Hate Speech Degree Detection on English Data
With the emergence of a variety of social media platforms, and the freedom to express one’s thought, sadly, there is a lot of hateful content available on social media. Some platforms like Twitter filter out any posts which involve abusive and highly provocative language. However, Gab is a platform where freedom of speech is retained. Thus, hate content can be easily found on Gab. It becomes important to analyze the data, posts and comments. Hate Speech detection thus, plays an important role in identifying any kind of trend, troll, threat, etc:
Hate Speech detection is not a new problem. However, the fact that there can be various degrees of hate in a data is often ignored. It should be noted that not all kind of hate speech can be severly detrimental and some post can contain severe violent provocation or threat that needs to handled to prevent any deteriorating consequences.
Our team has acquired past 4 years dataset of Gab using pushshift.io .
Initially we decided to divide the dataset into 5 classes, as mentioned here : https://docs.google.com/document/d/19FPhzMzea46NnUGA6IqrYO460dYE9n1AwbsiNKwfnwo/edit?usp=sharing and https://docs.google.com/document/d/1LH8cYBsik5zPbGwRKP_AyhM2y5XGqu2KdjI6CxCimEQ/edit?usp=sharing
This classification is based on three parameters :
- hate speech or non hate speech
- against a group or individual
- violent or non-violent
Based on the above parameters we got the classes. We had a hypothesis that most hateful remark are more hurtful when it is pointed towards a group as compared to an individual. Also, most violent post are more dangerous than a non-violent one.
We referred to following papers and article to come up with the above stated classes : https://www.aclweb.org/anthology/W18-4413 https://www.aclweb.org/anthology/N16-2013 http://www.yichang-cs.com/yahoo/WWW16_Abusivedetection.pdf https://aclweb.org/anthology/W16-5618 http://www.robertjsternberg.com/hate/
However, we soon realized that there is a flaw in the given class system because a sarcastic post against an individual is much lighter in terms of hate as compared to an accusing post against an individual but according to the given class system both were assigned to the same class. Also, some sarcastic posts against a group can be less hateful than an accusation or intimidation against an individual, thus, however according to the classification, the former will be considered more hateful than latter.
Thus, we decided that we can instead annotate within three categories as mentioned in the below doc : https://docs.google.com/document/d/1C49RknV-eYGIp83IEzxAtNxoRSTPrncc5ZNegA1krUM/edit?usp=sharing