TY - JOUR
T1 - PROVOKE
T2 - Toxicity trigger detection in conversations from the top 100 subreddits
AU - Almerekhi, Hind
AU - Kwak, Haewoon
AU - Salminen, Joni
AU - Jansen, Bernard J.
N1 - Publisher Copyright:
© 2022 Wuhan University
PY - 2022/10
Y1 - 2022/10
N2 - Promoting healthy discourse on community-based online platforms like Reddit can be challenging, especially when conversations show ominous signs of toxicity. Therefore, in this study, we find the turning points (i.e., toxicity triggers) making conversations toxic. Before finding toxicity triggers, we built and evaluated various machine learning models to detect toxicity from Reddit comments. Subsequently, we used our best-performing model, a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model that achieved an area under the receiver operating characteristic curve (AUC) score of 0.983 to detect toxicity. Next, we constructed conversation threads and used the toxicity prediction results to build a training set for detecting toxicity triggers. This procedure entailed using our large-scale dataset to refine toxicity triggers' definition and build a trigger detection dataset using 991,806 conversation threads from the top 100 communities on Reddit. Then, we extracted a set of sentiment shift, topical shift, and context-based features from the trigger detection dataset, using them to build a dual embedding biLSTM neural network that achieved an AUC score of 0.789. Our trigger detection dataset analysis showed that specific triggering keywords are common across all communities, like ‘racist’ and ‘women’. In contrast, other triggering keywords are specific to certain communities, like ‘overwatch’ in r/Games. Implications are that toxicity trigger detection algorithms can leverage generic approaches but must also tailor detections to specific communities.
AB - Promoting healthy discourse on community-based online platforms like Reddit can be challenging, especially when conversations show ominous signs of toxicity. Therefore, in this study, we find the turning points (i.e., toxicity triggers) making conversations toxic. Before finding toxicity triggers, we built and evaluated various machine learning models to detect toxicity from Reddit comments. Subsequently, we used our best-performing model, a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model that achieved an area under the receiver operating characteristic curve (AUC) score of 0.983 to detect toxicity. Next, we constructed conversation threads and used the toxicity prediction results to build a training set for detecting toxicity triggers. This procedure entailed using our large-scale dataset to refine toxicity triggers' definition and build a trigger detection dataset using 991,806 conversation threads from the top 100 communities on Reddit. Then, we extracted a set of sentiment shift, topical shift, and context-based features from the trigger detection dataset, using them to build a dual embedding biLSTM neural network that achieved an AUC score of 0.789. Our trigger detection dataset analysis showed that specific triggering keywords are common across all communities, like ‘racist’ and ‘women’. In contrast, other triggering keywords are specific to certain communities, like ‘overwatch’ in r/Games. Implications are that toxicity trigger detection algorithms can leverage generic approaches but must also tailor detections to specific communities.
KW - Conversation threads
KW - Neural networks
KW - Online toxicity
KW - Reddit
KW - Social media
KW - Toxicity triggers
UR - http://www.scopus.com/inward/record.url?scp=85139638316&partnerID=8YFLogxK
U2 - 10.1016/j.dim.2022.100019
DO - 10.1016/j.dim.2022.100019
M3 - Article
AN - SCOPUS:85139638316
SN - 2543-9251
VL - 6
JO - Data and Information Management
JF - Data and Information Management
IS - 4
M1 - 100019
ER -