TY - GEN
T1 - Fighting the COVID-19 Infodemic
T2 - 2021 Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021
AU - Alam, Firoj
AU - Shaar, Shaden
AU - Dalvi, Fahim
AU - Sajjad, Hassan
AU - Nikolov, Alex
AU - Mubarak, Hamdy
AU - Da San Martino, Giovanni
AU - Abdelali, Ahmed
AU - Durrani, Nadir
AU - Darwish, Kareem
AU - Al-Homai, Abdulaziz
AU - Zaghouani, Wajdi
AU - Caselli, Tommaso
AU - Danoe, Gijs
AU - Stolk, Friso
AU - Bruntink, Britt
AU - Nakov, Preslav
N1 - Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreading xenophobia and panic. Addressing the issue requires solving a number of challenging problems such as identifying messages containing claims, determining their check-worthiness and factuality, and their potential to do harm as well as the nature of that harm, to mention just a few. To address this gap, we release a large dataset of 16K manually annotated tweets for fine-grained disinformation analysis that (i) focuses on COVID19, (ii) combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society, and (iii) covers Arabic, Bulgarian, Dutch, and English. Finally, we show strong evaluation results using pretrained Transformers, thus confirming the practical utility of the dataset in monolingual vs. multilingual, and single task vs. multitask settings.
AB - With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreading xenophobia and panic. Addressing the issue requires solving a number of challenging problems such as identifying messages containing claims, determining their check-worthiness and factuality, and their potential to do harm as well as the nature of that harm, to mention just a few. To address this gap, we release a large dataset of 16K manually annotated tweets for fine-grained disinformation analysis that (i) focuses on COVID19, (ii) combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society, and (iii) covers Arabic, Bulgarian, Dutch, and English. Finally, we show strong evaluation results using pretrained Transformers, thus confirming the practical utility of the dataset in monolingual vs. multilingual, and single task vs. multitask settings.
UR - http://www.scopus.com/inward/record.url?scp=85116712351&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85116712351
T3 - Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021
SP - 611
EP - 649
BT - Findings of the Association for Computational Linguistics, Findings of ACL
A2 - Moens, Marie-Francine
A2 - Huang, Xuanjing
A2 - Specia, Lucia
A2 - Yih, Scott Wen-Tau
PB - Association for Computational Linguistics (ACL)
Y2 - 7 November 2021 through 11 November 2021
ER -