ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection

Fatima Haouari, Maram Hasanain, Reem Suwaileh, Tamer Elsayed

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Citations (Scopus)

Abstract

In this paper we introduce ArCOV19-Rumors, an Arabic COVID-19 Twitter dataset for misinformation detection composed of tweets containing claims from 27th January till the end of April 2020. We collected 138 verified claims, mostly from popular fact-checking websites, and identified 9.4K relevant tweets to those claims. Tweets were manually-annotated by veracity to support research on misinformation detection, which is one of the major problems faced during a pandemic. ArCOV19-Rumors supports two levels of misinformation detection over Twitter: Verifying free-text claims (called claim-level verification) and verifying claims expressed in tweets (called tweet-level verification). Our dataset covers, in addition to health, claims related to other topical categories that were influenced by COVID-19, namely, social, politics, sports, entertainment, and religious. Moreover, we present benchmarking results for tweet-level verification on the dataset. We experimented with SOTA models of versatile approaches that either exploit content, user profiles features, temporal features and propagation structure of the conversational threads for tweet verification.

Original languageEnglish
Title of host publicationWANLP 2021 - 6th Arabic Natural Language Processing Workshop, Proceedings of the Workshop
EditorsNizar Habash, Houda Bouamor, Hazem Hajj, Walid Magdy, Wajdi Zaghouani, Fethi Bougares, Nadi Tomeh, Ibrahim Abu Farha, Samia Touileb
PublisherAssociation for Computational Linguistics (ACL)
Pages72-81
Number of pages10
ISBN (Electronic)9781954085091
Publication statusPublished - 2021
Externally publishedYes
Event6th Arabic Natural Language Processing Workshop, WANLP 2021 - Virtual, Kyiv, Ukraine
Duration: 19 Apr 2021 → …

Publication series

NameWANLP 2021 - 6th Arabic Natural Language Processing Workshop, Proceedings of the Workshop

Conference

Conference6th Arabic Natural Language Processing Workshop, WANLP 2021
Country/TerritoryUkraine
CityVirtual, Kyiv
Period19/04/21 → …

Fingerprint

Dive into the research topics of 'ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection'. Together they form a unique fingerprint.

Cite this