TY - GEN
T1 - ArCOV19-Rumors
T2 - 6th Arabic Natural Language Processing Workshop, WANLP 2021
AU - Haouari, Fatima
AU - Hasanain, Maram
AU - Suwaileh, Reem
AU - Elsayed, Tamer
N1 - Publisher Copyright:
© WANLP 2021 - 6th Arabic Natural Language Processing Workshop
PY - 2021
Y1 - 2021
N2 - In this paper we introduce ArCOV19-Rumors, an Arabic COVID-19 Twitter dataset for misinformation detection composed of tweets containing claims from 27th January till the end of April 2020. We collected 138 verified claims, mostly from popular fact-checking websites, and identified 9.4K relevant tweets to those claims. Tweets were manually-annotated by veracity to support research on misinformation detection, which is one of the major problems faced during a pandemic. ArCOV19-Rumors supports two levels of misinformation detection over Twitter: Verifying free-text claims (called claim-level verification) and verifying claims expressed in tweets (called tweet-level verification). Our dataset covers, in addition to health, claims related to other topical categories that were influenced by COVID-19, namely, social, politics, sports, entertainment, and religious. Moreover, we present benchmarking results for tweet-level verification on the dataset. We experimented with SOTA models of versatile approaches that either exploit content, user profiles features, temporal features and propagation structure of the conversational threads for tweet verification.
AB - In this paper we introduce ArCOV19-Rumors, an Arabic COVID-19 Twitter dataset for misinformation detection composed of tweets containing claims from 27th January till the end of April 2020. We collected 138 verified claims, mostly from popular fact-checking websites, and identified 9.4K relevant tweets to those claims. Tweets were manually-annotated by veracity to support research on misinformation detection, which is one of the major problems faced during a pandemic. ArCOV19-Rumors supports two levels of misinformation detection over Twitter: Verifying free-text claims (called claim-level verification) and verifying claims expressed in tweets (called tweet-level verification). Our dataset covers, in addition to health, claims related to other topical categories that were influenced by COVID-19, namely, social, politics, sports, entertainment, and religious. Moreover, we present benchmarking results for tweet-level verification on the dataset. We experimented with SOTA models of versatile approaches that either exploit content, user profiles features, temporal features and propagation structure of the conversational threads for tweet verification.
UR - http://www.scopus.com/inward/record.url?scp=85106703336&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85106703336
T3 - WANLP 2021 - 6th Arabic Natural Language Processing Workshop, Proceedings of the Workshop
SP - 72
EP - 81
BT - WANLP 2021 - 6th Arabic Natural Language Processing Workshop, Proceedings of the Workshop
A2 - Habash, Nizar
A2 - Bouamor, Houda
A2 - Hajj, Hazem
A2 - Magdy, Walid
A2 - Zaghouani, Wajdi
A2 - Bougares, Fethi
A2 - Tomeh, Nadi
A2 - Farha, Ibrahim Abu
A2 - Touileb, Samia
PB - Association for Computational Linguistics (ACL)
Y2 - 19 April 2021
ER -