TY - JOUR
T1 - When a disaster happens, we are ready
T2 - Location mention recognition from crisis tweets
AU - Suwaileh, Reem
AU - Elsayed, Tamer
AU - Imran, Muhammad
AU - Sajjad, Hassan
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2022/8
Y1 - 2022/8
N2 - Geolocation information is important for humanitarian organizations to gain situational awareness and deliver timely aid during disasters. Towards addressing the problem of recognizing locations, i.e., Location Mention Recognition (LMR), within social media posts during disasters, past studies mainly focused on proposing techniques that assume the availability of abundant training data at the disaster onset. In this work, we adopt the more realistic assumption that no (i.e., zeroshot setting) or as little as a few hundred examples (i.e., few-shot setting) from the just-occurred event is available for training. Specifically, we examine the effect of training a BERT-based LMR model on past events using different settings, datasets, languages, and geo-proximity. Extensive empirical analysis provides several insights for building an effective LMR model during disasters, including (i) Twitter crisis-related and location-specific data from geographicallynearby disaster events is more useful than all other combinations of training datasets in the zero-shot monolingual setting, (ii) using as few as 263-356 training tweets from the target language (i.e., few-shot setting) remarkably boosts the performance in the cross- and multilingual settings, and (iii) labeling about 500 target event's tweets leads to an acceptable LMR performance, higher than F1 of 0.7, in the monolingual settings. Finally, we conduct an extensive error analysis and highlight issues related to the quality of the available datasets and weaknesses of the current model.
AB - Geolocation information is important for humanitarian organizations to gain situational awareness and deliver timely aid during disasters. Towards addressing the problem of recognizing locations, i.e., Location Mention Recognition (LMR), within social media posts during disasters, past studies mainly focused on proposing techniques that assume the availability of abundant training data at the disaster onset. In this work, we adopt the more realistic assumption that no (i.e., zeroshot setting) or as little as a few hundred examples (i.e., few-shot setting) from the just-occurred event is available for training. Specifically, we examine the effect of training a BERT-based LMR model on past events using different settings, datasets, languages, and geo-proximity. Extensive empirical analysis provides several insights for building an effective LMR model during disasters, including (i) Twitter crisis-related and location-specific data from geographicallynearby disaster events is more useful than all other combinations of training datasets in the zero-shot monolingual setting, (ii) using as few as 263-356 training tweets from the target language (i.e., few-shot setting) remarkably boosts the performance in the cross- and multilingual settings, and (iii) labeling about 500 target event's tweets leads to an acceptable LMR performance, higher than F1 of 0.7, in the monolingual settings. Finally, we conduct an extensive error analysis and highlight issues related to the quality of the available datasets and weaknesses of the current model.
KW - Geolocation recognition
KW - Social good
KW - Twitter
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=hbku_researchportal&SrcAuth=WosAPI&KeyUT=WOS:000831719000007&DestLinkType=FullRecord&DestApp=WOS_CPL
U2 - 10.1016/j.ijdrr.2022.103107
DO - 10.1016/j.ijdrr.2022.103107
M3 - Article
SN - 2212-4209
VL - 78
JO - International Journal of Disaster Risk Reduction
JF - International Journal of Disaster Risk Reduction
M1 - 103107
ER -