TY - GEN
T1 - Halwasa
T2 - Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
AU - Mubarak, Hamdy
AU - Al-Khalifa, Hend
AU - Alkhalefah, Khaloud
N1 - Publisher Copyright:
© 2024 ELRA Language Resource Association: CC BY-NC 4.0.
PY - 2024/5
Y1 - 2024/5
N2 - Large Language Models (LLMs) have shown superb abilities to generate texts that are indistinguishable from human-generated texts in many cases. However, sometimes they generate false, incorrect, or misleading content, which is often described as “hallucinations”. Quantifying and analyzing hallucinations in LLMs can increase their reliability and usage. While hallucination is being actively studied for English and other languages, and different benchmarking datasets have been created, this area is not studied at all for Arabic. In our paper, we create the first Arabic dataset that contains 10K generated sentences by LLMs and annotate it for factuality and correctness. We provide a detailed analysis of the dataset to analyze factual and linguistic errors. We found that 25% of the generated sentences are factually incorrect. We share the dataset with the research community.
AB - Large Language Models (LLMs) have shown superb abilities to generate texts that are indistinguishable from human-generated texts in many cases. However, sometimes they generate false, incorrect, or misleading content, which is often described as “hallucinations”. Quantifying and analyzing hallucinations in LLMs can increase their reliability and usage. While hallucination is being actively studied for English and other languages, and different benchmarking datasets have been created, this area is not studied at all for Arabic. In our paper, we create the first Arabic dataset that contains 10K generated sentences by LLMs and annotate it for factuality and correctness. We provide a detailed analysis of the dataset to analyze factual and linguistic errors. We found that 25% of the generated sentences are factually incorrect. We share the dataset with the research community.
UR - http://www.scopus.com/inward/record.url?scp=85195919206&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85195919206
T3 - 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
SP - 8008
EP - 8015
BT - 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
A2 - Calzolari, Nicoletta
A2 - Kan, Min-Yen
A2 - Hoste, Veronique
A2 - Lenci, Alessandro
A2 - Sakti, Sakriani
A2 - Xue, Nianwen
PB - European Language Resources Association (ELRA)
Y2 - 20 May 2024 through 25 May 2024
ER -