Exploring Semantic Hadith Overlap Across Topics

Devi G. Kurup*, Amina Daoud, Jens Schneider, Wajdi Zaghouani*, Saeed Mohd H.M. Al Marri, Hamada R.H. Al-Absi, Younss Ait Mou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Semantic sentence similarity measures the degree of resemblance between multiple sentences. This similarity is a foundational element in information retrieval, machine translation, etc. This paper focuses on natural language processing techniques to analyze the semantic similarity in Hadiths, which are significant religious texts in Islam. Our objective is to investigate the extent of semantiv overlap between Hadiths across various topics, with the aim to provide insights into the cohesion and interconnectedness of Hadiths. We use AraVec and GPT embeddings to represent Hadiths numerically, followed by UMAP (Uniform Manifold Approximation & Projection) to project these embeddings to 2D. The projection serves to visually interpret the relationships between Hadiths, facilitating a deeper understanding of content and semantic interrelations. Our results unveil semantic clusters and connections within Hadiths, contributing to the exploration of Islamic textual heritage through modern computational methodologies. This study suggests that GPT outperforms AraVec, providing a more advanced representation that discerns intricate semantic relationships and subtle nuances within the Hadiths.

Original languageEnglish
Title of host publicationArabic Language Processing
Subtitle of host publicationFrom Theory to Practice - 8th International Conference, ICALP 2023, Proceedings
EditorsBoutaina Hdioud, Si Lhoussain Aouragh
PublisherSpringer Science and Business Media Deutschland GmbH
Pages127-138
Number of pages12
ISBN (Print)9783031791635
DOIs
Publication statusPublished - 2025
Event8th International Conference on Arabic Language Processing, ICALP 2023 - Rabat, Morocco
Duration: 19 Apr 202420 Apr 2024

Publication series

NameCommunications in Computer and Information Science
Volume2339 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference8th International Conference on Arabic Language Processing, ICALP 2023
Country/TerritoryMorocco
CityRabat
Period19/04/2420/04/24

Keywords

  • Arabic Natural Language Processing
  • AraVec & GPT
  • Hadith
  • Hadith Corpus
  • Semantic relatedness
  • UMAP

Fingerprint

Dive into the research topics of 'Exploring Semantic Hadith Overlap Across Topics'. Together they form a unique fingerprint.

Cite this