Towards Bangla Named Entity Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Citations (Scopus)

Abstract

Named Entity Recognition is one of the fundamental problems for Information Extraction and the task is to find the mentioned entities in text. Over the years there has been significant progress in Named Entity Recognition (NER) research for resource-rich languages such as English, Chinese, and Italian. Although, there are a number of studies for Bangla NER, however, most of these studies are conducted almost a decade ago and were focused on a single geographical location (i.e., India). Therefore, in this paper, we present a corpus annotated with seven named entities with a particular focus on Bangladeshi Bangla. It is a part of the development of the Bangla Content Annotation Bank (B-CAB). We also present baseline results, which can be useful for future research. For the baseline results, we employed word-level, POS, gazetteers and contextual features along with Conditional Random Fields (CRFs). Our study also includes the exploration of deep neural networks. Additionally, we investigated another large corpus from a different geographical location (i.e., India) and concluded on the importance of geographic-based NER for a language.

Original languageEnglish
Title of host publication2018 21st International Conference of Computer and Information Technology, ICCIT 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538692424
DOIs
Publication statusPublished - 2 Jul 2018
Event21st International Conference of Computer and Information Technology, ICCIT 2018 - Dhaka, Bangladesh
Duration: 21 Dec 201823 Dec 2018

Publication series

Name2018 21st International Conference of Computer and Information Technology, ICCIT 2018

Conference

Conference21st International Conference of Computer and Information Technology, ICCIT 2018
Country/TerritoryBangladesh
CityDhaka
Period21/12/1823/12/18

Keywords

  • BangIa
  • CRF
  • LSTM
  • Named Entity Recognition
  • Neural Network
  • Sequence Labeling

Fingerprint

Dive into the research topics of 'Towards Bangla Named Entity Recognition'. Together they form a unique fingerprint.

Cite this