Overview of character-based models for natural language processing

Heike Adel, Ehsaneddin Asgari, Hinrich Schütze*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Character-based models become more and more popular for different natural language processing task, especially due to the success of neural networks. They provide the possibility of directly model text sequences without the need of tokenization and, therefore, enhance the traditional preprocessing pipeline. This paper provides an overview of character-based models for a variety of natural language processing tasks. We group existing work in three categories: tokenization-based approaches, bag-of-n-gram models and end-to-end models. For each category, we present prominent examples of studies with a particular focus on recent character-based deep learning work.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 18th International Conference, CICLing 2017, Revised Selected Papers
EditorsAlexander Gelbukh
PublisherSpringer Verlag
Pages3-16
Number of pages14
ISBN (Print)9783319771120
DOIs
Publication statusPublished - 2018
Externally publishedYes
Event18th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2017 - Budapest, Hungary
Duration: 17 Apr 201723 Apr 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10761 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2017
Country/TerritoryHungary
CityBudapest
Period17/04/1723/04/17

Keywords

  • Document representation
  • Feature selection
  • Language models
  • Natural language generation
  • Natural language processing
  • Neural networks
  • Structured prediction
  • Supervised learning by classification

Fingerprint

Dive into the research topics of 'Overview of character-based models for natural language processing'. Together they form a unique fingerprint.

Cite this