Quantitative parameters in corpus design: Estimating the optimum text size in Modern Greek language

George Mikros*

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

Abstract

The aim of this paper is to investigate the major quantitative parameters related to the definition of the optimum text size in Modern Greek corpus development. Using the Hellenic National Corpus (HNC) (Hatzigeorgiu et al., 2000) as a reference point we estimated a number of critical statistical measures regarding feature counting in different text sizes. The results indicate that frequent linguistic features behave differently from the medium frequency and the rare ones and the text size increase do not affect them uniformly.

Original languageEnglish
Pages834-838
Number of pages5
Publication statusPublished - 2002
Externally publishedYes
Event3rd International Conference on Language Resources and Evaluation, LREC 2002 - Las Palmas, Canary Islands, Spain
Duration: 29 May 200231 May 2002

Conference

Conference3rd International Conference on Language Resources and Evaluation, LREC 2002
Country/TerritorySpain
CityLas Palmas, Canary Islands
Period29/05/0231/05/02

Fingerprint

Dive into the research topics of 'Quantitative parameters in corpus design: Estimating the optimum text size in Modern Greek language'. Together they form a unique fingerprint.

Cite this