Content Words in Authorship Attribution: An Evaluation of Stylometric Features in a Literary Corpus

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

The aim of this paper is to explore quantitatively the literary style of five Modern Greek novels and to conduct an authorship attribution experiment comparing different sets of stylometric variables. In order to answer the above research questions we created a corpus of five Modern Greek novels written by four authors. Each novel was split in equal word chunks of different sizes and for each text sample we calculated a number of common stylometric variables. Furthermore, we used the most Frequent Function Words (FFW) of the corpus as well as the most distinctive Author-Specific words (ASW). The resulted datasets were analyzed using Discriminant Function Analysis and the ASW method exhibited superior authorship attribution accuracy across all text sizes compared to the stylometric and FFW variables.
Original languageEnglish
Title of host publicationIssues in Quantitative Linguistics
Publication statusPublished - 2009
Externally publishedYes

Fingerprint

Dive into the research topics of 'Content Words in Authorship Attribution: An Evaluation of Stylometric Features in a Literary Corpus'. Together they form a unique fingerprint.

Cite this