TY - GEN
T1 - Authorship attribution in Greek tweets using author's multilevel N-gram profiles
AU - Mikros, George K.
AU - Perifanos, Kostas A.
PY - 2013
Y1 - 2013
N2 - The aim of this study is to explore authorship attribution methods in Greek tweets. We have developed the first Modern Greek Twitter corpus (GTC) consisted of 12,973 tweets crawled from 10 Greek popular users. We used this corpus in order to study the effectiveness of a specific document representation called Author's Multilevel N-gram Profile (AMNP) and the impact of different methods on training data construction for the task of authorship attribution. In order to address the above research questions we used GTC to create 4 different datasets which contained merged tweets in texts of different sizes (100, 75, 50 and 25 words). Results were evaluated using authorship attribution accuracy both in 10-fold cross-validation and in an external test set compiled from actual tweets. AMNP representation achieved significant better accuracies than single feature groups across all text sizes.
AB - The aim of this study is to explore authorship attribution methods in Greek tweets. We have developed the first Modern Greek Twitter corpus (GTC) consisted of 12,973 tweets crawled from 10 Greek popular users. We used this corpus in order to study the effectiveness of a specific document representation called Author's Multilevel N-gram Profile (AMNP) and the impact of different methods on training data construction for the task of authorship attribution. In order to address the above research questions we used GTC to create 4 different datasets which contained merged tweets in texts of different sizes (100, 75, 50 and 25 words). Results were evaluated using authorship attribution accuracy both in 10-fold cross-validation and in an external test set compiled from actual tweets. AMNP representation achieved significant better accuracies than single feature groups across all text sizes.
UR - http://www.scopus.com/inward/record.url?scp=84883273767&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84883273767
SN - 9781577355984
T3 - AAAI Spring Symposium - Technical Report
SP - 17
EP - 23
BT - Analyzing Microtext - Papers from the AAAI Spring Symposium, Technical Report
T2 - 2013 AAAI Spring Symposium
Y2 - 25 March 2013 through 27 March 2013
ER -