TY - GEN
T1 - Improving the classification of newsgroup messages through social network analysis
AU - Fortuna, Blaz
AU - Rodrigues, Eduarda Mendes
AU - Milic-Frayling, Natasa
PY - 2007
Y1 - 2007
N2 - Newsgroup participants interact with their communities through conversation threads. They may respond to a message to answer a question, debate a topic, support or disagree with another person's point, or digress and write about a different subject. Understanding the structure of threads and the sentiment of the participants' interaction is valuable for search and moderation of newsgroups. In this paper, we focus on automatic classification of message replies into several types. For representing messages we consider rich feature sets that combine the standard author reply-to network properties with features derived from four additional structures identified in the data: 1) a network of authors who participate in the same threads, 2) network of authors who post similar content, 3) network of threads sharing common authors, and 4) network of content-related threads. For selected newsgroups we train linear SVM classifiers to identify agreement and disagreement with the original message, and question and answer patterns in the threads. We show that the use of newly defined features substantially improves classification of messages in comparison with the SVM model based only on the standard reply-to network.
AB - Newsgroup participants interact with their communities through conversation threads. They may respond to a message to answer a question, debate a topic, support or disagree with another person's point, or digress and write about a different subject. Understanding the structure of threads and the sentiment of the participants' interaction is valuable for search and moderation of newsgroups. In this paper, we focus on automatic classification of message replies into several types. For representing messages we consider rich feature sets that combine the standard author reply-to network properties with features derived from four additional structures identified in the data: 1) a network of authors who participate in the same threads, 2) network of authors who post similar content, 3) network of threads sharing common authors, and 4) network of content-related threads. For selected newsgroups we train linear SVM classifiers to identify agreement and disagreement with the original message, and question and answer patterns in the threads. We show that the use of newly defined features substantially improves classification of messages in comparison with the SVM model based only on the standard reply-to network.
KW - Communities
KW - Message classification
KW - Newsgroups
KW - Social networks
UR - http://www.scopus.com/inward/record.url?scp=63449127205&partnerID=8YFLogxK
U2 - 10.1145/1321440.1321565
DO - 10.1145/1321440.1321565
M3 - Conference contribution
AN - SCOPUS:63449127205
SN - 9781595938039
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 877
EP - 880
BT - CIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
T2 - 16th ACM Conference on Information and Knowledge Management, CIKM 2007
Y2 - 6 November 2007 through 9 November 2007
ER -