AI-Writing Detection Using an Ensemble of Transformers and Stylometric Features

George Mikros, Athanasios Koursaris, Dimitrios Bilianos, George Markopoulos

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

This study aims to develop an effective and precise methodology for detecting AI-generated text, leveraging the synergistic combination of transformer learning and stylometric features. The research utilized two datasets provided by the AuTexTification: Automated Text Identification shared task, a component of IberLEF 2023, the 5th Workshop on Iberian Languages Evaluation Forum held at the SEPLN 2023 Conference. Our team engaged in both English language subtasks, which included binary classification of texts as either human or AI-generated and multiclass classification to predict the specific AI writing model employed from a selection of six. Our main approach was to experiment with multiple Transformer models and, at the same time, to use an extensive stylometric feature engineering workflow. Each method (transformers and stylometric features) was first applied separately, and then we explored various ways to combine them. The most efficient method was based on ensemble learning utilizing majority voting employing the two most accurate transformer models in our training data and a comprehensive combined concatenation of many different stylometric feature groups. The macro-F1 scores on the test sets on subtasks 1 and 2 were 60.78 and 55.87, respectively, positioning our group above the median of the competing teams. This study underscores the potential of combining transformer learning and stylometric features to enhance the accuracy of AI-generated text detection.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3496
Publication statusPublished - 2023
Event2023 Iberian Languages Evaluation Forum, IberLEF 2023 - Jaen, Spain
Duration: 26 Sept 2023 → …

Keywords

  • AI-writing detection
  • ensemble learning
  • stylometry
  • transformers

Fingerprint

Dive into the research topics of 'AI-Writing Detection Using an Ensemble of Transformers and Stylometric Features'. Together they form a unique fingerprint.

Cite this