Abstract
We propose a novel approach to the study of how artificial neural network perceive the distinction between grammatical and ungrammatical sentences, a crucial task in the growing field of synthetic linguistics. The method is based on performance measures of language models trained on corpora and finetuned with either grammatical or ungrammatical sentences, then applied to (different types of) grammatical or ungrammatical sentences. The results show that both in the difficult and highly symmetrical task of detecting subject islands and in the more open CoLA dataset, grammatical sentences give rise to better scores than ungrammatical ones, possibly because they can be better integrated within the body of linguistic structural knowledge that the language model has accumulated.
Original language | English |
---|---|
Title of host publication | Proceedings of the Second BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP |
Pages | 204-212 |
Publication status | Published - 2019 |
Externally published | Yes |