Neuron-level Interpretation of Deep NLP Models: A Survey

Hassan Sajjad, Nadir Durrani, Fahim Dalvi

Research output: Contribution to journalArticlepeer-review

33 Citations (Scopus)

Abstract

The proliferation of Deep Neural Networks in various domains has seen an increased need for interpretability of these models. Prelimi-nary work done along this line, and papers that surveyed such, are focused on high-level representation analysis. However, a recent branch of work has concentrated on interpretability at a more granular level of analyzing neurons within these models. In this paper, we survey the work done on neuron analysis including: i) methods to discover and understand neurons in a network; ii) evaluation methods; iii) major findings including cross architectural compar-isons that neuron analysis has unraveled; iv) applications of neuron probing such as: controlling the model, domain adaptation, and so forth; and v) a discussion on open issues and future research directions.

Original languageEnglish
Pages (from-to)1285-1303
Number of pages19
JournalTransactions of the Association for Computational Linguistics
Volume10
DOIs
Publication statusPublished - 22 Nov 2022

Fingerprint

Dive into the research topics of 'Neuron-level Interpretation of Deep NLP Models: A Survey'. Together they form a unique fingerprint.

Cite this