TY - GEN
T1 - Can LLMs Facilitate Interpretation of Pre-trained Language Models?
AU - Mousi, Basel
AU - Durrani, Nadir
AU - Dalvi, Fahim
N1 - Publisher Copyright:
©2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Work done to uncover the knowledge encoded within pre-trained language models rely on annotated corpora or human-in-the-loop methods. However, these approaches are limited in terms of scalability and the scope of interpretation. We propose using a large language model, ChatGPT, as an annotator to enable fine-grained interpretation analysis of pre-trained language models. We discover latent concepts within pre-trained language models by applying agglomerative hierarchical clustering over contextualized representations and then annotate these concepts using ChatGPT. Our findings demonstrate that ChatGPT produces accurate and semantically richer annotations compared to human-annotated concepts. Additionally, we showcase how GPT-based annotations empower interpretation analysis methodologies of which we demonstrate two: probing frameworks and neuron interpretation. To facilitate further exploration and experimentation in the field, we make available a substantial Concept-Net dataset (TCN) comprising 39,000 annotated concepts.
AB - Work done to uncover the knowledge encoded within pre-trained language models rely on annotated corpora or human-in-the-loop methods. However, these approaches are limited in terms of scalability and the scope of interpretation. We propose using a large language model, ChatGPT, as an annotator to enable fine-grained interpretation analysis of pre-trained language models. We discover latent concepts within pre-trained language models by applying agglomerative hierarchical clustering over contextualized representations and then annotate these concepts using ChatGPT. Our findings demonstrate that ChatGPT produces accurate and semantically richer annotations compared to human-annotated concepts. Additionally, we showcase how GPT-based annotations empower interpretation analysis methodologies of which we demonstrate two: probing frameworks and neuron interpretation. To facilitate further exploration and experimentation in the field, we make available a substantial Concept-Net dataset (TCN) comprising 39,000 annotated concepts.
UR - http://www.scopus.com/inward/record.url?scp=85179780678&partnerID=8YFLogxK
U2 - 10.18653/v1/2023.emnlp-main.196
DO - 10.18653/v1/2023.emnlp-main.196
M3 - Conference contribution
AN - SCOPUS:85179780678
T3 - EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings
SP - 3248
EP - 3268
BT - EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings
A2 - Bouamor, Houda
A2 - Pino, Juan
A2 - Bali, Kalika
PB - Association for Computational Linguistics (ACL)
T2 - 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023
Y2 - 6 December 2023 through 10 December 2023
ER -