What is NeuroX?

NeuroX is a framework that aims to intepret deep NLP models and increase the transparency of their inner workings and predictions. The goal of the framework and the proposed methodologies is to go beyond input features for interpretation and provide richer explanations of a given model and its predictions. It encompasses several lines of works, including Neuron Probing which highlights what components (layers, attention heads, neurons) of a network learn specific concepts and Latent Concept Discovery which extracts the concepts captured within the learned representations.

Active Collaborations

Past Collaborations

Projects

  • NeuroX Toolkit

    A Python library that encapsulates various methods for neuron interpretation and analysis, geared towards Deep NLP models. The library is a one-stop shop for activation extraction, probe training, clustering analysis, neuron selection and more.

  • Model Explorer

    A GUI toolkit that provides several methods to identify salient neurons with respect to a model itself or an external task. Provides visualization, ablation, and manipulation of neurons within a given model

    Try the Demo
  • ConceptX

    Explore latent concepts learned by a trained neural network model like BERT.





    Coming Soon!
  • ExplainMyPredictions

    Coming Soon!
  • Policy Police

    Coming Soon!

Publications

[1] Nadir Durrani, Hassan Sajjad, and Fahim Dalvi. How transfer learning impacts linguistic knowledge in deep NLP models? In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 4947--4957, Online, August 2021. Association for Computational Linguistics. [ bib | DOI | http ]
[2] Hassan Sajjad, Narine Kokhlikyan, Fahim Dalvi, and Nadir Durrani. Fine-grained interpretation and causation analysis in deep NLP models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials, pages 5--10, Online, June 2021. Association for Computational Linguistics. [ bib | DOI | http ]
[3] Shammur Absar Chowdhury, Nadir Durrani, and Ahmed Ali. What do end-to-end speech models learn about speaker, language and channel information? a layer-wise and neuron-level analysis, 2021. [ bib | arXiv ]
[4] Hassan Sajjad, Firoj Alam, Fahim Dalvi, and Nadir Durrani. Effect of post-processing on contextualized word representations, 2021. [ bib | arXiv ]
[5] Hassan Sajjad, Nadir Durrani, and Fahim Dalvi. Neuron-level interpretation of deep nlp models: A survey, 2021. [ bib | arXiv ]
[6] Fahim Dalvi, Hassan Sajjad, Nadir Durrani, and Yonatan Belinkov. Analyzing redundancy in pretrained transformer models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4908--4926, Online, November 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[7] Nadir Durrani, Hassan Sajjad, Fahim Dalvi, and Yonatan Belinkov. Analyzing individual neurons in pre-trained language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4865--4880, Online, November 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[8] John Wu, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, and James Glass. Similarity analysis of contextual word representation models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4638--4655, Online, July 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[9] Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, and James Glass. On the linguistic representational power of neural machine translation models. Computational Linguistics, 46(1):1--52, March 2020. [ bib | DOI | http ]
[10] Hassan Sajjad, Fahim Dalvi, Nadir Durrani, and Preslav Nakov. Poor man's BERT: smaller and faster transformer models. CoRR, abs/2004.03844, 2020. [ bib | arXiv | http ]
[11] Nadir Durrani, Fahim Dalvi, Hassan Sajjad, Yonatan Belinkov, and Preslav Nakov. One size does not fit all: Comparing NMT representations of different granularities. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1504--1516, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. [ bib | DOI | http ]
[12] Fahim Dalvi, Avery Nortonsmith, D. Anthony Bau, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, and James Glass. Neurox: A toolkit for analyzing individual neurons in neural networks. In AAAI Conference on Artificial Intelligence (AAAI), January 2019. [ bib | http ]
[13] Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Yonatan Belinkov, D. Anthony Bau, and James Glass. What is one grain of sand in the desert? analyzing individual neurons in deep nlp models. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI, Oral presentation), January 2019. [ bib | http ]
[14] Anthony Bau, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, and James Glass. Identifying and controlling important neurons in neural machine translation. In International Conference on Learning Representations, 2019. [ bib | http ]
[15] Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Yonatan Belinkov, and Stephan Vogel. Understanding and improving morphological learning in the neural machine translation decoder. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 142--151, Taipei, Taiwan, November 2017. Asian Federation of Natural Language Processing. [ bib | http ]
[16] Yonatan Belinkov, Lluís Màrquez, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, and James Glass. Evaluating layers of representation in neural machine translation on part-of-speech and semantic tagging tasks. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1--10, Taipei, Taiwan, November 2017. Asian Federation of Natural Language Processing. [ bib | http ]
[17] Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, and James Glass. What do Neural Machine Translation Models Learn about Morphology? In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), Vancouver, July 2017. Association for Computational Linguistics. [ bib | .pdf ]

Media Coverage

Various works from the projects have received coverage from science media

Team

Core Team

Hassan Sajjad Senior Scientist Qatar Computing Research Institute
Nadir Durrani Senior Scientist Qatar Computing Research Institute
Fahim Dalvi Software Engineer Qatar Computing Research Institute

Collaborators

Abdul Rafae Khan Postdoctoral Researcher Stevens Institute of Technology
Ahmed Abdelali Senior Software Engineer Qatar Computing Research Institute
Firoj Alam Scientist Qatar Computing Research Institute
Jia Xu Assistant Professor Stevens Institute of Technology

Past Collaborators

Anthony Bau Undergraduate Student MIT CSAIL
James Glass Senior Research Scientist MIT CSAIL
Narine Kokhlikyan Software Engineer Facebook AI
Yonatan Belinkov Postdoctoral Researcher MIT and Harvard University