Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, BlackboxNLP@EMNLP 2023, Singapore, December 7, 2023 - researchr publication

researchr

You are not signed in
Sign in
Sign up

Yonatan Belinkov, Sophie Hao, Jaap Jumelet, Najoung Kim, Arya McCarthy, Hosein Mohebbi, editors, Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, BlackboxNLP@EMNLP 2023, Singapore, December 7, 2023. Association for Computational Linguistics, 2023. [doi]

Conference: blackboxnlp2023

Abstract is missing.

Knowledge-Grounded Natural Language Recommendation ExplanationAnthony M. Colas, Jun Araki, Zhengyu Zhou, Bingqing Wang, Zhe Feng 0003. 1-15 [doi]

Emergent Linear Representations in World Models of Self-Supervised Sequence ModelsNeel Nanda, Andrew Lee, Martin Wattenberg. 16-30 [doi]

Explaining Data Patterns in Natural Language with Language ModelsChandan Singh, John X. Morris, Jyoti Aneja, Alexander M. Rush, Jianfeng Gao 0001. 31-55 [doi]

Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse ScalingAkshat Gupta. 56-64 [doi]

Disentangling the Linguistic Competence of Privacy-Preserving BERTStefan Arnold, Nils Kemmerzell, Annika Schreiner. 65-75 [doi]

"Honey, Tell Me What's Wrong", Global Explanation of Textual Discriminative Models through Cooperative GenerationAntoine Chaffin, Julien Delaunay. 76-88 [doi]

Self-Consistency of Large Language Models under AmbiguityHenning Bartsch, Ole Jørgensen, Domenic Rosati, Jason Hoelscher-Obermaier, Jacob Pfau. 89-105 [doi]

Character-Level Chinese Backpack Language ModelsHao Sun, John Hewitt. 106-119 [doi]

Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward NetworksSunit Bhattacharya, Ondrej Bojar. 120-126 [doi]

Why Bother with Geometry? On the Relevance of Linear Decompositions of Transformer EmbeddingsTimothee Mickus, Raúl Vázquez. 127-141 [doi]

Investigating Semantic Subspaces of Transformer Sentence Embeddings through Linear Structural ProbingDmitry Nikolaev 0003, Sebastian Padó. 142-154 [doi]

Causal Abstraction for Chain-of-Thought Reasoning in Arithmetic Word ProblemsJuanhe (TJ) Tan. 155-168 [doi]

Enhancing Interpretability Using Human Similarity Judgements to Prune Word EmbeddingsNatalia Flechas Manrique, Wanqian Bao, Aurélie Herbelot, Uri Hasson. 169-179 [doi]

When Your Language Model Cannot Even Do Determiners Right: Probing for Anti-Presuppositions and the Maximize Presupposition! PrincipleJudith Sieker, Sina Zarrieß. 180-198 [doi]

Introducing VULCAN: A Visualization Tool for Understanding Our Models and Data by ExampleJonas Groschwitz. 199-211 [doi]

The Self-Contained Negation Test SetDavid Kletz, Pascal Amsili, Marie Candito. 212-221 [doi]

Investigating the Effect of Discourse Connectives on Transformer Surprisal: Language Models Understand Connectives, Even So They Are SurprisedYan Cong, Emmanuele Chersoni, Yu-Yin Hsu, Philippe Blache. 222-232 [doi]

METAPROBE: A Representation- and Task-Agnostic ProbeYichu Zhou, Vivek Srikumar. 233-249 [doi]

How Much Consistency Is Your Accuracy Worth?Jacob K. Johnson, Ana Marasovic. 250-260 [doi]

Investigating the Encoding of Words in BERT's Neurons Using Feature TextualizationTanja Baeumel, Soniya Vijayakumar, Josef van Genabith, Guenter Neumann, Simon Ostermann 0002. 261-270 [doi]

Evaluating Transformer's Ability to Learn Mildly Context-Sensitive LanguagesShunjie Wang, Shane Steinert-Threlkeld. 271-283 [doi]

Layered Bias: Interpreting Bias in Pretrained Large Language ModelsNirmalendu Prakash, Roy Ka-Wei Lee. 284-295 [doi]

Not Wacky vs. Definitely Wacky: A Study of Scalar Adverbs in Pretrained Language ModelsIsabelle Lorge, Janet B. Pierrehumbert. 296-316 [doi]

Rigorously Assessing Natural Language Explanations of NeuronsJing Huang, Atticus Geiger, Karel D'Oosterlinck, Zhengxuan Wu, Christopher Potts. 317-331 [doi]

NPIs Aren't Exactly Easy: Variation in Licensing across Large Language ModelsDeanna DeCarlo, William Palmer, Michael Wilson, Bob Frank. 332-341 [doi]

Memory Injections: Correcting Multi-Hop Reasoning Failures During Inference in Transformer-Based Language ModelsMansi Sakarvadia, Aswathy Ajith, Arham Khan, Daniel Grzenda, Nathaniel Hudson, André Bauer 0001, Kyle Chard, Ian T. Foster. 342-356 [doi]

Systematic Generalization by Finetuning? Analyzing Pretrained Language Models Using Constituency TestsAishik Chakraborty, Jackie C. K. Cheung, Timothy J. O'Donnell. 357-366 [doi]

On Quick Kisses and How to Make Them Count: A Study on Event Construal in Light Verb Constructions with BERTChenxin Liu, Emmanuele Chersoni. 367-378 [doi]

Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language ModelAbhijith Chintam, Rahel Beloch, Willem H. Zuidema, Michael Hanna 0001, Oskar van der Wal. 379-394 [doi]

runs on WebDSL