Abstract is missing.
- Knowledge-Grounded Natural Language Recommendation ExplanationAnthony M. Colas, Jun Araki, Zhengyu Zhou, Bingqing Wang, Zhe Feng 0003. 1-15 [doi]
- Emergent Linear Representations in World Models of Self-Supervised Sequence ModelsNeel Nanda, Andrew Lee, Martin Wattenberg. 16-30 [doi]
- Explaining Data Patterns in Natural Language with Language ModelsChandan Singh, John X. Morris, Jyoti Aneja, Alexander M. Rush, Jianfeng Gao 0001. 31-55 [doi]
- Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse ScalingAkshat Gupta. 56-64 [doi]
- Disentangling the Linguistic Competence of Privacy-Preserving BERTStefan Arnold, Nils Kemmerzell, Annika Schreiner. 65-75 [doi]
- "Honey, Tell Me What's Wrong", Global Explanation of Textual Discriminative Models through Cooperative GenerationAntoine Chaffin, Julien Delaunay. 76-88 [doi]
- Self-Consistency of Large Language Models under AmbiguityHenning Bartsch, Ole Jørgensen, Domenic Rosati, Jason Hoelscher-Obermaier, Jacob Pfau. 89-105 [doi]
- Character-Level Chinese Backpack Language ModelsHao Sun, John Hewitt. 106-119 [doi]
- Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward NetworksSunit Bhattacharya, Ondrej Bojar. 120-126 [doi]
- Why Bother with Geometry? On the Relevance of Linear Decompositions of Transformer EmbeddingsTimothee Mickus, Raúl Vázquez. 127-141 [doi]
- Investigating Semantic Subspaces of Transformer Sentence Embeddings through Linear Structural ProbingDmitry Nikolaev 0003, Sebastian Padó. 142-154 [doi]
- Causal Abstraction for Chain-of-Thought Reasoning in Arithmetic Word ProblemsJuanhe (TJ) Tan. 155-168 [doi]
- Enhancing Interpretability Using Human Similarity Judgements to Prune Word EmbeddingsNatalia Flechas Manrique, Wanqian Bao, Aurélie Herbelot, Uri Hasson. 169-179 [doi]
- When Your Language Model Cannot Even Do Determiners Right: Probing for Anti-Presuppositions and the Maximize Presupposition! PrincipleJudith Sieker, Sina Zarrieß. 180-198 [doi]
- Introducing VULCAN: A Visualization Tool for Understanding Our Models and Data by ExampleJonas Groschwitz. 199-211 [doi]
- The Self-Contained Negation Test SetDavid Kletz, Pascal Amsili, Marie Candito. 212-221 [doi]
- Investigating the Effect of Discourse Connectives on Transformer Surprisal: Language Models Understand Connectives, Even So They Are SurprisedYan Cong, Emmanuele Chersoni, Yu-Yin Hsu, Philippe Blache. 222-232 [doi]
- METAPROBE: A Representation- and Task-Agnostic ProbeYichu Zhou, Vivek Srikumar. 233-249 [doi]
- How Much Consistency Is Your Accuracy Worth?Jacob K. Johnson, Ana Marasovic. 250-260 [doi]
- Investigating the Encoding of Words in BERT's Neurons Using Feature TextualizationTanja Baeumel, Soniya Vijayakumar, Josef van Genabith, Guenter Neumann, Simon Ostermann 0002. 261-270 [doi]
- Evaluating Transformer's Ability to Learn Mildly Context-Sensitive LanguagesShunjie Wang, Shane Steinert-Threlkeld. 271-283 [doi]
- Layered Bias: Interpreting Bias in Pretrained Large Language ModelsNirmalendu Prakash, Roy Ka-Wei Lee. 284-295 [doi]
- Not Wacky vs. Definitely Wacky: A Study of Scalar Adverbs in Pretrained Language ModelsIsabelle Lorge, Janet B. Pierrehumbert. 296-316 [doi]
- Rigorously Assessing Natural Language Explanations of NeuronsJing Huang, Atticus Geiger, Karel D'Oosterlinck, Zhengxuan Wu, Christopher Potts. 317-331 [doi]
- NPIs Aren't Exactly Easy: Variation in Licensing across Large Language ModelsDeanna DeCarlo, William Palmer, Michael Wilson, Bob Frank. 332-341 [doi]
- Memory Injections: Correcting Multi-Hop Reasoning Failures During Inference in Transformer-Based Language ModelsMansi Sakarvadia, Aswathy Ajith, Arham Khan, Daniel Grzenda, Nathaniel Hudson, André Bauer 0001, Kyle Chard, Ian T. Foster. 342-356 [doi]
- Systematic Generalization by Finetuning? Analyzing Pretrained Language Models Using Constituency TestsAishik Chakraborty, Jackie C. K. Cheung, Timothy J. O'Donnell. 357-366 [doi]
- On Quick Kisses and How to Make Them Count: A Study on Event Construal in Light Verb Constructions with BERTChenxin Liu, Emmanuele Chersoni. 367-378 [doi]
- Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language ModelAbhijith Chintam, Rahel Beloch, Willem H. Zuidema, Michael Hanna 0001, Oskar van der Wal. 379-394 [doi]