Abstract is missing.
- Complete Chess Games Enable LLM Become A Chess MasterYinqi Zhang, Xintian Han, Haolong Li, Kedi Chen, Shaohui Lin. 1-7 [doi]
- Predicting the Target Word of Game-playing Conversations using a Low-Rank Dialect Adapter for Decoder ModelsDipankar Srirag, Aditya Joshi, Jacob Eisenstein. 8-17 [doi]
- ChaI-TeA: A Benchmark for Evaluating Autocompletion of Interactions with LLM-based ChatbotsShani Goren, Oren Kalinsky, Tomer Stav, Yuri Rapoport, Yaron Fairstein, Ram Yazdi, Nachshon Cohen, Alexander Libov, Guy Kushilevitz. 18-32 [doi]
- Cross-Lingual Transfer Learning for Speech TranslationRao Ma, Mengjie Qian, Yassir Fathullah, Siyuan Tang, Mark J. F. Gales, Kate M. Knill. 33-43 [doi]
- Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?Nishant Balepur, Feng Gu, Abhilasha Ravichander, Shi Feng 0005, Jordan Lee Boyd-Graber, Rachel Rudinger. 44-64 [doi]
- Personalized Help for Optimizing Low-Skilled Users' StrategyFeng Gu, Wichayaporn Wongkamjan, Jordan Lee Boyd-Graber, Jonathan K. Kummerfeld, Denis Peskoff, Jonathan May. 65-74 [doi]
- Local Prompt OptimizationYash Jain, Vishal Chowdhary. 75-81 [doi]
- Cross-lingual Transfer of Reward Models in Multilingual AlignmentJiwoo Hong, Noah Lee, Rodrigo Martínez-Castaño, César Rodríguez, James Thorne. 82-94 [doi]
- Inference-Time Selective Debiasing to Enhance Fairness in Text Classification ModelsGleb Kuzmin, Neemesh Yadav, Ivan V. Smirnov, Timothy Baldwin, Artem Shelmanov. 95-107 [doi]
- Automatic Evaluation of Healthcare LLMs Beyond Question-AnsweringAnna Arias-Duart, Pablo Agustin Martin-Torres, Daniel Hinjos, Pablo Bernabeu-Perez, Lucia Urcelay-Ganzabal, Marta Gonzalez-Mallo 0001, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Sergio Álvarez-Napagao, Dario Garcia-Gasulla. 108-130 [doi]
- STRUX: An LLM for Decision-Making with Structured ExplanationsYiming Lu, Yebowen Hu, Hassan Foroosh, Wei Jin, Fei Liu 0004. 131-141 [doi]
- Improving Vietnamese-English Cross-Lingual Retrieval for Legal and General DomainsToan Ngoc Nguyen, Nam Le Hai, Nguyen Doan Hieu, Dai An Nguyen, Linh Ngo Van 0001, Thien Huu Nguyen, Sang Dinh. 142-153 [doi]
- Computational Discovery of Chiasmus in Ancient Religious TextHope McGovern, Hale Sirin, Tom Lippincott. 154-160 [doi]
- Characterizing the Effects of Translation on Intertextuality using Multilingual Embedding SpacesHope McGovern, Hale Sirin, Tom Lippincott. 161-167 [doi]
- LLM2: Let Large Language Models Harness System 2 ReasoningCheng Yang 0007, Chufan Shi, Siheng Li, Bo Shui, Yujiu Yang, Wai Lam. 168-177 [doi]
- Context-Efficient Retrieval with Factual DecompositionYanhong Li, David Yunis, David McAllester, Jiawei Zhou. 178-194 [doi]
- Sports and Women's Sports: Gender Bias in Text Generation with Olympic DataLaura Biester. 195-205 [doi]
- Alligators All Around: Mitigating Lexical Confusion in Low-resource Machine TranslationElizabeth Nielsen, Isaac Caswell, Jiaming Luo, Colin Cherry. 206-221 [doi]
- PROM: Pivoted and Regulated Optimization for Multilingual Instruction LearningJaeseong Lee 0002, Seung-won Hwang, Hojin Lee 0006, Yunju Bak, Changmin Lee. 222-228 [doi]
- Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via AbstractionKaiqiao Han, Tianqing Fang, Zhaowei Wang 0003, Yangqiu Song, Mark Steedman. 229-243 [doi]
- Defense against Prompt Injection Attacks via Mixture of EncodingsRuiyi Zhang, David Sullivan, Kyle Jackson, Pengtao Xie, Mei Chen. 244-252 [doi]
- Watching the AI Watchdogs: A Fairness and Robustness Analysis of AI Safety Moderation ClassifiersAkshit Achara, Anshuman Chhabra. 253-264 [doi]
- CoRAG: Collaborative Retrieval-Augmented GenerationAashiq Muhamed, Mona T. Diab, Virginia Smith. 265-276 [doi]
- Is It Navajo? Accurate Language Detection for Endangered Athabaskan LanguagesIvory Yang, Weicheng Ma, Chunhui Zhang, Soroush Vosoughi. 277-284 [doi]
- Don't Touch My DiacriticsKyle Gorman, Yuval Pinter. 285-291 [doi]
- Pretrained Image-Text Models are Secretly Video CaptionersChunhui Zhang, Yiren Jian, Zhongyu Ouyang, Soroush Vosoughi. 292-305 [doi]
- Reverse Modeling in Large Language ModelsSicheng Yu, Yuanchen Xu, Cunxiao Du, Yanying Zhou, Minghui Qiu, Qianru Sun, Hao Zhang, Jiawei Wu. 306-320 [doi]
- Preserving Multilingual Quality While Tuning Query Encoder on English OnlyOleg Vasilyev, Randy Sawaya, John Bohannon. 321-341 [doi]
- Using Contextually Aligned Online Reviews to Measure LLMs' Performance Disparities Across Language VarietiesZixin Tang, Chieh-Yang Huang, Tsung-Chi Li, Ho Yin Sam Ng, Hen-Hsen Huang, Ting-Hao Kenneth Huang. 342-355 [doi]
- Towards Federated Low-Rank Adaptation of Language Models with Rank HeterogeneityYuji Byun, Jaeho Lee. 356-362 [doi]
- Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-SubjectZenghao Duan, Wenbin Duan, Zhiyi Yin, Yinghan Shen, Shaoling Jing, Jie Zhang, Huawei Shen, Xueqi Cheng. 363-373 [doi]
- STEP: Staged Parameter-Efficient Pre-training for Large Language ModelsKazuki Yano, Takumi Ito, Jun Suzuki 0001. 374-384 [doi]
- Language Models Encode Numbers Using Digit Representations in Base 10Amit A. Levy, Mor Geva. 385-395 [doi]
- A Systematic Study of Cross-Layer KV Sharing for Efficient LLM InferenceYou Wu, Haoyi Wu, Kewei Tu. 396-403 [doi]
- AMPS: ASR with Multimodal Paraphrase SupervisionAbhishek Gupta, Amruta Parulekar, Sameep Chattopadhyay, Preethi Jyothi. 404-413 [doi]
- Taxi1500: A Dataset for Multilingual Text Classification in 1500 LanguagesChunlan Ma, Ayyoob Imani, Haotian Ye, Renhao Pei, Ehsaneddin Asgari, Hinrich Schütze. 414-439 [doi]
- GameTox: A Comprehensive Dataset and Analysis for Enhanced Toxicity Detection in Online Gaming CommunitiesUsman Naseem, Shuvam Shiwakoti, Siddhant Bikram Shah, Surendrabikram Thapa, Qi Zhang 0020. 440-447 [doi]
- FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMsForrest Sheng Bao, Miaoran Li, Renyi Qu, Ge Luo 0002, Erana Wan, Yujia Tang, Weisi Fan, Manveer Singh Tamber, Suleman Kazi, Vivek Sourabh, Mike Qi, Ruixuan Tu, Chenyu Xu, Matthew Gonzales, Ofer Mendelevitch, Amin Ahmad. 448-461 [doi]
- Debate-Feedback: A Multi-Agent Framework for Efficient Legal Judgment PredictionXi Chen, Mao Mao, Shuo Li, Haotian Shangguan. 462-470 [doi]
- Great Memory, Shallow Reasoning: Limits of kNN-LMsShangyi Geng, Wenting Zhao, Alexander M. Rush. 471-482 [doi]
- Repetition Neurons: How Do Language Models Produce Repetitions?Tatsuya Hiraoka, Kentaro Inui. 483-495 [doi]
- STAR: Spectral Truncation and Rescale for Model MergingYu-Ang Lee, Ching Yun Ko, Tejaswini Pedapati, I-Hsin Chung, Mi-Yen Yeh, Pin-Yu Chen. 496-505 [doi]
- Task-driven Layerwise Additive Activation InterventionHieu Trung Nguyen, Bao Nguyen, Binh Nguyen, Viet Anh Nguyen. 506-513 [doi]
- Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text ApproachesAdithya Pratapa, Teruko Mitamura. 514-528 [doi]
- Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language ModelsSangmin Woo, Kang Zhou, Yun Zhou, Shuai Wang, Sheng Guan, Haibo Ding, Lin Lee Cheong. 529-538 [doi]
- A Layered Debating Multi-Agent System for Similar Disease DiagnosisYutian Zhao, Huimin Wang, Yefeng Zheng 0001, Xian Wu 0001. 539-549 [doi]
- The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear SubspacesAhmed Oumar El-Shangiti, Tatsuya Hiraoka, Hilal AlQuabeh, Benjamin Heinzerling, Kentaro Inui. 550-561 [doi]
- AlignFreeze: Navigating the Impact of Realignment on the Layers of Multilingual Models Across Diverse LanguagesSteve Bakos, David Guzmán, Riddhi More, Kelly Chutong Li, Félix Gaschi, En-Shiun Annie Lee. 562-586 [doi]
- FLIQA-AD: a Fusion Model with Large Language Model for Better Diagnose and MMSE Prediction of Alzheimer's DiseaseJunhao Chen, Zhiyuan Ding, Yan Liu 0054, Xiangzhu Zeng, Ling Wang 0013. 587-594 [doi]
- Transform Retrieval for Textual Entailment in RAGQuan Guo, Xin Liang. 595-599 [doi]
- How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal RepresentationsHyunji Lee, Danni Liu, Supriti Sinhamahapatra, Jan Niehues. 600-610 [doi]
- Explore the Reasoning Capability of LLMs in the Chess TestbedShu Wang 0002, Lei Ji 0001, Renxi Wang, Wenxiao Zhao, Haokun Liu, Yifan Hou, Ying Nian Wu. 611-622 [doi]
- Auto-Cypher: Improving LLMs on Cypher generation via LLM-supervised generation-verification frameworkAman Tiwari, Shiva Krishna Reddy Malay, Vikas Yadav, Masoud Hashemi, Sathwik Tejaswi Madhusudhan. 623-640 [doi]
- Leveraging Moment Injection for Enhanced Semi-supervised Natural Language Inference with Large Language ModelsSeo-Yeon Park. 641-648 [doi]
- A Fair Comparison without Translationese: English vs. Target-language Instructions for Multilingual LLMsTaisei Enomoto, Hwichan Kim, Zhousi Chen, Mamoru Komachi. 649-670 [doi]
- Evaluating Multimodal Generative AI with Korean Educational StandardsSanghee Park, Geewook Kim. 671-688 [doi]
- ScratchEval: Are GPT-4o Smarter than My Child? Evaluating Large Multimodal Models with Visual Programming ChallengesRao Fu, Ziyang Luo, Hongzhan Lin 0001, Zhen Ye, Jing Ma 0004. 689-699 [doi]
- Interpret and Control Dense Retrieval with Sparse Latent FeaturesHao Kang, Tevin Wang, Chenyan Xiong. 700-709 [doi]
- DART: An AIGT Detector using AMR of Rephrased TextHyeonchu Park, Byungjun Kim, Bugeun Kim. 710-721 [doi]
- Scaling Graph-Based Dependency Parsing with Arc Vectorization and Attention-Based RefinementNicolas Floquet, Joseph Le Roux, Nadi Tomeh, Thierry Charnois. 722-734 [doi]
- Language Models "Grok" to CopyAng Lv, Ruobing Xie, Xingwu Sun, Zhanhui Kang, Rui Yan 0001. 735-741 [doi]
- Evaluating LLMs for Quotation Attribution in Literary Texts: A Case Study of LLaMa3Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe Cerisara. 742-755 [doi]
- Beyond Literal Token Overlap: Token Alignability for MultilingualityKatharina Hämmerl, Tomasz Limisiewicz, Jindrich Libovický, Alexander Fraser 0001. 756-767 [doi]
- IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark for LLMsKawshik Manikantan, Makarand Tapaswi, Vineet Gandhi, Shubham Toshniwal. 768-777 [doi]
- kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-SpeechKarl El Hajal, Ajinkya Kulkarni, Enno Hermann, Mathew Magimai-Doss. 778-786 [doi]
- CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented GenerationYoungwon Lee 0003, Seung-won Hwang, Daniel F. Campos, Filip Gralinski, Zhewei Yao, Yuxiong He. 787-796 [doi]
- GraphLSS: Integrating Lexical, Structural, and Semantic Features for Long Document Extractive SummarizationMargarita Bugueño, Hazem Abou Hamdan, Gerard de Melo. 797-804 [doi]
- Step-by-Step Fact Verification System for Medical Claims with Explainable ReasoningJuraj Vladika, Ivana Hacajová, Florian Matthes. 805-816 [doi]
- Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and MaliseetShenran Wang, Changbing Yang, Mike Parkhill, Chad Quinn, Christopher Hammerly, Jian Zhu. 817-826 [doi]
- Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined PromptsKun Qian 0016, Maximillian Chen, Siyan Li, Arpit Sharma 0001, Zhou Yu 0005. 827-844 [doi]
- Sociodemographic Prompting is Not Yet an Effective Approach for Simulating Subjective Judgments with LLMsHuaman Sun, Jiaxin Pei, Minje Choi, David Jurgens. 845-854 [doi]
- Identifying Power Relations in Conversations using Multi-Agent Social ReasoningZhaoqing Wu, Dan Goldwasser, Maria Leonor Pacheco, Leora Morgenstern. 855-865 [doi]
- Examining Spanish Counseling with MIDAS: a Motivational Interviewing Dataset in SpanishAylin Gunal, Bowen Yi, John Piette, Rada Mihalcea, Verónica Pérez-Rosas. 866-872 [doi]
- Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of StereotypesIsabel O. Gallegos, Ryan Aponte, Ryan A. Rossi, Joe Barrow, Md. Mehrab Tanjim, Tong Yu 0001, Hanieh Deilamsalehy, Ruiyi Zhang 0002, SungChul Kim, Franck Dernoncourt, Nedim Lipka, Deonna M. Owens, Jiuxiang Gu. 873-888 [doi]
- EqualizeIR: Mitigating Linguistic Biases in Retrieval ModelsJiali Cheng, Hadi Amiri. 889-898 [doi]
- Do Audio-Language Models Understand Linguistic Variations?Ramaneswaran Selvakumar, Sonal Kumar, Hemant Kumar Giri, Nishit Anand, Ashish Seth, Sreyan Ghosh, Dinesh Manocha. 899-913 [doi]
- Giving the Old a Fresh Spin: Quality Estimation-Assisted Constrained Decoding for Automatic Post-EditingSourabh Dattatray Deoghare, Diptesh Kanojia, Pushpak Bhattacharyya. 914-925 [doi]
- RuleR: Improving LLM Controllability by Rule-based Data RecyclingMing Li, Han Chen, Chenguang Wang, Dang Nguyen, Dianqi Li, Tianyi Zhou 0001. 926-943 [doi]
- MixRevDetect: Towards Detecting AI-Generated Content in Hybrid Peer ReviewsSandeep Kumar, Samarth Garg, Sagnik Sengupta, Tirthankar Ghosal, Asif Ekbal. 944-953 [doi]
- DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse GraphMaitreya Prafulla Chitale, Uday Bindal, Rajakrishnan Rajkumar, Rahul Mishra. 954-965 [doi]
- Capturing Human Cognitive Styles with Language: Towards an Experimental Evaluation ParadigmVasudha Varadarajan, Syeda Mahwish, Xiaoran Liu, Julia Buffolino, Christian C. Luhmann, Ryan L. Boyd, H. Andrew Schwartz. 966-979 [doi]