Abstract is missing.
- Frontmatter [doi]
- Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language ModelsZhengxin Zhang, Dan Zhao 0003, Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Qing Li 0006, Yong Jiang 0001, Zhihao Jia. 1-17 [doi]
- Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal UtterancesHanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao. 18-35 [doi]
- MAGE: Machine-generated Text Detection in the WildYafu Li, Qintong Li, Leyang Cui, Wei Bi, Zhilin Wang, Longyue Wang, Linyi Yang, Shuming Shi 0001, Yue Zhang 0004. 36-53 [doi]
- PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language ModelsHaoran Li 0003, Dadi Guo, Donghao Li, Wei Fan, Qi Hu, Xin Liu 0039, Chunkit Chan, Duanyi Yao, Yuan Yao, Yangqiu Song. 54-73 [doi]
- GenTranslate: Large Language Models are Generative Multilingual Speech and Machine TranslatorsYuchen Hu, Chen Chen 0075, Chao-Han Huck Yang, Ruizhe Li 0001, Dong Zhang, Zhehuai Chen, Engsiong Chng. 74-90 [doi]
- Exploring Chain-of-Thought for Multi-modal Metaphor DetectionYanzhi Xu, Yueying Hua, Shichen Li, Zhongqing Wang. 91-101 [doi]
- BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-DistillationDayou Du, Yijia Zhang, Shijie Cao, Jiaqi Guo, Ting Cao, Xiaowen Chu, Ningyi Xu. 102-116 [doi]
- A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation and ExtrapolationKai Chen, Ye Wang, Yitong Li, Aiping Li, Han Yu, Xin Song. 117-132 [doi]
- Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented GenerationShicheng Xu, Liang Pang, Mo Yu, Fandong Meng, Huawei Shen, Xueqi Cheng, Jie Zhou 0016. 133-145 [doi]
- CSCD-NS: a Chinese Spelling Check Dataset for Native SpeakersYong Hu, Fandong Meng, Jie Zhou 0016. 146-159 [doi]
- Evaluating Dynamic Topic ModelsCharu James, Mayank Nagda, Nooshin Haji Ghassemi, Marius Kloft, Sophie Fellenz. 160-176 [doi]
- How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data CompositionGuanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang 0225, Zheng Yuan 0002, Chang Zhou, Jingren Zhou. 177-198 [doi]
- Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome ClassificationShanshan Xu, T. Y. S. S. Santosh, Oana Ichim, Barbara Plank, Matthias Grabmair. 199-216 [doi]
- Inference to the Best Explanation in Large Language ModelsDhairya Dalal, Marco Valentino, André Freitas, Paul Buitelaar. 217-235 [doi]
- A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference CorpusEduard Poesina, Cornelia Caragea, Radu-Tudor Ionescu. 236-253 [doi]
- MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question AnsweringXiusi Chen, Jyun-Yu Jiang, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Wei Wang 0010. 254-266 [doi]
- SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMsYebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu 0001, Fei Liu 0004. 267-278 [doi]
- SciMON: Scientific Inspiration Machines Optimized for NoveltyQingyun Wang 0005, Doug Downey, Heng Ji, Tom Hope. 279-299 [doi]
- Expedited Training of Visual Conditioned Language Generation via Redundancy ReductionYiren Jian, Tingkai Liu, Yunzhe Tao, Chunhui Zhang, Soroush Vosoughi, Hongxia Yang. 300-314 [doi]
- Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language ModelsAbhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, Ali Emami. 315-334 [doi]
- Retrieval-Augmented Multilingual Knowledge EditingWeixuan Wang, Barry Haddow, Alexandra Birch. 335-354 [doi]
- Picturing Ambiguity: A Visual Twist on the Winograd Schema ChallengeBrendan Park, Madeline Janecek, Naser Ezzati Jivan, Yifeng Li, Ali Emami. 355-374 [doi]
- Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language ModelsAbhishek Kumar, Sarfaroz Yunusov, Ali Emami. 375-392 [doi]
- Framing in the Presence of Supporting Data: A Case Study in U.S. Economic NewsAlexandria Leto, Elliot Pickens, Coen D. Needell, David Rothschild, Maria Leonor Pacheco. 393-415 [doi]
- Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image SequencesXiyao Wang, Yuhang Zhou, Xiaoyu Liu 0003, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Fuxiao Liu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang. 416-442 [doi]
- TTM-RE: Memory-Augmented Document-Level Relation ExtractionChufan Gao, Xuan Wang 0008, Jimeng Sun 0001. 443-458 [doi]
- Answer is All You Need: Instruction-following Text Embedding via Answering the QuestionLetian Peng, Yuwei Zhang 0001, Zilong Wang 0002, Jayanth Srinivasa, Gaowen Liu, Zihan Wang 0001, Jingbo Shang. 459-477 [doi]
- Explore Spurious Correlations at the Concept Level in Language Models for Text ClassificationYuhang Zhou, Paiheng Xu, Xiaoyu Liu 0003, Bang An, Wei Ai 0002, Furong Huang. 478-492 [doi]
- Every Answer Matters: Evaluating Commonsense with Probabilistic MeasuresQi Cheng, Michael Boratko, Pranay Kumar Yelugam, Tim O'Gorman, Nalini Singh, Andrew McCallum, Xiang Li 0069. 493-506 [doi]
- GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient AnalysisYueqi Xie, Minghong Fang, Renjie Pi, Neil Gong 0001. 507-518 [doi]
- Pouring Your Heart Out: Investigating the Role of Figurative Language in Online Expressions of EmpathyGyeongeun Lee, Christina Wong, Meghan Guo, Natalie Parde. 519-529 [doi]
- An Information-Theoretic Approach to Analyze NLP Classification TasksLuran Wang, Mark J. F. Gales, Vatsal Raina. 530-551 [doi]
- Can Your Model Tell a Negation from an Implicature? Unravelling Challenges With Intent EncodersYuwei Zhang, Siffi Singh, Sailik Sengupta, Igor Shalyminov, Hang Su, Hwanjun Song, Saab Mansour. 552-567 [doi]
- Wav2Gloss: Generating Interlinear Glossed Text from SpeechTaiqi He, KwangHee Choi, Lindia Tjuatja, Nathaniel Robinson, Jiatong Shi, Shinji Watanabe 0001, Graham Neubig, David R. Mortensen, Lori S. Levin. 568-582 [doi]
- Leveraging Codebook Knowledge with NLI and ChatGPT for Zero-Shot Political Relation ClassificationYibo Hu 0002, Erick Skorupa Parolin, Latifur Khan, Patrick T. Brandt, Javier Osorio, Vito D'Orazio. 583-603 [doi]
- SPOR: A Comprehensive and Practical Evaluation Method for Compositional Generalization in Data-to-Text GenerationZiyao Xu, Houfeng Wang. 604-621 [doi]
- OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction FollowingHaochen Shi, Zhiyuan Sun, Xingdi Yuan, Marc-Alexandre Côté, Bang Liu. 622-636 [doi]
- Multimodal Instruction Tuning with Conditional Mixture of LoRAYing Shen 0001, Zhiyang Xu, Qifan Wang, Yu Cheng, Wenpeng Yin 0001, Lifu Huang. 637-648 [doi]
- DocLens: Multi-aspect Fine-grained Medical Text EvaluationYiqing Xie, Sheng Zhang 0012, Hao Cheng 0002, Pengfei Liu, Zelalem Gero, Cliff Wong, Tristan Naumann, Hoifung Poon, Carolyn P. Rosé. 649-679 [doi]
- FOFO: A Benchmark to Evaluate LLMs' Format-Following CapabilityCongying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, Ran Xu, Wenpeng Yin 0001, Caiming Xiong. 680-699 [doi]
- Hyper-CL: Conditioning Sentence Representations with HypernetworksYoung Hyun Yoo, Jii Cha, Changhyeon Kim, Taeuk Kim. 700-711 [doi]
- Analysis of Multi-Source Language Training in Cross-Lingual TransferSeong Hoon Lim, Taejun Yun, Jinhyeon Kim, Jihun Choi, Taeuk Kim. 712-725 [doi]
- ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract DescriptionsSreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramaneswaran S., S. Sakshi, Dinesh Manocha. 726-748 [doi]
- The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language VariantsLucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, Madian Khabsa. 749-775 [doi]
- Learn from Failure: Fine-tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic ProvingChenyang An, Zhibo Chen 0009, Qihao Ye, Emily First, Letian Peng, Jiayun Zhang, Zihan Wang 0001, Sorin Lerner, Jingbo Shang. 776-790 [doi]
- Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play ApproachSaehyung Lee, Sangwon Yu, Junsung Park, Jihun Yi, Sungroh Yoon. 791-809 [doi]
- IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback with Human-Language Model InteractionInna W. Lin, Ashish Sharma 0004, Christopher Michael Rytting, Adam S. Miner, Jina Suh, Tim Althoff. 810-840 [doi]
- Token-wise Influential Training Data Retrieval for Large Language ModelsHuawei Lin, Jikai Long, Zhaozhuo Xu, Weijie Zhao 0001. 841-860 [doi]
- Tree-of-Counterfactual Prompting for Zero-Shot Stance DetectionMaxwell A. Weinzierl, Sanda M. Harabagiu. 861-880 [doi]
- VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web TasksJing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Russ Salakhutdinov, Daniel Fried. 881-905 [doi]
- FineSurE: Fine-grained Summarization Evaluation using LLMsHwanjun Song, Hang Su, Igor Shalyminov, Jason Cai, Saab Mansour. 906-922 [doi]
- Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI FeedbackDaechul Ahn, Yura Choi, Youngjae Yu, Dongyeop Kang, Jonghyun Choi. 923-940 [doi]
- Prompt Refinement with Image Pivot for Text-to-Image GenerationJingtao Zhan, Qingyao Ai, Yiqun Liu 0001, Yingwei Pan, Ting Yao, Jiaxin Mao, Shaoping Ma, Tao Mei 0001. 941-954 [doi]
- Striking Gold in Advertising: Standardization and Exploration of Ad Text GenerationMasato Mita, Soichiro Murakami, Akihiko Kato, Peinan Zhang. 955-972 [doi]
- AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility EstimationZhaowei Wang 0003, Wei Fan, Qing Zong, Hongming Zhang 0009, Sehyun Choi, Tianqing Fang, Xin Liu 0039, Yangqiu Song, Ginny Y. Wong, Simon See. 973-994 [doi]
- Reflect-RL: Two-Player Online RL Fine-Tuning for LMsRunlong Zhou, Simon S. Du, Beibin Li. 995-1015 [doi]
- Can ChatGPT's Performance be Improved on Verb Metaphor Detection Tasks? Bootstrapping and Combining Tacit KnowledgeCheng Yang, Puli Chen, Qingbao Huang. 1016-1027 [doi]
- Self-Distillation Bridges Distribution Gap in Language Model Fine-TuningZhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu. 1028-1043 [doi]
- An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented GenerationKun Zhu 0025, Xiaocheng Feng, Xiyuan Du, Yuxuan Gu, Weijiang Yu, Haotian Wang 0007, Qianglong Chen, Zheng Chu, Jingchang Chen, Bing Qin 0001. 1044-1069 [doi]
- RORA: Robust Free-Text Rationale EvaluationZhengping Jiang, Yining Lu, Hanjie Chen, Daniel Khashabi, Benjamin Van Durme, Anqi Liu. 1070-1087 [doi]
- Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven AgentsCheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie Zhou, Yankai Lin, Zhiyuan Liu 0001, Maosong Sun 0001. 1088-1113 [doi]
- InstructProtein: Aligning Human and Protein Language via Knowledge InstructionZeyuan Wang, Qiang Zhang, Keyan Ding, Ming Qin, Xiang Zhuang, Xiaotong Li, Huajun Chen. 1114-1136 [doi]
- ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language ModelsAparna Elangovan, Ling Liu, Lei Xu, Sravan Babu Bodapati, Dan Roth. 1137-1160 [doi]
- Linguistically Conditioned Semantic Textual SimilarityJingxuan Tu, Keer Xu, Liulu Yue, Bingyang Ye, Kyeongmin Rim, James Pustejovsky. 1161-1172 [doi]
- Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and FutureZheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Tao He, Haotian Wang 0007, Weihua Peng, Ming Liu 0004, Bing Qin 0001, Ting Liu 0001. 1173-1203 [doi]
- TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language ModelsZheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Haotian Wang 0007, Ming Liu 0004, Bing Qin 0001. 1204-1228 [doi]
- BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question AnsweringZheng Chu, Jingchang Chen, Qianglong Chen, Haotian Wang 0007, Kun Zhu 0025, Xiyuan Du, Weijiang Yu, Ming Liu 0004, Bing Qin 0001. 1229-1248 [doi]
- ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge BaseSiyu Yuan, Jiangjie Chen, Changzhi Sun, Jiaqing Liang, Yanghua Xiao, Deqing Yang. 1249-1265 [doi]
- TaSL: Continual Dialog State Tracking via Task Skill Localization and ConsolidationYujie Feng, Xu Chu, Yongxin Xu, Guangyuan Shi, Bo Liu 0049, Xiao-Ming Wu 0003. 1266-1279 [doi]
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language ModelsDamai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, Wenfeng Liang. 1280-1297 [doi]
- Grounding Language Model with Chunking-Free In-Context RetrievalHongjin Qian, Zheng Liu 0011, Kelong Mao, Yujia Zhou 0002, Zhicheng Dou. 1298-1311 [doi]
- Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis GenerationJiaxin Bai, Yicheng Wang, Tianshi Zheng, Yue Guo, Xin Liu 0039, Yangqiu Song. 1312-1329 [doi]
- Active Prompting with Chain-of-Thought for Large Language ModelsShizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, Tong Zhang 0001. 1330-1350 [doi]
- EasyGen: Easing Multimodal Generation with BiDiffuser and LLMsXiangyu Zhao, Bo Liu 0049, Qijiong Liu, Guangyuan Shi, Xiao-Ming Wu 0003. 1351-1370 [doi]
- Rewriting the Code: A Simple Method for Large Language Model Augmented Code SearchHaochen Li, Xin Zhou, Zhiqi Shen. 1371-1389 [doi]
- A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science ApplicationsNaomi Baes, Nick Haslam, Ekaterina Vylomova. 1390-1415 [doi]
- Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized RehearsalJianheng Huang, Leyang Cui, Ante Wang, Chengyi Yang, Xinting Liao, Linfeng Song, Junfeng Yao, Jinsong Su. 1416-1428 [doi]
- Enhancing Large Language Models in Coding Through Multi-Perspective Self-ConsistencyBaizhou Huang, Shuai Lu, Xiaojun Wan 0001, Nan Duan. 1429-1450 [doi]
- Citation-Enhanced Generation for LLM-based ChatbotsWeitao Li, Junkai Li, Weizhi Ma, Yang Liu. 1451-1466 [doi]
- Transitive Consistency Constrained Learning for Entity-to-Entity Stance DetectionHaoyang Wen, Eduard H. Hovy, Alexander Hauptmann 0001. 1467-1480 [doi]
- Feature-Adaptive and Data-Scalable In-Context LearningJiahao Li, Quan Wang 0002, Licheng Zhang, Guoqing Jin, Zhendong Mao. 1481-1494 [doi]
- Probing the Multi-turn Planning Capabilities of LLMs via 20 Question GamesYizhe Zhang 0002, Jiarui Lu, Navdeep Jaitly. 1495-1516 [doi]
- WaterBench: Towards Holistic Evaluation of Watermarks for Large Language ModelsShangqing Tu, Yuliang Sun, Yushi Bai, Jifan Yu, Lei Hou 0001, Juanzi Li. 1517-1542 [doi]
- Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language ModelsYida Zhao, Chao Lou, Kewei Tu. 1543-1556 [doi]
- A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any TranslationZhengrui Ma, Qingkai Fang, Shaolei Zhang, Shoutao Guo, Yang Feng 0004, Min Zhang. 1557-1575 [doi]
- Probing Language Models for Pre-training Data DetectionZhenhua Liu, Tong Zhu 0002, Chuanyuan Tan, Bing Liu, Haonan Lu, Wenliang Chen. 1576-1587 [doi]
- Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context UnderstandingZhihan Zhang, Yixin Cao 0002, Chenchen Ye 0001, Yunshan Ma, Lizi Liao, Tat-Seng Chua. 1588-1606 [doi]
- IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script GenerationSenyu Han, Lu Chen, Li-Min Lin, Zhengshan Xu, Kai Yu. 1607-1619 [doi]
- Language Model Adaption for Reinforcement Learning with Natural Language Action SpaceJiangxing Wang, Jiachen Li, Xiao Han, Deheng Ye, Zongqing Lu. 1620-1634 [doi]
- Evaluating Intention Detection Capability of Large Language Models in Persuasive DialoguesHiromasa Sakurai, Yusuke Miyao. 1635-1657 [doi]
- LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt CompressionHuiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li 0002, Chin-Yew Lin, Yuqing Yang 0001, Lili Qiu. 1658-1677 [doi]
- Persuading across Diverse Domains: a Dataset and Persuasion Large Language ModelChuhao Jin, Kening Ren, Lingzhen Kong, Xiting Wang, Ruihua Song, Huan Chen. 1678-1706 [doi]
- HealMe: Harnessing Cognitive Reframing in Large Language Models for PsychotherapyMengxi Xiao, Qianqian Xie, Ziyan Kuang, Zhicheng Liu, Kailai Yang, Min Peng 0002, Weiguang Han, Jimin Huang. 1707-1725 [doi]
- Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion RecognitionZirun Guo, Tao Jin 0004, Zhou Zhao. 1726-1736 [doi]
- An Effective Pronunciation Assessment Approach Leveraging Hierarchical Transformers and Pre-training StrategiesBi-Cheng Yan, Jiun-Ting Li, Yi-Cheng Wang, Hsin-Wei Wang, Tien-Hong Lo, Yung-Chang Hsu, Wei-Cheng Chao, Berlin Chen. 1737-1747 [doi]
- Detection-Correction Structure via General Language Model for Grammatical Error CorrectionWei Li 0101, Houfeng Wang. 1748-1763 [doi]
- Generative Pre-trained Speech Language Model with Efficient Hierarchical TransformerYongxin Zhu, Dan Su 0002, Liqiang He, Linli Xu, Dong Yu 0001. 1764-1775 [doi]
- Selene: Pioneering Automated Proof in Software VerificationLichen Zhang, Shuai Lu, Nan Duan. 1776-1789 [doi]
- Dissecting Human and LLM PreferencesJunlong Li, Fan Zhou, Shichao Sun, Yikai Zhang, Hai Zhao 0001, Pengfei Liu 0003. 1790-1811 [doi]
- UniCoder: Scaling Code Large Language Model via Universal CodeTao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li. 1812-1824 [doi]
- AoE: Angle-optimized Embeddings for Semantic Textual SimilarityXianming Li, Jing Li. 1825-1839 [doi]
- InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological InterviewsXintao Wang, Yunze Xiao, Jen-Tse Huang, Siyu Yuan, Rui Xu, Haoran Guo, Quan Tu, Yaying Fei, Ziang Leng, Wei Wang 0009, Jiangjie Chen, Cheng Li, Yanghua Xiao. 1840-1873 [doi]
- Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be BetterShengchao Liu, Xiaoming Liu, Yichen Wang, Zehua Cheng, Chengzhengxu Li, Zhaohan Zhang, Yu Lan, Chao Shen. 1874-1889 [doi]
- AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM AnnotatorsJingwei Ni, Minjing Shi, Dominik Stammbach, Mrinmaya Sachan, Elliott Ash, Markus Leippold. 1890-1912 [doi]
- Towards Faithful and Robust LLM Specialists for Evidence-Based Question-AnsweringTobias Schimanski, Jingwei Ni, Mathias Kraus, Elliott Ash, Markus Leippold. 1913-1931 [doi]
- LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style PluginShihan Dou, Enyu Zhou, Yan Liu 0002, Songyang Gao, Wei Shen, Limao Xiong, Yuhao Zhou, Xiao Wang 0001, Zhiheng Xi, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 1932-1945 [doi]
- Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-EvaluationXiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng. 1946-1965 [doi]
- M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple PartitionsZheng Wang, Shu Xian Teo, Jieer Ouyang, Yongjun Xu, Wei Shi. 1966-1978 [doi]
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative ComprehensionQian Yang, Jin Xu, Wenrui Liu, Yunfei Chu, Ziyue Jiang 0001, Xiaohuan Zhou, Yichong Leng, Yuanjun Lv, Zhou Zhao, Chang Zhou, Jingren Zhou. 1979-1998 [doi]
- Navigating the Metrics Maze: Reconciling Score Magnitudes and AccuraciesTom Kocmi, Vilém Zouhar, Christian Federmann, Matt Post. 1999-2014 [doi]
- ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language ModelsYuanyi Ren, Haoran Ye, Hanjun Fang, Xin Zhang, Guojie Song. 2015-2040 [doi]
- DM-BLI: Dynamic Multiple Subspaces Alignment for Unsupervised Bilingual Lexicon InductionLing Hu, Yuemei Xu. 2041-2052 [doi]
- SparseFit: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language ExplanationsJesus Solano, Mardhiyah Sanni, Oana-Maria Camburu, Pasquale Minervini. 2053-2077 [doi]
- Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution EstimationWen Wu, Bo Li 0028, Chao Zhang 0031, Chung-Cheng Chiu, Qiujia Li, Junwen Bai, Tara N. Sainath, Philip C. Woodland. 2078-2093 [doi]
- REANO: Optimising Retrieval-Augmented Reader Models through Knowledge Graph GenerationJinyuan Fang, Zaiqiao Meng, Craig Macdonald. 2094-2112 [doi]
- Learning Disentangled Semantic Spaces of Explanations via Invertible Neural NetworksYingji Zhang, Danilo Carvalho, André Freitas. 2113-2134 [doi]
- MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story GenerationYan Ma, Yu Qiao 0001, Pengfei Liu 0003. 2135-2169 [doi]
- Open-Set Semi-Supervised Text Classification via Adversarial Disagreement MaximizationJunfan Chen, Richong Zhang, Junchi Chen, Chunming Hu. 2170-2180 [doi]
- ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three StagesJunjie Ye, Sixian Li, Guanyu Li, Caishuang Huang, Songyang Gao, Yilong Wu, Qi Zhang 0001, Tao Gui, Xuanjing Huang 0001. 2181-2211 [doi]
- A synthetic data approach for domain generalization of NLI modelsMohammad Javad Hosseini, Andrey Petrov, Alex Fabrikant, Annie Louis. 2212-2226 [doi]
- Enhancing Contrastive Learning with Noise-Guided Attack: Towards Continual Relation Extraction in the WildTing Wu, Jingyi Liu, Rui Zheng, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 2227-2239 [doi]
- LRQuant: Learnable and Robust Post-Training Quantization for Large Language ModelsJiaqi Zhao, Miao Zhang, Chao Zeng, Ming Wang, Xuebo Liu, Liqiang Nie. 2240-2255 [doi]
- VariErr NLI: Separating Annotation Error from Human Label VariationLeon Weber-Genzel, Siyao Peng, Marie-Catherine de Marneffe, Barbara Plank. 2256-2269 [doi]
- Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model EvaluationXunjian Yin, Xu Zhang, Jie Ruan, Xiaojun Wan 0001. 2270-2286 [doi]
- ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot RetrievalSoyoung Yoon, Eunbi Choi, Jiyeon Kim, Hyeongu Yun, Yireun Kim, Seung-won Hwang. 2287-2308 [doi]
- Exploring the Potential of Large Language Models in Computational ArgumentationGuizhen Chen, LiYing Cheng, Anh Tuan Luu, Lidong Bing. 2309-2330 [doi]
- TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Semantic TasksViktor Moskvoretskii, Ekaterina Neminova, Alina Lobanova, Alexander Panchenko, Irina Nikishina. 2331-2350 [doi]
- CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense ReasoningWeiqi Wang 0001, Tianqing Fang, Chunyang Li, Haochen Shi, Wenxuan Ding 0001, Baixuan Xu, Zhaowei Wang 0003, Jiaxin Bai, Xin Liu 0039, Cheng Jiayang, Chunkit Chan, Yangqiu Song. 2351-2374 [doi]
- MEFT: Memory-Efficient Fine-Tuning through Sparse AdapterJitai Hao, Weiwei Sun 0001, Xin Xin 0003, Qi Meng, Zhumin Chen, Pengjie Ren, Zhaochun Ren. 2375-2388 [doi]
- Surgical Feature-Space Decomposition of LLMs: Why, When and How?Arnav Chavan, Nahush Lele, Deepak K. Gupta. 2389-2400 [doi]
- Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive GuidanceZhangyue Yin, Qiushi Sun, Qipeng Guo, Zhiyuan Zeng, Xiaonan Li, Junqi Dai, Qinyuan Cheng, Xuanjing Huang 0001, Xipeng Qiu. 2401-2416 [doi]
- Modality-Aware Integration with Large Language Models for Knowledge-Based Visual Question AnsweringJunnan Dong, Qinggang Zhang, Huachi Zhou, Daochen Zha, Pai Zheng, Xiao Huang 0001. 2417-2429 [doi]
- Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache CompressionPeiyu Liu 0002, Ze-Feng Gao, Xin Zhao 0018, Yipeng Ma, Tao Wang, Ji-Rong Wen. 2430-2440 [doi]
- VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language ModelsSeoyeon Kim, Kwangwook Seo, Hyungjoo Chae, Jinyoung Yeo, Dongha Lee. 2441-2461 [doi]
- Making Long-Context Language Models Better Multi-Hop ReasonersYanyang Li, Shuo Liang, Michael R. Lyu, Liwei Wang 0009. 2462-2475 [doi]
- TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language ModelsYihong Liu, Chunlan Ma, Haotian Ye, Hinrich Schütze. 2476-2499 [doi]
- Extreme Miscalibration and the Illusion of Adversarial RobustnessVyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis. 2500-2525 [doi]
- HyCoRec: Hypergraph-Enhanced Multi-Preference Learning for Alleviating Matthew Effect in Conversational RecommendationYongsen Zheng, Ruilin Xu, Ziliang Chen, Guohua Wang, Mingjie Qian, Jinghui Qin, Liang Lin. 2526-2537 [doi]
- Co-training for Low Resource Scientific Natural Language InferenceMobashir Sadat, Cornelia Caragea. 2538-2550 [doi]
- RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language ModelsJiongxiao Wang, Junlin Wu 0001, Muhao Chen, Yevgeniy Vorobeychik, Chaowei Xiao. 2551-2570 [doi]
- Time is Encoded in the Weights of Finetuned Language ModelsKai Nylund, Suchin Gururangan, Noah A. Smith. 2571-2587 [doi]
- Long-Context Language Modeling with Parallel Context EncodingHoward Yen, Tianyu Gao, Danqi Chen 0001. 2588-2610 [doi]
- SirLLM: Streaming Infinite Retentive LLMYao Yao, Zuchao Li, Hai Zhao 0001. 2611-2624 [doi]
- IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained ModelsTao Feng 0013, Lizhen Qu, Zhuang Li, Haolan Zhan, Yuncheng Hua, Reza Haf. 2625-2639 [doi]
- Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at ScaleXiang Hu, Pengyu Ji, Qingyang Zhu, Wei Wu, Kewei Tu. 2640-2657 [doi]
- MELA: Multilingual Evaluation of Linguistic AcceptabilityZiyin Zhang, Yikang Liu, Weifang Huang, Junyu Mao, Rui Wang, Hai Hu. 2658-2674 [doi]
- CopyNE: Better Contextual ASR by Copying Named EntitiesShilin Zhou, Zhenghua Li, Yu Hong, Min Zhang 0005, Zhefeng Wang 0001, Baoxing Huai. 2675-2686 [doi]
- Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table RetrievalPeter Baile Chen, Yi Zhang, Dan Roth. 2687-2699 [doi]
- Generalizing Conversational Dense Retrieval via LLM-Cognition Data AugmentationHaonan Chen 0005, Zhicheng Dou, Kelong Mao, Jiongnan Liu, Ziliang Zhao. 2700-2718 [doi]
- ItD: Large Language Models Can Teach Themselves Induction through DeductionWangtao Sun, Haotian Xu, Xuanqing Yu, Pei Chen, Shizhu He, Jun Zhao 0001, Kang Liu 0001. 2719-2731 [doi]
- MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMsZimu Lu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li. 2732-2747 [doi]
- Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous AgentHeng-Da Xu, Xian-Ling Mao, Puhai Yang, Fanshu Sun, Heyan Huang. 2748-2763 [doi]
- On Context Utilization in Summarization with Large Language ModelsMathieu Ravaut, Aixin Sun, Nancy F. Chen, Shafiq Joty. 2764-2781 [doi]
- INTERS: Unlocking the Power of Large Language Models in Search with Instruction TuningYutao Zhu 0001, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu 0011, Ji-Rong Wen, Zhicheng Dou. 2782-2809 [doi]
- Enhancing In-Context Learning via Implicit Demonstration AugmentationXiaoling Zhou, Wei Ye 0004, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie 0003, Shikun Zhang. 2810-2828 [doi]
- PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRASheng Wang, Boyang Xue, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Lingpeng Kong, Chuan Wu. 2829-2841 [doi]
- Improving Event Definition Following For Zero-Shot Event DetectionZefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang 0010, Nanyun Peng. 2842-2863 [doi]
- Through the MUD: A Multi-Defendant Charge Prediction Benchmark with Linked Crime ElementsXiao Wei, Qi Xu, Hang Yu, Qian Liu 0012, Erik Cambria. 2864-2878 [doi]
- Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session EmbeddingYiruo Cheng, Kelong Mao, Zhicheng Dou. 2879-2893 [doi]
- Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under AttacksYichen Wang, Shangbin Feng, Abe Bohan Hou, Xiao Pu 0003, Chao Shen 0001, Xiaoming Liu 0001, Yulia Tsvetkov, Tianxing He. 2894-2925 [doi]
- Training Language Models to Generate Text with Citations via Fine-grained RewardsChengyu Huang, Zeqiu Wu, Yushi Hu, Wenya Wang. 2926-2949 [doi]
- Hypergraph based Understanding for Document Semantic Entity RecognitionQiwei Li 0002, Zuchao Li, Ping Wang, Haojun Ai, Hai Zhao 0001. 2950-2960 [doi]
- GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem SolversQintong Li, Leyang Cui, Xueliang Zhao, Lingpeng Kong, Wei Bi. 2961-2984 [doi]
- Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language ModelsQingkai Min, Qipeng Guo, Xiangkun Hu, Songfang Huang, Zheng Zhang, Yue Zhang. 2985-3002 [doi]
- AutoAct: Automatic Agent Learning from Scratch for QA via Self-PlanningShuofei Qiao, Ningyu Zhang 0001, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, Chengfei Lv, Huajun Chen. 3003-3021 [doi]
- ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification TasksT. Y. S. S. Santosh, Tuan-Quang Vuong, Matthias Grabmair. 3022-3039 [doi]
- Virtual Compiler Is All You Need For Assembly Code SearchZeyu Gao, Hao Wang, Yuanda Wang, Chao Zhang. 3040-3051 [doi]
- MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-TuningPengjie Ren, Chengshun Shi, Shiguang Wu 0003, Mengqi Zhang, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Jiahuan Pei. 3052-3064 [doi]
- Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for ReasoningYongqi Tong, Dawei Li, Sizhe Wang, Yujia Wang, Fei Teng, Jingbo Shang. 3065-3080 [doi]
- An Iterative Associative Memory Model for Empathetic Response GenerationZhou Yang, Zhaochun Ren, Wang Yufeng, Haizhou Sun, Chao Chen, Xiaofei Zhu, Xiangwen Liao. 3081-3092 [doi]
- Detoxifying Large Language Models via Knowledge EditingMengru Wang, Ningyu Zhang 0001, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang 0001, Huajun Chen. 3093-3118 [doi]
- LongBench: A Bilingual, Multitask Benchmark for Long Context UnderstandingYushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu 0036, Aohan Zeng, Lei Hou 0001, Yuxiao Dong, Jie Tang 0001, Juanzi Li. 3119-3137 [doi]
- Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language ModelsYuyan Chen, Songzhou Yan, Panjun Liu, Yanghua Xiao. 3138-3167 [doi]
- UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource LanguagesTrinh Pham, Khoi Le, Anh Tuan Luu. 3168-3184 [doi]
- VISTA: Visualized Text Embedding For Universal Multi-Modal RetrievalJunjie Zhou, Zheng Liu, Shitao Xiao, Bo Zhao, Yongping Xiong. 3185-3200 [doi]
- Black-Box Prompt Optimization: Aligning Large Language Models without Model TrainingJiale Cheng, Xiao Liu 0036, Kehan Zheng, Pei Ke, Hongning Wang, Yuxiao Dong, Jie Tang 0001, Minlie Huang. 3201-3219 [doi]
- Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 BenchmarkChanjun Park, Hyeonwoo Kim, DahYun Kim, SeongHwan Cho, Sanghoon Kim, Sukyung Lee, Yungi Kim, Hwalsuk Lee. 3220-3234 [doi]
- Unified Hallucination Detection for Multimodal Large Language ModelsXiang Chen 0016, Chenxi Wang, Yida Xue, Ningyu Zhang 0001, Xiaoyan Yang, Qiang Li 0022, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen. 3235-3252 [doi]
- Empowering Character-level Text Infilling by Eliminating Sub-TokensHouxing Ren, Mingjie Zhan, Zhongyuan Wu, Hongsheng Li 0001. 3253-3267 [doi]
- Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language ModelsKun Luo, Zheng Liu, Shitao Xiao, Tong Zhou, Yubo Chen 0001, Jun Zhao 0001, Kang Liu 0001. 3268-3281 [doi]
- GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge?Dayoon Ko, Jinyoung Kim, Hahyeon Choi, Gunhee Kim. 3282-3308 [doi]
- Attribute First, then Generate: Locally-attributable Grounded Text GenerationAviv Slobodkin, Eran Hirsch, Arie Cattan, Tal Schuster, Ido Dagan. 3309-3344 [doi]
- T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from TextAoxiong Yin, Haoyuan Li, Kai Shen, Siliang Tang, Yueting Zhuang. 3345-3356 [doi]
- OceanGPT: A Large Language Model for Ocean Science TasksZhen Bi, Ningyu Zhang 0001, Yida Xue, Yixin Ou, Daxiong Ji, Guozhou Zheng, Huajun Chen. 3357-3372 [doi]
- Beyond Memorization: The Challenge of Random Memory Access in Language ModelsTongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin. 3373-3388 [doi]
- BIPED: Pedagogically Informed Tutoring System for ESL EducationSoonwoo Kwon, Sojung Kim, Minju Park, Seunghyun Lee, Kyuseok Kim. 3389-3414 [doi]
- Timeline-based Sentence Decomposition with In Context Learning for Temporal Fact ExtractionJianhao Chen, Haoyuan Ouyang, Junyang Ren, Wentao Ding, Wei Hu, Yuzhong Qu. 3415-3432 [doi]
- Collaboration or Corporate Capture? Quantifying NLP's Reliance on Industry Artifacts and ContributionsWill Aitken, Mohamed Abdalla, Karen Rudie, Catherine Stinson. 3433-3448 [doi]
- Prompt Expansion for Adaptive Text-to-Image GenerationSiddhartha Datta, Alexander Ku, Deepak Ramachandran, Peter Anderson. 3449-3476 [doi]
- Progressively Modality Freezing for Multi-Modal Entity AlignmentYani Huang, Xuefeng Zhang, Richong Zhang, Junfan Chen, Jaein Kim. 3477-3489 [doi]
- Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense RetrievalChaofan Li, Zheng Liu 0011, Shitao Xiao, Yingxia Shao, Defu Lian. 3490-3500 [doi]
- Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse PromptsXuan-Phi Nguyen, Mahani Aljunied, Shafiq Joty, Lidong Bing. 3501-3516 [doi]
- Metaphor Understanding Challenge Dataset for LLMsXiaoyu Tong, Rochelle Choenni, Martha Lewis, Ekaterina Shutova. 3517-3536 [doi]
- A Multi-Task Embedder For Retrieval Augmented LLMsPeitian Zhang, Zheng Liu 0011, Shitao Xiao, Zhicheng Dou, Jian-Yun Nie. 3537-3553 [doi]
- Language Models Don't Learn the Physical Manifestation of LanguageBruce W. Lee, Jaehyuk Lim. 3554-3579 [doi]
- What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot DetectionShangbin Feng, Herun Wan, Ningnan Wang, Zhaoxuan Tan, Minnan Luo, Yulia Tsvetkov. 3580-3601 [doi]
- Self-Contrast: Better Reflection Through Inconsistent Solving PerspectivesWenqi Zhang, Yongliang Shen 0001, Linjuan Wu, Qiuying Peng, Jun Wang, Yueting Zhuang, Weiming Lu 0001. 3602-3622 [doi]
- Relying on the Unreliable: The Impact of Language Models' Reluctance to Express UncertaintyKaitlyn Zhou, Jena D. Hwang, Xiang Ren 0001, Maarten Sap. 3623-3643 [doi]
- Unity in Diversity: Collaborative Pre-training Across Multimodal Medical SourcesXiaochen Wang, Junyu Luo 0001, Jiaqi Wang 0002, Yuan Zhong, Xiaokun Zhang, Yaqing Wang, Parminder Bhatia, Cao Xiao, Fenglong Ma. 3644-3656 [doi]
- When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLPSara Papi, Marco Gaido, Andrea Pilzer, Matteo Negri. 3657-3672 [doi]
- SBAAM! Eliminating Transcript Dependency in Automatic SubtitlingMarco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli. 3673-3691 [doi]
- StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History SelectionSara Papi, Marco Gaido, Matteo Negri, Luisa Bentivogli. 3692-3707 [doi]
- ARL2: Aligning Retrievers with Black-box Large Language Models via Self-guided Adaptive Relevance LabelingLingxi Zhang, Yue Yu, Kuan Wang, Chao Zhang. 3708-3719 [doi]
- Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid InferenceJihwan Bang, Juntae Lee, Kyuhong Shim, Seunghan Yang, Simyung Chang. 3720-3731 [doi]
- FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal ModelYebin Lee, Imseong Park, Myungjoo Kang. 3732-3746 [doi]
- MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in ConversationsYuxin Wang, Ivory Yang, Saeed Hassanpour, Soroush Vosoughi. 3747-3764 [doi]
- MPCoder: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation LearningZhenlong Dai, Chang Yao, WenKang Han, Yuanying Yuanying, Zhipeng Gao, Jingyuan Chen. 3765-3780 [doi]
- DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM WorkflowsAjay Patel, Colin Raffel, Chris Callison-Burch. 3781-3799 [doi]
- Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding ObjectiveChenze Shao, Fandong Meng, Jiali Zeng, Jie Zhou 0016. 3800-3814 [doi]
- Identifying while Learning for Document Event Causality IdentificationCheng Liu, Wei Xiang, Bang Wang. 3815-3827 [doi]
- OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific ProblemsChaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu, Xu Han 0007, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu 0001, Maosong Sun 0001. 3828-3850 [doi]
- Insert or Attach: Taxonomy Completion via Box EmbeddingWei Xue, Yongliang Shen 0001, Wenqi Ren, Jietian Guo, Shiliang Pu, Weiming Lu 0001. 3851-3863 [doi]
- Semiparametric Token-Sequence Co-SupervisionHyunji Lee, Doyoung Kim, Jihoon Jun, Se June Joo, Joel Jang, Kyoung-woon On, Minjoon Seo. 3864-3882 [doi]
- Instruction Fusion: Advancing Prompt Evolution through HybridizationWeidong Guo, Jiuding Yang, Kaitong Yang, Xiangyang Li, Zhuwei Rao, Yu Xu, Di Niu. 3883-3893 [doi]
- TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware SimulationYikai Zhang, Siyu Yuan, Caiyu Hu, Kyle Richardson 0001, Yanghua Xiao, Jiangjie Chen. 3894-3916 [doi]
- Exploring Memorization in Fine-tuned Language ModelsShenglai Zeng, Yaxin Li 0001, Jie Ren 0019, Yiding Liu, Han Xu 0002, Pengfei He, Yue Xing, Shuaiqiang Wang, Jiliang Tang, Dawei Yin. 3917-3948 [doi]
- Towards Real-world Scenario: Imbalanced New Intent DiscoveryShun Zhang, Chaoran Yan, Jian Yang, Jiaheng Liu, Ying Mo, Jiaqi Bai, Tongliang Li, Zhoujun Li. 3949-3963 [doi]
- M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text DetectionYuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti 0002, Thomas Arnold 0002, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov. 3964-3992 [doi]
- Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for DialogueJian Wang 0054, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li 0002, Xiaoyong Wei. 3993-4010 [doi]
- SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-trainingNan He, Weichen Xiong, Hanwen Liu, Yi Liao, Lei Ding, Kai Zhang, Guohua Tang, Xiao Han, Yang Wei. 4011-4022 [doi]
- Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?Ning Bian, Xianpei Han, Hongyu Lin, Yaojie Lu 0001, Ben He, Le Sun 0001. 4023-4043 [doi]
- Learning Global Controller in Latent Space for Parameter-Efficient Fine-TuningZeqi Tan, Yongliang Shen 0001, Xiaoxia Cheng, Chang Zong, Wenqi Zhang, Jian Shao, Weiming Lu 0001, Yueting Zhuang. 4044-4055 [doi]
- CaMML: Context-Aware Multimodal Learner for Large ModelsYixin Chen, Shuai Zhang 0007, Boran Han, Tong He 0001, Bo Li 0026. 4056-4071 [doi]
- MAVEN-ARG: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument AnnotationXiaozhi Wang, Hao Peng 0015, Yong Guan, Kaisheng Zeng, Jianhui Chen, Lei Hou 0001, Xu Han 0007, Yankai Lin, Zhiyuan Liu 0001, Ruobing Xie, Jie Zhou 0016, Juanzi Li. 4072-4091 [doi]
- NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity ClassesLizhou Fan, Wenyue Hua, Lingyao Li, Haoyang Ling, Yongfeng Zhang. 4092-4114 [doi]
- Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language ModelsZhiwei He 0002, Binglin Zhou, Hongkun Hao, Aiwei Liu, Xing Wang 0007, Zhaopeng Tu, Zhuosheng Zhang 0001, Rui Wang 0015. 4115-4129 [doi]
- Multi-Level Feedback Generation with Large Language Models for Empowering Novice Peer CounselorsAlicja Chaszczewicz, Raj Sanjay Shah, Ryan Louie, Bruce A Arnow, Robert E. Kraut, Diyi Yang. 4130-4161 [doi]
- In-context Mixing (ICM): Code-mixed Prompts for Multilingual LLMsBhavani Shankar, Preethi Jyothi, Pushpak Bhattacharyya. 4162-4176 [doi]
- Respond in my Language: Mitigating Language Inconsistency in Response Generation based on Large Language ModelsLiang Zhang, Qin Jin, Haoyang Huang, DongDong Zhang, Furu Wei. 4177-4192 [doi]
- Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model QueriesYu-Hsiang Huang, Yu-Che Tsai, Hsiang Hsiao, Hong-Yi Lin, Shou-de Lin. 4193-4205 [doi]
- Enhancing Reinforcement Learning with Label-Sensitive Reward for Natural Language UnderstandingKuo Liao, Shuang Li, Meng Zhao, Liqun Liu, Mengge Xue, ZhenYu Hu, Honglin Han, ChengGuo Yin. 4206-4220 [doi]
- Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting PromptsJiahao Ying, Yixin Cao 0002, Kai Xiong 0002, Long Cui, Yidong He, Yongbin Liu. 4221-4246 [doi]
- CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window ExtendingShiyi Zhu, Jing Ye, Wei Jiang, Siqiao Xue, Qi Zhang, Yifan Wu, Jianguo Li. 4247-4262 [doi]
- InfoLossQA: Characterizing and Recovering Information Loss in Text SimplificationJan Trienes, Sebastian Joseph, Jörg Schlötterer, Christin Seifert, Kyle Lo, Wei Xu 0004, Byron C. Wallace, Junyi Jessy Li. 4263-4294 [doi]
- CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction FollowingKaiyan Zhang, Jianyu Wang, Ermo Hua, Biqing Qi, Ning Ding 0002, Bowen Zhou. 4295-4312 [doi]
- DAPR: A Benchmark on Document-Aware Passage RetrievalKexin Wang, Nils Reimers 0001, Iryna Gurevych. 4313-4330 [doi]
- Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice SelectorsMengge Xue, ZhenYu Hu, Liqun Liu, Kuo Liao, Shuang Li, Honglin Han, Meng Zhao, ChengGuo Yin. 4331-4344 [doi]
- SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge GraphHanzhu Chen, Xu Shen, Qitan Lv, Jie Wang 0005, Xiaoqi Ni, Jieping Ye. 4345-4360 [doi]
- Uncertainty-Guided Modal Rebalance for Hateful Memes DetectionChuanpeng Yang, Yaxin Liu, Fuqing Zhu, Jizhong Han, Songlin Hu. 4361-4371 [doi]
- Missci: Reconstructing Fallacies in Misrepresented ScienceMax Glockner, Yufang Hou 0001, Preslav Nakov, Iryna Gurevych. 4372-4405 [doi]
- Uncovering the Full Potential of Visual Grounding Methods in VQADaniel Reich, Tanja Schultz. 4406-4419 [doi]
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMsJiejun Tan, Zhicheng Dou, Yutao Zhu 0001, Peidong Guo, Kun Fang, Ji-Rong Wen. 4420-4436 [doi]
- Favi-Score: A Measure for Favoritism in Automated Preference Ratings for Generative AI EvaluationPius von Däniken, Jan Deriu, Don Tuggener, Mark Cieliebak. 4437-4454 [doi]
- LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine FeedbackTimon Ziegenbein, Gabriella Skitalinskaya, Alireza Bayat Makou, Henning Wachsmuth. 4455-4476 [doi]
- Graph Language ModelsMoritz Plenz, Anette Frank. 4477-4494 [doi]
- Analyzing Semantic Change through Lexical ReplacementsFrancesco Periti, Pierluigi Cassotti, Haim Dubossarsky, Nina Tahmasebi. 4495-4510 [doi]
- Exploiting Intrinsic Multilateral Logical Rules for Weakly Supervised Natural Language Video LocalizationZhe Xu, Kun Wei, Xu Yang 0019, Cheng Deng. 4511-4521 [doi]
- Interpretability of Language Models via Task SpacesLucas Weber, Jaap Jumelet, Elia Bruni, Dieuwke Hupkes. 4522-4538 [doi]
- Using Synchronic Definitions and Semantic Relations to Classify Semantic Change TypesPierluigi Cassotti, Stefano De Pascale, Nina Tahmasebi. 4539-4553 [doi]
- Factual Confidence of LLMs: on Reliability and Robustness of Current EstimatorsMatéo Mahaut, Laura Aina, Paula Czarnowska, Momchil Hardalov, Thomas Müller, Lluís Màrquez. 4554-4570 [doi]
- StepCoder: Improving Code Generation with Reinforcement Learning from Compiler FeedbackShihan Dou, Yan Liu 0002, Haoxiang Jia, Enyu Zhou, Limao Xiong, Junjie Shan, Caishuang Huang, Xiao Wang 0001, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang 0001, Tao Gui, Xuanjing Huang 0001. 4571-4585 [doi]
- One-Shot Learning as Instruction Data Prospector for Large Language ModelsYunshui Li, Binyuan Hui, Xiaobo Xia, Jiaxi Yang, Min Yang 0007, Lei Zhang, Shuzheng Si, Ling-Hao Chen, Junhao Liu, Tongliang Liu, Fei Huang 0004, Yongbin Li. 4586-4601 [doi]
- Navigating the OverKill in Large Language ModelsChenyu Shi, Xiao Wang 0042, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001, Xun Zhao, Dahua Lin. 4602-4614 [doi]
- A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning ChainsAlon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins 0001, Roee Aharoni, Mor Geva. 4615-4634 [doi]
- Re3: A Holistic Framework and Dataset for Modeling Collaborative Document RevisionQian Ruan, Ilia Kuznetsov, Iryna Gurevych. 4635-4655 [doi]
- NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long DocumentsTamara Czinczoll, Christoph Hönes, Maximilian Schall, Gerard de Melo. 4656-4666 [doi]
- FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language ModelsYuxin Jiang, Yufei Wang 0005, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang 0002, Qun Liu 0001, Wei Wang. 4667-4688 [doi]
- Learning to Edit: Aligning LLMs with Knowledge EditingYuxin Jiang, Yufei Wang 0005, Chuhan Wu, Wanjun Zhong, Xingshan Zeng, Jiahui Gao, Liangyou Li, Xin Jiang 0002, Lifeng Shang, Ruiming Tang, Qun Liu 0001, Wei Wang. 4689-4705 [doi]
- DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction TuningYejie Wang, Keqing He 0001, Guanting Dong, Pei Wang, Weihao Zeng, Muxi Diao, Weiran Xu, Jingang Wang, Mengdi Zhang, Xunliang Cai. 4706-4721 [doi]
- When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-IncrementalityBrielen Madureira, Patrick Kahardipraja, David Schlangen. 4722-4749 [doi]
- SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language ModelsMd Imbesat Hassan Rizvi, Xiaodan Zhu, Iryna Gurevych. 4750-4767 [doi]
- Planning Like Human: A Dual-process Framework for Dialogue PlanningTao He, Lizi Liao, Yixin Cao 0002, Yuanxing Liu 0001, Ming Liu 0004, Zerui Chen, Bing Qin 0001. 4768-4791 [doi]
- Spectral Filters, Dark Signals, and Attention SinksNicola Cancedda. 4792-4808 [doi]
- DiffuCOMET: Contextual Commonsense Knowledge DiffusionSilin Gao, Mete Ismayilzada, Mengjie Zhao, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut. 4809-4831 [doi]
- Systematic Task Exploration with LLMs: A Study in Citation Text GenerationFurkan Sahinuç, Ilia Kuznetsov, Yufang Hou 0001, Iryna Gurevych. 4832-4855 [doi]
- Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan AcquisitionMatteo Bortoletto, Constantin Ruhdorfer, Adnen Abdessaied, Lei Shi, Andreas Bulling. 4856-4871 [doi]
- Temporal Knowledge Question Answering via Abstract Reasoning InductionZiyang Chen, Dongfang Li, Xiang Zhao, Baotian Hu, Min Zhang. 4872-4889 [doi]
- Who Wrote this Code? Watermarking for Code GenerationTaehyun Lee, Seokhee Hong 0002, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin, Gunhee Kim. 4890-4911 [doi]
- MapCoder: Multi-Agent Code Generation for Competitive Problem SolvingMd. Ashraful Islam, Mohammed Eunus Ali, Md. Rizwan Parvez. 4912-4944 [doi]
- RelayAttention for Efficient Large Language Model Serving with Long System PromptsLei Zhu, Xinjiang Wang, Wayne Zhang 0001, Rynson W. H. Lau. 4945-4957 [doi]
- Boosting Language Models Reasoning with Chain-of-Knowledge PromptingJianing Wang, Qiushi Sun, Xiang Li 0067, Ming Gao 0001. 4958-4981 [doi]
- Open Grounded Planning: Challenges and Benchmark ConstructionShiguang Guo, Ziliang Deng, Hongyu Lin, Yaojie Lu 0001, Xianpei Han, Le Sun 0001. 4982-5003 [doi]
- LLM Knows Body Language, Too: Translating Speech Voices into Human GesturesChenghao Xu, Guangtao Lyu, Jiexi Yan, Muli Yang, Cheng Deng. 5004-5013 [doi]
- QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-CorrectionXiang Huang, Sitao Cheng, Shanshan Huang, Jiayu Shen, Yong Xu, Chaoyun Zhang, Yuzhong Qu. 5014-5035 [doi]
- PITA: Prompting Task Interaction for Argumentation MiningYang Sun, Muyi Wang, Jianzhu Bao, Bin Liang, Xiaoyan Zhao 0005, Caihua Yang, Min Yang 0007, Ruifeng Xu. 5036-5049 [doi]
- Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language ModelsJinhao Duan, Hao Cheng, Shiqi Wang 0002, Alex Zavalny, Chenan Wang, Renjing Xu, Bhavya Kailkhura, Kaidi Xu. 5050-5063 [doi]
- Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language RepresentationsGregor Geigle, Radu Timofte, Goran Glavas. 5064-5084 [doi]
- Estimating Agreement by Chance for Sequence AnnotationDiya Li, Carolyn P. Rosé, Ao Yuan, Chunxiao Zhou. 5085-5097 [doi]
- Are Emergent Abilities in Large Language Models just In-Context Learning?Sheng Lu, Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Iryna Gurevych. 5098-5139 [doi]
- WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction TuningZhaojian Yu, Xin Zhang, Ning Shang, Yangyu Huang, Can Xu, Yishujie Zhao, Wenxiang Hu, Qiufeng Yin. 5140-5153 [doi]
- Eliciting Better Multilingual Structured Reasoning from LLMs through CodeBryan Li, Tamer Alkhouli, Daniele Bonadiman, Nikolaos Pappas 0004, Saab Mansour. 5154-5169 [doi]
- OLIVE: Object Level In-Context Visual EmbeddingsTimothy Ossowski, Junjie Hu. 5170-5185 [doi]
- Quantifying Uncertainty in Answers from any Language Model and Enhancing their TrustworthinessJiuhai Chen, Jonas Mueller 0001. 5186-5200 [doi]
- Marathon: A Race Through the Realm of Long Context with Large Language ModelsLei Zhang, Yunshui Li, Ziqiang Liu, Jiaxi Yang, Junhao Liu, Longze Chen, Run Luo, Min Yang. 5201-5217 [doi]
- Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency GraphXiaochen Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang. 5218-5234 [doi]
- PCAD: Towards ASR-Robust Spoken Language Understanding via Prototype Calibration and Asymmetric DecouplingXianwei Zhuang, Xuxin Cheng, Liming Liang, Yuxin Xie, Zhichang Wang, Zhiqi Huang, Yuexian Zou. 5235-5246 [doi]
- Rethinking the Multimodal Correlation of Multimodal Sequential Learning via Generalizable Attentional Results AlignmentTao Jin, Wang Lin, Ye Wang, Linjun Li, Xize Cheng, Zhou Zhao. 5247-5265 [doi]
- UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained GenerationXun Liang, Shichao Song, Simin Niu, Zhiyu Li, Feiyu Xiong, Bo Tang, Yezhaohui Wang, Dawei He, Cheng Peng, Zhonghao Wang, Haiying Deng. 5266-5293 [doi]
- PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal RetrieversWeizhe Lin, Jingbiao Mei, Jinghong Chen, Bill Byrne. 5294-5316 [doi]
- Triple-Encoders: Representations That Fire Together, Wire TogetherJustus-Jonas Erker, Florian Mai, Nils Reimers 0001, Gerasimos Spanakis, Iryna Gurevych. 5317-5332 [doi]
- Improving Hateful Meme Detection through Retrieval-Guided Contrastive LearningJingbiao Mei, Jinghong Chen, Weizhe Lin, Bill Byrne, Marcus Tomalin. 5333-5347 [doi]
- Agent-Pro: Learning to Evolve via Policy-Level Reflection and OptimizationWenqi Zhang, Ke Tang, Hai Wu, Mengna Wang, Yongliang Shen 0001, Guiyang Hou, Zeqi Tan, Peng Li 0031, Yueting Zhuang, Weiming Lu 0001. 5348-5375 [doi]
- Your Transformer is Secretly LinearAnton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Nikolai Gerasimenko, Ivan V. Oseledets, Denis Dimitrov, Andrey Kuznetsov. 5376-5384 [doi]
- Noise Correction on Subjective DatasetsUthman Jinadu, Yi Ding. 5385-5395 [doi]
- Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM OptimizersLütfi Kerem Senel, Besnik Fetahu, Davis Yoshida, Zhiyu Chen 0001, Giuseppe Castellucci, Nikhita Vedula, Jason Ingyu Choi, Shervin Malmasi. 5396-5420 [doi]
- Instruction-tuned Language Models are Better Knowledge LearnersZhengbao Jiang, Zhiqing Sun, Weijia Shi, Pedro Rodríguez 0001, Chunting Zhou, Graham Neubig, Xi Victoria Lin, Wen-tau Yih, Srini Iyer 0001. 5421-5434 [doi]
- What Do Language Models Hear? Probing for Auditory Representations in Language ModelsJerry Ngo, Yoon Kim. 5435-5448 [doi]
- Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse MotifsZae Myung Kim, Kwang Hee Lee, Preston Zhu, Vipul Raheja, Dongyeop Kang. 5449-5474 [doi]
- Jailbreak Open-Sourced Large Language Models via Enforced DecodingHangfan Zhang, Zhimeng Guo, Huaisheng Zhu, Bochuan Cao, Lu Lin 0001, Jinyuan Jia 0001, Jinghui Chen, Dinghao Wu. 5475-5493 [doi]
- NICE: To Optimize In-Context Examples or Not?Pragya Srivastava, Satvik Golechha, Amit Deshpande, Amit Sharma 0007. 5494-5510 [doi]
- CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and GenerationWeixiang Yan, Haitian Liu, Yunkun Wang, Yunzhe Li, Qian Chen 0003, Wen Wang, Tingyu Lin 0002, Weishan Zhao, Li Zhu, Hari Sundaram, ShuiGuang Deng. 5511-5558 [doi]
- Digital Socrates: Evaluating LLMs through Explanation CritiquesYuling Gu, Oyvind Tafjord, Peter Clark. 5559-5586 [doi]
- SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware DecodingZhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia 0001, Bill Yuchen Lin, Radha Poovendran. 5587-5605 [doi]
- Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?Guijin Son, Sangwon Baek, Sangdae Nam, Ilgyun Jeong, Seungone Kim. 5606-5627 [doi]
- Experiential Co-Learning of Software-Developing AgentsChen Qian, Yufan Dang, Jiahao Li, Wei Liu, Zihao Xie, Yifei Wang, Weize Chen, Cheng Yang 0002, Xin Cong, Xiaoyin Che, Zhiyuan Liu 0001, Maosong Sun 0001. 5628-5640 [doi]
- Learning Geometry-Aware Representations for New Intent DiscoveryKai Tang, Junbo Zhao 0002, Xiao Ding, Runze Wu, Lei Feng 0006, Gang Chen 0001, Haobo Wang. 5641-5654 [doi]
- Speaker Verification in Agent-generated ConversationsYizhe Yang, Palakorn Achananuparp, Heyan Huang, Jing Jiang 0001, Ee-Peng Lim. 5655-5676 [doi]
- Benchmarking Data Science AgentsYuge Zhang, Qiyang Jiang, XingyuHan XingyuHan, Nan Chen, Yuqing Yang 0001, Kan Ren. 5677-5700 [doi]
- Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language ModelsTianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang 0001, Xiaolei Wang, Xin Zhao 0018, Furu Wei, Ji-Rong Wen. 5701-5715 [doi]
- Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language ModelsShiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu 0001, Ruifeng Xu, Min Yang 0007. 5716-5731 [doi]
- A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment TechniquesMegh Thakkar, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, Sarath Chandar. 5732-5745 [doi]
- Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank AdaptationXiang Luo 0003, Zhiwen Tang, Jin Wang, Xuejie Zhang. 5746-5765 [doi]
- PRP-Graph: Pairwise Ranking Prompting to LLMs with Graph Aggregation for Effective Text Re-rankingJian Luo, Xuanang Chen, Ben He, Le Sun 0001. 5766-5776 [doi]
- RepCodec: A Speech Representation Codec for Speech TokenizationZhichao Huang, Chutong Meng, Tom Ko. 5777-5790 [doi]
- GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trickJiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao. 5791-5808 [doi]
- Event-Radar: Event-driven Multi-View Learning for Multimodal Fake News DetectionZihan Ma, Minnan Luo, Hao Guo, Zhi Zeng, Yiran Hao, Xiang Zhao 0002. 5809-5821 [doi]
- Fine-Grained Modeling of Narrative Context: A Coherence Perspective via Retrospective QuestionsLiyan Xu, Jiangnan Li, Mo Yu, Jie Zhou. 5822-5838 [doi]
- Stealthy Attack on Large Language Model based RecommendationJinghao Zhang, Yuting Liu, Qiang Liu 0006, Shu Wu, Guibing Guo, Liang Wang 0001. 5839-5857 [doi]
- Multi-Dimensional Optimization for Text Summarization via Reinforcement LearningSangwon Ryu, Heejin Do, Yunsu Kim 0001, Gary Lee, Jungseul Ok. 5858-5871 [doi]
- Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language ModelsChangyu Chen, Xiting Wang, Ting-En Lin, Ang Lv, Yuchuan Wu, Xin Gao 0001, Ji-Rong Wen, Rui Yan 0001, Yongbin Li. 5872-5900 [doi]
- SEER: Facilitating Structured Reasoning and Explanation via Reinforcement LearningGuoxin Chen, Kexin Tang, Chao Yang 0026, Fuying Ye, Yu Qiao 0001, Yiming Qian. 5901-5921 [doi]
- Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label LearningYeachan Kim, Junho Kim, SangKeun Lee 0001. 5922-5936 [doi]
- SparseFlow: Accelerating Transformers by Sparsifying Information FlowsYeachan Kim, SangKeun Lee 0001. 5937-5948 [doi]
- ProtT3: Protein-to-Text Generation for Text-based Protein UnderstandingZhiyuan Liu 0001, An Zhang 0003, Hao Fei 0001, Enzhi Zhang, Xiang Wang 0010, Kenji Kawaguchi, Tat-Seng Chua. 5949-5966 [doi]
- KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language ModelsZhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye 0004, Jindong Wang 0001, Xing Xie 0001, Yue Zhang 0004, Shikun Zhang. 5967-5985 [doi]
- EmoBench: Evaluating the Emotional Intelligence of Large Language ModelsSahand Sabour, Siyang Liu 0003, Zheyuan Zhang, June M. Liu, Jinfeng Zhou, Alvionna S. Sunaryo, Tatia M. C. Lee, Rada Mihalcea, Minlie Huang. 5986-6004 [doi]
- Are AI-Generated Text Detectors Robust to Adversarial Perturbations?Guanhua Huang, Yuchen Zhang, Zhe Li, Yongjian You, Mingze Wang, Zhouwang Yang. 6005-6024 [doi]
- FinTextQA: A Dataset for Long-form Financial Question AnsweringJian Chen, Peilin Zhou, Yining Hua, Loh Xin, Kehui Chen, Ziyuan Li, Bing Zhu, Junwei Liang 0006. 6025-6047 [doi]
- On Measuring Faithfulness or Self-consistency of Natural Language ExplanationsLetitia Parcalabescu, Anette Frank. 6048-6089 [doi]
- Learning or Self-aligning? Rethinking Instruction Fine-tuningMengjie Ren, Boxi Cao, Hongyu Lin, Cao Liu, Xianpei Han, Ke Zeng, Guanglu Wan, Xunliang Cai, Le Sun 0001. 6090-6105 [doi]
- Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key?Qineng Wang, Zihao Wang 0001, Ying Su, Hanghang Tong, Yangqiu Song. 6106-6131 [doi]
- Soft Knowledge Prompt: Help External Knowledge Become a Better Teacher to Instruct LLM in Knowledge-based VQAQunbo Wang, Ruyi Ji, Tianhao Peng, Wenjun Wu, Zechao Li, Jing Liu 0001. 6132-6143 [doi]
- TasTe: Teaching Large Language Models to Translate through Self-ReflectionYutong Wang, Jiali Zeng, Xuebo Liu 0002, Fandong Meng, Jie Zhou 0016, Min Zhang 0005. 6144-6158 [doi]
- Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language ModelsXudong Lu, Qi Liu, Yuhui Xu, Aojun Zhou, Siyuan Huang, Bo Zhang 0069, Junchi Yan, Hongsheng Li 0001. 6159-6172 [doi]
- UNIMO-G: Unified Image Generation through Multimodal Conditional DiffusionWei Li, Xue Xu, Jiachen Liu, Xinyan Xiao. 6173-6188 [doi]
- The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM AbilitiesDavid Stap, Eva Hasler, Bill Byrne, Christof Monz, Ke Tran. 6189-6206 [doi]
- Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?Hexiang Tan, Fei Sun 0001, Wanli Yang, Yuanzhuo Wang, Qi Cao, Xueqi Cheng. 6207-6227 [doi]
- Unveiling Linguistic Regions in Large Language ModelsZhihao Zhang, Jun Zhao 0019, Qi Zhang 0001, Tao Gui, Xuanjing Huang 0001. 6228-6247 [doi]
- Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and AccompanimentZhiqing Hong, Rongjie Huang, Xize Cheng, Yongqi Wang, Ruiqi Li, Fuming You, Zhou Zhao, Zhimeng Zhang. 6248-6261 [doi]
- FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence SelectionYufei Huang 0008, Xu Han 0007, Maosong Sun 0001. 6262-6276 [doi]
- Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models' Understanding of Discourse RelationsYisong Miao, Hongfu Liu 0002, Wenqiang Lei, Nancy F. Chen, Min-Yen Kan. 6277-6295 [doi]
- An Open Multilingual System for Scoring Readability of WikipediaMykola Trokhymovych, Indira Sen, Martin Gerlach. 6296-6311 [doi]
- Unlearning Traces the Influential Training Data of Language ModelsMasaru Isonuma, Ivan Titov. 6312-6325 [doi]
- Exploring Alignment in Shared Cross-lingual SpacesBasel Mousi, Nadir Durrani, Fahim Dalvi, Majd Hawasly, Ahmed Abdelali. 6326-6348 [doi]
- Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language ModelsWenxuan Wang 0001, Wenxiang Jiao, Jingyuan Huang, Ruyi Dai, Jen-tse Huang 0001, Zhaopeng Tu, Michael R. Lyu. 6349-6384 [doi]
- Self-Evolving GPT: A Lifelong Autonomous Experiential LearnerJinglong Gao, Xiao Ding, Yiming Cui, Jianbai Zhao, Hepeng Wang, Ting Liu 0001, Bing Qin 0001. 6385-6432 [doi]
- WRP: Weight Recover Prune for Structured SparsityZhenDong Tan, Xingjun Zhang, Zheng Wei. 6433-6443 [doi]
- Error-preserving Automatic Speech Recognition of Young English Learners' LanguageJanick Michot, Manuela Hürlimann, Jan Deriu, Luzia Sauer, Katsiaryna Mlynchyk, Mark Cieliebak. 6444-6454 [doi]
- DiFiNet: Boundary-Aware Semantic Differentiation and Filtration Network for Nested Named Entity RecognitionYuxiang Cai, Qiao Liu 0003, Yanglei Gan, Run Lin, Changlin Li, Xueyi Liu, Da Luo, JiayeYang JiayeYang. 6455-6471 [doi]
- Legal Case Retrieval: A Survey of the State of the ArtYi Feng 0005, Chuanyi Li, Vincent Ng 0001. 6472-6485 [doi]
- Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text GenerationTianqi Zhong, Zhaoyi Li, Quan Wang 0002, Linqi Song, Ying Wei 0001, Defu Lian, Zhendong Mao. 6486-6517 [doi]
- LLaMA Pro: Progressive LLaMA with Block ExpansionChengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ying Shan, Ping Luo. 6518-6537 [doi]
- Generating Contrastive Narratives Using the Brownian Bridge Process for Narrative Coherence LearningFeiteng Mu, Wenjie Li 0002. 6538-6555 [doi]
- A Causal Approach for Counterfactual Reasoning in NarrativesFeiteng Mu, Wenjie Li 0002. 6556-6569 [doi]
- SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by SimulationMatthias Lindemann, Alexander Koller, Ivan Titov. 6570-6587 [doi]
- The Hidden Space of Transformer Language AdaptersJesujoba Alabi, Marius Mosbach, Matan Eyal, Dietrich Klakow, Mor Geva. 6588-6607 [doi]
- A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated TextsNafis Irtiza Tripto, Saranya Venkatraman, Dominik Macko, Róbert Móro, Ivan Srba, Adaku Uchendu, Thai Le, Dongwon Lee 0001. 6608-6625 [doi]
- Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken ConversationsGuan-Ting Lin, Cheng-Han Chiang, Hung-yi Lee. 6626-6642 [doi]
- RetinaQA: A Robust Knowledge Base Question Answering Model for both Answerable and Unanswerable QuestionsPrayushi Faldu, Indrajit Bhattacharya, Mausam. 6643-6656 [doi]
- GroundingGPT: Language Enhanced Multi-modal Grounding ModelZhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Vu Tu, Zhida Huang, Tao Wang. 6657-6678 [doi]
- Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and ApproachesIslam Eldifrawi, Shengrui Wang, Amine Trabelsi. 6679-6692 [doi]
- Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen LanguagesCarlos Mullov, Quan Pham, Alexander Waibel. 6693-6709 [doi]
- SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory BudgetRui Kong, Yuanchun Li, Qingtian Feng, Weijun Wang, Xiaozhou Ye, Ye Ouyang, Linghe Kong, Yunxin Liu. 6710-6720 [doi]
- PixT3: Pixel-based Table-To-Text GenerationIñigo Alonso, Eneko Agirre, Mirella Lapata. 6721-6736 [doi]
- Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity AnswersGal Yona, Roee Aharoni, Mor Geva. 6737-6751 [doi]
- TAMS: Translation-Assisted Morphological SegmentationEnora Rice, Ali Marashian, Luke Gessler, Alexis Palmer, Katharina von der Wense. 6752-6765 [doi]
- XCodeEval: An Execution-based Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and RetrievalMohammad Abdullah Matin Khan, M. Saiful Bari, Xuan Do Long, Weishi Wang, Md. Rizwan Parvez, Shafiq Joty. 6766-6805 [doi]
- ProxyQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language ModelsHaochen Tan, Zhijiang Guo, Zhan Shi, Lu Xu, Zhili Liu, Yunlong Feng, Xiaoguang Li, Yasheng Wang, Lifeng Shang, Qun Liu, Linqi Song. 6806-6827 [doi]
- A Glitch in the Matrix? Locating and Detecting Language Model Grounding with FakepediaGiovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West 0001. 6828-6844 [doi]
- Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQAYue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Yang Zhao, Xinze Guan, Xin Wang 0061. 6845-6863 [doi]
- WebVoyager: Building an End-to-End Web Agent with Large Multimodal ModelsHongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu 0011, Yong Dai, Hongming Zhang 0009, Zhenzhong Lan, Dong Yu 0001. 6864-6890 [doi]
- Translation-based Lexicalization Generation and Lexical Gap Detection: Application to Kinship TermsSenyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak. 6891-6900 [doi]
- Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in ConversationsRitam Dutt, Zhen Wu, Jiaxin Shi, Divyanshu Sheth, Prakhar Gupta, Carolyn P. Rosé. 6901-6929 [doi]
- Robust Frame-Semantic Models with Lexical Unit Trees and Negative SamplesJacob Daniel Devasier, Yogesh Gurjar, Chengkai Li 0001. 6930-6941 [doi]
- Harnessing the Power of Large Language Models for Natural Language to First-Order Logic TranslationYuan Yang, Siheng Xiong, Ali Payani, Ehsan Shareghi, Faramarz Fekri. 6942-6959 [doi]
- Lightweight reranking for language model generationsSiddhartha Jain 0001, Xiaofei Ma 0001, Anoop Deoras, Bing Xiang. 6960-6984 [doi]
- ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer ReviewsMike D'Arcy, Alexis Ross, Erin Bransom, Bailey Kuehl, Jonathan Bragg, Tom Hope, Doug Downey. 6985-7001 [doi]
- The Unreasonable Effectiveness of Easy Training Data for Hard TasksPeter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe. 7002-7024 [doi]
- PLUG: Leveraging Pivot Language in Cross-Lingual Instruction TuningZhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu 0002, Mengzhao Jia, Meng Jiang 0001, Francesco Barbieri. 7025-7046 [doi]
- MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense ReasoningInderjeet Nair, Lu Wang 0008. 7047-7065 [doi]
- ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMsJustin Chih-Yao Chen, Swarnadeep Saha, Mohit Bansal. 7066-7085 [doi]
- Mirror: Multiple-perspective Self-Reflection Method for Knowledge-rich ReasoningHanqi Yan, Qinglin Zhu, Xinyu Wang, Lin Gui 0003, Yulan He 0001. 7086-7103 [doi]
- Where Do People Tell Stories Online? Story Detection Across Online CommunitiesMaria Antoniak, Joel Mire, Maarten Sap, Elliott Ash, Andrew Piper. 7104-7130 [doi]
- Large Language Models Are No Longer Shallow ParsersYuanhe Tian, Fei Xia, Yan Song. 7131-7142 [doi]
- Dialogue Summarization with Mixture of Experts based on Large Language ModelsYuanhe Tian, Fei Xia, Yan Song. 7143-7155 [doi]
- ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human PreferencesYuanhe Tian, Ruyi Gan, Yan Song 0004, Jiaxing Zhang 0001, Yongdong Zhang 0001. 7156-7173 [doi]
- An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMsDaking Rai, Ziyu Yao. 7174-7193 [doi]
- Leveraging Large Language Models for Learning Complex Legal Concepts through StorytellingHang Jiang, Xiajie Zhang, Robert Mahari, Daniel Kessler, Eric Ma, Tal August, Irene Li, Alex Pentland, Yoon Kim, Deb Roy, Jad Kabbara. 7194-7219 [doi]
- Intrinsic Task-based Evaluation for Referring Expression GenerationGuanyi Chen, Fahime Same, Kees van Deemter. 7220-7231 [doi]
- From Moments to Milestones: Incremental Timeline Summarization Leveraging Large Language ModelsQisheng Hu, Geonsik Moon, Hwee Tou Ng. 7232-7246 [doi]
- End-to-end Learning of Logical Rules for Enhancing Document-level Relation ExtractionKunxun Qi, Jianfeng Du, Hai Wan. 7247-7263 [doi]
- Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?Qingkai Fang, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng 0004. 7264-7277 [doi]
- Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked AutoencoderJiaqi Wang, Zhenxi Song, Zhengyu Ma, Xipeng Qiu, Min Zhang, Zhiguo Zhang. 7278-7292 [doi]
- CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent LayersLongwei Zou, Qingyang Wang, Han Zhao, Jiangangkong Jiangangkong, Yi Yang, Yangdong Deng. 7293-7307 [doi]
- Prompt Optimization via Adversarial In-Context LearningDo Long, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Shieh, Junxian He. 7308-7327 [doi]
- StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice ConversionZhichao Wang 0002, Yuanzhe Chen, Xinsheng Wang, Lei Xie 0001, Yuping Wang. 7328-7338 [doi]
- Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question AnsweringZhengliang Shi, Shuo Zhang 0006, Weiwei Sun 0001, Shen Gao, Pengjie Ren, Zhumin Chen, Zhaochun Ren. 7339-7353 [doi]
- Multimodal Contextualized Semantic Parsing from SpeechJordan Voas, David Harwath, Raymond Mooney. 7354-7369 [doi]
- LaMP: When Large Language Models Meet PersonalizationAlireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani. 7370-7392 [doi]
- AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data FiltersLi Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, Jesse Dodge. 7393-7420 [doi]
- MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn DialoguesGe Bai, Jie Liu, Xingyuan Bu, Yancheng He, Jiaheng Liu, Zhanhui Zhou, Zhuoran Lin, Wenbo Su, Tiezheng Ge, Bo Zheng 0007, Wanli Ouyang. 7421-7454 [doi]
- EFSA: Towards Event-Level Financial Sentiment AnalysisTianyu Chen, Yiming Zhang, Guoxin Yu, Dapeng Zhang, Li Zeng, Qing He 0003, Xiang Ao 0001. 7455-7467 [doi]
- What Evidence Do Language Models Find Convincing?Alexander Wan, Eric Wallace, Dan Klein. 7468-7484 [doi]
- Advancement in Graph Understanding: A Multimodal Benchmark and Fine-Tuning of Vision-Language ModelsQihang Ai, Jiafan Li, Jincheng Dai, Jianwu Zhou, Lemao Liu, Haiyun Jiang, Shuming Shi 0001. 7485-7501 [doi]
- LangBridge: Multilingual Reasoning Without Multilingual SupervisionDongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo. 7502-7522 [doi]
- Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMsSiyuan Wang, Zhongyu Wei, Yejin Choi 0001, Xiang Ren 0001. 7523-7543 [doi]
- SEGO: Sequential Subgoal Optimization for Mathematical Problem-SolvingXueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong. 7544-7565 [doi]
- Unlocking the Power of Large Language Models for Entity AlignmentXuhui Jiang, Yinghan Shen, ZhiChao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, Huawei Shen, Yuanzhuo Wang. 7566-7583 [doi]
- Trial and Error: Exploration-Based Trajectory Optimization of LLM AgentsYifan Song, Da Yin, Xiang Yue, Jie Huang, Sujian Li, Bill Yuchen Lin. 7584-7600 [doi]
- ReFT: Reasoning with Reinforced Fine-TuningLuong Quoc Trung, Xinbo Zhang, Zhanming Jie, Peng Sun, Xiaoran Jin, Hang Li. 7601-7614 [doi]
- Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge AlignmentYunxin Li, Xinyu Chen, Baotian Hu, Haoyuan Shi, Min Zhang. 7615-7626 [doi]
- FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text GenerationZijian Feng, Hanzhang Zhou, Kezhi Mao, Zixiao Zhu. 7627-7640 [doi]
- HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria DecompositionYuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun 0008, Qi Zhang 0066. 7641-7660 [doi]
- Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of the State of the ArtShengjie Li 0002, Vincent Ng 0001. 7661-7681 [doi]
- Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion AttributionFlor Miriam Plaza del Arco, Amanda Cercas Curry, Alba Cercas Curry, Gavin Abercrombie, Dirk Hovy. 7682-7696 [doi]
- Label Augmentation for Zero-Shot Hierarchical Text ClassificationLorenzo Paletto, Valerio Basile, Roberto Esposito. 7697-7706 [doi]
- STICKERCONV: Generating Multimodal Empathetic Responses from ScratchYiqun Zhang, Fanheng Kong, Peidong Wang, Shuang Sun, Lingshuai Wang, Shi Feng 0001, Daling Wang, Yifei Zhang 0003, Kaisong Song. 7707-7733 [doi]
- EIT: Enhanced Interactive TransformerTong Zheng, Bei Li, Huiwen Bao, Tong Xiao, Jingbo Zhu. 7734-7751 [doi]
- MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMsYavuz Faruk Bakman, Duygu Nur Yaldiz, Baturalp Buyukates, Chenyang Tao, Dimitrios Dimitriadis, Salman Avestimehr. 7752-7767 [doi]
- EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language ModelsRocktim Jyoti Das, Simeon Emilov Hristov, Haonan Li 0002, Dimitar Dimitrov, Ivan Koychev, Preslav Nakov. 7768-7791 [doi]
- Order-Agnostic Data Augmentation for Few-Shot Named Entity RecognitionHuiming Wang, LiYing Cheng, Wenxuan Zhang, De Wen Soh, Lidong Bing. 7792-7807 [doi]
- Text Embedding Inversion Security for Multilingual Language ModelsYiyi Chen 0002, Heather C. Lent, Johannes Bjerva. 7808-7827 [doi]
- Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-AlignmentKeming Lu, Bowen Yu 0002, Chang Zhou, Jingren Zhou. 7828-7840 [doi]
- PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User SimulatorChuyi Kong, Yaxin Fan, Xiang Wan, Feng Jiang, Benyou Wang. 7841-7863 [doi]
- Synthesizing Text-to-SQL Data from Weak and Strong LLMsJiaxi Yang, Binyuan Hui, Min Yang, Jian Yang 0003, Junyang Lin, Chang Zhou. 7864-7875 [doi]
- STRUCTSUM Generation for Faster Text ComprehensionParag Jain, Andreea Marzoca, Francesco Piccinno. 7876-7896 [doi]
- Analysing The Impact of Sequence Composition on Language Model Pre-TrainingYu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Milos, Yuxiang Wu, Pasquale Minervini. 7897-7912 [doi]
- NACL: A General and Effective KV Cache Eviction Framework for LLM at Inference TimeYilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang 0006, Tingwen Liu, Shuohuan Wang, Yu Sun, Dianhai Yu, Hua Wu 0003. 7913-7926 [doi]
- SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural NetworkKexin Wang, Jiahong Zhang, Yong Ren, Man Yao, Di Shang, Bo Xu 0002, Guoqi Li. 7927-7940 [doi]
- Context-aware Difference Distilling for Multi-change CaptioningYunbin Tu, Liang Li 0003, Li Su 0003, Zheng-Jun Zha, Chenggang Yan 0001, Qingming Huang. 7941-7956 [doi]
- Dataflow-Guided Retrieval Augmentation for Repository-Level Code CompletionWei Cheng, Yuhan Wu, Wei Hu. 7957-7977 [doi]
- Chain-of-Exemplar: Enhancing Distractor Generation for Multimodal Educational Question GenerationHaohao Luo, Yang Deng 0002, Ying Shen 0001, See-Kiong Ng, Tat-Seng Chua. 7978-7993 [doi]
- LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text ClassificationChunLiu ChunLiu, Hongguang Zhang, Kainan Zhao, Xinghai Ju, Lin Yang. 7994-8004 [doi]
- LEMON: Reviving Stronger and Smaller LMs from Larger LMs with Linear Parameter FusionYilong Chen, Junyuan Shang, Zhenyu Zhang 0006, Shiyao Cui, Tingwen Liu, Shuohuan Wang, Yu Sun 0029, Hua Wu 0003. 8005-8019 [doi]
- Speech Sense Disambiguation: Tackling Homophone Ambiguity in End-to-End Speech TranslationTengfei Yu, Xuebo Liu 0002, Liang Ding 0006, Kehai Chen, Dacheng Tao, Min Zhang 0005. 8020-8035 [doi]
- To be Continuous, or to be Discrete, Those are Bits of QuestionsYiran Wang 0006, Masao Utiyama. 8036-8049 [doi]
- Moûsai: Efficient Text-to-Music Diffusion ModelsFlavio Schneider, Ojasv Kamal, Zhijing Jin, Bernhard Schölkopf. 8050-8068 [doi]
- PokeMQA: Programmable knowledge editing for Multi-hop Question AnsweringHengrui Gu 0002, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang 0003, Xin Wang 0035. 8069-8083 [doi]
- MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme InterventionPrince Jha, Raghav Jain, Konika Mandal, Aman Chadha, Sriparna Saha 0001, Pushpak Bhattacharyya. 8084-8104 [doi]
- Efficient OCR for Building a Diverse Digital HistoryJacob Carlson, Tom Bryan, Melissa Dell. 8105-8115 [doi]
- Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency SpaceZongru Wu, Zhuosheng Zhang 0001, Pengzhou Cheng, Gongshen Liu. 8116-8134 [doi]
- ANAH: Analytical Annotation of Hallucinations in Large Language ModelsZiwei Ji, Yuzhe Gu, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen 0026. 8135-8158 [doi]
- Aligning Large Language Models for Controllable RecommendationsWensheng Lu, Jianxun Lian, Wei Zhang, Guanghua Li, Mingyang Zhou 0001, Hao Liao, Xing Xie. 8159-8172 [doi]
- Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution MethodsHaeun Yu, Pepa Atanasova, Isabelle Augenstein. 8173-8186 [doi]
- Full Parameter Fine-tuning for Large Language Models with Limited ResourcesKai Lv, Yuqing Yang 0004, Tengxiao Liu, Qipeng Guo, Xipeng Qiu. 8187-8198 [doi]
- M³CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-ThoughtQiguang Chen, Libo Qin 0001, Jin Zhang, Zhi Chen 0006, Xiao Xu 0005, Wanxiang Che. 8199-8221 [doi]
- Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language ModelsLongze Chen, Ziqiang Liu, Wanwei He, Yinhe Zheng, Hao Sun, Yunshui Li, Run Luo, Min Yang. 8222-8234 [doi]
- Label-Synchronous Neural Transducer for E2E Simultaneous Speech TranslationKeqi Deng, Philip C. Woodland. 8235-8251 [doi]
- Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RLYunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao 0007, Jiang Bian 0002, Kee-Eung Kim. 8252-8271 [doi]
- A Modular Approach for Multimodal Summarization of TV ShowsLouis Mahon, Mirella Lapata. 8272-8291 [doi]
- Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind CapabilitiesAlex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency. 8292-8308 [doi]
- BizBench: A Quantitative Reasoning Benchmark for Business and FinanceMichael Krumdick, Rik Koncel-Kedziorski, Viet Dac Lai, Varshini Reddy, Charles Lovering, Chris Tanner. 8309-8332 [doi]
- Direct Metric Optimization for Image Captioning through Reward-Weighted Augmented Data UtilizationTakumi Takada, Yuma Suzuki, Hiroki Takushima, Hayato Tanoue, Haruki Sato, Aiswariya Manoj Kumar, Hiroki Nishihara, Takayuki Hori, Kazuya Ueki. 8333-8346 [doi]
- Deciphering Hate: Identifying Hateful Memes and Their TargetsEftekhar Hossain, Omar Sharif, Mohammed Moshiul Hoque, Sarah Masud Preum. 8347-8359 [doi]
- Inducing Systematicity in Transformers by Attending to Structurally Quantized EmbeddingsYichen Jiang, Xiang Zhou, Mohit Bansal. 8360-8383 [doi]
- Label-Efficient Model Selection for Text GenerationShir Ashury-Tahan, Ariel Gera, Benjamin Sznajder, Leshem Choshen, Liat Ein-Dor, Eyal Shnarch. 8384-8402 [doi]
- Machine Unlearning of Pre-trained Large Language ModelsJin Yao, Eli Chien, Minxin Du, Xinyao Niu, Tianhao Wang 0001, Zezhou Cheng, Xiang Yue. 8403-8419 [doi]
- Competition of Mechanisms: Tracing How Language Models Handle Facts and CounterfactualsFrancesco Ortu, Zhijing Jin, Diego Doimo, Mrinmaya Sachan, Alberto Cazzaniga, Bernhard Schölkopf. 8420-8436 [doi]
- FactPICO: Factuality Evaluation for Plain Language Summarization of Medical EvidenceSebastian Joseph, Lily Chen, Jan Trienes, Hannah Louisa Göke, Monika Coers, Wei Xu 0004, Byron C. Wallace, Junyi Jessy Li. 8437-8464 [doi]
- BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad PredictionYinhao Bai, Yalan Xie, Xiaoyi Liu, Yuhua Zhao, Zhixin Han, Mengting Hu, Hang Gao 0003, Renhong Cheng. 8465-8482 [doi]
- Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context AttackYu Fu, Yufei Li, Wen Xiao, Cong Liu 0005, Yue Dong 0002. 8483-8502 [doi]
- Speech language models lack important brain-relevant semanticsSubba Reddy Oota, Emin Çelik, Fatma Deniz, Mariya Toneva. 8503-8528 [doi]
- DocLLM: A Layout-Aware Generative Language Model for Multimodal Document UnderstandingDongsheng Wang 0005, Natraj Raman, Mathieu Sibue, Zhiqiang Ma, Petr Babkin, Simerjot Kaur, Yulong Pei, Armineh Nourbakhsh, Xiaomo Liu. 8529-8548 [doi]
- Bypassing LLM Watermarks with Color-Aware SubstitutionsQilong Wu 0002, Varun Chandrasekaran. 8549-8581 [doi]
- Parallel Structures in Pre-training Data Yield In-Context LearningYanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He 0001. 8582-8592 [doi]
- OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language ModelsHainiu Xu, Runcong Zhao, Lixing Zhu, Jinhua Du, Yulan He 0001. 8593-8623 [doi]
- Towards Privacy-Aware Sign Language Translation at ScalePhillip Rust, Bowen Shi, Skyler Wang, Necati Cihan Camgöz, Jean Maillard. 8624-8641 [doi]
- Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective RewardsHaoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang. 8642-8655 [doi]
- Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled CharactersYinghui Li, Zishan Xu, Shaoshen Chen, Haojing Huang, Yangning Li, Shirong Ma, Yong Jiang 0001, Zhongli Li, Qingyu Zhou, Hai-Tao Zheng 0002, Ying Shen 0001. 8656-8668 [doi]
- RAVEL: Evaluating Interpretability Methods on Disentangling Language Model RepresentationsJing Huang 0020, Zhengxuan Wu, Christopher Potts, Mor Geva, Atticus Geiger. 8669-8687 [doi]
- Large Language Models as Zero-shot Dialogue State Tracker through Function CallingZekun Li 0008, Zhiyu Chen 0002, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Dong 0001, Adithya Sagar, Xifeng Yan, Paul A. Crook. 8688-8704 [doi]
- Faithful Chart Summarization with ChaTS-PiSyrine Krichene, Francesco Piccinno, Fangyu Liu 0001, Julian Eisenschlos. 8705-8723 [doi]
- Enhancing Dialogue State Tracking Models through LLM-backed User-Agents SimulationCheng Niu, Xingguang Wang, Xuxin Cheng, Juntong Song, Tong Zhang 0001. 8724-8741 [doi]
- MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-CheckingTing-Chih Chen, Chia-Wei Tang, Chris Thomas. 8742-8757 [doi]
- KnowCoder: Coding Structured Knowledge into LLMs for Universal Information ExtractionZixuan Li, Yutao Zeng, Yuxin Zuo, Weicheng Ren, Wenxuan Liu, Miao Su, Yucan Guo, Yantao Liu, Lixiang Lixiang, Zhilei Hu, Long Bai 0002, Wei Li 0176, Yidan Liu, Pan Yang, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng. 8758-8779 [doi]
- ERA-CoT: Improving Chain-of-Thought through Entity Relationship AnalysisYanming Liu, Xinyue Peng, Tianyu Du, Jianwei Yin, Weihao Liu, Xuhong Zhang 0002. 8780-8794 [doi]
- On the Multi-turn Instruction Following for Conversational Web AgentsYang Deng 0002, Xuan Zhang, Wenxuan Zhang, Yifei Yuan 0002, See-Kiong Ng, Tat-Seng Chua. 8795-8812 [doi]
- Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile AgentsShihan Deng, Weikai Xu, Hongda Sun, Wei Liu 0005, Tao Tan, Jianfeng Liu 0005, Ang Li, Jian Luan 0001, Bin Wang 0004, Rui Yan 0001, Shuo Shang. 8813-8831 [doi]
- MC²: Towards Transparent and Culturally-Aware NLP for Minority Languages in ChinaChen Zhang 0019, Mingxu Tao, Quzhe Huang, Jiuheng Lin, Zhibin Chen, Yansong Feng. 8832-8850 [doi]
- Decoder-only Streaming Transformer for Simultaneous TranslationShoutao Guo, Shaolei Zhang, Yang Feng 0004. 8851-8864 [doi]
- Defending Large Language Models Against Jailbreaking Attacks Through Goal PrioritizationZhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, Minlie Huang. 8865-8887 [doi]
- I am a Strange Dataset: Metalinguistic Tests for Language ModelsTristan Thrush, Jared Moore, Miguel Monares, Christopher Potts, Douwe Kiela. 8888-8907 [doi]
- TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful SpaceShaolei Zhang, Tian Yu, Yang Feng 0004. 8908-8949 [doi]
- ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-TrainingLe Zhuo, Zewen Chi, Minghao Xu, Heyan Huang, Jianan Zhao 0002, Heqi Zheng, Conghui He, Xian-Ling Mao, Wentao Zhang. 8950-8963 [doi]
- StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task LearningShaolei Zhang, Qingkai Fang, Shoutao Guo, Zhengrui Ma, Min Zhang, Yang Feng 0004. 8964-8986 [doi]
- Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language ModelsTianjie Ju, Yijin Chen, Xinwei Yuan, Zhuosheng Zhang 0001, Wei Du, Yubin Zheng, Gongshen Liu. 8987-9001 [doi]
- Why Don't Prompt-Based Fairness Metrics Correlate?Abdelrahman Zayed, Gonçalo Mordido, Ioana Baldini, Sarath Chandar. 9002-9019 [doi]
- NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative DataManuel Tonneau, Pedro Vitor Quinta de Castro, Karim Lasri, Ibrahim Farouq, Lakshmi Subramanian, Víctor Orozco-Olvera, Samuel Fraiberger. 9020-9040 [doi]
- M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture DatasetZhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang. 9041-9060 [doi]
- Mitigating Biases for Instruction-following Language Models via Bias Neurons EliminationNakyeong Yang, Taegwan Kang, Stanley Jungkyu Choi, Honglak Lee, Kyomin Jung. 9061-9073 [doi]
- Domain Adaptation for Subjective Induction Questions Answering on Products by Adversarial Disentangled LearningYufeng Zhang, Jianxing Yu, Yanghui Rao, Libin Zheng, Qinliang Su, Huaijie Zhu, Jian Yin 0001. 9074-9089 [doi]
- Revisiting Demonstration Selection Strategies in In-Context LearningKeqin Peng, Liang Ding 0006, Yancheng Yuan, Xuebo Liu 0002, Min Zhang 0005, Yuanxin Ouyang, Dacheng Tao. 9090-9101 [doi]
- Multimodal Table UnderstandingMingyu Zheng, Xinwei Feng, Qingyi Si, Qiaoqiao She, Zheng Lin 0001, Wenbin Jiang, Weiping Wang 0005. 9102-9124 [doi]
- Ex3: Automatic Novel Writing by Extracting, Excelsior and ExpandingHuang Lei, Jiaming Guo, Guanhua He, Xishan Zhang, Rui Zhang 0040, Shaohui Peng, Shaoli Liu, Tianshi Chen 0002. 9125-9146 [doi]
- Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context LearningMayur Patidar, Riya Sawhney, Avinash Kumar Singh, Biswajit Chatterjee, Mausam, Indrajit Bhattacharya. 9147-9165 [doi]
- WatME: Towards Lossless Watermarking Through Lexical RedundancyLiang Chen 0001, Yatao Bian, Yang Deng 0002, Deng Cai 0002, Shuaiyi Li, Peilin Zhao, Kam-Fai Wong. 9166-9180 [doi]
- Text-like Encoding of Collaborative Information in Large Language Models for RecommendationYang Zhang 0072, Keqin Bao, Ming Yan, Wenjie Wang 0007, Fuli Feng, Xiangnan He 0001. 9181-9191 [doi]
- MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in PerceptionYuhao Wang, Yusheng Liao, Heyang Liu, Hongcheng Liu, Yanfeng Wang, Yu Wang. 9192-9205 [doi]
- Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense ReasoningJiachun Li, Pengfei Cao, Chenhao Wang, Zhuoran Jin, Yubo Chen 0001, Daojian Zeng, Kang Liu 0001, Jun Zhao 0001. 9206-9230 [doi]
- Multi-Aspect Controllable Text Generation with Disentangled Counterfactual AugmentationYi Liu, Xiangyu Liu, Xiangrong Zhu, Wei Hu. 9231-9253 [doi]
- Reward-based Input Construction for Cross-document Relation ExtractionByeonghu Na, Suhyeon Jo, Yeongmin Kim, Il-Chul Moon. 9254-9270 [doi]
- Hyperspherical Multi-Prototype with Optimal Transport for Event Argument ExtractionGuangjun Zhang, Hu Zhang 0003, Yujie Wang 0003, Ru Li 0001, Hongye Tan, Jiye Liang. 9271-9284 [doi]
- Understanding Retrieval Robustness for Retrieval-augmented Image CaptioningWenyan Li, Jiaang Li, Rita Ramos, Raphael Tang, Desmond Elliott. 9285-9299 [doi]
- Semi-Supervised Spoken Language GlossificationHuijie Yao, Wengang Zhou, Hao Zhou, Houqiang Li. 9300-9312 [doi]
- SeeClick: Harnessing GUI Grounding for Advanced Visual GUI AgentsKanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Yantao Li 0003, Jianbing Zhang, Zhiyong Wu 0003. 9313-9332 [doi]
- InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated AnswersYakir Yehuda, Itzik Malkiel, Oren Barkan, Jonathan Weill, Royi Ronen, Noam Koenigstein. 9333-9347 [doi]
- F-Eval: Asssessing Fundamental Abilities with Refined Evaluation MethodsYu Sun, Keyuchen Keyuchen, Shujie Wang, Peiji Li, Qipeng Guo, Hang Yan 0001, Xipeng Qiu, Xuanjing Huang 0001, Dahua Lin. 9348-9369 [doi]
- Comparing Inferential Strategies of Humans and Large Language Models in Deductive ReasoningPhilipp Mondorf, Barbara Plank. 9370-9402 [doi]
- Whose Preferences? Differences in Fairness Preferences and Their Impact on the Fairness of AI Utilizing Human FeedbackMaria Lerner, Florian E. Dorner, Elliott Ash, Naman Goel. 9403-9425 [doi]
- Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human AnnotationsPeiyi Wang, Lei Li, Zhihong Shao, Runxin Xu, Damai Dai, Yifei Li, Deli Chen, Yu Wu, Zhifang Sui. 9426-9439 [doi]
- Large Language Models are not Fair EvaluatorsPeiyi Wang, Lei Li, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Lingpeng Kong, Qi Liu, Tianyu Liu, Zhifang Sui. 9440-9450 [doi]
- Improving Large Language Models in Event Relation Logical PredictionMeiqi Chen 0001, Yubo Ma, Kaitao Song, Yixin Cao 0002, Yan Zhang, Dongsheng Li 0002. 9451-9478 [doi]
- Synchronized Video Storytelling: Generating Video Narrations with Structured StorylineDingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng 0007, Qin Jin. 9479-9493 [doi]
- Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report GenerationWenting Chen, LinLin Shen, Jingyang Lin, Jiebo Luo, Xiang Li 0001, Yixuan Yuan. 9494-9509 [doi]
- T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by StepZehui Chen, Weihua Du, Wenwei Zhang, Kuikun Liu, Jiangning Liu, Miao Zheng, Jingming Zhuo, Songyang Zhang, Dahua Lin, Kai Chen 0026, Feng Zhao. 9510-9529 [doi]
- Are LLM-based Evaluators Confusing NLG Quality Criteria?Xinyu Hu, Mingqi Gao 0002, Sen Hu, Yang Zhang, Yicheng Chen, Teng Xu 0007, Xiaojun Wan 0001. 9530-9570 [doi]
- Synergistic Interplay between Search and Large Language Models for Information RetrievalJiazhan Feng, Chongyang Tao, Xiubo Geng, Tao Shen 0001, Can Xu, Guodong Long, Dongyan Zhao 0001, Daxin Jiang. 9571-9583 [doi]
- Linear Transformers with Learnable Kernel Functions are Better In-Context ModelsYaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov. 9584-9597 [doi]
- Temperature-scaling surprisal estimates improve fit to human reading times - but does it do so for the "right reasons"?Tong Liu, Iza Skrjanec, Vera Demberg. 9598-9619 [doi]
- Beyond Recognising Entailment: Formalising Natural Language Inference from an Argumentative PerspectiveAmeer Saadat-Yazdi, Nadin Kökciyan. 9620-9636 [doi]
- AnyGPT: Unified Multimodal LLM with Discrete Sequence ModelingJun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan 0001, Jie Fu, Tao Gui, Tianxiang Sun, Yu-Gang Jiang, Xipeng Qiu. 9637-9662 [doi]
- CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal ModelsZixin Chen, Hongzhan Lin 0001, Ziyang Luo, Mingfei Cheng, Jing Ma 0004, Guang Chen 0003. 9663-9687 [doi]
- Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt DistillationAiwei Liu, Haoping Bai, Zhiyun Lu, Xiang Kong, Xiaoming Wang, Jiulong Shan, Meng Cao, Lijie Wen 0001. 9688-9712 [doi]
- Diffusion Lens: Interpreting Text Encoders in Text-to-Image PipelinesMichael Toker, Hadas Orgad, Mor Ventura, Dana Arad, Yonatan Belinkov. 9713-9728 [doi]
- Parrot: Enhancing Multi-Turn Instruction Following for Large Language ModelsYuchong Sun, Che Liu, Kun Zhou 0002, Jinwen Huang, Ruihua Song, Xin Zhao 0018, Fuzheng Zhang, Di Zhang, Kun Gai. 9729-9750 [doi]
- Robust Singing Voice Transcription Serves SynthesisRuiqi Li, Yu Zhang 0126, Yongqi Wang, Zhiqing Hong, Rongjie Huang, Zhou Zhao. 9751-9766 [doi]
- VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language ModelTianyu Chen, Lin Li, ZhuLiuchuan ZhuLiuchuan, Zongyang Li, Xueqing Liu 0001, Guangtai Liang, Qianxiang Wang, Tao Xie 0001. 9767-9780 [doi]
- Self-Modifying State Modeling for Simultaneous Machine TranslationDonglei Yu, Xiaomian Kang, Yuchen Liu 0007, Yu Zhou 0001, Chengqing Zong. 9781-9795 [doi]
- MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language NavigationJiaqi Chen, Bingqian Lin, Ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee Kenneth Wong. 9796-9810 [doi]
- BadAgent: Inserting and Activating Backdoor Attacks in LLM AgentsYifei Wang, Dizhan Xue, Shengjie Zhang, Shengsheng Qian. 9811-9827 [doi]
- DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to DeterminacyHongda Sun, Weikai Xu, Wei Liu 0005, Jian Luan 0001, Bin Wang 0004, Shuo Shang, Ji-Rong Wen, Rui Yan 0001. 9828-9862 [doi]
- LePaRD: A Large-Scale Dataset of Judicial Citations to PrecedentRobert Mahari, Dominik Stammbach, Elliott Ash, Alex Pentland. 9863-9877 [doi]
- To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question AnsweringGiacomo Frisoni, Alessio Cocchieri, Alex Presepi, Gianluca Moro, Zaiqiao Meng. 9878-9919 [doi]
- MERA: A Comprehensive LLM Evaluation in RussianAlena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Anastasia Minaeva, Denis Dimitrov, Alexander Panchenko, Sergey Markov. 9920-9948 [doi]
- SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style TransferJie Zhao 0013, Ziyu Guan, Cai Xu, Wei Zhao 0019, Yue Jiang. 9949-9960 [doi]
- Dodo: Dynamic Contextual Compression for Decoder-only LMsGuanghui Qin, Corby Rosset, Ethan C. Chau, Nikhil Rao, Benjamin Van Durme. 9961-9975 [doi]
- POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine TranslationShilong Pan, Zhiliang Tian, Liang Ding 0006, Haoqi Zheng, Zhen Huang 0006, Zhihua Wen, Dongsheng Li 0001. 9976-9992 [doi]
- NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese JournalismMiao Li, Ming-Bin Chen, Bo Tang, ShengbinHou ShengbinHou, Pengyu Wang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Keming Mao, Cheng Peng, Yi Luo. 9993-10014 [doi]
- MAPO: Advancing Multilingual Reasoning through Multilingual-Alignment-as-Preference OptimizationShuaijie She, Wei Zou, Shujian Huang, Wenhao Zhu, Xiang Liu, Xiang Geng, Jiajun Chen. 10015-10027 [doi]
- Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial TrainingFeiteng Fang, Yuelin Bai, Shiwen Ni, Min Yang 0007, Xiaojun Chen 0006, Ruifeng Xu. 10028-10039 [doi]
- Predicting Text Preference Via Structured Comparative ReasoningJing Nathan Yan, Tianqi Liu 0002, Justin T. Chiu, Jiaming Shen, Zhen Qin 0001, Yue Yu, Charumathi Lakshmanan, Yair Kurzion, Alexander M. Rush, Jialu Liu, Michael Bendersky. 10040-10060 [doi]
- CoELM: Construction-Enhanced Language ModelingLvxiaowei Xu, Zhilin Gong, Jianhua Dai, Tianxiang Wang, Ming Cai, Jiawei Peng. 10061-10081 [doi]
- Uni-Dubbing: Zero-Shot Speech Synthesis from Visual ArticulationSongju Lei, Xize Cheng, Mengjiao Lyu, Jianqiao Hu, Jintao Tan, Runlin Liu, Lingyu Xiong, Tao Jin 0004, Xiandong Li, Zhou Zhao. 10082-10099 [doi]
- On the Impact of Calibration Data in Post-training Quantization and PruningMiles Williams, Nikolaos Aletras. 10100-10118 [doi]
- SymKGQA: Few-Shot Knowledge Graph Question Answering via Symbolic Program Generation and ExecutionPrerna Agarwal, Nishant Kumar, Srikanta Bedathur. 10119-10140 [doi]
- Meta-Task Prompting Elicits Embeddings from Large Language ModelsYibin Lei, Di Wu, Tianyi Zhou 0001, Tao Shen 0001, Yu Cao, Chongyang Tao, Andrew Yates. 10141-10157 [doi]
- A Sentiment Consolidation Framework for Meta-Review GenerationMiao Li, Jey Han Lau, Eduard H. Hovy. 10158-10177 [doi]
- Revisiting Structured Sentiment Analysis as Latent Dependency Graph ParsingChengjie Zhou, Bobo Li, Hao Fei 0001, Fei Li, Chong Teng, Donghong Ji. 10178-10191 [doi]
- OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language IdentificationYifan Peng, Yui Sudo, Muhammad Shakeel 0001, Shinji Watanabe 0001. 10192-10209 [doi]
- Do Large Language Models Latently Perform Multi-Hop Reasoning?Sohee Yang, Elena Gribovskaya, Nora Kassner, Mor Geva, Sebastian Riedel 0001. 10210-10229 [doi]
- MuggleMath: Assessing the Impact of Query and Response Augmentation on Math ReasoningChengpeng Li, Zheng Yuan 0002, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang 0010, Chang Zhou. 10230-10258 [doi]
- Harnessing Toulmin's theory for zero-shot argument explicationAnkita Gupta, Ethan Zuckerman, Brendan T. O'Connor 0001. 10259-10276 [doi]
- BinaryAlign: Word Alignment as Binary Sequence LabelingGaetan Latouche, Marc-André Carbonneau, Benjamin Swanson. 10277-10288 [doi]
- Quantifying the Persona Effect in LLM SimulationsTiancheng Hu, Nigel Collier. 10289-10307 [doi]
- Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?Nishant Balepur, Abhilasha Ravichander, Rachel Rudinger. 10308-10330 [doi]
- Retrieval Augmented Fact Verification by Synthesizing Contrastive ArgumentsZhenrui Yue, Huimin Zeng, Lanyu Shang, Yifan Liu, Yang Zhang 0031, Dong Wang 0002. 10331-10343 [doi]
- SyllabusQA: A Course Logistics Question Answering DatasetNigel Fernandez, Alexander Scarlatos, Andrew S. Lan. 10344-10369 [doi]
- MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language ModelsYilin Wen 0006, Zifeng Wang, Jimeng Sun. 10370-10388 [doi]
- AGB-DE: A Corpus for the Automated Legal Assessment of Clauses in German Consumer ContractsDaniel Braun 0003, Florian Matthes. 10389-10405 [doi]
- Examining the robustness of LLM evaluation to the distributional assumptions of benchmarksCharlotte Siska, Katerina Marazopoulou, Melissa Ailem, James Bono. 10406-10421 [doi]
- Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive TuningEric Pasewark, Kyle Montgomery, Kefei Duan, Dawn Song, Chenguang Wang 0001. 10422-10437 [doi]
- Bridging the Preference Gap between Retrievers and LLMsZixuan Ke, Weize Kong, Cheng Li 0012, Mingyang Zhang 0001, Qiaozhu Mei, Michael Bendersky. 10438-10451 [doi]
- Large Language Models Can Learn Temporal ReasoningSiheng Xiong, Ali Payani, Ramana Kompella, Faramarz Fekri. 10452-10470 [doi]
- Learning Relational Decomposition of Queries for Question Answering from TablesRaphaël Mouravieff, Benjamin Piwowarski, Sylvain Lamprier. 10471-10485 [doi]
- Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with PeopleDun-Ming Huang, Pol van Rijn, Ilia Sucholutsky, Raja Marjieh, Nori Jacoby. 10486-10512 [doi]
- Pareto Optimal Learning for Estimating Large Language Model ErrorsTheodore Zhao, Mu Wei, Joseph Preston, Hoifung Poon. 10513-10529 [doi]
- Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language ModelsVictor Agostinelli, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen. 10530-10541 [doi]
- Defending Against Alignment-Breaking Attacks via Robustly Aligned LLMBochuan Cao, Yuanpu Cao, Lu Lin 0001, Jinghui Chen. 10542-10560 [doi]
- Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language ModelsGuanming Xiong, Junwei Bao 0001, Wen Zhao. 10561-10582 [doi]
- LLMs in the Imaginarium: Tool Learning through Simulated Trial and ErrorBoshi Wang, Hao Fang, Jason Eisner, Benjamin Van Durme, Yu Su 0001. 10583-10604 [doi]
- HyperMoE: Towards Better Mixture of Experts via Transferring Among ExpertsHao Zhao, Zihan Qiu, Huijia Wu, Zili Wang, Zhaofeng He, Jie Fu. 10605-10618 [doi]
- Aligning Large Language Models with Human Preferences through Representation EngineeringWenhao Liu, Xiaohua Wang, Muling Wu, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang 0001. 10619-10638 [doi]
- CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language ModelsFuwen Luo, Chi Chen 0005, Zihao Wan, Zhaolu Kang, Qidong Yan, Yingjie Li, Xiaolong Wang, Siyu Wang, Ziyue Wang, Xiaoyue Mi, Peng Li 0030, Ning Ma, Maosong Sun 0001, Yang Liu 0005. 10639-10659 [doi]
- ARAIDA: Analogical Reasoning-Augmented Interactive Data AnnotationChen Huang, Yiping Jin, Ilija Ilievski, Wenqiang Lei, Jiancheng Lv 0001. 10660-10675 [doi]
- PolCLIP: A Unified Image-Text Word Sense Disambiguation Model via Generating Multimodal Complementary RepresentationsQihao Yang, Yong Li, Xuelin Wang, Fu Lee Wang, Tianyong Hao. 10676-10690 [doi]
- Prompted Aspect Key Point Analysis for Quantitative Review SummarizationAn Tang, Xiuzhen Zhang 0001, Minh Dinh, Erik Cambria. 10691-10708 [doi]
- Ask Again, Then Fail: Large Language Models' Vacillations in JudgmentQiming Xie, Zengzhi Wang, Yi Feng, Rui Xia. 10709-10745 [doi]
- CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language ModelsTong Zhang, Peixin Qin, Yang Deng 0002, Chen Huang, Wenqiang Lei, Junhong Liu, Dingnan Jin, Hongru Liang, Tat-Seng Chua. 10746-10766 [doi]
- Multimodal Reasoning with Multimodal Knowledge GraphJunlin Lee, Yequan Wang, Jing Li, Min Zhang. 10767-10782 [doi]
- Confidence is not Timeless: Modeling Temporal Validity for Rule-based Temporal Knowledge Graph ForecastingRikui Huang, Wei Wei 0002, Xiaoye Qu, Shengzhe Zhang, Dangyang Chen, Yu Cheng 0001. 10783-10794 [doi]
- CARE: A Clue-guided Assistant for CSRs to Read User ManualsWeihong Du, Jia Liu, Zujie Wen, Dingnan Jin, Hongru Liang, Wenqiang Lei. 10795-10811 [doi]
- Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning ProcessesDingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che. 10812-10828 [doi]
- PAGED: A Benchmark for Procedural Graphs Extraction from DocumentsWeihong Du, Wenrui Liao, Hongru Liang, Wenqiang Lei. 10829-10846 [doi]
- Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content DetectorsYing Zhou, Ben He, Le Sun 0001. 10847-10861 [doi]
- RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language ModelsCheng Niu, Yuanhao Wu, Juno Zhu, Siliang Xu, Kashun Shum, Randy Zhong, Juntong Song, Tong Zhang 0001. 10862-10878 [doi]
- The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language ModelsJunyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Xin Zhao 0018, Jian-Yun Nie, Ji-Rong Wen. 10879-10899 [doi]
- Revisiting Knowledge Distillation for Autoregressive Language ModelsQihuang Zhong, Liang Ding 0006, Li Shen 0008, Juhua Liu, Bo Du 0001, Dacheng Tao. 10900-10913 [doi]
- Continual Learning with Semi-supervised Contrastive Distillation for Incremental Neural Machine TranslationYunlong Liang, Fandong Meng, Jiaan Wang, Jinan Xu, Yufeng Chen 0005, Jie Zhou 0016. 10914-10928 [doi]
- Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask LearnersRongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Zhenhui Ye, Luping Liu, Zehan Wang 0001, Ziyue Jiang 0001, Xuankai Chang, Jiatong Shi, Chao Weng, Zhou Zhao, Dong Yu 0001. 10929-10942 [doi]
- Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New LanguagesShih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu-Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee. 10943-10959 [doi]
- PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-RailsNeal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz, Somesh Jha, Atul Prakash 0001. 10960-10976 [doi]
- Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLMs-Powered AssistanceBo Yuan, Yulin Chen, Yin Zhang, Wei Jiang. 10977-11011 [doi]
- CLOMO: Counterfactual Logical Modification with Large Language ModelsYinya Huang, Ruixin Hong, Hongming Zhang 0009, Wei Shao 0009, Zhicheng Yang, Dong Yu 0001, Changshui Zhang, Xiaodan Liang, Linqi Song. 11012-11034 [doi]
- Exploring Hybrid Question Answering via Program-based PromptingQi Shi 0002, Han Cui, Haofeng Wang, Qingfu Zhu, Wanxiang Che, Ting Liu 0001. 11035-11046 [doi]
- IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic LanguagesHarman Singh, Nitish Gupta, Shikhar Bharadwaj, Dinesh Tewari, Partha Talukdar. 11047-11073 [doi]
- Simple but Effective Compound Geometric Operations for Temporal Knowledge Graph CompletionRui Ying, Mengting Hu, Jianfeng Wu, Yalan Xie, Xiaoyi Liu, Zhunheng Wang, Ming Jiang, Hang Gao, Linlin Zhang, Renhong Cheng. 11074-11086 [doi]
- Uncertainty Aware Learning for Language Model AlignmentYikun Wang, Rui Zheng, Liang Ding 0006, Qi Zhang 0001, Dahua Lin, Dacheng Tao. 11087-11099 [doi]
- Interpretable User Satisfaction Estimation for Conversational Systems with Large Language ModelsYing Chun Lin, Jennifer Neville, Jack W. Stokes, Longqi Yang, Tara Safavi, Mengting Wan, Scott Counts, Siddharth Suri, Reid Andersen, Xiaofeng Xu, Deepak Gupta, Sujay Kumar Jauhar, Xia Song, Georg Buscher, Saurabh Tiwary, Brent J. Hecht, Jaime Teevan. 11100-11115 [doi]
- Fundamental Capabilities of Large Language Models and their Applications in Domain Scenarios: A SurveyJiawei Li, Yizhe Yang, Yu Bai 0018, Xiaofeng Zhou, Yinghao Li, Huashan Sun, Yuhang Liu, Xingpeng Si, Yuhao Ye, Yixiao Wu, Yiguan Lin, Bin Xu, Ren Bowen, Chong Feng, Yang Gao 0016, Heyan Huang. 11116-11141 [doi]
- Measuring Political Bias in Large Language Models: What Is Said and How It Is SaidYejin Bang, Delong Chen, Nayeon Lee, Pascale Fung. 11142-11159 [doi]
- Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool UseYuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang 0004, Yongbin Li, Rui Yan 0001. 11160-11174 [doi]
- Layer-Condensed KV Cache for Efficient Inference of Large Language ModelsHaoyi Wu, Kewei Tu. 11175-11188 [doi]
- Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich LanguagesYuanchi Zhang, Yile Wang, Zijun Liu, Shuo Wang, Xiaolong Wang, Peng Li 0030, Maosong Sun 0001, Yang Liu 0005. 11189-11204 [doi]
- Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization CorrelationsJiaxing Sun, Weiquan Huang, Jiang Wu, Chenya Gu, Wei Li 0044, Songyang Zhang, Hang Yan 0001, Conghui He. 11205-11228 [doi]
- Browse and Concentrate: Comprehending Multimodal Content via Prior-LLM Context FusionZiyue Wang, Chi Chen 0005, Yiqi Zhu, Fuwen Luo, Peng Li 0030, Ming Yan, Ji Zhang 0011, Fei Huang 0004, Maosong Sun 0001, Yang Liu 0005. 11229-11245 [doi]
- Model Composition for Multimodal Large Language ModelsChi Chen 0005, Yiyang Du, Zheng Fang, Ziyue Wang, Fuwen Luo, Peng Li 0030, Ming Yan, Ji Zhang 0011, Fei Huang 0004, Maosong Sun 0001, Yang Liu 0005. 11246-11262 [doi]
- Draft& Verify: Lossless Large Language Model Acceleration via Self-Speculative DecodingJun Zhang, Jue Wang 0019, Huan Li 0003, Lidan Shou, Ke Chen 0005, Gang Chen 0001, Sharad Mehrotra. 11263-11282 [doi]
- Soul-Mix: Enhancing Multimodal Machine Translation with Manifold MixupXuxin Cheng, Ziyu Yao 0001, Yifei Xin, Hao An, Hongxiang Li, Yaowei Li, Yuexian Zou. 11283-11294 [doi]
- Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language ModelsChangjiang Gao, Jixing Li, Jiajun Chen, Shujian Huang. 11295-11308 [doi]
- MIST: Mutual Information Maximization for Short Text ClusteringKrissanee Kamthawee, Can Udomcharoenchaikit, Sarana Nutanong. 11309-11324 [doi]
- Self-chats from Large Language Models Make Small Emotional Support Chatbot BetterZhonghua Zheng, Lizi Liao, Yang Deng 0002, Libo Qin 0001, Liqiang Nie. 11325-11345 [doi]
- Improving Conversational Abilities of Quantized Large Language Models via Direct Preference AlignmentJanghwan Lee, Seongmin Park, Sukjin Hong, Minsoo Kim, Du-Seong Chang, Jungwook Choi. 11346-11364 [doi]
- Complex Reasoning over Logical Queries on Commonsense Knowledge GraphsTianqing Fang, Zeming Chen, Yangqiu Song, Antoine Bosselut. 11365-11384 [doi]
- An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token RoutingZiwei Chai, Guoyin Wang 0002, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu 0001, Yang Yang 0009. 11385-11396 [doi]
- Learning to Plan and Generate Text with CitationsConstanza Fierro, Reinald Kim Amplayo, Fantine Huot, Nicola De Cao, Joshua Maynez, Shashi Narayan, Mirella Lapata. 11397-11417 [doi]
- Exploring Precision and Recall to assess the quality and diversity of LLMsFlorian Le Bronnec, Alexandre Verine, Benjamin Négrevergne, Yann Chevaleyre, Alexandre Allauzen. 11418-11441 [doi]
- Aligning Large Language Models by On-Policy Self-JudgmentSangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu. 11442-11459 [doi]
- IL-TUR: Benchmark for Indian Legal Text Understanding and ReasoningAbhinav Joshi, Shounak Paul, Akshat Sharma, Pawan Goyal 0002, Saptarshi Ghosh 0001, Ashutosh Modi. 11460-11499 [doi]
- JumpCoder: Go Beyond Autoregressive Coder via Online ModificationMouxiang Chen, Hao Tian, Zhongxin Liu, Xiaoxue Ren, Jianling Sun. 11500-11520 [doi]
- Aya Dataset: An Open-Access Collection for Multilingual Instruction TuningShivalika Singh, Freddie Vargus, Daniel D'Souza, Börje Karlsson 0001, Abinaya Mahendiran, Wei-Yin Ko, Herumb Shandilya, Jay Patel, Deividas Mataciunas, Laura O'Mahony, Mike Zhang, Ramith Hettiarachchi, Joseph Wilson, Marina Machado, Luisa Souza Moura, Dominik Krzeminski, Hakimeh Fadaei, Irem Ergün, Ifeoma Okoh, Aisha Alaagib, Oshan Mudannayake, Zaid Alyafeai, Minh Vu Chien, Sebastian Ruder, Surya Guthikonda, Emad A. Alghamdi, Sebastian Gehrmann, Niklas Muennighoff, Max Bartolo, Julia Kreutzer, Ahmet Üstün, Marzieh Fadaee, Sara Hooker. 11521-11567 [doi]
- Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel TasksAnwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, Tanmoy Chakraborty 0002. 11568-11587 [doi]
- Split and Rephrase with Large Language ModelsDavid Ponce, Thierry Etchegoyhen, Jesus Calleja, Harritxu Gete. 11588-11607 [doi]
- ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase PartitionLu Ye, Ze Tao, Yong Huang, Yang Li. 11608-11620 [doi]
- AlignBench: Benchmarking Chinese Alignment of Large Language ModelsXiao Liu 0036, Xuanyu Lei, Shengyuan Wang, Yue Huang, Andrew Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun 0001, Xiaotao Gu, Hongning Wang, Jing Zhang 0015, Minlie Huang, Yuxiao Dong, Jie Tang 0001. 11621-11640 [doi]
- SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language ModelsWeixiang Zhao, Shilong Wang, Yulin Hu, Yanyan Zhao, Bing Qin 0001, Xuanyu Zhang, Qing Yang 0033, Dongliang Xu, Wanxiang Che. 11641-11661 [doi]
- DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank DistributionYulong Mao, Kaiyu Huang, Changhao Guan, Ganglin Bao, Fengran Mo, Jinan Xu. 11662-11675 [doi]
- Cross-Lingual Knowledge Editing in Large Language ModelsJiaan Wang, Yunlong Liang, Zengkui Sun, Yuxuan Cao, Jiarong Xu, Fandong Meng. 11676-11686 [doi]
- Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot TechniquesAnar Yeginbergen, Maite Oronoz, Rodrigo Agerri. 11687-11699 [doi]
- Learning Task Decomposition to Assist Humans in Competitive ProgrammingJiaxin Wen, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, Minlie Huang. 11700-11723 [doi]
- An Entropy-based Text Watermarking Detection MethodYijian Lu, Aiwei Liu, Dianzhi Yu, Jingjing Li 0007, Irwin King. 11724-11735 [doi]
- Enhancing Explainable Rating Prediction through Annotated Macro ConceptsHuachi Zhou, Shuang Zhou 0012, Hao Chen 0062, Ninghao Liu, Fan Yang 0023, Xiao Huang 0001. 11736-11748 [doi]
- How to Engage your Readers? Generating Guiding Questions to Promote Active ReadingPeng Cui, Vilém Zouhar, Xiaoyu Zhang, Mrinmaya Sachan. 11749-11765 [doi]
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision PerspectiveZihao Yue, Liang Zhang, Qin Jin. 11766-11781 [doi]
- Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language GenerationXinglin Wang, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li 0001. 11782-11794 [doi]
- More frequent verbs are associated with more diverse valency frames: Efficient principles at the lexicon-grammar interfaceSiyu Tao, Lucia Donatelli, Michael Hahn. 11795-11810 [doi]
- Quantifying Generalizations: Exploring the Divide Between Human and LLMs' Sensitivity to QuantificationClaudia Collacciani, Giulia Rambelli, Marianna Bolognesi. 11811-11822 [doi]
- Can Large Language Models Interpret Noun-Noun Compounds? A Linguistically-Motivated Study on Lexicalized and Novel CompoundsGiulia Rambelli, Emmanuele Chersoni, Claudia Collacciani, Marianna Bolognesi. 11823-11835 [doi]
- CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent EvaluationQuan Tu, Shilong Fan, Zihang Tian, Tianhao Shen, Shuo Shang, Xin Gao 0001, Rui Yan 0001. 11836-11850 [doi]
- Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and BeyondYongqi Li 0001, Wenjie Wang 0007, Leigang Qu, Liqiang Nie, Wenjie Li 0002, Tat-Seng Chua. 11851-11861 [doi]
- Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad PredictionYice Zhang, Jie Zeng, Weiming Hu, Ziyi Wang, Shiwei Chen, Ruifeng Xu. 11862-11875 [doi]
- Learning to Generate Answers with Citations via Factual Consistency ModelsRami Aly, ZhiQiang Tang, Samson Tan, George Karypis. 11876-11896 [doi]
- Improving Text Embeddings with Large Language ModelsLiang Wang 0046, Nan Yang 0002, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei. 11897-11916 [doi]
- Self-Training with Direct Preference Optimization Improves Chain-of-Thought ReasoningTianduo Wang, Shichen Li, Wei Lu. 11917-11928 [doi]
- UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning DatasetHaoyu Wang, Shuo Wang, Yukun Yan, Xujia Wang, Zhiyu Yang, Yuzhuang Xu, Zhenghao Liu, Liner Yang, Ning Ding 0002, Xu Han 0007, Zhiyuan Liu 0001, Maosong Sun 0001. 11929-11942 [doi]
- Document-level Claim Extraction and Decontextualisation for Fact-CheckingZhenyun Deng, Michael Schlichtkrull, Andreas Vlachos 0001. 11943-11954 [doi]
- PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive LearningXiaoqi Qiu, YongJie Wang, Xu Guo 0002, Zhiwei Zeng, Yu Yue, Yuhong Feng, Chunyan Miao. 11955-11971 [doi]
- LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument ExtractionHanzhang Zhou, Junlang Qian, Zijian Feng, Hui Lu, Zixiao Zhu, Kezhi Mao. 11972-11990 [doi]
- Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language ModelsWeihong Zhong, Xiaocheng Feng, Liang Zhao, Qiming Li, Lei Huang 0021, Yuxuan Gu, Weitao Ma, Yuan Xu, Bing Qin 0001. 11991-12011 [doi]
- mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language ModelsHuiyuan Lai, Malvina Nissim. 12012-12026 [doi]
- GunStance: Stance Detection for Gun Control and Gun RegulationNikesh Gyawali, Iustin Sirbu, Tiberiu Sosea, Sarthak Khanal, Doina Caragea, Traian Rebedea, Cornelia Caragea. 12027-12044 [doi]
- Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text GenerationZdenek Kasner, Ondrej Dusek. 12045-12072 [doi]
- Don't Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech DetectionMin Zhang, Jianfeng He, Taoran Ji, Chang-Tien Lu. 12073-12086 [doi]
- Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality EstimationGiorgos Vernikos, Andrei Popescu-Belis. 12087-12105 [doi]
- Generating and Evaluating Plausible Explanations for Knowledge Graph CompletionAntonio Di Mauro, Zhao Xu 0001, Wiem Ben Rim, Timo Sztyler, Carolin Lawrence. 12106-12118 [doi]
- One Prompt To Rule Them All: LLMs for Opinion Summary EvaluationTejpalsingh Siledar, Swaroop Nath, Sankara Sri Raghava Ravindra Muddu, Rupasai Rangaraju, Swaprava Nath, Pushpak Bhattacharyya, Suman Banerjee, Amey Patil, Sudhanshu Singh, Muthusamy Chelliah, Nikesh Garera. 12119-12134 [doi]
- LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine TranslationShaoLin Zhu, Leiyu Pan, Bo Li, Deyi Xiong. 12135-12148 [doi]
- A Joint Coreference-Aware Approach to Document-Level Target Sentiment AnalysisHongjie Cai, Heqing Ma, Jianfei Yu, Rui Xia. 12149-12160 [doi]
- VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language ModelsQingxing Cao, Junhao Cheng, Xiaodan Liang, Liang Lin. 12161-12176 [doi]
- AutoDSL: Automated domain-specific language design for structural representation of procedures with constraintsYu-Zhe Shi, Haofei Hou, Zhangqian Bi, Fanxu Meng, Xiang Wei, Lecheng Ruan, Qining Wang. 12177-12214 [doi]
- Multipath parsing in the brainBerta Franzluebbers, Donald Dunagan, Milos Stanojevic, Jan Buys, John T. Hale. 12215-12229 [doi]
- Search-Adaptor: Embedding Customization for Information RetrievalJinsung Yoon, Yanfei Chen, Sercan Ö. Arik, Tomas Pfister. 12230-12247 [doi]
- Back to Basics: Revisiting REINFORCE-Style Optimization for Learning from Human Feedback in LLMsArash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker. 12248-12267 [doi]
- VIEScore: Towards Explainable Metrics for Conditional Image Synthesis EvaluationMax Ku, Dongfu Jiang, Cong Wei, Xiang Yue, Wenhu Chen. 12268-12290 [doi]
- Tree Transformer's Disambiguation Ability of Prepositional Phrase Attachment and Garden Path EffectsLingling Zhou, Suzan Verberne, Gijs Wijnholds. 12291-12301 [doi]
- Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge GraphsElan Markowitz, Anil Ramakrishna, Jwala Dhamala, Ninareh Mehrabi, Charith Peris, Rahul Gupta 0001, Kai-Wei Chang, Aram Galstyan. 12302-12319 [doi]
- Structured Tree Alignment for Evaluation of (Speech) Constituency ParsingFreda Shi, Kevin Gimpel, Karen Livescu. 12320-12332 [doi]
- ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image GenerationAkshita Jha, Vinodkumar Prabhakaran, Remi Denton, Sarah Laszlo, Shachi Dave, Rida Qadri, Chandan K. Reddy, Sunipa Dev. 12333-12347 [doi]
- Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency CheckingXiaokang Zhang, Zijun Yao 0002, Jing Zhang 0001, Kaifeng Yun, Jifan Yu, Juanzi Li, Jie Tang 0001. 12348-12364 [doi]
- What Do Language Models Learn in Context? The Structured Task HypothesisJiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell. 12365-12379 [doi]
- Agent Lumos: Unified and Modular Training for Open-Source Language AgentsDa Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Raghavi Chandu, Kai-Wei Chang, Yejin Choi 0001, Bill Yuchen Lin. 12380-12403 [doi]
- Investigating Cultural Alignment of Large Language ModelsBadr AlKhamissi, Muhammad N. ElNokrashy, Mai Alkhamissi, Mona T. Diab. 12404-12422 [doi]
- More Victories, Less Cooperation: Assessing Cicero's Diplomacy PlayWichayaporn Wongkamjan, Feng Gu, Yanze Wang, Ulf Hermjakob, Jonathan May, Brandon M. Stewart, Jonathan K. Kummerfeld, Denis Peskoff, Jordan L. Boyd-Graber. 12423-12441 [doi]
- VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the WildPuyuan Peng, Po-Yao Huang 0001, Shang-wen Li 0001, Abdelrahman Mohamed, David Harwath. 12442-12462 [doi]
- RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text DetectorsLiam Dugan, Alyssa Hwang, Filip Trhlík, Andrew Zhu, Josh Magnus Ludan, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch. 12463-12492 [doi]
- Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog WhistlesJulia Kruk, Michela Marchini, Rijul Magu, Caleb Ziems, David Muchlinski, Diyi Yang. 12493-12509 [doi]
- On the Representational Capacity of Neural Language Models with Chain-of-Thought ReasoningFranz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell. 12510-12548 [doi]
- Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination TrendsSanjana Ramprasad, Elisa Ferracane, Zachary C. Lipton. 12549-12561 [doi]
- LLM in a flash: Efficient Large Language Model Inference with Limited MemoryKeivan Alizadeh, Seyed-Iman Mirzadeh, Dmitry Belenko, S. Khatamifard, Minsik Cho, Carlo C. del Mundo, Mohammad Rastegari, Mehrdad Farajtabar. 12562-12584 [doi]
- Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language ModelsMuhammad Maaz 0001, Hanoona Abdul Rasheed, Salman Khan 0001, Fahad Khan. 12585-12602 [doi]
- To Distill or Not to Distill? On the Robustness of Robust Knowledge DistillationAbdul Waheed, Karima Kadaoui, Muhammad Abdul-Mageed. 12603-12621 [doi]
- LayerSkip: Enabling Early Exit Inference and Self-Speculative DecodingMostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud 0002, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu. 12622-12642 [doi]
- Classist Tools: Social Class Correlates with Performance in NLPAmanda Cercas Curry, Giuseppe Attanasio, Zeerak Talat, Dirk Hovy. 12643-12655 [doi]
- ActionIE: Action Extraction from Scientific Literature with Programming LanguagesXianrui Zhong, Yufeng Du, Siru Ouyang, Ming Zhong 0005, Tingfeng Luo, Qirong Ho, Hao Peng, Heng Ji, Jiawei Han 0001. 12656-12671 [doi]
- A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking SpeechGaurav Verma, Rynaa Grover, Jiawei Zhou 0002, Binny Mathew, Jordan Kraemer, Munmun De Choudhury, Srijan Kumar. 12672-12684 [doi]
- Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMsZhiwei Cao, Qian Cao, Yu Lu, Ningxin Peng, Luyang Huang, Shanbo Cheng, Jinsong Su. 12685-12695 [doi]
- COSMIC: Mutual Information for Task-Agnostic Summarization EvaluationMaxime Darrin, Philippe Formont, Jackie Chi Kit Cheung, Pablo Piantanida. 12696-12717 [doi]
- EUROPA: A Legal Multilingual Keyphrase Generation DatasetOlivier Salaün, Frédéric Piedboeuf, Guillaume Le Berre, David Alfonso-Hermelo, Philippe Langlais. 12718-12736 [doi]
- GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly ReviewsMaxime Darrin, Ines Arous, Pablo Piantanida, Jackie Chi Kit Cheung. 12737-12752 [doi]
- Peacock: A Family of Arabic Multimodal Large Language Models and BenchmarksFakhraddin Alwajih, El Moatez Billah Nagoudi, Gagan Bhatia, Abdelrahman Mohamed, Muhammad Abdul-Mageed. 12753-12776 [doi]
- Generating Coherent Sequences of Visual Illustrations for Real-World Manual TasksJoão Bordalo, Vasco Ramos, Rodrigo Valerio, Diogo Glória-Silva, Yonatan Bitton, Michal Yarom, Idan Szpektor, João Magalhães. 12777-12797 [doi]
- Cheetah: Natural Language Generation for 517 African LanguagesIfe Adebara, AbdelRahim A. Elmadany, Muhammad Abdul-Mageed. 12798-12823 [doi]
- TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based ReasoningYilun Zhao 0001, Lyuhao Chen, Arman Cohan, Chen Zhao. 12824-12840 [doi]
- KnowledgeFMath: A Knowledge-Intensive Math Reasoning Dataset in Finance DomainsYilun Zhao 0001, Hongjun Liu, Yitao Long, Rui Zhang 0037, Chen Zhao, Arman Cohan. 12841-12858 [doi]
- API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMsKinjal Basu 0002, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, Maxwell Crouse, Asim Munawar, Vernon Austel, Sadhana Kumaravel, Vinod Muthusamy, Pavan Kapanipathi, Luis A. Lastras. 12859-12870 [doi]
- LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative TasksHanqing Wang, Bowen Ping, Shuo Wang, Xu Han 0007, Yun Chen 0007, Zhiyuan Liu 0001, Maosong Sun 0001. 12871-12882 [doi]
- Harder Task Needs More Experts: Dynamic Routing in MoE ModelsQuzhe Huang, Zhenwei An, Nan Zhuang, Mingxu Tao, Chen Zhang 0019, Yang Jin, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng. 12883-12895 [doi]
- XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech PerceptionHyoJung Han, Mohamed Anwar, Juan Pino 0001, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang. 12896-12911 [doi]
- SOTOPIA-π: Interactive Learning of Socially Intelligent Language AgentsRuiyi Wang, Haofei Yu, Wenxin Sharon Zhang, Zhengyang Qi, Maarten Sap, Yonatan Bisk, Graham Neubig, Hao Zhu 0011. 12912-12940 [doi]
- \mathcal XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-ExpertsYifeng Ding, Jiawei Liu 0004, Yuxiang Wei 0003, Lingming Zhang 0001. 12941-12955 [doi]
- Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model PruningTuc Nguyen, Thai Le. 12956-12973 [doi]
- Learning to Decode Collaboratively with Multiple Language ModelsZejiang Shen 0001, Hunter Lang, Bailin Wang, Yoon Kim, David A. Sontag. 12974-12990 [doi]
- DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language ModelsWeihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu 0001, Yiqun Liu 0001. 12991-13013 [doi]
- Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?Zhaochen Su, Juntao Li, Jun Zhang, Tong Zhu 0002, Xiaoye Qu, Pan Zhou 0001, Yan Bowen, Yu Cheng 0001, Min Zhang 0005. 13014-13033 [doi]
- CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model GenerationPei Ke, Bosi Wen, Andrew Feng, Xiao Liu 0036, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang 0001, Minlie Huang. 13034-13054 [doi]
- LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent EnvironmentsJunzhe Chen, Xuming Hu, Shuodi Liu, Shiyu Huang, Wei-Wei Tu, Zhaofeng He, Lijie Wen 0001. 13055-13077 [doi]
- Small But Funny: A Feedback-Driven Approach to Humor DistillationSahithya Ravi, Patrick Huber, Akshat Shrivastava, Vered Shwartz, Arash Einolghozati. 13078-13090 [doi]
- Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language ModelsFangzhi Xu, Zhiyong Wu 0003, Qiushi Sun, Siyu Ren, Fei Yuan, Shuai Yuan, Qika Lin, Yu Qiao, Jun Liu 0002. 13091-13116 [doi]
- From Sights to Insights: Towards Summarization of Multimodal Clinical DocumentsAkash Ghosh, Mohit Tomar, Abhisek Tiwari, Sriparna Saha 0001, Jatin Salve, Setu Sinha. 13117-13129 [doi]
- When Phrases Meet Probabilities: Enabling Open Relation Extraction with Cooperating Large Language ModelsJiaxin Wang, Lingling Zhang, Wee Sun Lee, Yujie Zhong, Liwei Kang, Jun Liu. 13130-13147 [doi]
- Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentationJán Cegin, Branislav Pecher, Jakub Simko, Ivan Srba, Mária Bieliková, Peter Brusilovsky. 13148-13171 [doi]
- Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in ArabicYassine El Kheir, Hamdy Mubarak, Ahmed Ali 0002, Shammur Absar Chowdhury. 13172-13184 [doi]
- Document-Level Machine Translation with Large-Scale Public Parallel CorporaProyag Pal, Alexandra Birch, Kenneth Heafield. 13185-13197 [doi]
- Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description LengthNur Lan, Emmanuel Chemla, Roni Katzir. 13198-13210 [doi]
- Context versus Prior Knowledge in Language ModelsKevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell. 13211-13235 [doi]
- Word Matters: What Influences Domain Adaptation in Summarization?Yinghao Li, Siyu Miao, Heyan Huang, Yang Gao 0016. 13236-13249 [doi]
- Visualization Recommendation with Prompt-based Reprogramming of Large Language ModelsXinhang Li, Jingbo Zhou, Wei Chen 0156, Derong Xu, Tong Xu 0001, Enhong Chen. 13250-13262 [doi]
- HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMsPranoy Panda, Ankush Agarwal, Chaitanya Devaguptapu, Manohar Kaul, Prathosh A P. 13263-13282 [doi]
- Toward In-Context Teaching: Adapting Examples to Students' MisconceptionsAlexis Ross, Jacob Andreas. 13283-13310 [doi]
- Bridging Word-Pair and Token-Level Metaphor Detection with Explainable Domain MiningYuan Tian, Ruike Zhang, Nan Xu, Wenji Mao. 13311-13325 [doi]
- Faithful Logical Reasoning via Symbolic Chain-of-ThoughtJundong Xu, Hao Fei 0001, Liangming Pan, Qian Liu 0012, Mong-Li Lee, Wynne Hsu. 13326-13365 [doi]
- S²GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment AnalysisBingfeng Chen, Qihan Ouyang, Yongqi Luo, Boyan Xu, Ruichu Cai, Zhifeng Hao. 13366-13379 [doi]
- Maverick: Efficient and Accurate Coreference Resolution Defying Recent TrendsGiuliano Martinelli, Edoardo Barba, Roberto Navigli. 13380-13394 [doi]
- ESCoT: Towards Interpretable Emotional Support Dialogue SystemsTenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin. 13395-13412 [doi]
- PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question AnsweringFangzhi Xu, Qika Lin, Tianzhe Zhao, Jiawei Han 0010, Jun Liu 0002. 13413-13429 [doi]
- WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright ProtectionAnudeex Shetty, Yue Teng, Ke He, Qiongkai Xu. 13430-13444 [doi]
- Advancing Parameter Efficiency in Fine-tuning via Representation EditingMuling Wu, Wenhao Liu, Xiaohua Wang, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang 0001. 13445-13464 [doi]
- Context Consistency between Training and Inference in Simultaneous Machine TranslationMeizhi Zhong, Lemao Liu, Kehai Chen, Mingming Yang, Min Zhang 0005. 13465-13476 [doi]
- Using Natural Language Explanations to Improve Robustness of In-context LearningXuanli He, Yuxiang Wu, Oana-Maria Camburu, Pasquale Minervini, Pontus Stenetorp. 13477-13499 [doi]
- Chunk, Align, Select: A Simple Long-sequence Processing Method for TransformersJiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du. 13500-13519 [doi]
- ArchCode: Incorporating Software Requirements in Code Generation with Large Language ModelsHojae Han, Jaejin Kim, Jaeseok Yoo, Youngwon Lee 0003, Seung-won Hwang. 13520-13552 [doi]
- Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial LabelsZixia Jia, Junpeng Li, Shichuan Zhang, Anji Liu, Zilong Zheng. 13553-13569 [doi]
- MULFE: A Multi-Level Benchmark for Free Text Model EditingChenhao Wang, Pengfei Cao, Zhuoran Jin, Yubo Chen 0001, Daojian Zeng, Kang Liu 0001, Jun Zhao 0001. 13570-13587 [doi]
- MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-SpeechShengpeng Ji, Ziyue Jiang 0001, Hanting Wang, Jialong Zuo, Zhou Zhao. 13588-13600 [doi]
- Spatially-Aware Speaker for Vision-and-Language Navigation Instruction GenerationMuraleekrishna Gopinathan, Martin Masek, Jumana Abu-Khalaf, David Suter. 13601-13614 [doi]
- HiRoPE: Length Extrapolation for Code Models Using Hierarchical PositionKechi Zhang, Ge Li 0001, Huangzhao Zhang, Zhi Jin. 13615-13627 [doi]
- Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional TrainingJunqing He, Kunhao Pan, Xiaoqun Dong, Zhuoyang Song, LiuYiBo LiuYiBo, Qianguosun Qianguosun, Yuxin Liang, Hao Wang, Enming Zhang, Jiaxing Zhang. 13628-13642 [doi]
- CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding ChallengesKechi Zhang, Jia Li 0012, Ge Li 0001, Xianjie Shi, Zhi Jin. 13643-13658 [doi]
- When is Tree Search Useful for LLM Planning? It Depends on the DiscriminatorZiru Chen, Michael White, Raymond J. Mooney, Ali Payani, Yu Su 0001, Huan Sun 0001. 13659-13678 [doi]
- LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language ModelsMihir Parmar, Nisarg Patel, Neeraj Varshney, Mutsumi Nakamura, Man Luo 0003, Santosh Mashetty, Arindam Mitra, Chitta Baral. 13679-13707 [doi]
- Meta-Tuning LLMs to Leverage Lexical Knowledge for Generalizable Language Style UnderstandingRuohao Guo, Wei Xu, Alan Ritter. 13708-13731 [doi]
- Reducing Privacy Risks in Online Self-Disclosures with Language ModelsYao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan Ritter, Wei Xu. 13732-13754 [doi]
- Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language ModelsZihao Lin 0003, Mohammad Beigi, Hongxuan Li, Yufan Zhou, Yuxiang Zhang, Qifan Wang, Wenpeng Yin 0001, Lifu Huang. 13755-13772 [doi]
- REFINESUMM: Self-Refining MLLM for Generating a Multimodal Summarization DatasetVaidehi Patil, Leonardo F. R. Ribeiro, Mengwen Liu, Mohit Bansal, Markus Dreyer. 13773-13786 [doi]
- When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model LeaderboardsNorah Alzahrani, Hisham Abdullah Alyahya, Yazeed Alnumay, Sultan Alrashed, Shaykhah Alsubaie, Yousef Almushayqih, Faisal Mirza, Nouf Alotaibi, Nora Al-Twairesh, Areeb Alowisheq, M. Saiful Bari, Haidar Khan. 13787-13805 [doi]
- LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language TextsHelia Hashemi, Jason Eisner, Corby Rosset, Benjamin Van Durme, Chris Kedzie. 13806-13834 [doi]
- LIEDER: Linguistically-Informed Evaluation for Discourse Entity RecognitionXiaomeng Zhu, Robert Frank 0001. 13835-13850 [doi]
- Evaluating Very Long-Term Conversational Memory of LLM AgentsAdyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang. 13851-13870 [doi]
- Prototypical Reward Network for Data-Efficient Model AlignmentJinghan Zhang, Xiting Wang, Yiqiao Jin, Changyu Chen, Xinhao Zhang, Kunpeng Liu 0001. 13871-13884 [doi]
- NEO-BENCH: Evaluating Robustness of Large Language Models with NeologismsJonathan Zheng, Alan Ritter, Wei Xu 0004. 13885-13906 [doi]
- Impacts of Misspelled Queries on Translation and Product SearchGreg Hanneman, Natawut Monaikul, Taichi Nakatani. 13907-13920 [doi]
- Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMsBilgehan Sel, Priya Shanmugasundaram, Mohammad Kachuee, Kun Zhou, Ruoxi Jia 0001, Ming Jin 0002. 13921-13959 [doi]
- The MERSA Dataset and a Transformer-Based Approach for Speech Emotion RecognitionEnshi Zhang, Rafael Trujillo, Christian Poellabauer. 13960-13970 [doi]
- Transparent and Scrutable Recommendations Using Natural Language User ProfilesJerome Ramos, Hossein A. Rahmani, Xi Wang 0012, Xiao Fu 0007, Aldo Lipani. 13971-13984 [doi]
- Fora: A corpus and framework for the study of facilitated dialogueHope Schroeder, Deb Roy, Jad Kabbara. 13985-14001 [doi]
- Explanation-aware Soft Ensemble Empowers Large Language Model In-context LearningYue Yu, Jiaming Shen, Tianqi Liu 0002, Zhen Qin 0001, Jing Nathan Yan, Jialu Liu, Chao Zhang 0014, Michael Bendersky. 14002-14024 [doi]
- What is the Best Way for ChatGPT to Translate Poetry?Shanshan Wang, Derek F. Wong, Jingming Yao, Lidia S. Chao. 14025-14043 [doi]
- Rephrasing the Web: A Recipe for Compute and Data-Efficient Language ModelingPratyush Maini, Skyler Seto, Richard He Bai, David Grangier, Yizhe Zhang 0002, Navdeep Jaitly. 14044-14072 [doi]
- DeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language Models via Causal InterventionJunda Wu, Tong Yu 0001, Xiang Chen, Haoliang Wang, Ryan A. Rossi, SungChul Kim, Anup B. Rao, Julian J. McAuley. 14073-14087 [doi]
- Representation Learning with Conditional Information Flow MaximizationDou Hu 0001, Lingwei Wei, Wei Zhou 0019, Songlin Hu. 14088-14103 [doi]
- GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark ConstructionVirginia K. Felkner, Jennifer A. Thompson, Jonathan May. 14104-14115 [doi]
- Quantifying Contamination in Evaluating Code Generation Capabilities of Language ModelsMartin Riddell, Ansong Ni, Arman Cohan. 14116-14137 [doi]
- Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task ArithmeticRishabh Bhardwaj, Do Duc Anh, Soujanya Poria. 14138-14149 [doi]
- Tracking the Newsworthiness of Public DocumentsAlexander Spangher, Serdar Tumgoren, Ben Welsh, Nanyun Peng, Emilio Ferrara, Jonathan May. 14150-14168 [doi]
- EWEK-QA : Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering SystemsMohammad Dehghan, Mohammad Ali Alomrani, Sunyam Bagga, David Alfonso-Hermelo, Khalil Bibi, Abbas Ghaddar, Yingxue Zhang 0001, Xiaoguang Li, Jianye Hao, Qun Liu 0001, Jimmy Lin, Boxing Chen, Prasanna Parthasarathi, Mahdi Biparva, Mehdi Rezagholizadeh. 14169-14187 [doi]
- Multi-modal Preference Alignment Remedies Degradation of Visual Instruction Tuning on Language ModelsShengzhi Li, Rongyu Lin, Shichao Pei. 14188-14200 [doi]
- Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence GenerationJiachen Zhao, Wenlong Zhao 0001, Andrew Drozdov, Benjamin Rozonoyer, Md. Arafat Sultan, Jay Yoon Lee, Mohit Iyyer, Andrew McCallum. 14201-14214 [doi]
- Controlled Text Generation for Black-box Language Models via Score-based Progressive EditorSangwon Yu, Changmin Lee, Hojin Lee 0006, Sungroh Yoon. 14215-14237 [doi]
- LogogramNLP: Comparing Visual and Textual Representations of Ancient Logographic Writing Systems for NLPDanlu Chen, Freda Shi, Aditi Agarwal, Jacobo Myerston, Taylor Berg-Kirkpatrick. 14238-14254 [doi]
- Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-TuningMing Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng 0001, Tianyi Zhou 0001. 14255-14273 [doi]
- Confabulation: The Surprising Value of Large Language Model HallucinationsPeiqi Sui, Eamon Duede, Sophie Wu, Richard Jean So. 14274-14284 [doi]
- IAPT: Instance-Aware Prompt Tuning for Large Language ModelsWei Zhu 0016, Aaron Xuxiang Tian, Congrui Yin, Yuan Ni, Xiaoling Wang, Guotong Xie. 14285-14304 [doi]
- DeVAn: Dense Video Annotation for Video-Language ModelsTingkai Liu, Yunzhe Tao, Haogeng Liu, Qihang Fan, Ding Zhou, Huaibo Huang, Ran He 0001, Hongxia Yang. 14305-14321 [doi]
- How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMsYi Zeng 0005, Hongpeng Lin, Jingwen Zhang, Diyi Yang, Ruoxi Jia 0001, Weiyan Shi. 14322-14350 [doi]
- The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language ModelsAdithya Bhaskar, Dan Friedman, Danqi Chen 0001. 14351-14368 [doi]
- Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language ModelsLei Li 0039, Yuqi Wang, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu 0049. 14369-14387 [doi]
- L-Eval: Instituting Standardized Evaluation for Long Context Language ModelsChenxin An, Shansan Gong, Ming Zhong 0005, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu. 14388-14411 [doi]
- DIALECTBENCH: An NLP Benchmark for Dialects, Varieties, and Closely-Related LanguagesFahim Faisal, Orevaoghene Ahia, Aarohi Srivastava, Kabir Ahuja, David Chiang 0001, Yulia Tsvetkov, Antonios Anastasopoulos. 14412-14454 [doi]
- Causal-Guided Active Learning for Debiasing Large Language ModelsZhouhao Sun, Li Du, Xiao Ding, Yixuan Ma, Yang Zhao, Kaitao Qiu, Ting Liu 0001, Bing Qin 0001. 14455-14469 [doi]
- PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM AgentsQisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang 0001. 14470-14505 [doi]
- Towards Better Understanding of Contrastive Sentence Representation Learning: A Unified Paradigm for GradientMingxin Li, Richong Zhang, Zhijie Nie. 14506-14521 [doi]
- Emergent Word Order Universals from Cognitively-Motivated Language ModelsTatsuki Kuribayashi, Ryo Ueda, Ryo Yoshida, Yohei Oseki, Ted Briscoe, Timothy Baldwin. 14522-14543 [doi]
- Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology ViewJintian Zhang, Xin Xu, Ningyu Zhang 0001, Ruibo Liu, Bryan Hooi, Shumin Deng. 14544-14607 [doi]
- MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module PluginTianshuo Zhou, Sen Mei, Xinze Li, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu 0001, Yu Gu 0002, Ge Yu 0001. 14608-14624 [doi]
- Distributional Inclusion Hypothesis and Quantifications: Probing for Hypernymy in Functional Distributional SemanticsChun Hei Lo, Wai Lam, Hong Cheng 0001, Guy Emerson. 14625-14637 [doi]
- CausalGym: Benchmarking causal interpretability methods on linguistic tasksAryaman Arora, Dan Jurafsky, Christopher Potts. 14638-14663 [doi]
- Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM CollaborationShangbin Feng, Weijia Shi, Yike Wang 0002, Wenxuan Ding 0001, Vidhisha Balachandran, Yulia Tsvetkov. 14664-14690 [doi]
- Mission: Impossible Language ModelsJulie Kallini, Isabel Papadimitriou, Richard Futrell, Kyle Mahowald, Christopher Potts. 14691-14714 [doi]
- Semisupervised Neural Proto-Language ReconstructionLiang Lu, Peirong Xie, David R. Mortensen. 14715-14759 [doi]
- Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli. 14760-14778 [doi]
- Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?Roshan Sharma 0001, Suwon Shon, Mark Lindsey, Hira Dhamyal, Bhiksha Raj. 14779-14797 [doi]
- D2LLM: Decomposed and Distilled Large Language Models for Semantic SearchZihan Liao, Hang Yu 0002, Jianguo Li, Jun Wang 0006, Wei Zhang 0056. 14798-14814 [doi]
- Arabic Diacritics in the Wild: Exploiting Opportunities for Improved DiacritizationSalman Elgamal, Ossama Obeid, MHD Tameem Kabbani, Go Inoue, Nizar Habash. 14815-14829 [doi]
- Disinformation Capabilities of Large Language ModelsIvan Vykopal, Matús Pikuliak, Ivan Srba, Róbert Móro, Dominik Macko, Mária Bieliková. 14830-14847 [doi]
- Learn or Recall? Revisiting Incremental Learning with Pre-trained Language ModelsJunhao Zheng, Shengjie Qiu, Qianli Ma 0001. 14848-14877 [doi]
- How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field StudyAndreas Waldis, Yufang Hou 0001, Iryna Gurevych. 14878-14898 [doi]
- Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian LanguagesSamuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Tjeng Wawan Cenggoro, Jhonson Lee, Salsabil Maulana Akbar, Emmanuel Dave, Nuur Shadieq, Muhammad Ihza Mahendra, Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung. 14899-14914 [doi]
- Must NLP be Extractive?Steven Bird. 14915-14929 [doi]
- Spiral of Silence: How is Large Language Model Killing Information Retrieval? - A Case Study on Open Domain Question AnsweringXiaoyang Chen, Ben He, Hongyu Lin, Xianpei Han, Tianshu Wang, Boxi Cao, Le Sun 0001, Yingfei Sun. 14930-14951 [doi]
- Latxa: An Open Language Model and Evaluation Suite for BasqueJulen Etxaniz, Oscar Sainz, Naiara Miguel, Itziar Aldabe, German Rigau, Eneko Agirre, Aitor Ormazabal, Mikel Artetxe, Aitor Soroa. 14952-14972 [doi]
- Why are Sensitive Functions Hard for Transformers?Michael Hahn 0001, Mark Rofin. 14973-15008 [doi]
- Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and ReactionHaoqiu Yan, Yongxin Zhu, Kai Zheng, Bing Liu, Haoyu Cao, Deqiang Jiang, Linli Xu. 15009-15022 [doi]
- IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code GeneratorsIndraneil Paul, Goran Glavas, Iryna Gurevych. 15023-15041 [doi]
- The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-tuningRochelle Choenni, Anne Lauscher, Ekaterina Shutova. 15042-15058 [doi]
- MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language ModelingTomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer. 15059-15076 [doi]
- MultiLegalPile: A 689GB Multilingual Legal CorpusJoel Niklaus, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho. 15077-15094 [doi]
- WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with CitationsHaolin Deng, Chang Wang, Xin Li, Dezhang Yuan, Junlang Zhan, Tianhua Zhou, Jin Ma 0003, Jun Gao, Ruifeng Xu. 15095-15114 [doi]
- What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular LanguagesNadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell. 15115-15134 [doi]
- Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency ParsingBehzad Shayegh, Yuqiao Wen, Lili Mou. 15135-15156 [doi]
- ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMsFengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li 0026, Radha Poovendran. 15157-15173 [doi]
- ChatDev: Communicative Agents for Software DevelopmentChen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang 0002, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu 0001, Maosong Sun 0001. 15174-15186 [doi]
- Disentangled Learning with Synthetic Parallel Data for Text Style TransferJingxuan Han, Quan Wang 0002, Zikang Guo, Benfeng Xu, Licheng Zhang, Zhendong Mao. 15187-15201 [doi]
- PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System SafetyZaibin Zhang, Yongting Zhang, Lijun Li, Jing Shao, Hongzhi Gao, Yu Qiao 0001, Lijun Wang, Huchuan Lu, Feng Zhao. 15202-15231 [doi]
- Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support ConversationDongjin Kang, Sunghwan Kim, Taeyoon Kwon, Seungjun Moon, Hyunsouk Cho, Youngjae Yu, Dongha Lee, Jinyoung Yeo. 15232-15261 [doi]
- ınftyBench: Extending Long Context Evaluation Beyond 100K TokensXinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, Junhao Chen, Moo Khai Hao, Xu Han 0007, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu 0001, Maosong Sun 0001. 15262-15277 [doi]
- Natural Language Satisfiability: Exploring the Problem Distribution and Evaluating Transformer-based Language ModelsTharindu Madusanka, Ian Pratt-Hartmann, Riza Batista-Navarro. 15278-15294 [doi]
- Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language ModelsPaul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Kirk, Hinrich Schütze, Dirk Hovy. 15295-15311 [doi]
- AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in ItalianGiovanni Puccetti 0002, Anna Rogers, Chiara Alzetta, Felice dell'Orletta, Andrea Esuli. 15312-15338 [doi]
- Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language ModelsMosh Levy, Alon Jacoby, Yoav Goldberg. 15339-15353 [doi]
- Disambiguate Words like Composing Them: A Morphology-Informed Approach to Enhance Chinese Word Sense DisambiguationYue Wang, Qiliang Liang, Yaqi Yin, Hansi Wang, Yang Liu. 15354-15365 [doi]
- Do Llamas Work in English? On the Latent Language of Multilingual TransformersChris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West 0001. 15366-15394 [doi]
- G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine TranslationXingyuan Pan, Luyang Huang, Liyan Kang, Zhicheng Liu, Yu Lu, Shanbo Cheng. 15395-15406 [doi]
- Media Framing: A typology and Survey of Computational Approaches Across DisciplinesYulia Otmakhova 0001, Shima Khanehzar, Lea Frermann. 15407-15428 [doi]
- SPZ: A Semantic Perturbation-based Data Augmentation Method with Zonal-Mixing for Alzheimer's Disease DetectionFangfang Li, Cheng Huang, Puzhen Su, Jie Yin. 15429-15439 [doi]
- Calibrating Large Language Models Using Their Generations OnlyDennis Ulmer, Martin Gubri, Hwaran Lee, Sangdoo Yun, Seong Joon Oh. 15440-15459 [doi]
- Iterative Forward Tuning Boosts In-Context Learning in Language ModelsJiaxi Yang, Binyuan Hui, Min Yang 0007, Bailin Wang, Bowen Li, Binhua Li, Fei Huang 0004, Yongbin Li. 15460-15473 [doi]
- Pride and Prejudice: LLM Amplifies Self-Bias in Self-RefinementWenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei Li 0005, William Wang 0001. 15474-15492 [doi]
- Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn'tChihiro Taguchi, David Chiang 0001. 15493-15503 [doi]
- Steering Llama 2 via Contrastive Activation AdditionNina Rimsky, Nick Gabrieli, Julian Schulz, Meg Tong, Evan Hubinger, Alexander Matt Turner. 15504-15522 [doi]
- EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic ActivitiesNian Li, Chen Gao 0001, Mingyu Li, Yong Li 0008, Qingmin Liao. 15523-15536 [doi]
- SafetyBench: Evaluating the Safety of Large Language ModelsZhexin Zhang, Leqi Lei, Lindong Wu, Rui Sun, Yongkang Huang, Chong Long, Xiao Liu 0036, Xuanyu Lei, Jie Tang 0001, Minlie Huang. 15537-15553 [doi]
- Deciphering Oracle Bone Language with Diffusion ModelsHaisu Guan, Huanxin Yang, Xinyu Wang 0010, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu. 15554-15567 [doi]
- M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language ModelsWai-Chung Kwan, Xingshan Zeng, Yufei Wang 0005, Yusen Sun, Liangyou Li, Yuxin Jiang, Lifeng Shang, Qun Liu 0001, Kam-Fai Wong. 15568-15592 [doi]
- RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via RomanizationJaavid Aktar Husain, Raj Dabre, Aswanth M., Jay Gala, Thanmay Jayakumar, Ratish Puduppully, Anoop Kunchukuttan. 15593-15615 [doi]
- Causal Estimation of Memorisation ProfilesPietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos 0001, Tiago Pimentel. 15616-15635 [doi]
- CHECKWHY: Causal Fact Verification via Argument StructureJiasheng Si, YiBo Zhao, YingJie Zhu, Haiyang Zhu, Wenpeng Lu, Deyu Zhou. 15636-15659 [doi]
- Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single ModelChristian Tomani, David Vilar, Markus Freitag, Colin Cherry, Subhajit Naskar, Mara Finkelstein, Xavier Garcia, Daniel Cremers. 15660-15679 [doi]
- On Efficient and Statistical Quality Estimation for Data AnnotationJan-Christoph Klie, Juan Haladjian, Marc Kirchner, Rahul Nair. 15680-15696 [doi]
- EZ-STANCE: A Large Dataset for English Zero-Shot Stance DetectionChenye Zhao, Cornelia Caragea. 15697-15714 [doi]
- American Sign Language Handshapes Reflect Pressures for Communicative EfficiencyKayo Yin, Terry Regier, Dan Klein. 15715-15724 [doi]
- Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining ResearchLuca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Raghavi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar 0009, Li Lucy, Xinxi Lyu, Nathan Lambert 0001, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson 0001, Zejiang Shen 0001, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Evan Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo. 15725-15788 [doi]
- OLMo: Accelerating the Science of Language ModelsDirk Groeneveld, Iz Beltagy, Evan Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert 0001, Kyle Richardson 0001, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi. 15789-15809 [doi]
- Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!Zhanhui Zhou, Jie Liu, Zhichen Dong, Jiaheng Liu, Chao Yang 0026, Wanli Ouyang, Yu Qiao 0001. 15810-15830 [doi]
- IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian LanguagesMohammed Safi Ur Rahman Khan, Priyam Mehta, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Suriyaprasaad B, Varun Balan G, Sparsh Jain, Anoop Kunchukuttan, Pratyush Kumar, Raj Dabre, Mitesh M. Khapra. 15831-15879 [doi]
- Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language ModelsXiaolong Wang, Yile Wang, Yuanchi Zhang, Fuwen Luo, Peng Li 0030, Maosong Sun, Yang Liu. 15880-15893 [doi]
- Aya Model: An Instruction Finetuned Open-Access Multilingual Language ModelAhmet Üstün, Viraat Aryabumi, Zheng Xin Yong, Wei-Yin Ko, Daniel D'Souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid, Freddie Vargus, Phil Blunsom, Shayne Longpre, Niklas Muennighoff, Marzieh Fadaee, Julia Kreutzer, Sara Hooker. 15894-15939 [doi]
- BatchEval: Towards Human-like Text EvaluationPeiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, Heda Wang, Yao Hu, Kan Li 0001. 15940-15958 [doi]
- ToMBench: Benchmarking Theory of Mind in Large Language ModelsZhuang Chen 0002, Jincenzi Wu, Jinfeng Zhou, Bosi Wen, Guanqun Bi, Gongyao Jiang, Yaru Cao, Mengting Hu, Yunghwei Lai, Zexuan Xiong, Minlie Huang. 15959-15983 [doi]
- COKE: A Cognitive Knowledge Graph for Machine Theory of MindJincenzi Wu, Zhuang Chen 0002, Jiawen Deng, Sahand Sabour, Helen Meng, Minlie Huang. 15984-16007 [doi]
- MultiPICo: Multilingual Perspectivist Irony CorpusSilvia Casola, Simona Frenda, Soda Marem Lo, Erhan Sezerer, Antonio Uva 0001, Valerio Basile, Cristina Bosco, Alessandro Pedrani, Chiara Rubagotti, Viviana Patti, Davide Bernardi. 16008-16021 [doi]
- AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding AgentsHarsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal, Niranjan Balasubramanian. 16022-16076 [doi]
- MMToM-QA: Multimodal Theory of Mind Question AnsweringChuanyang Jin, Yutong Wu, Jing Cao, Jiannan Xiang, Yen-ling Kuo, Zhiting Hu, Tomer D. Ullman, Antonio Torralba 0001, Joshua B. Tenenbaum, Tianmin Shu. 16077-16102 [doi]
- DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial DocumentsYilun Zhao 0001, Yitao Long, Hongjun Liu, Ryo Kamoi, Linyong Nan, Lyuhao Chen, Yixin Liu 0003, Xiangru Tang, Rui Zhang 0037, Arman Cohan. 16103-16120 [doi]
- Unintended Impacts of LLM Alignment on Global RepresentationMichael J. Ryan, William Held, Diyi Yang. 16121-16140 [doi]
- ICLEF: In-Context Learning with Expert Feedback for Explainable Style TransferArkadiy Saakyan, Smaranda Muresan. 16141-16163 [doi]
- MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracyDavis Yoshida, Kartik Goyal, Kevin Gimpel. 16164-16215 [doi]
- Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!Stefano Perrella, Lorenzo Proietti 0002, Alessandro Scirè, Edoardo Barba, Roberto Navigli. 16216-16244 [doi]
- NounAtlas: Filling the Gap in Nominal Semantic Role LabelingRoberto Navigli, Marco Pinto, Pasquale Silvestri, Dennis Rotondi, Simone Ciciliano, Alessandro Scirè. 16245-16258 [doi]
- The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive ConversationRongwu Xu, Brian S. Lin, Shujian Yang, Tianqi Zhang, Weiyan Shi, Tianwei Zhang 0004, Zhixuan Fang, Wei Xu, Han Qiu 0001. 16259-16303 [doi]
- LooGLE: Can Long-Context Language Models Understand Long Contexts?Jiaqi Li, Mengmeng Wang, Zilong Zheng, Muhan Zhang. 16304-16333 [doi]
- Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face ConversationSe Jin Park, Chae Won Kim, Hyeongseop Rha, Minsu Kim, Joanna Hong, Jeong Hun Yeo, Yong Man Ro. 16334-16348 [doi]
- ECBD: Evidence-Centered Benchmark Design for NLPYu-Lu Liu, Su Lin Blodgett, Jackie C. K. Cheung, Vera Liao, Alexandra Olteanu, Ziang Xiao. 16349-16365 [doi]
- Having Beer after Prayer? Measuring Cultural Bias in Large Language ModelsTarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu 0004. 16366-16393 [doi]
- Explicating the Implicit: Argument Detection Beyond Sentence BoundariesPaul Roit, Aviv Slobodkin, Eran Hirsch, Arie Cattan, Ayal Klein, Valentina Pyatkin, Ido Dagan. 16394-16409 [doi]
- Word Embeddings Are Steers for Language ModelsChi Han, Jialiang Xu, Manling Li, Yi Fung 0001, Chenkai Sun, Nan Jiang, Tarek F. Abdelzaher, Heng Ji. 16410-16430 [doi]