Abstract is missing.
- Frontmatter [doi]
- Are LLMs Good Annotators for Discourse-level Event Relation Extraction?Kangda Wei, Aayush Gautam, Ruihong Huang. 1-19 [doi]
- Transferability of Syntax-Aware Graph Neural Networks in Zero-Shot Cross-Lingual Semantic Role LabelingRachel Devianti, Yusuke Miyao. 20-42 [doi]
- Should Cross-Lingual AMR Parsing go Meta? An Empirical Assessment of Meta-Learning and Joint Learning AMR ParsingJeongwoo Kang 0001, Maximin Coavoux, Cédric Lopez, Didier Schwab. 43-51 [doi]
- General Collaborative Framework between Large Language Model and Experts for Universal Information ExtractionKunlong Bao, Ning Wang. 52-77 [doi]
- SEAVER: Attention Reallocation for Mitigating Distractions in Language Models for Conditional Semantic Textual Similarity MeasurementBaixuan Li, Yunlong Fan, Zhiqiang Gao. 78-95 [doi]
- Search if you don't know! Knowledge-Augmented Korean Grammatical Error Correction with Large Language ModelsSeonmin Koo, Jinsung Kim, Chanjun Park, HeuiSeok Lim. 96-125 [doi]
- Measuring the Robustness of NLP Models to Domain ShiftsNitay Calderon, Naveh Porat, Eyal Ben-David, Alexander Chapanin, Zorik Gekhman, Nadav Oved, Vitaly Shalumov, Roi Reichart. 126-154 [doi]
- Text2Model: Text-based Model Induction for Zero-shot Image ClassificationOhad Amosy, Tomer Volk, Eilam Shapira, Eyal Ben-David, Roi Reichart, Gal Chechik. 155-172 [doi]
- InsertGNN: A Hierarchical Graph Neural Network for the TOEFL Sentence Insertion ProblemFang Wu, Stan Z. Li. 173-180 [doi]
- Unleashing Large Language Models' Proficiency in Zero-shot Essay ScoringSanwoo Lee, Yida Cai, Desong Meng, Ziyang Wang, Yunfang Wu. 181-198 [doi]
- DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?Zhouhong Gu, Lin Zhang, Xiaoxuan Zhu, Jiangjie Chen, Wenhao Huang, Yikai Zhang, Shusen Wang, Zheyu Ye, Yan Gao, Hongwei Feng, Yanghua Xiao. 199-222 [doi]
- Improve Meta-learning for Few-Shot Text Classification with All You Can Acquire from the TasksXinyue Liu, Yunlong Gao, Linlin Zong, Bo Xu 0009. 223-235 [doi]
- CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level GranularityMoshe Berchansky, Daniel Fleischer, Moshe Wasserblat, Peter Izsak. 236-246 [doi]
- SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLMJielin Qiu, Andrea Madotto, Zhaojiang Lin, Paul A. Crook, Yifan Ethan Xu, Babak Damavandi, Xin Dong 0001, Christos Faloutsos, Lei Li 0005, Seungwhan Moon. 247-266 [doi]
- SRAP-Agent: Simulating and Optimizing Scarce Resource Allocation Policy with LLM-based AgentJiarui Ji, Yang Li, Hongtao Liu, Zhicheng Du, Zhewei Wei, Qi Qi, Weiran Shen, Yankai Lin. 267-293 [doi]
- Ukrainian Resilience: A Dataset for Detection of Help-Seeking Signals Amidst the Chaos of WarMSVPJ Sathvik, Abhilash Dowpati, Srreyansh Sethi. 294-300 [doi]
- Selective Annotation via Data Allocation: These Data Should Be Triaged to Experts for Annotation Rather Than the ModelChen Huang, Yang Deng 0002, Wenqiang Lei, Jiancheng Lv 0001, Ido Dagan. 301-320 [doi]
- Document Hashing with Multi-Grained Prototype-Induced Hierarchical Generative ModelQian Zhang, Qinliang Su, Jiayang Chen, Zhenpeng Song. 321-333 [doi]
- Predictive Multiplicity of Knowledge Graph Embeddings in Link PredictionYuqicheng Zhu, Nico Potyka, Mojtaba Nayyeri, Bo Xiong, Yunjie He, Evgeny Kharlamov, Steffen Staab. 334-354 [doi]
- Temporal Fact Reasoning over Hyper-Relational Knowledge GraphsZifeng Ding, Jingcheng Wu, Jingpei Wu, Yan Xia 0003, Bo Xiong, Volker Tresp. 355-373 [doi]
- GREEN: Generative Radiology Report Evaluation and Error NotationSophie Ostmeier, Justin Xu, Zhihong Chen, Maya Varma, Louis Blankemeier, Christian Bluethgen, Arne Md, Michael E. Moseley, Curtis P. Langlotz, Akshay Chaudhari, Jean-Benoit Delbrouck. 374-390 [doi]
- XRec: Large Language Models for Explainable RecommendationQiyao Ma, Xubin Ren, Chao Huang 0001. 391-402 [doi]
- LLM Questionnaire Completion for Automatic Psychiatric AssessmentGony Rosenman, Talma Hendler, Lior Wolf. 403-415 [doi]
- Disordered-DABS: A Benchmark for Dynamic Aspect-Based Summarization in Disordered TextsXiaobo Guo, Soroush Vosoughi. 416-431 [doi]
- Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative DatasetsIsrael Abebe Azime, Atnafu Lambebo Tonja, Tadesse Destaw Belay, Mitiku Yohannes Fuge, Aman Kassahun Wassie, Eyasu Shiferaw Jada, Yonas Chanie, Walelign Tewabe Sewunetie, Seid Muhie Yimam. 432-444 [doi]
- Can Large Language Models Identify Authorship?Baixiang Huang, Canyu Chen, Kai Shu. 445-460 [doi]
- TransLLaMa: LLM-based Simultaneous Translation SystemRoman Koshkin, Katsuhito Sudoh, Satoshi Nakamura 0001. 461-476 [doi]
- Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed EmbeddingsHiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira. 477-506 [doi]
- Granularity is crucial when applying differential privacy to text: An investigation for neural machine translationDoan Nam Long Vu, Timour Igamberdiev, Ivan Habernal. 507-527 [doi]
- An Open-Source Data Contamination Report for Large Language ModelsYucheng Li 0001, Yunhao Guo, Frank Guerin, Chenghua Lin. 528-541 [doi]
- Few shot chain-of-thought driven reasoning to prompt LLMs for open-ended medical question answeringSaeel Sandeep Nachane, Ojas Gramopadhye, Prateek Chanda, Ganesh Ramakrishnan, Kshitij Sharad Jadhav, Yatin Nandwani, Dinesh Raghu, Sachindra Joshi. 542-573 [doi]
- Reformatted AlignmentRun-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, Pengfei Liu. 574-597 [doi]
- Unsupervised Domain Adaptation for Keyphrase Generation using Citation ContextsFlorian Boudin, Akiko Aizawa. 598-614 [doi]
- SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health SupportHuachuan Qiu, Hongliang He, Shuai Zhang, Anqi Li, Zhenzhong Lan. 615-636 [doi]
- DocEE-zh: A Fine-grained Benchmark for Chinese Document-level Event ExtractionMinghui Liu, Meihan Tong, Yangda Peng, Lei Hou 0001, Juanzi Li, Bin Xu 0001. 637-649 [doi]
- MalayMMLU: A Multitask Benchmark for the Low-Resource Malay LanguageSoon Chang Poh, Sze Jue Yang, Jeraelyn Tan, Lawrence Chieng, Jia Huei Tan, Zhenyu Yu, Foong Mun, Chee Seng Chan. 650-669 [doi]
- Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt OptimizationTobias Schnabel, Jennifer Neville. 670-686 [doi]
- Learning to Route for Dynamic Adapter Composition in Continual Learning with Language ModelsVladimir Araujo, Marie-Francine Moens, Tinne Tuytelaars. 687-696 [doi]
- LLM-supertagger: Categorial Grammar Supertagging via Large Language ModelsJinman Zhao, Gerald Penn. 697-705 [doi]
- Editing Conceptual Knowledge for Large Language ModelsXiaohan Wang, Shengyu Mao, Shumin Deng, Yunzhi Yao, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen, Ningyu Zhang 0001. 706-724 [doi]
- RAG-Studio: Towards In-Domain Adaptation of Retrieval Augmented Generation Through Self-AlignmentKelong Mao, Zheng Liu 0011, Hongjin Qian, Fengran Mo, Chenlong Deng, Zhicheng Dou. 725-735 [doi]
- MMCode: Benchmarking Multimodal Large Language Models for Code Generation with Visually Rich Programming ProblemsKaixin Li, Yuchen Tian, Qisheng Hu, Ziyang Luo, Zhiyong Huang, Jing Ma 0004. 736-783 [doi]
- Enabling Discriminative Reasoning in LLMs for Legal Judgment PredictionChenlong Deng, Kelong Mao, Yuyao Zhang, Zhicheng Dou. 784-796 [doi]
- Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal ModelsDonghoon Kim, Gusang Lee, Kyuhong Shim, Byonghyo Shim. 797-819 [doi]
- What Would Happen Next? Predicting Consequences from An Event Causality GraphChuanhong Zhan, Wei Xiang 0005, Liang Chao, Bang Wang. 820-832 [doi]
- Can LLMs Learn From Mistakes? An Empirical Study on Reasoning TasksShengnan An, Zexiong Ma, Siqi Cai, Zeqi Lin, Nanning Zheng 0001, Jian-Guang Lou, Weizhu Chen. 833-854 [doi]
- Temporal Cognitive Tree: A Hierarchical Modeling Approach for Event Temporal Relation ExtractionWanting Ning, Lishuang Li, Xueyang Qin, Yubo Feng, Jingyao Tang. 855-864 [doi]
- LongGenBench: Long-context Generation BenchmarkXiang Liu, Peijie Dong, Xuming Hu, Xiaowen Chu 0001. 865-883 [doi]
- RaFe: Ranking Feedback Improves Query Rewriting for RAGShengyu Mao, Yong Jiang 0001, Boli Chen, Xiao Li, Peng Wang 0104, Xinyu Wang 0013, Pengjun Xie, Fei Huang 0004, Huajun Chen, Ningyu Zhang 0001. 884-901 [doi]
- BASES: Large-scale Web Search User Simulation with Large Language Model based AgentsRuiyang Ren, Peng Qiu, Yingqi Qu, Jing Liu 0022, Xin Zhao 0018, Hua Wu 0003, Ji-Rong Wen, Haifeng Wang 0001. 902-917 [doi]
- Make Large Language Model a Better RankerWenshuo Chao, Zhi Zheng 0008, Hengshu Zhu, Hao Liu 0026. 918-929 [doi]
- SpeciaLex: A Benchmark for In-Context Specialized Lexicon LearningJoseph Marvin Imperial, Harish Tayyar Madabushi. 930-965 [doi]
- Devil's Advocate: Anticipatory Reflection for LLM AgentsHaoyu Wang 0005, Tao Li 0039, Zhiwei Deng, Dan Roth, Yang Li 0150. 966-978 [doi]
- API Is Enough: Conformal Prediction for Large Language Models Without Logit-AccessJiayuan Su, Jing Luo, Hongwei Wang, Lu Cheng. 979-995 [doi]
- Introducing Compiler Semantics into Large Language Models as Programming Language Translators: A Case Study of C to x86 AssemblyShuoming Zhang, Jiacheng Zhao, Chunwei Xia, Zheng Wang 0001, Yunji Chen, Huimin Cui. 996-1011 [doi]
- Negating Negatives: Alignment with Human Negative Samples via Distributional Dispreference OptimizationShitong Duan, Xiaoyuan Yi, Peng Zhang 0060, Yan Liu 0002, Zheng Liu 0011, Tun Lu, Xing Xie 0001, Ning Gu. 1012-1042 [doi]
- OffsetBias: Leveraging Debiased Data for Tuning EvaluatorsJunsoo Park, Seungyeon Jwa, Meiying Ren, Daeyoung Kim, Sanghyuk Choi. 1043-1067 [doi]
- Employing Glyphic Information for Chinese Event Extraction with Vision-Language ModelXiaoyi Bao, Jinghang Gu, Zhongqing Wang, Minjie Qiang, Chu-Ren Huang. 1068-1080 [doi]
- Can CLIP Count Stars? An Empirical Study on Quantity Bias in CLIPZeliang Zhang, Zhuo Liu, Mingqian Feng, Chenliang Xu. 1081-1086 [doi]
- LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path PlanningSilin Meng, Yiwei Wang 0001, Cheng-Fu Yang, Nanyun Peng, Kai-Wei Chang. 1087-1102 [doi]
- Guided Knowledge Generation with Language Models for Commonsense ReasoningXiao Wei, Haoran Chen, Hang Yu, Hao Fei, Qian Liu. 1103-1136 [doi]
- BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce DomainKaisi Guan, Qian Cao 0001, Yuchong Sun, Xiting Wang, Ruihua Song. 1137-1158 [doi]
- NCPrompt: NSP-Based Prompt Learning and Contrastive Learning for Implicit Discourse Relation RecognitionYuetong Rong, Yijun Mo. 1159-1169 [doi]
- SAFETY-J: Evaluating Safety with CritiqueYixiu Liu, Yuxiang Zheng, Shijie Xia, Jiajun Li, Yi Tu, Chaoling Song, Pengfei Liu. 1170-1192 [doi]
- Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQLDingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che. 1193-1207 [doi]
- A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language ModelsAshutosh Sathe, Prachi Jain, Sunayana Sitaram. 1208-1249 [doi]
- Breaking the Boundaries: A Unified Framework for Chinese Named Entity Recognition Across Text and SpeechJinzhong Ning, Yuanyuan Sun, Bo Xu, Zhihao Yang, Ling Luo 0001, Hongfei Lin. 1250-1260 [doi]
- VGA: Vision GUI Assistant - Minimizing Hallucinations through Image-Centric Fine-TuningZiyang Meng, Yu Dai, Zezheng Gong, Shaoxiong Guo, Minglong Tang, Tongquan Wei. 1261-1279 [doi]
- Understanding the Therapeutic Relationship between Counselors and Clients in Online Text-based Counseling using LLMsAnqi Li, Yu Lu, Nirui Song, Shuai Zhang, Lizhi Ma, Zhenzhong Lan. 1280-1303 [doi]
- Dynamic Planning for LLM-based Graphical User Interface AutomationShaoqing Zhang, Zhuosheng Zhang 0001, Kehai Chen, Xinbei Ma, Muyun Yang, Tiejun Zhao, Min Zhang 0005. 1304-1320 [doi]
- SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented GenerationMinda Hu, Licheng Zong, Hongru Wang 0003, Jingyan Zhou, Jingjing Li 0007, Yichen Gao, Kam-Fai Wong, Yu Li 0006, Irwin King. 1321-1335 [doi]
- Large Language Model-based Human-Agent Collaboration for Complex Task SolvingXueyang Feng, Zhiyuan Chen, Yujia Qin, Yankai Lin, Xu Chen 0017, Zhiyuan Liu 0001, Ji-Rong Wen. 1336-1357 [doi]
- MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained ClassificationKai Sun, Yushi Bai, Ji Qi, Lei Hou 0001, Juan-Zi Li. 1358-1375 [doi]
- LongAlign: A Recipe for Long Context Alignment of Large Language ModelsYushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou 0001, Jie Tang 0001, Yuxiao Dong, Juanzi Li. 1376-1395 [doi]
- Let's Ask GNN: Empowering Large Language Model for Graph In-Context LearningZhengyu Hu, Yichuan Li 0001, Zhengyu Chen 0001, Jingang Wang, Han Liu, Kyumin Lee, Kaize Ding. 1396-1409 [doi]
- CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI SystemsQianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann 0002, Sebastian Möller 0001. 1410-1422 [doi]
- Evaluating Language Model Character TraitsFrancis Rhys Ward, Zejia Yang, Alex Jackson, Randy Brown, Chandler Smith, Grace Colverd, Louis Thomson, Raymond Douglas, Patrik Bartak, Andrew Rowan. 1423-1443 [doi]
- Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained RewardsHyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo. 1444-1466 [doi]
- R-Judge: Benchmarking Safety Risk Awareness for LLM AgentsTongxin Yuan, Zhiwei He 0002, Lingzhong Dong, Yiming Wang, Ruijie Zhao 0001, Tian Xia, Lizhen Xu, Binglin Zhou, Fangqi Li 0001, Zhuosheng Zhang 0001, Rui Wang 0015, Gongshen Liu. 1467-1490 [doi]
- EAVE: Efficient Product Attribute Value Extraction via Lightweight Sparse-layer InteractionLi Yang, Qifan Wang, Jianfeng Chi, Jiahao Liu, Jingang Wang, Fuli Feng, Zenglin Xu, Yi Fang 0008, Lifu Huang, Dongfang Liu. 1491-1505 [doi]
- MultiSkill: Evaluating Large Multimodal Models for Fine-grained Alignment SkillsZhenran Xu, Senbao Shi, Baotian Hu, Longyue Wang, Min Zhang 0005. 1506-1523 [doi]
- To Forget or Not? Towards Practical Knowledge Unlearning for Large Language ModelsBozhong Tian, Xiaozhuan Liang, Siyuan Cheng 0008, Qingbin Liu, Mengru Wang, Dianbo Sui, Xi Chen 0003, Huajun Chen, Ningyu Zhang 0001. 1524-1537 [doi]
- EchoSight: Advancing Visual-Language Models with Wiki KnowledgeYibin Yan, Weidi Xie. 1538-1551 [doi]
- Diversify, Rationalize, and Combine: Ensembling Multiple QA Strategies for Zero-shot Knowledge-based VQAMiaoyu Li, Haoxin Li, Zilin Du, Boyang Li. 1552-1566 [doi]
- Reconfidencing LLMs from the Grouping Loss PerspectiveLihu Chen, Alexandre Perez-Lebel, Fabian M. Suchanek, Gaël Varoquaux. 1567-1581 [doi]
- Tokenization Falling Short: On Subword Robustness in Large Language ModelsYekun Chai, Yewei Fang, Qiwei Peng 0002, Xuhong Li 0002. 1582-1599 [doi]
- AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large Language ModelsYuting Wei, Yuanxing Xu, Xinru Wei, Simin Yang, Yangfu Zhu, Yuqing Li, Di Liu, Bin Wu. 1600-1617 [doi]
- MMAR: Multilingual and Multimodal Anaphora Resolution in Instructional VideosCennet Oguz, Pascal Denis, Simon Ostermann 0002, Emmanuel Vincent 0001, Natalia Skachkova, Josef von Genabith. 1618-1633 [doi]
- Dealing with Controversy: An Emotion and Coping Strategy Corpus Based on Role PlayingEnrica Troiano, Sofie Labat, Marco Stranisci, Rossana Damiano, Viviana Patti, Roman Klinger. 1634-1658 [doi]
- MATE: Meet At The Embedding - Connecting Images with Long TextsYoung-Kyun Jang, Junmo Kang, Yong Jae Lee, Donghyun Kim. 1659-1672 [doi]
- Mixed Distillation Helps Smaller Language Models Reason BetterChenglin Li, Qianglong Chen, Liangyue Li, Caiyu Wang, Feng Tao, Yicheng Li, Zulong Chen, Yin Zhang 0006. 1673-1690 [doi]
- The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language ModelsXinyi Chen, Baohao Liao, Jirui Qi, Panagiotis Eustratiadis, Christof Monz, Arianna Bisazza, Maarten de Rijke. 1691-1706 [doi]
- Optimizing Instruction Synthesis: Effective Exploration of Evolutionary Space with Tree SearchChenglin Li, Qianglong Chen, Zhi Li, Feng Tao, Yicheng Li, Hao Chen, Fei Yu, Yin Zhang. 1707-1721 [doi]
- Suri: Multi-constraint Instruction Following in Long-form Text GenerationChau Pham, Simeng Sun, Mohit Iyyer. 1722-1753 [doi]
- Augmenting Black-box LLMs with Medical Textbooks for Biomedical Question AnsweringYubo Wang, Xueguang Ma, Wenhu Chen. 1754-1770 [doi]
- Exploring Multilingual Concepts of Human Values in Large Language Models: Is Value Alignment Consistent, Transferable and Controllable across Languages?Shaoyang Xu, Weilong Dong, Zishan Guo, Xinwei Wu, Deyi Xiong. 1771-1793 [doi]
- PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language ModelsHuixuan Zhang, Yun Lin, Xiaojun Wan 0001. 1794-1809 [doi]
- UrbanLLM: Autonomous Urban Activity Planning and Management with Large Language ModelsYue Jiang, Qin Chao, Yile Chen 0001, Xiucheng Li, Shuai Liu, Gao Cong. 1810-1825 [doi]
- Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for EnsemblingYao-Ching Yu, Chun-Chih Kuo, Ziqi Ye, Yu-Cheng Chang, Yueh-Se Li. 1826-1839 [doi]
- Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code GenerationSeonghyeon Lee, Suyeon Kim, Joonwon Jang, Heejae Chon, Dongha Lee, Hwanjo Yu. 1840-1846 [doi]
- AHP-Powered LLM Reasoning for Multi-Criteria Evaluation of Open-Ended ResponsesXiaotian Lu, Jiyi Li, Koh Takeuchi, Hisashi Kashima. 1847-1856 [doi]
- Enhancing Fine-Grained Image Classifications via Cascaded Vision Language ModelsCanshi Wei. 1857-1871 [doi]
- Exploring the Best Practices of Query Expansion with Large Language ModelsLe Zhang, Yihong Wu, Qian Yang, Jian-Yun Nie. 1872-1883 [doi]
- Chain-of-Rewrite: Aligning Question and Documents for Open-Domain Question AnsweringChunlei Xin, Yaojie Lu 0001, Hongyu Lin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Zhongyi Liu, Xianpei Han, Le Sun 0001. 1884-1896 [doi]
- MGCL: Multi-Granularity Clue Learning for Emotion-Cause Pair Extraction via Cross-Grained Knowledge DistillationYang Yu, Xin Lin 0001, Changqun Li, Shizhou Huang, Liang He 0001. 1897-1907 [doi]
- Efficient Data Generation for Source-grounded Information-seeking Dialogs: A Use Case for Meeting TranscriptsLotem Golany, Filippo Galgani, Maya Mamo, Nimrod Parasol, Omer Vandsburger, Nadav Bar, Ido Dagan. 1908-1925 [doi]
- Visual Question Decomposition on Multimodal Large Language ModelsHaowei Zhang, Jianzhe Liu, Zhen Han, Shuo Chen 0014, Bailan He, Volker Tresp, Zhiqiang Xu, Jindong Gu. 1926-1949 [doi]
- ProSA: Assessing and Understanding the Prompt Sensitivity of LLMsJingming Zhuo, Songyang Zhang, XinYu Fang, Haodong Duan, Dahua Lin, Kai Chen 0026. 1950-1976 [doi]
- Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language ModelsKai Yao, Penglei Gao, Lichun Li, Yuan Zhao, Xiaofeng Wang 0007, Wei Wang 0002, Jianke Zhu. 1977-1992 [doi]
- Abstraction-of-Thought Makes Language Models Better ReasonersRuixin Hong, Hongming Zhang 0009, Xiaoman Pan, Dong Yu 0001, Changshui Zhang. 1993-2027 [doi]
- LLMs Cannot (Yet) Match the Specificity and Simplicity of Online Communities in Long Form Question AnsweringKris-Fillip Kahl, Tolga Buz, Russa Biswas, Gerard de Melo. 2028-2053 [doi]
- Automated Tone Transcription and Clustering with Tone2VecYi Yang, Yiming Wang, ZhiQiang Tang, Jiahong Yuan. 2054-2065 [doi]
- Multi-dimensional Evaluation of Empathetic Dialogue ResponsesZhichao Xu, Jiepu Jiang. 2066-2087 [doi]
- Translation of Multifaceted Data without Re-Training of Machine Translation SystemsHyeonseok Moon, Seungyoon Lee, Seongtae Hong, Seungjun Lee, Chanjun Park, HeuiSeok Lim. 2088-2108 [doi]
- Reward Difference Optimization For Sample Reweighting In Offline RLHFShiqi Wang 0003, Zhengze Zhang, Rui Zhao, Fei Tan, Cam-Tu Nguyen. 2109-2123 [doi]
- AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction TrajectoriesYifan Song, Weimin Xiong, Xiutian Zhao, Dawei Zhu, Wenhao Wu, Ke Wang, Cheng Li, Wei Peng, Sujian Li. 2124-2141 [doi]
- Are LLMs Aware that Some Questions are not Open-ended?Dongjie Yang, Hai Zhao 0001. 2142-2152 [doi]
- Conditional Language Policy: A General Framework For Steerable Multi-Objective FinetuningKaiwen Wang, Rahul Kidambi, Ryan Sullivan, Alekh Agarwal, Christoph Dann, Andrea Michi, Marco Gelmi, Yunxuan Li, Raghav Gupta, Kumar Dubey, Alexandre Ramé, Johan Ferret, Geoffrey Cideron, Le Hou, Hongkun Yu 0001, Amr Ahmed 0001, Aranyak Mehta, Léonard Hussenot, Olivier Bachem, Edouard Leurent. 2153-2186 [doi]
- DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific LiteratureDawei Li 0008, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, Bojian Hou, Duy Duong Tran, Ying Ding, Huan Liu 0001, Li Shen 0001, Tianlong Chen. 2187-2205 [doi]
- Can AI Relate: Testing Large Language Model Response for Mental Health SupportSaadia Gabriel, Isha Puri, Xuhai Xu, Matteo Malgaroli, Marzyeh Ghassemi. 2206-2221 [doi]
- Towards Robust Extractive Question Answering Models: Rethinking the Training MethodologySon Tran, Matt Kretchmar. 2222-2236 [doi]
- Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice ConversionGiuseppe Ruggiero, Matteo Testa, Jurgen Van de Walle, Luigi Di Caro. 2237-2246 [doi]
- IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerceWenxuan Ding 0001, Weiqi Wang 0001, Sze Heng Douglas Kwok, Minghao Liu, Tianqing Fang, Jiaxin Bai, Xin Liu 0039, Changlong Yu, Zheng Li 0018, Chen Luo 0003, Qingyu Yin, Bing Yin, Junxian He, Yangqiu Song. 2247-2266 [doi]
- Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine SimilarityMichael R. Metel, Peng Lu, Boxing Chen, Mehdi Rezagholizadeh, Ivan Kobyzev. 2267-2272 [doi]
- EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential ReasoningYinzhu Quan, Zefang Liu. 2273-2282 [doi]
- The Base-Rate Effect on LLM Benchmark Performance: Disambiguating Test-Taking Strategies from Benchmark PerformanceKyle Moore, Jesse Roberts, Thao Pham, Oseremhen Ewaleifoh, Douglas H. Fisher. 2283-2288 [doi]
- Can LLM Graph Reasoning Generalize beyond Pattern Memorization?Yizhuo Zhang, Heng Wang 0008, Shangbin Feng, Zhaoxuan Tan, Xiaochuang Han, Tianxing He, Yulia Tsvetkov. 2289-2305 [doi]
- Improving Multilingual Instruction Finetuning via Linguistically Natural and Diverse DatasetsSathish Reddy Indurthi, Wenxuan Zhou, Shamil Chollampatt, Ravi Agrawal, Kaiqiang Song, Lingxiao Zhao, Chenguang Zhu. 2306-2323 [doi]
- ASTE-Transformer: Modelling Dependencies in Aspect-Sentiment Triplet ExtractionIwo Naglik, Mateusz Lango. 2324-2339 [doi]
- Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline ApproachAdam Wojciechowski, Mateusz Lango, Ondrej Dusek. 2340-2351 [doi]
- SynTQA: Synergistic Table-based Question Answering via Mixture of Text-to-SQL and E2E TQASiyue Zhang, Anh Tuan Luu, Chen Zhao. 2352-2364 [doi]
- OpenGraph: Towards Open Graph Foundation ModelsLianghao Xia, Ben Kao, Chao Huang 0001. 2365-2379 [doi]
- Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting FrameworkLu Chen, Ruqing Zhang 0001, Jiafeng Guo, Yixing Fan, Xueqi Cheng. 2380-2393 [doi]
- Learning to Paraphrase for Alignment with LLM PreferenceJunbo Fu, Guoshuai Zhao, Yimin Deng, Yunqi Mi, Xueming Qian. 2394-2407 [doi]
- Mirror-Consistency: Harnessing Inconsistency in Majority VotingSiyuan Huang 0003, Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang, Zhouhan Lin. 2408-2420 [doi]
- Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy ContextsYouna Kim, Hyuhng Joon Kim, Cheonbok Park, Choonghyun Park, Hyunsoo Cho, Junyeob Kim, Kang Min Yoo, Sang-goo Lee, Taeuk Kim. 2421-2431 [doi]
- AnyTrans: Translate AnyText in the Image with Large Scale ModelsZhipeng Qian, Pei Zhang 0011, Baosong Yang, Kai Fan 0002, Yiwei Ma, Derek F. Wong, Xiaoshuai Sun, Rongrong Ji. 2432-2444 [doi]
- In-Context Former: Lightning-fast Compressing Context for Large Language ModelXiangfeng Wang, Zaiyi Chen, Tong Xu 0001, Zheyong Xie, Yongyi He, Enhong Chen. 2445-2460 [doi]
- How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden StatesZhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Yongbin Li. 2461-2488 [doi]
- A Coarse-to-Fine Prototype Learning Approach for Multi-Label Few-Shot Intent DetectionXiaotong Zhang 0003, Xinyi Li, Feng Zhang, Zhiyi Wei, Junfeng Liu, Han Liu 0008. 2489-2502 [doi]
- Can Large Language Models Understand DL-Lite Ontologies? An Empirical StudyKeyu Wang, Guilin Qi, Jiaqi Li, Songlin Zhai. 2503-2519 [doi]
- Enhancing Healthcare LLM Trust with Atypical Presentations RecalibrationJeremy Qin, Bang Liu, Quoc Dinh Nguyen. 2520-2537 [doi]
- EvoR: Evolving Retrieval for Code GenerationHongjin Su, Shuyang Jiang, Yuhang Lai, Haoyuan Wu, Boao Shi, Che Liu, Qian Liu, Tao Yu 0009. 2538-2554 [doi]
- Head-wise Shareable Attention for Large Language ModelsZouying Cao, Yifei Yang, Hai Zhao 0001. 2555-2571 [doi]
- Divide-or-Conquer? Which Part Should You Distill Your LLM?Zhuofeng Wu 0001, Richard He Bai, Aonan Zhang, Jiatao Gu, V. G. Vinod Vydiswaran, Navdeep Jaitly, Yizhe Zhang 0002. 2572-2585 [doi]
- Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language ModelsYuqing Zhou, Ruixiang Tang, Ziyu Yao, Ziwei Zhu 0001. 2586-2614 [doi]
- Privacy Evaluation Benchmarks for NLP ModelsWei Huang 0039, Yinggui Wang, Cen Chen. 2615-2636 [doi]
- MM-ChatAlign: A Novel Multimodal Reasoning Framework based on Large Language Models for Entity AlignmentXuhui Jiang, Yinghan Shen, ZhiChao Shi, Chengjin Xu, Wei Li, Huang Zihe, Jian Guo, Yuanzhuo Wang. 2637-2654 [doi]
- Towards Explainable Computerized Adaptive Testing with Large Language ModelCheng Cheng, Guanhao Zhao, Zhenya Huang, Yan Zhuang, Zhaoyuan Pan, Qi Liu 0003, Xin Li 0064, Enhong Chen. 2655-2672 [doi]
- MC-indexing: Effective Long Document Retrieval via Multi-view Content-aware IndexingKuicai Dong, Derrick-Goh-Xin Deik, Yi Lee, Hao Zhang, Xiangyang Li, Cong Zhang, Yong Liu. 2673-2691 [doi]
- PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue SystemsKentaro Mitsui, Koh Mitsuda, Toshiaki Wakatsuki, Yukiya Hono, Kei Sawada. 2692-2700 [doi]
- Correct after Answer: Enhancing Multi-Span Question Answering with Post-Processing MethodJiayi Lin, Chenyang Zhang, Haibo Tong, Dongyu Zhang, Qingqing Hong, Bingxuan Hou, Junli Wang. 2701-2717 [doi]
- Are Large Language Models (LLMs) Good Social Predictors?Kaiqi Yang, Hang Li 0007, Hongzhi Wen, Tai-Quan Peng, Jiliang Tang, Hui Liu 0031. 2718-2730 [doi]
- Bahasa Harmony: A Comprehensive Dataset for Bahasa Text-to-Speech Synthesis with Discrete Codec Modeling of EnGen-TTSOnkar Susladkar, Vishesh Tripathi, Biddwan Ahmed. 2731-2741 [doi]
- MINERS: Multilingual Language Models as Semantic RetrieversGenta Indra Winata, Ruochen Zhang, David Ifeoluwa Adelani. 2742-2766 [doi]
- BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?Zongmeng Zhang, Jinhua Zhu 0001, Wengang Zhou, Xiang Qi, Peng Zhang 0080, Houqiang Li. 2767-2779 [doi]
- McCrolin: Multi-consistency Cross-lingual Training for Retrieval Question AnsweringPeerat Limkonchotiwat, Wuttikorn Ponwitayarat, Lalita Lowphansirikul, Potsawee Manakul, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, Sarana Nutanong. 2780-2793 [doi]
- A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial ScenariosSamuel Ackerman, Ella Rabinovich, Eitan Farchi, Ateret Anaby-Tavor. 2794-2802 [doi]
- Learning Musical Representations for Music Performance Question AnsweringXingjian Diao, Chunhui Zhang, Tingxuan Wu, Ming Cheng, Zhongyu Ouyang, Weiyi Wu, Jiang Gui. 2803-2813 [doi]
- Transfer Learning for Text Classification via Model Risk AnalysisYujie Sun, Chuyi Fan, Qun Chen. 2814-2825 [doi]
- Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level PerturbationsSukmin Cho, Soyeong Jeong, Jeongyeon Seo, Taeho Hwang, Jong Park. 2826-2844 [doi]
- Enhancing Temporal Modeling of Video LLMs via Time GatingZi-Yuan Hu, Yiwu Zhong, Shijia Huang, Michael R. Lyu, Liwei Wang 0009. 2845-2856 [doi]
- AlignedCoT: Prompting Large Language Models via Native-Speaking DemonstrationsZhicheng Yang, Yinya Huang, Jing Xiong, Liang Feng, Xiaodan Liang, Yiwei Wang, Jing Tang 0004. 2857-2896 [doi]
- On the Empirical Complexity of Reasoning and Planning in LLMsLiwei Kang, Zirui Zhao, David Hsu, Wee Sun Lee. 2897-2936 [doi]
- Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model TrainingXinyan Chen, Jiaxin Ge, Tianjun Zhang, Jiaming Liu 0003, Shanghang Zhang. 2937-2952 [doi]
- Are modern neural ASR architectures robust for polysynthetic languages?Éric Le Ferrand, Zoey Liu, Antti Arppe, Emily Prud'hommeaux. 2953-2963 [doi]
- A Notion of Complexity for Theory of Mind via Discrete World ModelsX. Angelo Huang, Emanuele La Malfa, Samuele Marro, Andrea Asperti, Anthony G. Cohn 0001, Michael J. Wooldridge. 2964-2983 [doi]
- Learning Dynamic Multi-attribute Interest for Personalized Product SearchYutong Bai, Zhicheng Dou, Ji-Rong Wen. 2984-2993 [doi]
- Evaluating Automatic Metrics with Incremental Machine Translation SystemsGuojun Wu, Shay B. Cohen, Rico Sennrich. 2994-3005 [doi]
- LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward EnsembleYujeong Lee, Sangwoo Shin, Wei-Jin Park, Honguk Woo. 3006-3029 [doi]
- Self-Renewal Prompt Optimizing with Implicit ReasoningZiHan Liang, Ben Chen, Zhuoran Ran, Zihan Wang, Huangyu Dai, Yufei Ma 0011, Dehong Gao, Xiaoyan Cai, Libin Yang. 3030-3041 [doi]
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language ModelsJiaming Li, Lei Zhang, Yunshui Li, Ziqiang Liu, Yuelin Bai, Run Luo, Longze Chen, Min Yang 0007. 3042-3059 [doi]
- Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language ModelingMatús Pikuliak, Stefan Oresko, Andrea Hrckova, Marián Simko. 3060-3083 [doi]
- Recent Trends in Linear Text Segmentation: A SurveyIacopo Ghinassi, Lin Wang 0009, Chris Newell, Matthew Purver. 3084-3095 [doi]
- mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document UnderstandingAnwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan, Liang Zhang, Bo Zhang 0071, Ji Zhang 0011, Qin Jin, Fei Huang 0004, Jingren Zhou. 3096-3120 [doi]
- Exploring Question Guidance and Answer Calibration for Visually Grounded Video Question AnsweringYuanxing Xu, Yuting Wei, Shuai Zhong, Xinming Chen, Jinsheng Qi, Bin Wu. 3121-3133 [doi]
- LoRAN: Improved Low-Rank Adaptation by a Non-Linear TransformationYinqiao Li, Linqi Song, Hanxu Hou. 3134-3143 [doi]
- Large Language Models are Limited in Out-of-Context Knowledge ReasoningPeng Hu, Changjiang Gao, RuiQi Gao, Jiajun Chen, Shujian Huang. 3144-3155 [doi]
- BiKT: Enabling Bidirectional Knowledge Transfer Between Pretrained Models and Sequential Downstream TasksHang Zeng, Chaoyue Niu, Fan Wu 0006, Shaojie Tang 0001, Leihao Pei, Chengfei Lv, Guihai Chen. 3156-3171 [doi]
- Double-Checker: Large Language Model as a Checker for Few-shot Named Entity RecognitionWei Chen 0156, Lili Zhao 0002, Zhi Zheng 0008, Tong Xu 0001, Yang Wang, Enhong Chen. 3172-3181 [doi]
- Scaling Sentence Embeddings with Large Language ModelsTing Jiang, Shaohan Huang, Zhongzhi Luan, Deqing Wang, Fuzhen Zhuang. 3182-3196 [doi]
- Exploring the Relationship between In-Context Learning and Instruction TuningHanyu Duan, Yixuan Tang, Yi Yang 0042, Ahmed Abbasi, Kar Yan Tam. 3197-3210 [doi]
- Granular Entity Mapper: Advancing Fine-grained Multimodal Named Entity Recognition and GroundingZiqi Wang, Chen Zhu 0003, Zhi Zheng 0008, Xinhang Li, Tong Xu 0001, Yongyi He, Qi Liu 0003, Ying Yu, Enhong Chen. 3211-3226 [doi]
- JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language ModelsZe Wang, Zekun Wu 0003, Xin Guan, Michael Thaler, Adriano S. Koshiyama, Skylar Lu, Sachin Beepath, Ediz Ertekin Jr., María Pérez-Ortiz 0001. 3227-3246 [doi]
- Contrastive Token Learning with Similarity Decay for Repetition Suppression in Machine TranslationHuangyu Dai, Ben Chen, Kaidi Chen, Ying Han, ZiHan Liang, Wen Jiang. 3247-3261 [doi]
- A Psycholinguistic Evaluation of Language Models' Sensitivity to Argument RolesEun Kyoung Lee, Sathvik Nair, Naomi Feldman. 3262-3274 [doi]
- Tending Towards Stability: Convergence Challenges in Small Language ModelsRichard Diehl Martinez, Pietro Lesci, Paula Buttery. 3275-3286 [doi]
- Be a Multitude to Itself: A Prompt Evolution Framework for Red TeamingRui Li, Peiyi Wang, Jingyuan Ma, Di Zhang, Lei Sha, Zhifang Sui. 3287-3301 [doi]
- Modeling News Interactions and Influence for Financial Market PredictionMengyu Wang, Shay Cohen, Tiejun Ma. 3302-3314 [doi]
- Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge DistillationYuhang Zhou, Jing Zhu 0005, Paiheng Xu, Xiaoyu Liu 0003, Xiyao Wang, Danai Koutra, Wei Ai 0002, Furong Huang. 3315-3333 [doi]
- Are Large Vision Language Models up to the Challenge of Chart Comprehension and ReasoningMohammed Saidul Islam, Raian Rahman, Ahmed Masry, Md. Tahmid Rahman Laskar, Mir Tafseer Nayeem, Enamul Hoque. 3334-3368 [doi]
- HoneyComb: A Flexible LLM-Based Agent System for Materials ScienceHuan Zhang, Yu Song, Ziyu Hou, Santiago Miret, Bang Liu. 3369-3382 [doi]
- Revealing COVID-19's Social Dynamics: Diachronic Semantic Analysis of Vaccine and Symptom Discourse on TwitterZeqiang Wang, Jiageng Wu, Yuqi Wang, Wei Xjtlu, Jie Yang, Nishanth Sastry, Jon Johnson, Suparna De. 3383-3394 [doi]
- Divide and Conquer: Legal Concept-guided Criminal Court View GenerationQi Xu, Xiao Wei 0002, Hang Yu 0006, Qian Liu 0012, Hao Fei 0001. 3395-3410 [doi]
- Data Diversity Matters for Robust Instruction TuningAlexander Bukharin, Shiyang Li, Zhengyang Wang, Jingfeng Yang 0001, Bing Yin, Xian Li, Chao Zhang 0014, Tuo Zhao, Haoming Jiang. 3411-3425 [doi]
- GE2PE: Persian End-to-End Grapheme-to-Phoneme ConversionElnaz Rahmati, Hossein Sameti. 3426-3436 [doi]
- Characterizing LLM Abstention Behavior in Science QA with Context PerturbationsBingbing Wen, Bill Howe, Lucy Lu Wang. 3437-3450 [doi]
- Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense ReasoningShramay Palta, Nishant Balepur, Peter Rankel, Sarah Wiegreffe, Marine Carpuat, Rachel Rudinger. 3451-3473 [doi]
- Cost-Efficient Subjective Task Annotation and Modeling through Few-Shot Annotator AdaptationPreni Golazizian, Alireza Salkhordeh Ziabari, Ali Omrani, Morteza Dehghani. 3474-3491 [doi]
- EDEN: Empathetic Dialogues for English LearningSiyan Li, Teresa Shao, Zhou Yu, Julia Hirschberg. 3492-3511 [doi]
- Language Models Still Struggle to Zero-shot Reason about Time SeriesMike A. Merrill, Mingtian Tan, Vinayak Gupta, Thomas Hartvigsen, Tim Althoff. 3512-3533 [doi]
- Enhancing Agent Learning through World Dynamics ModelingZhiyuan Sun, Haochen Shi, Marc-Alexandre Côté, Glen Berseth, Xingdi Yuan, Bang Liu. 3534-3568 [doi]
- NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data NormalizationMd Mahadi Hasan Nahid, Davood Rafiei. 3569-3585 [doi]
- Zero-Resource Hallucination Prevention for Large Language ModelsJunyu Luo 0001, Cao Xiao, Fenglong Ma. 3586-3602 [doi]
- Measuring and Improving Attentiveness to Partial Inputs with CounterfactualsYanai Elazar, Bhargavi Paranjape, Hao Peng 0009, Sarah Wiegreffe, Khyathi Raghavi Chandu, Vivek Srikumar, Sameer Singh 0001, Noah A. Smith. 3603-3623 [doi]
- LaRS: Latent Reasoning Skills for Chain-of-Thought ReasoningZifan Xu, Haozhu Wang, Dmitriy Bespalov, Xian Wu, Peter Stone, Yanjun Qi. 3624-3643 [doi]
- TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image CaptioningJoshua Feinglass, Yezhou Yang. 3644-3655 [doi]
- The Craft of Selective Prediction: Towards Reliable Case Outcome Classification - An Empirical Study on European Court of Human Rights CasesT. Y. S. S. Santosh, Irtiza Chowdhury, Shanshan Xu, Matthias Grabmair. 3656-3674 [doi]
- InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge IntegrationFali Wang, Runxue Bao, Suhang Wang, Wenchao Yu, Yanchi Liu, Wei Cheng 0002, Haifeng Chen. 3675-3688 [doi]
- SummaCoz: A Dataset for Improving the Interpretability of Factual Consistency Detection for SummarizationGe Luo 0002, Weisi Fan, Miaoran Li, Guoruizhe Sun, Runlong Zhang, Chenyu Xu, Forrest Sheng Bao. 3689-3702 [doi]
- Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation ModelSheng Cheng, Maitreya Patel, Yezhou Yang. 3703-3709 [doi]
- Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy ReasoningAkshara Prabhakar, Thomas L. Griffiths 0001, R. Thomas McCoy. 3710-3724 [doi]
- Self-contradictory reasoning evaluation and detectionZiyi Liu, Soumya Sanyal 0001, Isabelle Lee, Yongkang Du, Rahul Gupta, Yang Liu, Jieyu Zhao. 3725-3742 [doi]
- Incorporating Precedents for Legal Judgement Prediction on European Court of Human Rights CasesT. Y. S. S. Santosh, Mohamed Hesham Elganayni, Stanislaw Sójka, Matthias Grabmair. 3743-3750 [doi]
- Molecular Facts: Desiderata for Decontextualization in LLM Fact VerificationAnisha Gunjal, Greg Durrett. 3751-3768 [doi]
- MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular ComprehensionXingyu Lu, He Cao, Zijing Liu, Shengyuan Bai, Leqing Chen, Yuan Yao 0013, Hai-Tao Zheng 0002, Yu Li. 3769-3789 [doi]
- Sanitizing Large Language Models in Bug Detection with Data-FlowChengpeng Wang, Wuqi Zhang, Zian Su, Xiangzhe Xu, Xiangyu Zhang 0001. 3790-3805 [doi]
- Scaling Behavior for Large Language Models regarding Numeral Systems: An Example using PythiaZhejian Zhou, Jiayu Wang, Dahua Lin, Kai Chen 0026. 3806-3820 [doi]
- When and Where Did it Happen? An Encoder-Decoder Model to Identify Scenario ContextEnrique Noriega-Atala, Robert Vacareanu, Salena Ashton, Adarsh Pyarelal, Clayton T. Morrison, Mihai Surdeanu. 3821-3829 [doi]
- Enhancing Incremental Summarization with Structured RepresentationsEunJeong Hwang, Yichao Zhou 0001, James B. Wendt, Beliz Gunel, Nguyen Vo, Jing Xie 0002, Sandeep Tata. 3830-3842 [doi]
- Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language ModelsSongtao Jiang, Tuo Zheng, Yan Zhang 0004, Yeying Jin, Li Yuan, Zuozhu Liu. 3843-3860 [doi]
- Multiple Knowledge-Enhanced Interactive Graph Network for Multimodal Conversational Emotion RecognitionGeng Tu, Jun Wang, Zhenyu Li, Shiwei Chen, Bin Liang, Xi Zeng, Min Yang, Ruifeng Xu. 3861-3874 [doi]
- AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented GenerationJia Fu, Xiaoting Qin, Fangkai Yang, Lu Wang 0008, Jue Zhang, Qingwei Lin, Yubo Chen 0001, Dongmei Zhang 0001, Saravan Rajmohan, Qi Zhang. 3875-3891 [doi]
- Unleashing the Potential of Large Language Models through Spectral ModulationPeng Sun, Yao Zhu, Yunjian Zhang, Xiu Yan, Zizhe Wang, Xiangyang Ji. 3892-3911 [doi]
- LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language GeneralizationMuhammad Farid Adilazuarda, Samuel Cahyawijaya, Genta Indra Winata, Ayu Purwarianti, Alham Fikri Aji. 3912-3928 [doi]
- QUEST: Efficient Extreme Multi-Label Text Classification with Large Language Models on Commodity HardwareChuang Zhou 0002, Junnan Dong, Xiao Huang 0001, Zirui Liu 0001, Kaixiong Zhou, Zhaozhuo Xu. 3929-3940 [doi]
- UniSumEval: Towards Unified, Fine-grained, Multi-dimensional Summarization Evaluation for LLMsYuho Lee, Taewon Yun, Jason Cai, Hang Su, Hwanjun Song. 3941-3960 [doi]
- Enhancing Arguments Recognition for Financial Mathematical Reasoning over Hybrid DataJinsu Lim, Yechan Hwang, Young-Jun Lee, Ho-Jin Choi. 3961-3973 [doi]
- Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling CheckHaiming Wu, Hanqing Zhang, Richeng Xuan, Dawei Song 0001. 3974-3984 [doi]
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language ModelsZexuan Qiu, Jingjing Li 0007, Shijue Huang, Xiaoqi Jiao, Wanjun Zhong, Irwin King. 3985-4004 [doi]
- Guided Profile Generation Improves Personalization with Large Language ModelsJiarui Zhang. 4005-4016 [doi]
- mABC: Multi-Agent Blockchain-inspired Collaboration for Root Cause Analysis in Micro-Services ArchitectureWei Zhang, Hongcheng Guo, Jian Yang, Zhoujin Tian, Yi Zhang, Chaoran Yan, Zhoujun Li, Tongliang Li, Xu Shi, Liangfan Zheng, Bo Zhang. 4017-4033 [doi]
- Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel TokensWeiyao Luo, Suncong Zheng, Heming Xia, Weikang Wang, Yan Lei, Tianyu Liu, Shuang Chen, Zhifang Sui. 4034-4040 [doi]
- Reward Modeling Requires Automatic Adjustment Based on Data QualityBinghai Wang, Rui Zheng, Lu Chen, Zhiheng Xi, Wei Shen, Yuhao Zhou, Dong Yan, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 4041-4064 [doi]
- LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context InferenceZhongwei Wan, Ziang Wu, Che Liu, Jinfa Huang, Zhihong Zhu, Peng Jin 0001, Longyue Wang, Li Yuan 0007. 4065-4078 [doi]
- The Fall of ROME: Understanding the Collapse of LLMs in Model EditingWanli Yang, Fei Sun 0001, Jiajun Tan, Xinyu Ma, Du Su, Dawei Yin, Huawei Shen. 4079-4087 [doi]
- OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMsJintian Zhang, Cheng Peng, Mengshu Sun, Xiang Chen 0016, Lei Liang, Zhiqiang Zhang 0012, Jun Zhou 0011, Huajun Chen, Ningyu Zhang 0001. 4088-4119 [doi]
- Self-Evolution Fine-Tuning for Policy OptimizationRuiJun Chen, Jiehao Liang, Shiping Gao, Fanqi Wan, Xiaojun Quan. 4120-4137 [doi]
- Deeper Insights Without Updates: The Power of In-Context Learning Over Fine-TuningQingyu Yin, Xuzheng He, Chak Tou Leong, Fan Wang, Yanzhao Yan, Xiaoyu Shen, Qiang Zhang 0026. 4138-4151 [doi]
- Adaptive Feature-based Low-Rank Compression of Large Language Models via Bayesian OptimizationYixin Ji, Yang Xiang, Juntao Li, Qingrong Xia, Zi Ye, Xinyu Duan, Zhefeng Wang 0001, Kehai Chen, Min Zhang 0005. 4152-4168 [doi]
- Emosical: An Emotion-Annotated Musical Theatre DatasetHayoon Kim, Ahyeon Choi, Sungho Lee, Hyun Jung, Kyogu Lee. 4169-4180 [doi]
- Inference-Time Language Model Alignment via Integrated Value GuidanceZhixuan Liu, Zhanhui Zhou, Yuanfu Wang, Chao Yang, Yu Qiao. 4181-4195 [doi]
- TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language ModelsJiahuan Cao, Dezhi Peng, Peirong Zhang, Yongxin Shi, Yang Liu, Kai Ding 0009, Lianwen Jin. 4196-4210 [doi]
- NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation SurroundingChunkit Chan, Cheng Jiayang, Yauwai Yim, Zheye Deng, Wei Fan, Haoran Li 0003, Xin Liu 0039, Hongming Zhang 0009, Weiqi Wang 0001, Yangqiu Song. 4211-4241 [doi]
- A Robust Dual-debiasing VQA Model based on Counterfactual Causal EffectLingyun Song, Chengkun Yang, Xuanyu Li, Xuequn Shang. 4242-4252 [doi]
- PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio DomainJianyi Chen, Zheqi Dai, Zhen Ye, Xu Tan 0003, Qifeng Liu, Yike Guo, Wei Xue. 4253-4263 [doi]
- Beyond Persuasion: Towards Conversational Recommender System with Credible ExplanationsPeixin Qin, Chen Huang, Yang Deng 0002, Wenqiang Lei, Tat-Seng Chua. 4264-4282 [doi]
- Revisiting Query Variation Robustness of Transformer ModelsTim Hagen, Harrisen Scells, Martin Potthast. 4283-4296 [doi]
- Revisiting Catastrophic Forgetting in Large Language Model TuningHongYu Li, Liang Ding 0006, Meng Fang, Dacheng Tao. 4297-4308 [doi]
- M5 - A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language TasksFlorian Schneider, Sunayana Sitaram. 4309-4345 [doi]
- Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language ModelsFlor Miriam Plaza del Arco, Amanda Cercas Curry, Susanna Paoli, Alba Cercas Curry, Dirk Hovy. 4346-4366 [doi]
- Boosting Large Language Models with Continual Learning for Aspect-based Sentiment AnalysisXuanwen Ding, Jie Zhou 0015, Liang Dou, Qin Chen, Yuanbin Wu, Arlene Chen, Liang He 0001. 4367-4377 [doi]
- ProTrix: Building Models for Planning and Reasoning over Tables with Sentence ContextZirui Wu, Yansong Feng 0002. 4378-4406 [doi]
- Recent Advances in Online Hate Speech Moderation: Multimodality and the Role of Large ModelsMing Shan Hee, Shivam Sharma, Rui Cao 0002, Palash Nandi, Preslav Nakov, Tanmoy Chakraborty 0002, Roy Ka-Wei Lee. 4407-4419 [doi]
- Quantifying Generative Media Bias with a Corpus of Real-world and Generated News ArticlesFilip Trhlík, Pontus Stenetorp. 4420-4445 [doi]
- OEE-CFC: A Dataset for Open Event Extraction from Chinese Financial CommentaryQizhi Wan, Changxuan Wan, Rong Hu, Dexi Liu, Xu Wenwu, Kang Xu, Zou Meihua, Liu Tao, Jie Yang, Zhenwei Xiong. 4446-4459 [doi]
- Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document ClassificationSudipta Singha Roy, Xindi Wang, Robert E. Mercer, Frank Rudzicz. 4460-4470 [doi]
- BookWorm: A Dataset for Character Description and AnalysisArgyrios Papoudakis, Mirella Lapata, Frank Keller. 4471-4500 [doi]
- Leveraging Grammar Induction for Language Understanding and GenerationJushi Kai, Shengyuan Hou, Yusheng Huang, Zhouhan Lin. 4501-4513 [doi]
- SH2: Self-Highlighted Hesitation Helps You Decode More TruthfullyJushi Kai, Tianhang Zhang, Hai Hu, Zhouhan Lin. 4514-4530 [doi]
- RoQLlama: A Lightweight Romanian Adapted Language ModelGeorge-Andrei Dima, Andrei-Marius Avram, Cristian-George Craciun, Dumitru-Clementin Cercel. 4531-4541 [doi]
- Reference-free Hallucination Detection for Large Vision-Language ModelsQing Li 0038, Jiahui Geng, Chenyang Lyu, Derui Zhu, Maxim Panov, Fakhri Karray. 4542-4551 [doi]
- WavLLM: Towards Robust and Adaptive Speech Large Language ModelShujie Hu, Long Zhou, Shujie Liu 0001, Sanyuan Chen, Lingwei Meng, Hongkun Hao, Jing Pan, Xunying Liu, Jinyu Li 0001, Sunit Sivasankaran, Linquan Liu, Furu Wei. 4552-4572 [doi]
- Learning from Implicit User Feedback, Emotions and Demographic Information in Task-Oriented and Document-Grounded DialoguesDominic Petrak, Thy Thy Tran, Iryna Gurevych. 4573-4603 [doi]
- Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language ModelsRoxanne El Baff, Khalid Al Khatib, Milad Alshomary, Kai Konen, Benno Stein 0001, Henning Wachsmuth. 4604-4622 [doi]
- KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable ApproachesJiayi Yuan, Hongyi Liu, Shaochen Zhong, Yu-Neng Chuang, Songchen Li, Guanchu Wang, Duy Le, Hongye Jin, Vipin Chaudhary, Zhaozhuo Xu, Zirui Liu 0001, Xia Ben Hu. 4623-4648 [doi]
- An Evaluation Mechanism of LLM-based Agents on Manipulating APIsBing Liu, Jianxiang Zhou, Dan Meng, Haonan Lu. 4649-4662 [doi]
- Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsWenhao Shi, Zhiqiang Hu, Yi Bin, Junhua Liu, Yang Yang 0002, See-Kiong Ng, Lidong Bing, Roy Ka-Wei Lee. 4663-4680 [doi]
- Navigating the Nuances: A Fine-grained Evaluation of Vision-Language NavigationZehao Wang, Minye Wu, Yixin Cao 0002, Yubo Ma, Meiqi Chen 0001, Tinne Tuytelaars. 4681-4704 [doi]
- Re-Invoke: Tool Invocation Rewriting for Zero-Shot Tool RetrievalYanfei Chen, Jinsung Yoon, Devendra Singh Sachan, Qingze Wang, Vincent Cohen-Addad, MohammadHossein Bateni, Chen-Yu Lee, Tomas Pfister. 4705-4726 [doi]
- Rethinking Evaluation Methods for Machine UnlearningLeon Wichert, Sandipan Sikdar. 4727-4739 [doi]
- Evaluating Moral Beliefs across LLMs through a Pluralistic FrameworkXuelin Liu, Yanfei Zhu, Shucheng Zhu, Pengyuan Liu, Ying Liu, Dong Yu. 4740-4760 [doi]
- Knowledge Editing in Language Models via Adapted Direct Preference OptimizationAmit Rozner, Barak Battash, Lior Wolf, Ofir Lindenbaum. 4761-4774 [doi]
- Disentangling Questions from Query Generation for Task-Adaptive RetrievalYoonsang Lee 0004, Minsoo Kim, Seung-won Hwang. 4775-4785 [doi]
- Reap the Wild Wind: Detecting Media Storms in Large-Scale News CorporaDror K. Markus, Effi Levi, Tamir Sheafer, Shaul R. Shenhav. 4786-4797 [doi]
- A Survey on Natural Language Counterfactual GenerationYongJie Wang, Xiaoqi Qiu, Yu Yue, Xu Guo 0002, Zhiwei Zeng, Yuhong Feng, Zhiqi Shen 0001. 4798-4818 [doi]
- Geneverse: A Collection of Open-source Multimodal Large Language Models for Genomic and Proteomic ResearchTianyu Liu, Yijia Xiao, Xiao Luo 0001, Hua Xu 0001, Wenjin Jim Zheng, Hongyu Zhao. 4819-4836 [doi]
- QRMeM: Unleash the Length Limitation through Question then Reflection Memory MechanismBo Wang, Heyan Huang, Yixin Cao 0002, Jiahao Ying, Wei Tang 0015, Chong Feng. 4837-4851 [doi]
- LONG²RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point RecallZehan Qi, Rongwu Xu, Zhijiang Guo, Cunxiang Wang, Hao Zhang, Wei Xu. 4852-4872 [doi]
- IndoCL: Benchmarking Indonesian Language Development AssessmentNankai Lin, Hongyan Wu, Weixiong Zheng, Xingming Liao, Shengyi Jiang, Aimin Yang, Lixian Xiao. 4873-4885 [doi]
- Context-Driven Index Trimming: A Data Quality Perspective to Enhancing Precision of RALMsKexin Ma, Ruochun Jin, Haotian Wang 0001, Xi Wang, Huan Chen, Yuhua Tang, Qian Wang. 4886-4901 [doi]
- Counter Turing Test (CT²): Investigating AI-Generated Text Detection for Hindi - Ranking LLMs based on Hindi AI Detectability Index (ADI_hi)Ishan Kavathekar, Anku Rani, Ashmit Chamoli, Ponnurangam Kumaraguru, Amit Sheth 0001, Amitava Das. 4902-4926 [doi]
- Generating Media Background Checks for Automated Source Critical ReasoningMichael Schlichtkrull. 4927-4947 [doi]
- In Defense of Structural Sparse Adapters for Concurrent LLM ServingJunda Su, Zirui Liu, Zeju Qiu, Weiyang Liu, Zhaozhuo Xu. 4948-4953 [doi]
- CONSTRUCTURE: Benchmarking CONcept STRUCTUre REasoning for Multimodal Large Language ModelsZhiwei Zha, Xiangru Zhu, Yuanyi Xu, Chenghua Huang, JingPing Liu, Zhixu Li, Xuwu Wang, Yanghua Xiao, Bei Yang, Xiaoxiao Xu. 4954-4968 [doi]
- Stanceformer: Target-Aware Transformer for Stance DetectionKrishna Garg, Cornelia Caragea. 4969-4984 [doi]
- Learning Autonomous Driving Tasks via Human Feedbacks with Large Language ModelsYunsheng Ma, Xu Cao, Wenqian Ye, Can Cui, Kai Mei, Ziran Wang. 4985-4995 [doi]
- CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language TechnologiesWeiyan Shi, Ryan Li, Yutong Zhang, Caleb Ziems, Sunny Yu, Raya Horesh, Rogério de Paula, Diyi Yang. 4996-5025 [doi]
- TOOLVERIFIER: Generalization to New Tools via Self-VerificationDheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria Lomeli, Jingbo Shang, Jane Dwivedi-Yu. 5026-5041 [doi]
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language ModelsLiqiang Jing, Ruosen Li, Yunmo Chen, Xinya Du. 5042-5063 [doi]
- Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information GainDavide Mazzaccara, Alberto Testoni, Raffaella Bernardi. 5064-5074 [doi]
- Adversarial Math Word Problem GenerationRoy Xie, Chengxuan Huang, Junlin Wang, Bhuwan Dhingra. 5075-5093 [doi]
- Defending Large Language Models Against Jailbreak Attacks via Layer-specific EditingWei Zhao, Zhe Li, Yige Li, Ye Zhang, Jun Sun 0001. 5094-5109 [doi]
- Promoting Constructive Deliberation: Reframing for ReceptivenessGauri Kambhatla, Matthew Lease, Ashwin Rajadesingan. 5110-5132 [doi]
- A Simple but Effective Approach to Improve Structured Language Model Output for Information ExtractionYinghao Li, Rampi Ramprasad, Chao Zhang 0014. 5133-5148 [doi]
- Rater Cohesion and Quality from a Vicarious PerspectiveDeepak Pandita, Tharindu Cyril Weerasooriya, Sujan Dutta, Sarah Luger, Tharindu Ranasinghe, Ashiqur R. KhudaBukhsh, Marcos Zampieri, Christopher Homan. 5149-5162 [doi]
- Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM AgentsZengqing Wu, Run Peng, Shuyuan Zheng, Qianying Liu, Xu Han, Brian Inhyuk Kwon, Makoto Onizuka, Shaojie Tang 0001, Chuan Xiao 0001. 5163-5186 [doi]
- Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer PredictionAmrit Diggavi Seshadri. 5187-5192 [doi]
- From Test-Taking to Test-Making: Examining LLM Authoring of Commonsense Assessment ItemsMelissa Roemmele, Andrew Gordon. 5193-5203 [doi]
- "I Never Said That": A dataset, taxonomy and baselines on response clarity classificationKonstantinos Thomas, Giorgos Filandrianos, Maria Lymperaiou, Chrysoula Zerva, Giorgos Stamou. 5204-5233 [doi]
- Immunization against harmful fine-tuning attacksDomenic Rosati, Jan Wehner, Kai Williams, Lukasz Bartoszcze, Hassan Sajjad 0001, Frank Rudzicz. 5234-5247 [doi]
- UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion CauseGuimin Hu, Zhihong Zhu, Daniel Hershcovich, Lijie Hu, Hasti Seifi, Jiayuan Xie. 5248-5261 [doi]
- CodeFort: Robust Training for Code Generation ModelsYuhao Zhang, Shiqi Wang 0002, Haifeng Qian, Zijian Wang 0002, Mingyue Shang, Linbo Liu, Sanjay Krishna Gouda, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma 0001, Anoop Deoras. 5262-5277 [doi]
- MP-RNA: Unleashing Multi-species RNA Foundation Model via Calibrated Secondary Structure PredictionHeng Yang, Ke Li. 5278-5296 [doi]
- "Any Other Thoughts, Hedgehog?" Linking Deliberation Chains in Collaborative DialoguesAbhijnan Nath, Videep Venkatesha, Mariah Bradford, Avyakta Chelle, Austin Youngren, Carlos Mabrey, Nathaniel Blanchard, Nikhil Krishnaswamy. 5297-5314 [doi]
- Evaluation of Question Answer Generation for Portuguese: Insights and DatasetsFelipe Paula, Cassiana Michelin, Viviane P. Moreira. 5315-5327 [doi]
- Evolutionary Contrastive Distillation for Language Model AlignmentJulian Katz-Samuels, Zheng Li, Hyokun Yun, Priyanka Nigam, Yi Xu, Vaclav Petricek, Bing Yin, Trishul Chilimbi. 5328-5345 [doi]
- A Fairness-Driven Method for Learning Human-Compatible Negotiation StrategiesRyan Shea, Zhou Yu 0005. 5346-5370 [doi]
- Using RL to Identify Divisive Perspectives Improves LLMs Abilities to Identify Communities on Social MediaNikhil Mehta 0003, Dan Goldwasser. 5371-5390 [doi]
- Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation DialoguesDeuksin Kwon, Emily Weiss, Tara Kulshrestha, Kushal Chawla, Gale M. Lucas, Jonathan Gratch. 5391-5413 [doi]
- When Raw Data Prevails: Are Large Language Model Embeddings Effective in Numerical Data Representation for Medical Machine Learning Applications?Yanjun Gao, Skatje Myers, Shan Chen, Dmitriy Dligach, Timothy A. Miller, Danielle S. Bitterman, Matthew M. Churpek, Majid Afshar. 5414-5428 [doi]
- Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long ContextsAditya Sharma, Michael Saxon, William Yang Wang. 5429-5451 [doi]
- Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question ScoringJiazheng Li 0002, Hainiu Xu, Zhaoyue Sun, Yuxiang Zhou, David West, Cesare Aloisi, Yulan He 0001. 5452-5479 [doi]
- LOCR: Location-Guided Transformer for Optical Character RecognitionYu Sun, Dongzhan Zhou, Chen Lin 0003, Conghui He, Wanli Ouyang, Han-Sen Zhong. 5480-5497 [doi]
- Sing it, Narrate it: Quality Musical Lyrics TranslationZhuorui Ye, Jinhan Li, Rongwu Xu. 5498-5520 [doi]
- Exploring Automated Keyword Mnemonics Generation with Large Language Models via Overgenerate-and-RankJaewook Lee 0006, Hunter McNichols, Andrew S. Lan. 5521-5542 [doi]
- Dual-teacher Knowledge Distillation for Low-frequency Word TranslationYifan Guo, Hongying Zan, Hongfei Xu. 5543-5552 [doi]
- A Simple Angle-based Approach for Contrastive Learning of Unsupervised Sentence RepresentationYoo Hyun Jeong, Myeong Soo Han, Dong-Kyu Chae. 5553-5572 [doi]
- Developing a Pragmatic Benchmark for Assessing Korean Legal Language Understanding in Large Language ModelsKimyeeun Kimyeeun, Choi Youngrok, Eunkyung Choi, Jinhwan Choi, Hai Jin Park, Wonseok Hwang. 5573-5595 [doi]
- Visual Pivoting Unsupervised Multimodal Machine Translation in Low-Resource Distant Language PairsTurghun Tayir, Lin Li 0001, Xiaohui Tao 0001, Mieradilijiang Maimaiti, Ming Li, Jianquan Liu. 5596-5607 [doi]
- Scalable Fine-tuning from Multiple Data Sources: A First-Order Approximation ApproachDongyue Li, Ziniu Zhang, Lu Wang, Hongyang Zhang. 5608-5623 [doi]
- In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language ModelsPengrui Han, Peiyang Song, Haofei Yu, Jiaxuan You. 5624-5643 [doi]
- MathFish: Evaluating Language Model Math Reasoning via Grounding in Educational CurriculaLi Lucy, Tal August, Rose E. Wang, Luca Soldaini, Courtney Allison, Kyle Lo. 5644-5673 [doi]
- Enhancing Multi-Label Text Classification under Label-Dependent Noise: A Label-Specific Denoising FrameworkPengyu Xu, Liping Jing, Jian Yu 0001. 5674-5688 [doi]
- Automatic Reconstruction of Ancient Chinese PronunciationsZhige Huang, Haoan Jin, Mengyue Wu, Kenny Q. Zhu. 5689-5698 [doi]
- Instance-Level Dynamic LoRAs Composition for Cross-Task GeneralizationZhiqi Wang, Shizhu He, Kang Liu, Jun Zhao. 5699-5708 [doi]
- LongWanjuan: Towards Systematic Measurement for Long Text QualityXiaoran Liu, Kai Lv, Qipeng Guo, Hang Yan 0001, Conghui He, Xipeng Qiu, Dahua Lin. 5709-5725 [doi]
- Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuningTianxiang Hu, Pei Zhang 0011, Baosong Yang, Jun Xie, Derek F. Wong, Rui Wang 0015. 5726-5746 [doi]
- TriageAgent: Towards Better Multi-Agents Collaborations for Large Language Model-Based Clinical TriageMeng Lu, Brandon Ho, Dennis Ren, Xuan Wang. 5747-5764 [doi]
- Generative Deduplication For Socia Media Data SelectionXianming Li, Jing Li. 5765-5776 [doi]
- Gender Bias in Decision-Making with Large Language Models: A Study of Relationship ConflictsSharon Levy, William D. Adler, Tahilin Sanchez Karver, Mark Dredze, Michelle R. Kaufman. 5777-5800 [doi]
- Evaluating Biases in Context-Dependent Sexual and Reproductive Health QuestionsSharon Levy, Tahilin Sanchez Karver, William D. Adler, Michelle R. Kaufman, Mark Dredze. 5801-5812 [doi]
- Self-Evaluation of Large Language Model based on Glass-box FeaturesHui Huang, Yingqi Qu, Jing Liu, Muyun Yang, Bing Xu, Tiejun Zhao, Wenpeng Lu. 5813-5820 [doi]
- FASTTRACK: Reliable Fact Tracing via Clustering and LLM-Powered Evidence ValidationSi Chen, Feiyang Kang, Ning Yu, Ruoxi Jia 0001. 5821-5836 [doi]
- PKAD: Pretrained Knowledge is All You Need to Detect and Mitigate Textual Backdoor AttacksYu Chen, Qi Cao, Kaike Zhang, Xuchao Liu, Huawei Shen. 5837-5849 [doi]
- Merely Judging Metaphor is Not Enough: Research on Reasonable Metaphor DetectionPuli Chen, Cheng Yang, Qingbao Huang. 5850-5860 [doi]
- Can we teach language models to gloss endangered languages?Michael Ginn, Mans Hulden, Alexis Palmer. 5861-5876 [doi]
- On the token distance modeling ability of higher RoPE attention dimensionXiangyu Hong, Che Jiang, Biqing Qi, Fandong Meng, Mo Yu, Bowen Zhou 0002, Jie Zhou 0016. 5877-5888 [doi]
- Enhancing Byzantine-Resistant Aggregations with Client EmbeddingZhiyuan Zhang 0001, Hao Zhou 0012, Fandong Meng, Jie Zhou 0016, Xu Sun 0001. 5889-5896 [doi]
- Exploiting Careful Design of SVM Solution for Aspect-term Sentiment AnalysisHanfeng Liu, Minping Chen, Zhenya Zheng, Zeyi Wen. 5897-5906 [doi]
- Learning to Generate Rules for Realistic Few-Shot Relation Classification: An Encoder-Decoder ApproachMayank Singh, Eduardo Blanco 0002. 5907-5921 [doi]
- Plot Twist: Multimodal Models Don't Comprehend Simple Chart DetailsYasaman Razeghi, Ishita Dasgupta 0001, Fangyu Liu, Vinay Ramasesh, Sameer Singh 0001. 5922-5937 [doi]
- HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language ModelsHuy Nghiem, Hal Daumé III. 5938-5956 [doi]
- Giving Control Back to Models: Enabling Offensive Language Detection Models to Autonomously Identify and Mitigate BiasesJiapeng Liu, Weijie Li, Xiaochao Fan, Wenjun Deng, Liang Yang 0003, Yong Li, Yufeng Diao. 5957-5966 [doi]
- Toolken+: Improving LLM Tool Usage with Reranking and a Reject OptionKonstantin Yakovlev, Sergey I. Nikolenko, Andrey Bout. 5967-5974 [doi]
- SecureSQL: Evaluating Data Leakage of Large Language Models as Natural Language Interfaces to DatabasesYanqi Song, Ruiheng Liu, Shu Chen, Qianhao Ren, Yu Zhang 0030, Yongqi Yu. 5975-5990 [doi]
- Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge InjectionTianxiang Chen, Zhentao Tan, Tao Gong, Yue Wu, Qi Chu 0001, Bin Liu 0016, Jieping Ye, Nenghai Yu. 5991-6002 [doi]
- Entity or Relation Embeddings? An Analysis of Encoding Strategies for Relation ExtractionFrank Mtumbuka, Steven Schockaert. 6003-6022 [doi]
- Self-Consistency Boosts Calibration for Math ReasoningAnte Wang, Linfeng Song, Ye Tian, Baolin Peng, Lifeng Jin, Haitao Mi, Jinsong Su, Dong Yu. 6023-6029 [doi]
- Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum PlanningYuanhao Yue, Chengyu Wang, Jun Huang, Peng Wang. 6030-6054 [doi]
- On Creating an English-Thai Code-switched Machine Translation in Medical DomainParinthapat Pengpun, Krittamate Tiankanon, Amrest Chinkamol, Jiramet Kinchagawat, Pitchaya Chairuengjitjaras, Pasit Supholkhan, Pubordee Aussavavirojekul, Chiraphat Boonnag, Kanyakorn Veerakanjana, Hirunkul Phimsiri, Boonthicha Sae-jia, Nattawach Sataudom, Piyalitt Ittichaiwong, Peerat Limkonchotiwat. 6055-6073 [doi]
- CogGPT: Unleashing the Power of Cognitive Dynamics on Large Language ModelsYaojia Lv, Haojie Pan, Zekun Wang, Jiafeng Liang, Yuanxing Liu 0001, Ruiji Fu, Ming Liu 0004, Zhongyuan Wang 0006, Bing Qin 0001. 6074-6091 [doi]
- Can LLMs Recognize Toxicity? A Structured Investigation Framework and Toxicity MetricHyukhun Koh, Dohyung Kim, Minwoo Lee 0003, Kyomin Jung. 6092-6114 [doi]
- Toeing the Party Line: Election Manifestos as a Key to Understand Political Discourse on TwitterMaximilian Maurer, Tanise Ceron, Sebastian Padó, Gabriella Lapesa. 6115-6130 [doi]
- UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure RecognitionZhenrong Zhang, Shuhang Liu, Pengfei Hu 0006, Jiefeng Ma, Jun Du, Jianshu Zhang, Yu Hu 0003. 6131-6143 [doi]
- PolyWER: A Holistic Evaluation Framework for Code-Switched Speech RecognitionKarima Kadaoui, Maryam Al Ali, Hawau Olamide Toyin, Ibrahim Mohammed, Hanan Aldarmaki. 6144-6153 [doi]
- A Deep Analysis of the Impact of Multiword Expressions and Named Entities on Chinese-English Machine TranslationsHuacheng Song, Hongzhi Xu. 6154-6165 [doi]
- SCA: Selective Compression Attention for Efficiently Extending the Context Window of Large Language ModelsHuanran Zheng, Wei Zhu, Xiaoling Wang. 6166-6178 [doi]
- FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and RerankingZhuoer Wang, Leonardo F. R. Ribeiro, Alexandros Papangelis, Rohan Mukherjee, Tzu-Yen Wang, Xinyan Zhao, Arijit Biswas, James Caverlee, Angeliki Metallinou. 6179-6191 [doi]
- Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language ModelsSpyridon Mouselinos, Henryk Michalewski, Mateusz Tomasz Malinowski. 6192-6222 [doi]
- AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language ModelsZihao Zeng, Yibo Miao, Hongcheng Gao, Hao Zhang 0108, Zhijie Deng. 6223-6235 [doi]
- Learning from Relevant Subgoals in Successful Dialogs using Iterative Training for Task-oriented Dialog SystemsMagdalena Kaiser, Patrick Ernst, György Szarvas. 6236-6246 [doi]
- CLEAR: Can Language Models Really Understand Causal Graphs?Sirui Chen, Mengying Xu, Kun Wang, Xingyu Zeng, Rui Zhao, Shengjie Zhao, Chaochao Lu. 6247-6265 [doi]
- PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt TuningGyeongman Kim, Doohyuk Jang, Eunho Yang. 6266-6282 [doi]
- M2QA: Multi-domain Multilingual Question AnsweringLeon Engländer, Hannah Sterz, Clifton Poth, Jonas Pfeiffer, Ilia Kuznetsov, Iryna Gurevych. 6283-6305 [doi]
- Unveiling the Invisible: Captioning Videos with MetaphorsAbisek Rajakumar Kalarani, Pushpak Bhattacharyya, Sumit Shekhar. 6306-6320 [doi]
- How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?Ehsan Doostmohammadi, Oskar Holmström, Marco Kuhlmann. 6321-6336 [doi]
- RippleCOT: Amplifying Ripple Effect of Knowledge Editing in Language Models via Chain-of-Thought In-Context LearningZihao Zhao, Yuchen Yang, Yijiang Li, Yinzhi Cao. 6337-6347 [doi]
- Authorship Obfuscation in Multilingual Machine-Generated Text DetectionDominik Macko, Róbert Móro, Adaku Uchendu, Ivan Srba, Jason Samuel Lucas, Michiharu Yamashita, Nafis Irtiza Tripto, Dongwon Lee 0001, Jakub Simko, Mária Bieliková. 6348-6368 [doi]
- Comparing Edge-based and Node-based Methods on a Citation Prediction TaskPeter Vickers, Kenneth Church 0001. 6369-6388 [doi]
- DAdEE: Unsupervised Domain Adaptation in Early Exit PLMsDivya Jyoti Bajpai, Manjesh K. Hanawal. 6389-6400 [doi]
- LaCo: Large Language Model Pruning via Layer CollapseYifei Yang, Zouying Cao, Hai Zhao 0001. 6401-6417 [doi]
- Llamipa: An Incremental Discourse ParserKate Thompson, Akshay Chaturvedi, Julie Hunter, Nicholas Asher. 6418-6430 [doi]
- Nebula: A discourse aware Minecraft BuilderAkshay Chaturvedi, Kate Thompson, Nicholas Asher. 6431-6443 [doi]
- Improving Referring Ability for Biomedical Language ModelsJunfeng Jiang, Fei Cheng 0002, Akiko Aizawa. 6444-6457 [doi]
- CapEEN: Image Captioning with Early Exits and Knowledge DistillationDivya Jyoti Bajpai, Manjesh Kumar Hanawal. 6458-6472 [doi]
- LumberChunker: Long-Form Narrative Document SegmentationAndré V. Duarte, João Marques, Miguel Graça, Miguel Freire, Lei Li 0005, Arlindo L. Oliveira. 6473-6486 [doi]
- Exploring the Limits of Fine-grained LLM-based Physics Inference via Premise Removal InterventionsJordan Meadows, Tamsin James, André Freitas. 6487-6502 [doi]
- Unlocking Continual Learning Abilities in Language ModelsWenyu Du, Shuang Cheng, Tongxu Luo, Zihan Qiu, Zeyu Huang, Ka-Chun Cheung, Reynold Cheng, Jie Fu 0001. 6503-6522 [doi]
- On the Rigour of Scientific Writing: Criteria, Analysis, and InsightsJoseph James, Chenghao Xiao, Yucheng Li, Chenghua Lin. 6523-6538 [doi]
- MMUTF: Multimodal Multimedia Event Argument Extraction with Unified Template FillingPhilipp Seeberger, Dominik Wagner 0002, Korbinian Riedhammer. 6539-6548 [doi]
- Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference LearningSen Yang 0005, Leyang Cui, Deng Cai 0002, Xinting Huang, Shuming Shi 0001, Wai Lam. 6549-6561 [doi]
- Cross-lingual Contextualized Phrase RetrievalHuayang Li, Deng Cai 0002, Zhi Qu, Qu Cui, Hidetaka Kamigaito, Lemao Liu, Taro Watanabe. 6562-6576 [doi]
- VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMsRuotong Liao, Max Erler, Huiyu Wang, Guangyao Zhai, Gengyuan Zhang, Yunpu Ma, Volker Tresp. 6577-6602 [doi]
- Self-Constructed Context Decompilation with Fined-grained Alignment EnhancementYunlong Feng, Dechuan Teng, Yang Xu 0049, Honglin Mu, Xiao Xu 0005, Libo Qin 0001, Qingfu Zhu, Wanxiang Che. 6603-6614 [doi]
- Efficiently Computing Susceptibility to Context in Language ModelsTianyu Liu 0004, Kevin Du, Mrinmaya Sachan, Ryan Cotterell. 6615-6626 [doi]
- ESG-Kor: A Korean Dataset for ESG-related Information Extraction and Practical Use CasesJaeyoung Lee, Geonyeong Son, Misuk Kim. 6627-6643 [doi]
- Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong InformationYongheng Zhang, Qiguang Chen, Jingxuan Zhou, Peng Wang, Jiasheng Si, Jin Wang, Wenpeng Lu, Libo Qin 0001. 6644-6653 [doi]
- Hope 'The Paragraph Guy' explains the rest : Introducing MeSum, the Meme SummarizerAnas Khan, Tanik Saikh, Arpan Phukan, Asif Ekbal. 6654-6668 [doi]
- Learning Semantic Structure through First-Order-Logic TranslationAkshay Chaturvedi, Nicholas Asher. 6669-6680 [doi]
- A Training Data Recipe to Accelerate A* Search with Language ModelsDevaansh Gupta, Boyang Li. 6681-6695 [doi]
- From Generation to Selection: Findings of Converting Analogical Problem-Solving into Multiple-Choice QuestionsDonghyeon Shin, Seungpil Lee, Klea Kovacec, Sundong Kim. 6696-6708 [doi]
- What's under the hood: Investigating Automatic Metrics on Meeting SummarizationFrederic Kirstein, Jan Philip Wahle, Terry Ruas, Bela Gipp. 6709-6723 [doi]
- Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ LanguagesFabian David Schmidt, Philipp Borchert, Ivan Vulic, Goran Glavas. 6724-6743 [doi]
- CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in EssaysNuowei Liu, Xinhao Chen, Hongyi Wu, Changzhi Sun, Man Lan, Yuanbin Wu, Xiaopeng Bai, Shaoguang Mao, Yan Xia 0005. 6744-6759 [doi]
- An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model InferenceAtsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras. 6760-6785 [doi]
- AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language ModelsJiale Cheng, Yida Lu, Xiaotao Gu, Pei Ke, Xiao Liu 0036, Yuxiao Dong, Hongning Wang, Jie Tang 0001, Minlie Huang. 6786-6803 [doi]
- BAPO: Base-Anchored Preference Optimization for Overcoming Forgetting in Large Language Models PersonalizationGihun Lee, Minchan Jeong, Yujin Kim, Hojung Jung, Jaehoon Oh, Sangmook Kim, Se-Young Yun. 6804-6820 [doi]
- Beyond Common Words: Enhancing ASR Cross-Lingual Proper Noun Recognition Using Large Language ModelsRishabh Kumar, Sabyasachi Ghosh, Ganesh Ramakrishnan. 6821-6828 [doi]
- Few-shot clinical entity recognition in English, French and Spanish: masked language models outperform generative model promptingMarco Naguib, Xavier Tannier, Aurélie Névéol. 6829-6852 [doi]
- STTATTS: Unified Speech-To-Text And Text-To-Speech ModelHawau Olamide Toyin, Hao Li, Hanan Aldarmaki. 6853-6863 [doi]
- From Text Segmentation to Enhanced Representation Learning: A Novel Approach to Multi-Label Classification for Long TextsWang Zhang, Xin Wang, Qian Wang, Tao Deng, Xiaoru Wu. 6864-6873 [doi]
- Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQLQihuang Zhong, Kunfeng Chen, Liang Ding 0006, Juhua Liu, Bo Du 0001, Dacheng Tao. 6874-6885 [doi]
- ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage GuaranteesZhiyuan Wang, Jinhao Duan, Lu Cheng, Yue Zhang, Qingni Wang, Xiaoshuang Shi, Kaidi Xu, Heng Tao Shen, Xiaofeng Zhu 0001. 6886-6898 [doi]
- Irrelevant Alternatives Bias Large Language Model Hiring DecisionsKremena Valkanova, Pencho Yordanov. 6899-6912 [doi]
- PclGPT: A Large Language Model for Patronizing and Condescending Language DetectionHongbo Wang, LiMingDa LiMingDa, Junyu Lu, Hebin Xia, Liang Yang 0003, Bo Xu 0009, Ruizhu Liu, Hongfei Lin. 6913-6928 [doi]
- MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via DebateAlfonso Amayuelas, Xianjun Yang, Antonis Antoniades, Wenyue Hua, Liangming Pan, William Yang Wang. 6929-6948 [doi]
- CEAMC: Corpus and Empirical Study of Argument Analysis in Education via LLMsYupei Ren, Hongyi Wu, Zhaoguang Long, Shangqing Zhao, Xinyi Zhou, Zheqin Yin, Xinlin Zhuang, Xiaopeng Bai, Man Lan. 6949-6966 [doi]
- Ada-Instruct: Adapting Instruction Generators for Complex ReasoningWanyun Cui, Qianle Wang. 6967-6984 [doi]
- LINKAGE: Listwise Ranking among Varied-Quality References for Non-Factoid QA Evaluation via LLMsSihui Yang, Keping Bi, Wanqing Cui, Jiafeng Guo, Xueqi Cheng. 6985-7000 [doi]
- Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and ObservationsNuo Chen 0001, Zinan Zheng, Ning Wu, Ming Gong, Dongmei Zhang 0001, Jia Li 0009. 7001-7016 [doi]
- SynthEval: Hybrid Behavioral Testing of NLP Models with Synthetic EvaluationRaoyuan Zhao, Abdullatif Köksal, Yihong Liu, Leonie Weissweiler, Anna Korhonen, Hinrich Schütze. 7017-7034 [doi]
- TurkishMMLU: Measuring Massive Multitask Language Understanding in TurkishArda Yüksel, Abdullatif Köksal, Lütfi Kerem Senel, Anna Korhonen, Hinrich Schütze. 7035-7055 [doi]
- LongForm: Effective Instruction Tuning with Reverse InstructionsAbdullatif Köksal, Timo Schick, Anna Korhonen, Hinrich Schütze. 7056-7078 [doi]
- Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective on Molecule GraphsYinhan He, Zaiyi Zheng, Patrick Soga, Yaochen Zhu, Yushun Dong, Jundong Li. 7079-7096 [doi]
- Knowledge Mechanisms in Large Language Models: A Survey and PerspectiveMengru Wang, Yunzhi Yao, Ziwen Xu, Shuofei Qiao, Shumin Deng, Peng Wang 0104, Xiang Chen 0016, Jia-Chen Gu, Yong Jiang 0001, Pengjun Xie, Fei Huang 0004, Huajun Chen, Ningyu Zhang 0001. 7097-7135 [doi]
- LongHeads: Multi-Head Attention is Secretly a Long Context ProcessorYi Lu, Xin Zhou 0012, Wei He 0024, Jun Zhao 0019, Tao Ji, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 7136-7148 [doi]
- Crisis counselor language and perceived genuine concern in crisis conversationsGreg Buda, Ignacio Tripodi, Margaret Meagher, Elizabeth A. Olson. 7149-7160 [doi]
- Edit-Constrained Decoding for Sentence SimplificationTatsuya Zetsu, Yuki Arase, Tomoyuki Kajiwara. 7161-7173 [doi]
- Modeling Human Subjectivity in LLMs Using Explicit and Implicit Human Factors in PersonasSalvatore Giorgi, Tingting Liu, Ankit Aich, Kelsey Isman, Garrick Sherman, Zachary Fried, João Sedoc, Lyle H. Ungar, Brenda Curtis. 7174-7188 [doi]
- Multi-Loss Fusion: Angular and Contrastive Integration for Machine-Generated Text DetectionIqra Zahid, Yue Chang, Tharindu Madusanka, Youcheng Sun, Riza Batista-Navarro. 7189-7202 [doi]
- Intermediate Layer Distillation with the Reused Teacher Classifier: A Study on the Importance of the Classifier of Attention-based ModelsHang Zhang, Seyyed Hasan Mozafari, James J. Clark, Brett H. Meyer, Warren J. Gross. 7203-7212 [doi]
- Enhancing Large Language Model Based Sequential Recommender Systems with Pseudo Labels ReconstructionHyunsoo Na, Minseok Gang, Youngrok Ko, Jinseok Seol, Sang-goo Lee. 7213-7222 [doi]
- On the Generalization of Training-based ChatGPT Detection MethodsHan Xu 0002, Jie Ren 0019, Pengfei He, Shenglai Zeng, Yingqian Cui, Amy Liu, Hui Liu, Jiliang Tang. 7223-7243 [doi]
- Private prediction for large-scale synthetic text generationKareem Amin 0002, Alex Bie, Weiwei Kong, Alexey Kurakin, Natalia Ponomareva 0001, Umar Syed, Andreas Terzis, Sergei Vassilvitskii. 7244-7262 [doi]
- Generalists vs. Specialists: Evaluating Large Language Models for UrduSamee Arif, Abdul Hameed Azeemi, Agha Ali Raza, Awais Athar. 7263-7280 [doi]
- Improving Multi-Agent Debate with Sparse Communication TopologyYunxuan Li, Yibing Du, Jiageng Zhang, Le Hou, Peter Grabowski, Yeqing Li, Eugene Ie. 7281-7294 [doi]
- Evidence Retrieval for Fact Verification using Multi-stage RerankingShrikant Malviya, Stamos Katsigiannis. 7295-7308 [doi]
- Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process SupervisionZihan Wang 0001, Yunxuan Li, Yuexin Wu, Liangchen Luo, Le Hou, Hongkun Yu 0001, Jingbo Shang. 7309-7319 [doi]
- MUSCLE: A Model Update Strategy for Compatible LLM EvolutionJessica Maria Echterhoff, Fartash Faghri, Raviteja Vemulapalli, Ting-Yao Hu, Chun-Liang Li, Oncel Tuzel, Hadi Pouransari. 7320-7332 [doi]
- Event-Keyed SummarizationWilliam Gantt, Alexander Martin 0006, Pavlo Kuchmiichuk, Aaron Steven White. 7333-7345 [doi]
- The Effect of Sampling Temperature on Problem Solving in Large Language ModelsMatthew Renze. 7346-7356 [doi]
- HiCuLR: Hierarchical Curriculum Learning for Rhetorical Role Labeling of Legal DocumentsT. Y. S. S. Santosh, Apolline Isaia, Shiyu Hong, Matthias Grabmair. 7357-7364 [doi]
- Semi-Supervised Reward Modeling via Iterative Self-TrainingYifei He, Haoxiang Wang 0003, Ziyan Jiang, Alexandros Papangelis, Han Zhao 0002. 7365-7377 [doi]
- Demonstration Selection Strategies for Numerical Time Series Data-to-TextMasayuki Kawarada, Tatsuya Ishigaki, Goran Topic, Hiroya Takamura. 7378-7392 [doi]
- ALIGN-SIM: A Task-Free Test Bed for Evaluating and Interpreting Sentence Embeddings through Semantic Similarity AlignmentYash Mahajan, Naman Bansal, Eduardo Blanco 0002, Santu Karmaker. 7393-7428 [doi]
- BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language ModelsAofei Chang, Jiaqi Wang 0002, Han Liu 0008, Parminder Bhatia, Cao Xiao, Ting Wang 0006, Fenglong Ma. 7429-7440 [doi]
- In-Context Learning with Iterative Demonstration SelectionChengwei Qin, Aston Zhang, Chen Chen 0075, Anirudh Dagar, Wenming Ye. 7441-7455 [doi]
- On Evaluating Explanation Utility for Human-AI Decision Making in NLPFateme Hashemi Chaleshtori, Atreya Ghosal, Alexander Gill, Purbid Bambroo, Ana Marasovic. 7456-7504 [doi]
- Unsupervised Hierarchical Topic Modeling via Anchor Word Clustering and Path GuidanceJiyuan Liu, Hegang Chen, Chunjiang Zhu, Yanghui Rao. 7505-7517 [doi]
- GuardEmb: Dynamic Watermark for Safeguarding Large Language Model Embedding Service Against Model Stealing AttackLiaoyaqi Wang, Minhao Cheng. 7518-7534 [doi]
- Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMsSihang Zhao, Youliang Yuan, Xiaoying Tang 0002, Pinjia He. 7535-7548 [doi]
- Pseudo-Label Enhanced Prototypical Contrastive Learning for Uniformed Intent DiscoveryYimin Deng, Yuxia Wu, Guoshuai Zhao, Li Zhu, Xueming Qian. 7549-7562 [doi]
- RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation QuantizationXijie Huang, Zechun Liu, Shih-Yang Liu, Kwang-Ting Cheng. 7563-7576 [doi]
- Can Large Language Models Grasp Legal Theories? Enhance Legal Reasoning with Insights from Multi-Agent CollaborationWeikang Yuan, Junjie Cao, Zhuoren Jiang, Yangyang Kang, Jun Lin, Kaisong Song, Tianqianjin Lin, Pengwei Yan, Changlong Sun, Xiaozhong Liu. 7577-7597 [doi]
- Retrieval and Reasoning on KGs: Integrate Knowledge Graphs into Large Language Models for Complex Question AnsweringYixin Ji, Kaixin Wu, Juntao Li, Wei Chen 0034, Mingjie Zhong, Xu Jia, Min Zhang 0005. 7598-7610 [doi]
- Insights into LLM Long-Context Failures: When Transformers Know but Don't TellMuhan Gao, Taiming Lu, Kuai Yu, Adam Byerly, Daniel Khashabi. 7611-7625 [doi]
- E²CL: Exploration-based Error Correction Learning for Embodied AgentsHanlin Wang, Chak Tou Leong, Jian Wang 0054, Wenjie Li 0002. 7626-7639 [doi]
- BERGEN: A Benchmarking Library for Retrieval-Augmented GenerationDavid Rau, Hervé Déjean, Nadezhda Chirkova, Thibault Formal, Shuai Wang 0004, Stéphane Clinchant, Vassilina Nikoulina. 7640-7663 [doi]
- Contextualized Graph Representations for Generating Counter-Narratives against Hate SpeechSelene Baez Santamaría, Helena Gómez-Adorno, Ilia Markov. 7664-7674 [doi]
- Modeling Historical Relevant and Local Frequency Context for Representation-Based Temporal Knowledge Graph ForecastingShengzhe Zhang, Wei Wei 0002, Rikui Huang, Wenfeng Xie, Dangyang Chen. 7675-7686 [doi]
- Representation Alignment and Adversarial Networks for Cross-lingual Dependency ParsingYing Li, Jianjian Liu, Zhengtao Yu 0001, Shengxiang Gao, Yuxin Huang, Cunli Mao. 7687-7697 [doi]
- An Instruction Tuning-Based Contrastive Learning Framework for Aspect Sentiment Quad Prediction with Implicit Aspects and OpinionsHao Zhang, Yu-N Cheah, Congqing He, Feifan Yi. 7698-7714 [doi]
- MACAROON: Training Vision-Language Models To Be Your Engaged PartnersShujin Wu, Yi Fung 0001, Sha Li, Yixin Wan, Kai-Wei Chang, Heng Ji. 7715-7731 [doi]
- ICL: Iterative Continual Learning for Multi-domain Neural Machine TranslationZhibo Man, Kaiyu Huang, Yujie Zhang, Yuanmeng Chen, Yufeng Chen 0005, Jinan Xu. 7732-7743 [doi]
- Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive DecodingDerong Xu, Ziheng Zhang, Zhihong Zhu, Zhenxi Lin, Qidong Liu, Xian Wu 0001, Tong Xu 0001, Xiangyu Zhao 0001, Yefeng Zheng 0001, Enhong Chen. 7744-7757 [doi]
- NeuroMax: Enhancing Neural Topic Modeling via Maximizing Mutual Information and Group Topic RegularizationDuy-Tung Pham, Thien Trang Nguyen Vu, Tung Nguyen, Linh Ngo, Duc Anh Nguyen, Thien Huu Nguyen. 7758-7772 [doi]
- LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple ConstraintsThomas Palmeira Ferraz, Kartik Mehta, Yu-Hsiang Lin, Haw-Shiuan Chang, Shereen Oraby, Sijia Liu 0001, Vivek Subramanian, Tagyoung Chung, Mohit Bansal, Nanyun Peng. 7773-7812 [doi]
- Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge GraphsJunjie Wang, Mingyang Chen, Binbin Hu, Dan Yang, Ziqi Liu, Yue Shen, Peng Wei, Zhiqiang Zhang 0012, Jinjie Gu, Jun Zhou 0011, Jeff Z. Pan, Wen Zhang 0015, Huajun Chen. 7813-7835 [doi]
- Is Compound Aspect-Based Sentiment Analysis Addressed by LLMs?Yinhao Bai, Zhixin Han, Yuhua Zhao, Hang Gao 0003, Zhuowei Zhang, Xunzhi Wang, Mengting Hu. 7836-7861 [doi]
- Multilingual Fine-Grained News Headline Hallucination DetectionJiaming Shen, Tianqi Liu 0002, Jialu Liu, Zhen Qin 0001, Jay Pavagadhi, Simon Baumgartner, Michael Bendersky. 7862-7875 [doi]
- PE: A Poincare Explanation Method for Fast Text Hierarchy GenerationQian Chen, Dongyang Li, Xiaofeng He, Hongzhao Li, Hongyu Yi. 7876-7888 [doi]
- Step-level Value Preference Optimization for Mathematical ReasoningGuoxin Chen, Minpeng Liao, Chengxi Li 0014, Kai Fan 0002. 7889-7903 [doi]
- Towards Benchmarking Situational Awareness of Large Language Models: Comprehensive Benchmark, Evaluation and AnalysisGuo Tang, Zheng Chu, Wenxiang Zheng, Ming Liu 0004, Bing Qin 0001. 7904-7928 [doi]
- Balancing Visual Context Understanding in Dialogue for Image RetrievalZhaohui Wei, Lizi Liao, Xiaoyu Du, Xinguang Xiang. 7929-7942 [doi]
- Mechanistic Understanding and Mitigation of Language Model Non-Factual HallucinationsLei Yu, Meng Cao, Jackie C. K. Cheung, Yue Dong 0002. 7943-7956 [doi]
- A Study of Implicit Ranking Unfairness in Large Language ModelsChen Xu 0010, Wenjie Wang 0007, Yuxin Li, Liang Pang, Jun Xu 0001, Tat-Seng Chua. 7957-7970 [doi]
- Information Parity: Measuring and Predicting the Multilingual Capabilities of Language ModelsAlexander Tsvetkov, Alon Kipnis. 7971-7989 [doi]
- Better Call SAUL: Fluent and Consistent Language Model Editing with Generation RegularizationMingyang Wang, Lukas Lange, Heike Adel, Jannik Strötgen, Hinrich Schütze. 7990-8000 [doi]
- A Semantic Search Engine for Mathlib4Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, Bin Dong 0001. 8001-8013 [doi]
- DyKnow: Dynamically Verifying Time-Sensitive Factual Knowledge in LLMsSeyed Mahed Mousavi, Simone Alghisi, Giuseppe Riccardi. 8014-8029 [doi]
- Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented DialogueHuifang Du, Shuqin Li, Minghao Wu, Xuejing Feng, Yuan-Fang Li, Haofen Wang. 8030-8046 [doi]
- Assistive Large Language Model Agents for Socially-Aware Negotiation DialoguesYuncheng Hua, Lizhen Qu, Reza Haf. 8047-8074 [doi]
- HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role PlayingJing Chen, Xinyu Zhu, Cheng Yang, Chufan Shi, Yadong Xi, Yuxiang Zhang, Junjie Wang, Jiashu Pu, Tian Feng, Yujiu Yang, Rongsheng Zhang. 8075-8121 [doi]
- Advancing Cross-Lingual Entity Alignment with Large Language Models: Tailored Sample Segmentation and Zero-Shot PromptsLinyan Yang, Jingwei Cheng, Fu Zhang 0001. 8122-8138 [doi]
- Causal Discovery Inspired Unsupervised Domain Adaptation for Emotion-Cause Pair ExtractionYuncheng Hua, Yujin Huang, Shuo Huang, Tao Feng 0013, Lizhen Qu, Christopher Bain, Richard Bassed, Reza Haf. 8139-8156 [doi]
- Large Language Models are Students at Various Levels: Zero-shot Question Difficulty EstimationJae-Woo Park, Seong-Jin Park, Hyun-Sik Won, Kang Min Kim. 8157-8177 [doi]
- Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference DataHan-xia, Songyang Gao, Qiming Ge, Zhiheng Xi, Qi Zhang 0001, Xuanjing Huang 0001. 8178-8188 [doi]
- Activation Scaling for Steering and Interpreting Language ModelsNiklas Stoehr, Kevin Du, Vésteinn Snæbjarnarson, Robert West, Ryan Cotterell, Aaron Schein. 8189-8200 [doi]
- LaRA: Large Rank Adaptation for Speech and Text Cross-Modal Learning in Large Language ModelsZuhair Hasan Shaik, Pradyoth Hegde, Prashant Bannulmath, Deepak K. T.. 8201-8211 [doi]
- DTS-SQL: Decomposed Text-to-SQL with Small Large Language ModelsMohammadreza Pourreza, Davood Rafiei. 8212-8220 [doi]
- MedINST: Meta Dataset of Biomedical InstructionsWenhan Han, Meng Fang, Zihan Zhang, Yu Yin, Zirui Song, Ling Chen 0006, Mykola Pechenizkiy, Qingyu Chen 0001. 8221-8240 [doi]
- PropTest: Automatic Property Testing for Improved Visual ProgrammingJaywon Koo, Ziyan Yang, Paola Cascante-Bonilla, Baishakhi Ray, Vicente Ordonez. 8241-8256 [doi]
- BadFair: Backdoored Fairness Attacks with Group-conditioned TriggersJiaqi Xue, Qian Lou, Mengxin Zheng. 8257-8270 [doi]
- Is GPT-4V (ision) All You Need for Automating Academic Data Visualization? Exploring Vision-Language Models' Capability in Reproducing Academic ChartsZhehao Zhang, Weicheng Ma, Soroush Vosoughi. 8271-8288 [doi]
- Financial Forecasting from Textual and Tabular Time SeriesRoss Koval, Nicholas Andrews, Xifeng Yan. 8289-8300 [doi]
- Learning to Ask Denotative and Connotative Questions for Knowledge-based VQAXiaoying Xing, Peixi Xiong, Lei Fan, Yunxuan Li, Ying Wu 0001. 8301-8315 [doi]
- CONTOR: Benchmarking Strategies for Completing Ontologies with Plausible Missing RulesNa Li, Thomas Bailleux, Zied Bouraoui, Steven Schockaert. 8316-8334 [doi]
- Towards Pareto-Efficient RLHF: Paying Attention to a Few High-Reward Samples with Reward DropoutChanghun Lee, Chiehyeon Lim. 8335-8349 [d