Abstract is missing.
- Automating Alternative Generation in Decision-MakingYevhen Kostiuk, Clara Seyfried, Chris Reed 0001. 1-15 [doi]
- Bias Analysis and Mitigation through Protected Attribute Detection and Regard ClassificationTakuma Udagawa, Yang Zhao, Hiroshi Kanayama, Bishwaranjan Bhattacharjee. 16-25 [doi]
- Large Language Models Might Not Care What You Are Saying: Prompt Format Beats DescriptionsChenming Tang, Zhixiang Wang, Hao Sun, Yunfang Wu. 26-48 [doi]
- Boundary Matters: Leveraging Structured Text Plots for Long Text Outline GenerationYuanchi Ma, Jiamou Liu, Hui He, Libo Zhang 0006, Haoyuan Li, Zhendong Niu. 49-63 [doi]
- Can Large Language Models Personalize Dialogues to Generational Styles?Pier Felice Balestrucci, Ondrej Dusek, Luca Anselma, Alessandro Mazzei. 64-77 [doi]
- Toward Optimal LLM Alignments Using Two-Player GamesRui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang 0001, Zhiheng Xi, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001, Yang Liu 0018, Hang Li 0001. 78-99 [doi]
- Structural Patent Classification Using Label Hierarchy OptimizationMengting Gui, Shufeng Hao, Chongyang Shi 0001, Qi Zhang 0020. 100-114 [doi]
- Exploring Hyperbolic Hierarchical Structure for Multimodal Rumor DetectionMd Mahbubur Rahman, Shufeng Hao, Chongyang Shi 0001, An Lao, Jinyan Liu. 115-134 [doi]
- Multi-Surrogate-Objective Optimization for Neural Topic ModelsTue Le, Hoang Tran Vuong, Tung Nguyen, Linh Ngo Van 0001, Dinh Viet Sang, Trung Le 0001, Thien Huu Nguyen. 135-151 [doi]
- How Diversely Can Language Models Solve Problems? Exploring the Algorithmic Diversity of Model-Generated CodeSeonghyeon Lee, Heejae Chon, Joonwon Jang, Dongha Lee 0003, Hwanjo Yu. 152-167 [doi]
- ReAL: How Can LLMs Simulate the Real Teacher? Retrieval-enhanced Agent for Adaptive LearningRui Lv, Qi Liu 0003, Weibo Gao, Jiatong Li 0002, Kai Zhang 0038, Shiwei Tong. 168-181 [doi]
- LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming ContextsJunhao Chen, Jingbo Sun, Xiang Li, Haidong Xin, Yuhao Xue, Yibin Xu, Hao Zhao. 182-194 [doi]
- Versatile Framework for Song Generation with Prompt-based ControlYu Zhang 0126, Wenxiang Guo, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Jingyu Lu, Rongjie Huang 0001, Ruiyuan Zhang, Zhiqing Hong, Ziyue Jiang 0001, Zhou Zhao 0001. 195-219 [doi]
- InsBank: Evolving Instruction Subset for Ongoing AlignmentJiayi Shi, Yiwei Li 0001, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Huan Ren, Yao Hu 0002, Kan Li 0001. 220-238 [doi]
- TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool UseJunjie Ye 0005, Yilong Wu, Sixian Li, Yuming Yang, Zhiheng Xi, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001, Peng Wang 0095, Zhongchao Shi, Jianping Fan 0007, Zhengyin Du. 239-258 [doi]
- DCMKC: A Dual Consistency Matching Approach for Multi-hop Question Answering in LLMsXinyi Wang, Yiping Song, Chang Liu, Tingjin Luo, Bo Liu, Zheng Xie, Minlie Huang. 259-273 [doi]
- On Domain-Adaptive Post-Training for Multimodal Large Language ModelsDaixuan Cheng, Shaohan Huang, Ziyu Zhu, Xintong Zhang, Xin Zhao 0018, Zhongzhi Luan, Bo Dai 0026, Zhenliang Zhang. 274-296 [doi]
- CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy OptimizationJing Ye, Rui Wang, Yuchuan Wu, Victor Ma, Feiteng Fang, Fei Huang 0002, Yongbin Li. 297-323 [doi]
- SPPD: Self-training with Process Preference Learning Using Dynamic Value MarginHao Yi, Qingyang Li 0001, Yulan Hu, Fuzheng Zhang, Di Zhang 0026, Yong Liu 0018. 324-337 [doi]
- Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive FrameworkZhangyue Yin, Yuhong Sun, Xuanjing Huang 0001, Xipeng Qiu, Hui Zhao. 338-365 [doi]
- sudoLLM: On Multi-role Alignment of Language ModelsSoumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain. 366-384 [doi]
- DAC: Decomposed Automation Correction for Text-to-SQLDingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che. 385-402 [doi]
- VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle InteractionJie Yang, Jiajun Chen, Zhangyue Yin, Shuo Chen, Yuxin Wang 0005, Yiran Guo, Yuan Li, Yining Zheng, Xuanjing Huang 0001, Xipeng Qiu. 403-442 [doi]
- End-to-End Optimization for Multimodal Retrieval-Augmented Generation via Reward BackpropagationZhiyuan Fan, Longfei Yun, Ming Yan 0008, Yumeng Wang 0010, Dadi Guo, Brian Mak, James T. Kwok, Yi R. Fung 0001. 443-466 [doi]
- Audio-Aware Large Language Models as Judges for Speaking StylesCheng-Han Chiang, Xiaofei Wang, Chung-Ching Lin, Kevin Lin, Linjie Li, Radu Kopetz, Yao Qian, Zhendong Wang, Zhengyuan Yang, Hung-yi Lee, Lijuan Wang. 467-480 [doi]
- Evaluation of Text-to-Image Generation from a Creativity PerspectiveXinhao Wang, Xinyu Ma, Shengyong Ding, Derek F. Wong. 481-493 [doi]
- Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell ResearchXiang Liu 0001, Penglei Sun, Shuyan Chen, Longhan Zhang, Peijie Dong, Huajie You, Yongqi Zhang, Chang Yan, Xiaowen Chu 0001, Tong-Yi Zhang. 494-518 [doi]
- ProPy: Building Interactive Prompt Pyramids upon CLIP for Partially Relevant Video RetrievalYi Pan, Yujia Zhang 0001, Michael Kampffmeyer, Xiaoguang Zhao. 519-533 [doi]
- Multilingual Datasets for Custom Input Extraction and Explanation Requests Parsing in Conversational XAI SystemsQianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann 0002, Fedor Splitt, Jiaao Li, Yoana Tsoneva, Sebastian Möller 0001, Vera Schmitt. 534-555 [doi]
- Toolscaler: Scalable Generative Tool Calling via Structure-Aware Semantic TokenizationYunyue Su, Zhang Jinshuai, Bowen Fang, Wen Ye, Jinghao Zhang, Bowen Song, Weiqiang Wang, Qiang Liu 0006, Liang Wang. 556-578 [doi]
- LaMP-Val: Large Language Models Empower Personalized Valuation in AuctionJie Sun 0030, Tianyu Zhang, Houcheng Jiang, Kexin Huang, Xiang Shu, Zhibo Zhu, Lintao Ma, Xingyu Lu 0004, Jun Zhou 0011, Junkang Wu, Chi Luo, An Zhang 0003, Jiancan Wu, Xiang Wang 0010. 579-595 [doi]
- Exploring Model Kinship for Merging Large Language ModelsYedi Hu, Yunzhi Yao, Ningyu Zhang 0001, Huajun Chen, Shumin Deng. 596-625 [doi]
- MULTITAT: Benchmarking Multilingual Table-and-Text Question AnsweringXuanliang Zhang, Dingzirui Wang, Keyan Xu, Qingfu Zhu, Wanxiang Che. 626-647 [doi]
- LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation OptimizationYupeng Chang, Chenlu Guo, Yi Chang 0001, Yuan Wu 0002. 648-659 [doi]
- R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-task LearningJinda Liu, Yi Chang, Yuan Wu. 660-674 [doi]
- RACQC: Advanced Retrieval-Augmented Generation for Chinese Query CorrectionJinbo Su, Lingzhe Gao, Wei Li, Shihao Liu, Haojie Lei, Xinyi Wang, Yuanzhao Guo, Ke Wang, Daiting Shi, Dawei Yin. 675-689 [doi]
- Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language ModelsErcong Nie, Helmut Schmid, Hinrich Schütze. 690-706 [doi]
- Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language ModelsWeiyi Wu, Xinwen Xu, Chongyang Gao, Xingjian Diao, Siting Li, Lucas A. Salas, Jiang Gui. 707-730 [doi]
- Improving LLM Reasoning through Interpretable Role-Playing SteeringAnyi Wang, Dong Shu, Yifan Wang, Yunpu Ma, Mengnan Du. 731-751 [doi]
- R2A-TLS: Reflective Retrieval-Augmented Timeline Summarization with Causal-Semantic IntegrationChenlong Bao, Shijie Li, Minghao Hu, Ming Qiao, Bin Zhang, Jin-Tao Tang, Shasha Li, Ting Wang. 752-766 [doi]
- MedEBench: Diagnosing Reliability in Text-Guided Medical Image EditingMinghao Liu, Zhitao He 0001, Zhiyuan Fan, Qingyun Wang 0005, Yi R. Fung 0001. 767-791 [doi]
- FairCoT: Enhancing Fairness in Text-to-Image Generation via Chain of Thought Reasoning with Multimodal Large Language ModelsZahraa Al Sahili, Ioannis Patras, Matthew Purver. 792-816 [doi]
- Bag of Tricks for Sparse Mixture-of-Experts: A Benchmark Across Reasoning, Efficiency, and SafetyMufan Qiu, Zheyu Shen, Pingzhi Li, Ang Li 0005, Tianlong Chen 0001. 817-835 [doi]
- Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language ModelsJinzhe Li, Gengxu Li, Yi Chang, Yuan Wu. 836-869 [doi]
- Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality AligningShengyuan Wang, Jie Feng 0002, Tianhui Liu, Dan Pei, Yong Li 0008. 870-888 [doi]
- The Power of Framing: How News Headlines Guide Search BehaviorAmrit Poudel, Maria Milkowski, Tim Weninger. 889-900 [doi]
- DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language ModelsTsz Ting Chung, Lemao Liu, Mo Yu, Dit-Yan Yeung. 901-915 [doi]
- THCM-CAL: Temporal-Hierarchical Causal Modelling with Conformal Calibration for Clinical Risk PredictionXin Zhang 0108, Qiyu Wei, YingJie Zhu, Fanyi Wu, Sophia Ananiadou. 916-928 [doi]
- GenPilot: A Multi-Agent System for Test-Time Prompt Optimization in Image GenerationWen Ye, Zhaocheng Liu, Yuwei Gui, Tingyu Yuan, Yunyue Su, Bowen Fang, Chaoyang Zhao, Qiang Liu 0006, Liang Wang 0001. 929-958 [doi]
- Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language ModelsHaibo Wang, Zhiyang Xu, Yu Cheng, Shizhe Diao, Yufan Zhou 0001, Yixin Cao 0006, Qifan Wang 0001, Weifeng Ge, Lifu Huang. 959-975 [doi]
- DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba PictogramsXiaojun Bi 0002, Shuo Li, Junyao Xing, Ziyue Wang, Fuwen Luo, Weizheng Qiao, Lu Han, Ziwei Sun, Peng Li, Yang Liu. 976-990 [doi]
- Optimizing Cross-Client Domain Coverage for Federated Instruction Tuning of Large Language ModelsZezhou Wang, Yaxin Du, Xingjun Ma, Yu-Gang Jiang, Zhuzhong Qian, Siheng Chen. 991-1011 [doi]
- Aligning Black-Box LLMs for Aspect Sentiment Quad PredictionShichen Li, Jiawei Zhang, Zhongqing Wang, Peifeng Li. 1012-1025 [doi]
- Multifaceted Evaluation of Audio-Visual Capability for MLLMs: Effectiveness, Efficiency, Generalizability and RobustnessYusheng Zhao, Xiao Luo 0001, Junyu Luo 0002, Weizhi Zhang 0001, Zhiping Xiao 0001, Wei Ju 0001, Philip S. Yu, Ming Zhang 0004. 1026-1041 [doi]
- Two Steps from Hell: Compositionality on Chemical LMsVeronika Ganeeva, Kuzma Khrabrov, Artur Kadurin, Elena Tutubalina. 1042-1049 [doi]
- GTA: Supervised-Guided Reinforcement Learning for Text Classification with Large Language ModelsMin Zeng, Jingfei Sun, Xueyou Luo, ShiQi Zhang, Li Xie, Caiquan Liu, Xiaoxin Chen 0001. 1050-1060 [doi]
- Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM ReasoningZhaohui Yang, Yuxiao Ye, Shilei Jiang, Shihong Deng, Chen Hu, Linjing Li, Daxin Jiang. 1061-1075 [doi]
- LEAF: Large Language Diffusion Model for Time Series ForecastingYuhang Pei, Tao Ren 0002, Yifan Wang, Zhipeng Sun, Wei Ju 0001, Chong Chen 0002, Xiansheng Hua 0001, Xiao Luo 0001. 1076-1091 [doi]
- SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-TuningYuhao Zhang, Shaoming Duan, Jinhang Su, Chuanyi Liu, Peiyi Han. 1092-1110 [doi]
- Multilingual Verbalisation of Knowledge GraphsYifei Song, William Soto Martinez, Anna Nikiforovskaya, Evan Parker Kelly Chapple, Claire Gardent. 1111-1162 [doi]
- LAGCL4Rec: When LLMs Activate Interactions Potential in Graph Contrastive Learning for RecommendationLeqi Zheng, Chaokun Wang, Canzhi Chen, Jiajun Zhang, Cheng Wu 0004, Zixin Song, Shannan Yan, Ziyang Liu 0004, Hongwei Li 0032. 1163-1184 [doi]
- English as Defense Proxy: Mitigating Multilingual Jailbreak via Eliciting English Safety KnowledgeZekai Zhang, Yiduo Guo, Jiuheng Lin, Shanghaoran Quan, Huishuai Zhang, Dongyan Zhao 0001. 1185-1196 [doi]
- Dagger Behind Smile: Fool LLMs with a Happy Ending StoryXurui Song, Zhixin Xie, Shuo Huai, Jiayi Kong 0002, Jun Luo 0001. 1197-1229 [doi]
- Mitigating Object Hallucinations in MLLMs via Multi-Frequency PerturbationsShuo Li, Jiajun Sun, Guodong Zheng, Xiaoran Fan, Yujiong Shen, Yi Lu, Zhiheng Xi, Yuming Yang, Wenming Tan, Tao Ji, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 1230-1247 [doi]
- Natural Context Drift Undermines the Natural Language Understanding of Large Language ModelsYulong Wu, Viktor Schlegel, Riza Batista-Navarro. 1248-1259 [doi]
- Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRAPatryk Marszalek, Klaudia Balazy, Jacek Tabor, Tomasz Kusmierczyk. 1260-1271 [doi]
- Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical EvaluationJiahao Cheng, Tiancheng Su, Jia Yuan, Guoxiu He, Jiawei Liu, Xinqi Tao, Jingwen Xie, Huaxia Li. 1272-1305 [doi]
- Large Language Model Evaluation via Matrix Nuclear-NormYahan Li, Tingyu Xia, Yuan Wu 0002, Yi Chang 0001. 1306-1323 [doi]
- From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic SystemsXiuchao Sui, Daiying Tian, Qi Sun, Ruirui Chen 0002, Dongkyu Choi, Kenneth Kwok, Soujanya Poria. 1324-1340 [doi]
- Flexible Thinking for Multimodal Emotional Support Conversation via Reinforcement LearningFanfan Wang, Xiangqing Shen, Jianfei Yu, Rui Xia. 1341-1356 [doi]
- ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent DiffusionRana Muhammad Shahroz, Dongwen Tang, Pingzhi Li, Kai Wang 0036, Tianlong Chen 0001. 1357-1370 [doi]
- NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language ModelsChenlu Guo, Yi Chang, Yuan Wu. 1371-1385 [doi]
- Bhaasha, Bhāṣā, Zaban: A Survey for Low-Resourced Languages in South Asia - Current Stage and ChallengesSampoorna Poria, Xiaolei Huang. 1386-1406 [doi]
- DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced DataYuhang Zhou, Jing Zhu 0005, Shengyi Qian 0001, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu 0003, Ming Li, Paiheng Xu, Wei Ai 0002, Furong Huang. 1407-1419 [doi]
- What Makes for Good Image Captions?Delong Chen, Samuel Cahyawijaya, Etsuko Ishii, Ho Shu Chan, Yejin Bang, Pascale Fung. 1420-1437 [doi]
- What's Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in LLMsJinhao Pan, Chahat Raj, Ziyu Yao 0002, Ziwei Zhu 0001. 1438-1459 [doi]
- Identifying Rare Languages in Common Crawl Data is a Needles-in-a-Haystack ProblemRasul Dent, Pedro Ortiz Suarez, Thibault Clérice, Benoît Sagot. 1460-1473 [doi]
- Training Language Models to Critique With Multi-agent FeedbackTian Lan 0003, Wenwei Zhang, Chengqi Lyu, Shuaibin Li, Chen Xu, Heyan Huang, Dahua Lin, Xian-Ling Mao, Kai Chen 0026. 1474-1501 [doi]
- RELIC: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot ExamplesSoumya Suvra Ghosal, Vaibhav Singh, Akash Ghosh, Soumyabrata Pal, Subhadip Baidya, Sriparna Saha 0001, Dinesh Manocha. 1502-1517 [doi]
- Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question AnsweringJihao Zhao, Chunlai Zhou, Daixuan Li, Shuaishuai Zu, Biao Qin. 1518-1532 [doi]
- SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness GapsNeha Srikanth, Victor S. Bursztyn, Puneet Mathur, Ani Nenkova. 1533-1559 [doi]
- One More Modality: Does Abstract Meaning Representation Benefit Visual Question Answering?Abhidip Bhattacharyya, Emma Markle, Shira Wein. 1560-1572 [doi]
- DP-GTR: Differentially Private Prompt Protection via Group Text RewritingMingchen Li, Heng Fan 0001, Song Fu, Junhua Ding, Yunhe Feng. 1573-1585 [doi]
- Legal Mathematical Reasoning with LLMs: Procedural Alignment through Two-Stage Reinforcement LearningKepu Zhang, Guofu Xie, Weijie Yu 0003, Mingyue Xu, Xu Tang, Yaxin Li, Jun Xu 0001. 1586-1598 [doi]
- ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World ChallengesCheng Qian 0008, Hongyi Du, Hongru Wang 0011, Xiusi Chen, Yuji Zhang 0002, Avirup Sil, ChengXiang Zhai, Kathleen McKeown, Heng Ji 0001. 1599-1633 [doi]
- Beyond Coarse Labels: Fine-Grained Problem Augmentation and Multi-Dimensional Feedback for Emotional Support ConversationYuanchen Shi, Jiawang Hao, Fang Kong 0001. 1634-1647 [doi]
- FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-MakingJiaxiang Chen, Mingxi Zou, Zhuo Wang, Qifan Wang 0001, Danny Dongning Sun, Zhang Chi, Zenglin Xu. 1648-1672 [doi]
- EvolKV: Evolutionary KV Cache Compression for LLM InferenceBohan Yu, Yekun Chai. 1673-1689 [doi]
- A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language ModelsDong Shu, Xuansheng Wu, Haiyan Zhao 0003, Daking Rai, Ziyu Yao 0002, Ninghao Liu 0001, Mengnan Du. 1690-1712 [doi]
- Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of ExplainabilityDong Shu, Haiyan Zhao 0003, Jingyu Hu 0002, Weiru Liu, Ali Payani, Lu Cheng, Mengnan Du. 1713-1735 [doi]
- Attention Consistency for LLMs ExplanationTian Lan, Jinyuan Xu, Xue He, Jenq-Neng Hwang, Lei Li 0050. 1736-1750 [doi]
- Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMsYu Yan, Sheng Sun, Zhe Wang, Yijun Lin 0007, Zenghao Duan, Zhifei Zheng, Min Liu, Zhiyi Yin, Jianping Zhang. 1751-1767 [doi]
- CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination GenerationWeihua Zheng, Roy Ka-Wei Lee, Zhengyuan Liu, Wu Kui, AiTi Aw, Bowei Zou. 1768-1788 [doi]
- Evaluating Step-by-step Reasoning Traces: A SurveyJinu Lee, Julia Hockenmaier. 1789-1814 [doi]
- Beyond Guilt: Legal Judgment Prediction with Trichotomous ReasoningKepu Zhang, Haoyue Yang, Xu Tang, Weijie Yu 0003, Jun Xu 0001. 1815-1826 [doi]
- Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language ModelsYixin Wan, Anil Ramakrishna, Kai-Wei Chang 0001, Volkan Cevher, Rahul Gupta 0001. 1827-1835 [doi]
- DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster ManagementKai Yin, Xiangjue Dong, Chengkai Liu, Lipai Huang, Yiming Xiao, Zhewei Liu, Ali Mostafavi, James Caverlee. 1836-1867 [doi]
- Data or Language Supervision: What Makes CLIP Better than DINO?Yiming Liu, Yuhui Zhang, Dhruba Ghosh, Ludwig Schmidt, Serena Yeung-Levy. 1868-1874 [doi]
- Do LLMs Understand Wine Descriptors Across Cultures? A Benchmark for Cultural Adaptations of Wine ReviewsChenye Zou, Xingyue Wen, Tianyi Hu, Qian Janice Wang, Daniel Hershcovich. 1875-1894 [doi]
- DeFT-X: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual TransferSona Elza Simon, Preethi Jyothi. 1895-1909 [doi]
- Memory-enhanced Large Language Model for Cross-lingual Dependency Parsing via Deep Hierarchical Syntax UnderstandingJianjian Liu, Ying Li 0127, Zhengtao Yu 0001, Shun Su, Shengxiang Gao, Yuxin Huang 0004. 1910-1923 [doi]
- Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language ModelsJiyue Jiang, Alfred Kar Yin Truong, Yanyu Chen, Qinghang Bao, Sheng Wang, Pengan Chen, Jiuming Wang, Lingpeng Kong, Yu Li, Chuan Wu. 1924-1944 [doi]
- A Structured Framework for Evaluating and Enhancing Interpretive Capabilities of Multimodal LLMs in Culturally Situated TasksHaorui Yu, Ramon Ruiz-Dolz, Qiufeng Yi. 1945-1971 [doi]
- Train a Unified Multimodal Data Quality Classifier with Synthetic DataWeizhi Wang, Rongmei Lin, Shiyang Li, Colin Lockard, Ritesh Sarkhel, Sanket Lokegaonkar, Jingbo Shang, Xifeng Yan, Nasser Zalmout, Xian Li. 1972-1986 [doi]
- Self-Improvement in Multimodal Large Language Models: A SurveyShijian Deng, Kai Wang 0068, Tianyu Yang, Harsh Singh, Yapeng Tian. 1987-2006 [doi]
- Towards Achieving Concept Completeness for Textual Concept Bottleneck ModelsMilan Bhan, Yann Choho, Jean-Noël Vittaut, Nicolas Chesneau, Pierre Moreau, Marie-Jeanne Lesot. 2007-2024 [doi]
- EmoBench-UA: A Benchmark Dataset for Emotion Detection in UkrainianDaryna Dementieva, Nikolay Babakov, Alexander Fraser 0001. 2025-2048 [doi]
- Scientific Paper Retrieval with LLM-Guided Semantic-Based RankingYunyi Zhang 0001, Ruozhen Yang, Siqi Jiao, SeongKu Kang, Jiawei Han 0001. 2049-2060 [doi]
- DLIR: Spherical Adaptation for Cross-Lingual Knowledge Transfer of Sociological Concepts AlignmentZeqiang Wang, Jon Johnson, Suparna De. 2061-2075 [doi]
- Test-Time Steering for Lossless Text Compression via Weighted Product of ExpertsQihang Zhang, Muchen Li, Ziao Wang, Renjie Liao, Lele Wang. 2076-2088 [doi]
- Zero-Shot Contextual Embeddings via Offline Synthetic Corpus GenerationPhilip Lippmann, Jie Yang. 2089-2104 [doi]
- The Hallucination Tax of Reinforcement FinetuningLinxin Song, Taiwei Shi, Jieyu Zhao 0001. 2105-2120 [doi]
- Tracing Multilingual Factual Knowledge Acquisition in PretrainingYihong Liu 0001, Mingyang Wang 0003, Amir Hossein Kargaran, Felicia Körner, Ercong Nie, Barbara Plank, François Yvon, Hinrich Schütze. 2121-2146 [doi]
- Exploring the Vulnerability of the Content Moderation Guardrail in Large Language Models via Intent ManipulationJun Zhuang 0004, Hai Jin 0001, Ye Zhang, Zhengjian Kang, Wenbin Zhang 0002, Gaby G. Dagher, Haohan Wang. 2147-2160 [doi]
- Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial ExamplesAndrianos Michail, Simon Clematide, Rico Sennrich. 2161-2170 [doi]
- EmoGist: Efficient In-Context Learning for Visual Emotion UnderstandingRonald Seoh, Dan Goldwasser. 2171-2182 [doi]
- Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language ModelsHaokun Chen, Sebastian Szyller, Weilin Xu, Nageen Himayat. 2183-2192 [doi]
- Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text ModificationsYiming Zeng 0012, Wanhao Yu, Zexin Li, Tao Ren, Yu Ma, Jinghan Cao, Xiyan Chen, Tingting Yu. 2193-2206 [doi]
- LLM-based Conversational Recommendation Agents with Collaborative Verbalized ExperienceYaochen Zhu, Harald Steck, Dawen Liang, Yinhan He, Nathan Kallus, Jundong Li. 2207-2220 [doi]
- Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM InferenceHao Mark Chen, Wayne Luk, Yiu Ka Fai Cedric, Rui Li 0052, Konstantin Mishchenko, Stylianos I. Venieris, Hongxiang Fan. 2221-2238 [doi]
- Measuring Sycophancy of Language Models in Multi-turn DialoguesJiseung Hong, Grace Byun, Seungone Kim, Kai Shu. 2239-2259 [doi]
- On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future DirectionsWeiqi Wang 0001, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding 0001, Liyu Zhang 0005, Wei Fan 0001, Jiaxin Bai, Haoran Li 0003, Xin Liu 0039, Yangqiu Song. 2260-2281 [doi]
- Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient DescentJunda Wu, Yuxin Xiong, Xintong Li 0001, Yu Xia 0007, Ruoyu Wang 0038, Yu Wang 0160, Tong Yu 0001, SungChul Kim, Ryan A. Rossi, Lina Yao 0001, Jingbo Shang, Julian J. McAuley. 2282-2295 [doi]
- PathoHR: Hierarchical Reasoning for Vision-Language Models in PathologyYating Huang, Ziyan Huang, Lintao Xiang, Qijun Yang, Hujun Yin. 2296-2311 [doi]
- "What's Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI DatasetsAkshay Paruchuri, Maryam Aziz, Rohit Vartak, Ayman Ali, Best Uchehara, Xin Liu 0034, Ishan Chatterjee, Monica Agrawal. 2312-2336 [doi]
- Dynamic Evaluation for Oversensitivity in LLMsSophia Xiao Pu, Sitao Cheng, Xin Eric Wang, William Yang Wang. 2337-2344 [doi]
- Self-Correcting Code Generation Using Small Language ModelsJeonghun Cho 0002, Deokhyung Kang, Hyounghun Kim, Gary Lee 0001. 2345-2368 [doi]
- A Unified Framework for N-ary Property Information Extraction in Materials ScienceVan-Thuy Phi, Yuji Matsumoto 0001. 2369-2388 [doi]
- A Benchmark for Translations Across Styles and Language VariantsXin Tan, Bowei Zou, AiTi Aw. 2389-2402 [doi]
- ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent FrameworkLisheng Huang, Yichen Liu, Jinhao Jiang, Rongxiang Zhang, Jiahao Yan, Junyi Li 0001, Xin Zhao 0018. 2403-2417 [doi]
- Proactive User Information Acquisition via Chats on User-Favored TopicsShiki Sato, Jun Baba, Asahi Hentona, Shinji Iwata, Akifumi Yoshimoto, Koichiro Yoshino. 2418-2443 [doi]
- Evaluating Text Generation Quality Using Spectral Distances of SurprisalZhichen Liu, Yongyuan Li, Yang Xu, Yu Wang, Yingfang Yuan, Zuhao Yang. 2444-2463 [doi]
- NLP-ADBench: NLP Anomaly Detection BenchmarkYuangang Li 0002, Jiaqi Li, Zhuo Xiao, Tiankai Yang 0001, Yi Nian, Xiyang Hu, Yue Zhao 0016. 2464-2474 [doi]
- Toward Inclusive Language Models: Sparsity-Driven Calibration for Systematic and Interpretable Mitigation of Social Biases in LLMsPrommy Sultana Hossain, Chahat Raj, Ziwei Zhu 0001, Jessica Lin 0001, Emanuela Marasco. 2475-2508 [doi]
- Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific PapersXanh Ho, Sunisth Kumar, Yun-Ang Wu, Florian Boudin, Atsuhiro Takasu, Akiko Aizawa. 2509-2517 [doi]
- DCRM: A Heuristic to Measure Response Pair Quality in Preference OptimizationChengyu Huang, Tanya Goyal. 2518-2537 [doi]
- Advancing Reasoning with Off-the-Shelf LLMs: A Semantic Structure PerspectivePengfei He, Zitao Li, Yue Xing 0002, Yaliang Li, Jiliang Tang, Bolin Ding. 2538-2566 [doi]
- LLM-based Open Domain Planning by Leveraging Entity-Attribute-Level Domain ModelsDongning Rao, Songlin He, Zhihua Jiang, Ruishi Liang. 2567-2588 [doi]
- DICP: Deep In-Context Prompt for Event Causality IdentificationLin Mu, Jun Shen, Li Ni 0001, Lei Sang 0001, Zhize Wu, Peiquan Jin, Yiwen Zhang 0001. 2589-2599 [doi]
- Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech GenerationWeiting Tan, Jiachen Lian, Hirofumi Inaguma, Paden Tomasello, Philipp Koehn, Xutai Ma. 2600-2617 [doi]
- GRV-KBQA: A Three-Stage Framework for Knowledge Base Question Answering with Decoupled Logical Structure, Semantic Grounding and Structure-Aware ValidationYuhang Tian, Pan Yang, Dandan Song 0005, Zhijing Wu 0001, Hao Wang 0163. 2618-2632 [doi]
- Improving Prompt Generalization for Cross-prompt Essay Trait Scoring from the Scoring-invariance PerspectiveJiong Wang, Shengquan Yu. 2633-2646 [doi]
- When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language ModelsCheongwoong Kang, Jongeun Baek, Yeonjea Kim, Jaesik Choi. 2647-2667 [doi]
- ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe PromptsAmelia Hardy, Houjun Liu, Allie Griffith, Bernard Lange, Duncan Eddy, Mykel J. Kochenderfer. 2668-2683 [doi]
- How Do Large Language Models Perform on PDE Discovery: A Coarse-to-fine PerspectiveXiao Luo 0001, Changhu Wang, Yizhou Sun, Wei Wang 0010. 2684-2697 [doi]
- Rethinking Data Selection at Scale: Random Selection is Almost All You NeedTingyu Xia, Bowen Yu 0002, Kai Dang, an Yang, Yuan Wu 0002, Yuan Tian 0016, Yi Chang 0001, Junyang Lin. 2698-2711 [doi]
- PromptKeeper: Safeguarding System Prompts for LLMsZhifeng Jiang 0006, Zhihua Jin, Guoliang He. 2712-2728 [doi]
- Automating eHMI Action Design with LLMs for Automated Vehicle CommunicationDing Xia, Xinyue Gui, Fan Gao, Dongyuan Li, Mark Colley, Takeo Igarashi. 2729-2752 [doi]
- A Dynamic Fusion Model for Consistent Crisis ResponseXiaoying Song, Anirban Saha Anik, Eduardo Blanco 0002, Vanessa Frías-Martínez, Lingzi Hong. 2753-2768 [doi]
- UIOrchestra: Generating High-Fidelity Code from UI Designs with a Multi-agent SystemChuhuai Yue, Jiajun Chai, Yufei Zhan, Zixiang Ding, Xihao Liang, Peixin Wang, Shihai Chen, Wang Yixuan, Wang Yanping, Guojun Yin, Wei Lin. 2769-2782 [doi]
- CrossQG: Improving Difficulty-Controllable Question Generation through Consistency EnhancementKunze Li, Yu Zhang. 2783-2798 [doi]
- Progressive Facial Granularity Aggregation with Bilateral Attribute-based Enhancement for Face-to-Speech SynthesisYejin Jeon, Youngjae Kim, Jihyun Lee, Hyounghun Kim, Gary Lee. 2799-2811 [doi]
- Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RLXiaoying Song, Anirban Saha Anik, Dibakar Barua, Pengcheng Luo, Junhua Ding, Lingzi Hong. 2812-2830 [doi]
- FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short TextZijian Zheng, Yonghe Lu, Jian Yin 0001. 2831-2846 [doi]
- AuraDial: A Large-Scale Human-Centric Dialogue Dataset for Chinese AI Psychological CounselingXiantao Zhang. 2847-2863 [doi]
- TS-SQL: Test-driven Self-refinement for Text-to-SQLWenbo Xu, Haifeng Zhu, Liang Yan, Chuanyi Liu, Peiyi Han, Shaoming Duan, Jeff Z. Pan. 2864-2889 [doi]
- DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based AgentPengyu Zhu, Zhenhong Zhou, Yuanhe Zhang, Shilinlu Yan, Kun Wang 0056, Sen Su. 2890-2912 [doi]
- MotivGraph-SoIQ: Integrating Motivational Knowledge Graphs and Socratic Dialogue for Enhanced LLM IdeationXinping Lei, Tong Zhou 0014, Yubo Chen 0001, Kang Liu 0001, Jun Zhao 0001. 2913-2933 [doi]
- ExpertGenQA: Open-ended QA generation in Specialized DomainsHaz Sameen Shahgir, Chansong Lim, Jia Chen 0002, Evangelos E. Papalexakis, Yue Dong 0002. 2934-2955 [doi]
- VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code GenerationYuansheng Ni, Ping Nie, Kai Zou, Xiang Yue, Wenhu Chen. 2956-2983 [doi]
- Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality AssessmentJiahuan Pei, Fanghua Ye 0001, Xin Sun 0016, Wentao Deng, Koen V. Hindriks, Junxiao Wang. 2984-2997 [doi]
- Visual Program Distillation with Template-Based AugmentationMichal Shlapentokh-Rothman, Yu-Xiong Wang, Derek Hoiem. 2998-3018 [doi]
- NeighXLM: Enhancing Cross-Lingual Transfer in Low-Resource Languages via Neighbor-Augmented Contrastive PretrainingSicheng Wang, Wenyi Wu, Zibo Zhang. 3019-3030 [doi]
- ICLER: Intent CLassification with Enhanced ReasoningDezheng Gao, Xiaozheng Dong, Shuangtao Yang, Bo Fu. 3031-3044 [doi]
- PreGenie: An Agentic Framework for High-quality Visual Presentation GenerationXiaojie Xu, Xinli Xu, Sirui Chen, Haoyu Chen 0003, Fan Zhang, Ying-Cong Chen. 3045-3063 [doi]
- RIVAL: Reinforcement Learning with Iterative and Adversarial Optimization for Machine TranslationTianjiao Li, Mengran Yu, Chenyu Shi, Yanjun Zhao 0001, Xiaojing Liu, Qi Zhang 0020, Xuanjing Huang 0001, Qiang Zhang, Jiayin Wang. 3064-3079 [doi]
- MRAG: A Modular Retrieval Framework for Time-Sensitive Question AnsweringSiyue Zhang, Yuxiang Xue, Yiming Zhang, Xiaobao Wu, Anh Tuan Luu, Chen Zhao 0013. 3080-3118 [doi]
- CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language ModelsFeiyang Li, Peng Fang, Zhan Shi, Arijit Khan, Fang Wang Weihao Wang, Xin Zhang, Cui Yongjian. 3119-3171 [doi]
- TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular DataChangjiang Jiang, Fengchang Yu, Haihua Chen 0002, Wei Lu 0019, Jin Zeng. 3172-3196 [doi]
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path SupervisionDawei Zhu, Xiyu Wei, Guangxiang Zhao, Wenhao Wu, Haosheng Zou, Junfeng Ran, Xun Wang, Lin Sun, Xiangzheng Zhang, Sujian Li. 3197-3211 [doi]
- Multimodal Document-level Triple Extraction via Dynamic Graph Enhancement and Relation-Aware ReflectionXiang Li, Runhai Jiao, Changyu Zhou, Shoupeng Qiao, Ruojiao Qiao, Ruifan Li. 3212-3223 [doi]
- Distill Visual Chart Reasoning Ability from LLMs to MLLMsWei He 0024, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 3224-3250 [doi]
- FlowMalTrans: Unsupervised Binary Code Translation for Malware Detection Using Flow-Adapter ArchitectureMinghao Hu, Junzhe Wang, Weisen Zhao, Qiang Zeng 0001, Lannan Luo. 3251-3272 [doi]
- AdaTP: Attention-Debiased Token Pruning for Video Large Language ModelsFengyuan Sun, Leqi Shen, Hui Chen 0013, Sicheng Zhao, Jungong Han, Guiguang Ding. 3273-3286 [doi]
- AdaptFlow: Adaptive Workflow Optimization via Meta-LearningRunchuan Zhu, Bowen Jiang, Lingrui Mei, Fangkai Yang, Lu Wang 0029, Haoxiang Gao, Fengshuo Bai, Pu Zhao 0004, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang 0001. 3287-3302 [doi]
- LMUNIT: Fine-grained Evaluation with Natural Language Unit TestsJon Saad-Falcon, Rajan Vivek, William Berrios, Nandita Shankar Naik, Matija Franklin, Bertie Vidgen, Amanpreet Singh, Douwe Kiela, Shikib Mehri. 3303-3324 [doi]
- ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning EnhancementShan Yang, Kun Wu, Zeju Li, Linlin Zhang, Xiangyu Pei, Leike An, Yu Liu. 3325-3347 [doi]
- Detecting Stealthy Backdoor Samples based on Intra-class Distance for Large Language ModelsJinwen Chen 0001, Hainan Zhang 0001, Fei Sun, Qinnan Zhang, Sijia Wen, Ziwei Wang, Zhiming Zheng 0001. 3348-3365 [doi]
- Rust-doctor: Enhanced Feature for Rust Ownership and Lifetime Repair with Balanced Training Data GenerationWenzhang Yang, Xiaoning Ren, Cuifeng Gao, Yinxing Xue. 3366-3376 [doi]
- SLIM: Subtrajectory-Level Elimination for More Effective ReasoningXifeng Yao, Chengyuan Ma, Dongyu Lang, Yinhao Ni, Zhiwei Xu, Huarui Xie, Zihao Chen, Guang Shen, Dandan Tu, Yi Bai, Changzheng Zhang. 3377-3394 [doi]
- From Cross-Task Examples to In-Task Prompts: A Graph-Based Pseudo-Labeling Framework for In-context LearningZihan Chen 0002, Song Wang 0013, Xingbo Fu, Chengshuai Shi, Zhenyu Lei 0004, Cong Shen 0001, Jun-Dong Li. 3395-3410 [doi]
- Instance-level Randomization: Toward More Stable LLM EvaluationsYiyang Li, Yonghuang Wu, Ying Luo, Liangtai Sun, Zishu Qin, Lin Qiu, Xuezhi Cao, Xunliang Cai. 3411-3425 [doi]
- Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human DiversityZihao Li 0006, Feihao Fang, Xitong Zhang, Jiaru Zou, Zhining Liu 0002, Wei Xiong, Ziwei Wu, Baoyu Jing, Jingrui He. 3426-3455 [doi]
- PAMN: Multi-phase Correlation Modeling for Contrast-Enhanced 3D Medical Image RetrievalHaonan Tong, Ke Liu, Chuang Zhang, Xinglin Zhang, Tao Chen, Jenq-Neng Hwang, Lei Li 0050. 3456-3467 [doi]
- Safety in Large Reasoning Models: A SurveyCheng Wang, Yue Liu 0008, Baolong Bi, Duzhen Zhang, Zhong-Zhi Li, Yingwei Ma, Yufei He, Shengju Yu, Xinfeng Li, Junfeng Fang, Jiaheng Zhang, Bryan Hooi. 3468-3482 [doi]
- SafeConf: A Confidence-Calibrated Safety Self-Evaluation Method for Large Language ModelsBo Zhang, Cong Gao, Linkang Yang, Bingxu Han, Minghao Hu, Zhunchen Luo, Guotong Geng, Xiaoying Bai, Jun Zhang, Wen Yao, Zhong Wang. 3483-3495 [doi]
- DocAssistant: Integrating Key-region Reading and Step-wise Reasoning for Robust Document Visual Question AnsweringJinxu Zhang, Qiyuan Fan, Yu Zhang. 3496-3511 [doi]
- LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language ModelsRuijie Hou, Jiao Yueyang, Hanxu Hu, Yingming Li, Wai Lam, Huajian Zhang, Hongyuan Lu. 3512-3528 [doi]
- Enhancing Hate Speech Classifiers through a Gradient-assisted Counterfactual Text Generation StrategyMichael van Supranes, Shaowen Peng, Shoko Wakamiya, Eiji Aramaki. 3529-3544 [doi]
- Learning SQL Like a Human: Structure-Aware Curriculum Learning for Text-to-SQL GenerationXiaohu Zhu, Qian Li 0043, LiZhen Cui, YunTao Du. 3545-3559 [doi]
- Chain-of-Interactions: Multi-step Iterative ICL Framework for Abstractive Task-Oriented Dialogue Summarization of Conversational AI InteractionsJason S. Lucas, Ali Al-Lawati, Mahjabin Nahar, John Chen, Mahnoosh Mehrabani. 3560-3599 [doi]
- Your Semantic-Independent Watermark is Fragile: A Semantic Perturbation Attack against EaaS WatermarkZekun Fei, Biao Yi, Jianing Geng, Ruiqi He, Lihai Nie, Zheli Liu. 3600-3614 [doi]
- Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language ModelsYouan Cong, Pritom Saha Akash, Cheng Wang, Kevin Chen-Chuan Chang. 3615-3625 [doi]
- SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMsZhiqiang Liu, Enpei Niu, Yin-Hua, Mengshu Sun, Lei Liang 0002, Huajun Chen, Wen Zhang 0015. 3626-3640 [doi]
- PD³F: A Pluggable and Dynamic DoS-Defense Framework against resource consumption attacks targeting Large Language ModelsYuanhe Zhang, Xinyue Wang, Haoran Gao, Zhenhong Zhou, Fanyu Meng, Yuyao Zhang, Sen Su. 3641-3671 [doi]
- From Implicit Exploration to Structured Reasoning: Guideline and Refinement for LLMsJiaxiang Chen, Zhuo Wang, Mingxi Zou, Zhucong Li, Zhijian Zhou, Song Wang, Zenglin Xu. 3672-3684 [doi]
- PIP: Perturbation-based Iterative Pruning for Large Language ModelsYi Cao, Wei-Jie Xu, Yucheng Shen, Weijie Shi, Chi-Min Chan, Jianfeng Qu, Jiajie Xu. 3685-3701 [doi]
- Convolutional LoRA Aggregation for Unseen Tasks AdaptationXinhao Wu, Jialin Liu, Yutai Duan, Jie Liu. 3702-3714 [doi]
- CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and TaskHaosi Mo, Xinyu Ma, Xuebo Liu 0002, Derek F. Wong, Yu Li 0007, Jie Liu 0001, Min Zhang 0005. 3715-3734 [doi]
- Multilingual Collaborative Defense for Large Language ModelsHongliang Li, Jinan Xu, Gengping Cui, Changhao Guan, Fengran Mo, Kaiyu Huang. 3735-3755 [doi]
- Role-Guided Annotation and Prototype-Aligned Representation Learning for Historical Literature Sentiment ClassificationHongfei Du, Jiacheng Shi, Jacobo Myerston, Sidi Lu, Gang Zhou 0002, Ashley Gao. 3756-3768 [doi]
- MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech RecognitionYaqi Chen, Hao Zhang, Wenlin Zhang, Xukui Yang, Dan Qu, Yunpeng Liu. 3769-3779 [doi]
- RECAST: Retrieval-Augmented Contextual ASR via Decoder-State Keyword SpottingAshish R. Mittal, Sunita Sarawagi, Preethi Jyothi. 3780-3793 [doi]
- PREE: Towards Harmless and Adaptive Fingerprint Editing in Large Language Models via Knowledge Prefix EnhancementXubin Yue, Zhenhua Xu 0004, Wenpeng Xing, Jiahui Yu, Mohan Li, Meng Han. 3794-3804 [doi]
- Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert RoutingZichen Wu, Hsiu-Yuan Huang, Yunfang Wu. 3805-3825 [doi]
- Text-centric Alignment for Bridging Test-time Unseen ModalityYun-Da Tsai, Ting-Yu Yen, Pei-Fu Guo, Zhe-Yan Li, Shou-de Lin. 3826-3845 [doi]
- HierPrompt: Zero-Shot Hierarchical Text Classification with LLM-Enhanced PrototypesQian Zhang, Qinliang Su, Wei Zhu, Pang Yachun. 3846-3859 [doi]
- RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMsZhongzhan Huang, Guoming Ling, Yupei Lin, Yandong Chen, ShanShan Zhong, Hefeng Wu, Liang Lin. 3860-3887 [doi]
- Can We Steer Reasoning Direction by Thinking Intervention?Xingsheng Zhang, Luxi Xing, Chen Zhang, Yanbing Liu, Yifan Deng, Yunpeng Li, Yue Hu, Chenxu Niu. 3888-3913 [doi]
- MPO: Boosting LLM Agents with Meta Plan OptimizationWeimin Xiong, Yifan Song 0002, Qingxiu Dong, Bingchan Zhao, Feifan Song 0001, Xun Wang, Sujian Li. 3914-3935 [doi]
- Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge UtilizationSiyuan Zhang, Yichi Zhang 0012, Yinpeng Dong, Hang Su 0006. 3936-3968 [doi]
- Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language ModelsS. M. Rafiuddin, Muntaha Nujat Khan. 3969-3981 [doi]
- Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution ApproachXiaoran Yin, Xu Luo 0003, Hao Wu 0070, Lianli Gao, Jingkuan Song. 3982-4005 [doi]
- RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question AnsweringSichu Liang, Linhai Zhang, Hongyu Zhu 0004, Wenwen Wang, Yulan He 0001, Deyu Zhou. 4006-4033 [doi]
- EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented GenerationRuobing Yao, Yifei Zhang, Shuang Song, Neng Gao, Chenyang Tu. 4034-4050 [doi]
- StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological UnderpinningsKaustubh Shivshankar Shejole, Pushpak Bhattacharyya. 4051-4082 [doi]
- Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial ReasoningYihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao 0001. 4083-4103 [doi]
- How Does Knowledge Selection Help Retrieval Augmented Generation?Xiangci Li, Jessica Ouyang. 4104-4121 [doi]
- UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical ModulationTianlong Li, Wenhao Liu, Muling Wu, Shihan Dou, Zhenghua Wang, Changze Lv, Xiaohua Wang, Xiaoqing Zheng, Xuanjing Huang 0001. 4122-4136 [doi]
- ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented GenerationRuobing Yao, Yifei Zhang, Shuang Song, Yuhan Liu, Neng Gao, Chenyang Tu. 4137-4151 [doi]
- FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM QuantizationFangxin Liu, Zongwu Wang, Jinhong Xia, Junping Zhao, Shouren Zhao, Jinjin Li, Jian Liu, Li Jiang 0002, Haibing Guan. 4152-4161 [doi]
- ReLoop: "Seeing Twice and Thinking Backwards" via Closed-loop Training to Mitigate Hallucinations in Multimodal understandingJianjiang Yang, Yanshu Li, Ziyan Huang. 4162-4179 [doi]
- Sequence Structure Aware Retriever for Procedural Document Retrieval: A New Dataset and BaselineZhenqi Ye, Haopeng Ren, Yi Cai 0001, Qingbao Huang, Jing Qin 0001, Pinli Zhu, Songwen Gong. 4180-4198 [doi]
- The Effect of Language Diversity When Fine-Tuning Large Language Models for TranslationDavid Stap, Christof Monz. 4199-4211 [doi]
- David vs. Goliath: Cost-Efficient Financial QA via Cascaded Multi-Agent ReasoningChenghao Liu, Qian Liu, Ziqin Zhu, Hao Fei, Aniket Mahanti. 4212-4229 [doi]
- Benchmarking Uncertainty Metrics for LLM Target-Aware SearchPei-Fu Guo, Yun-Da Tsai, Shou-de Lin. 4230-4238 [doi]
- ZOGRASCOPE: A New Benchmark for Semantic Parsing over Property GraphsFrancesco Cazzaro, Justin Kleindienst, Sofia Márquez Gomez, Ariadna Quattoni. 4239-4246 [doi]
- FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical ReasoningRuosen Li, Ziming Luo, Xinya Du. 4247-4278 [doi]
- Recipe2Plan: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between ActionsZirui Wu, Xiao Liu 0032, Jiayi Li, Lingpeng Kong, Yansong Feng 0002. 4279-4301 [doi]
- Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream ModelsZhenhua Xu 0004, Zhaokun Yan, Binhan Xu, Xin Tong, Haitao Xu, Yourong Chen, Meng Han. 4302-4312 [doi]
- AELC: Adaptive Entity Linking with LLM-Driven ContextualizationFang Wang, Zhengwei Tao, Ming Wang, Minghao Hu, Xiaoying Bai. 4313-4327 [doi]
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning TransferHonglin Lin, Zhuoshi Pan, Qizhi Pei, Xin Gao 0001, Yu Li 0006, Mengzhang Cai, Conghui He, Lijun Wu 0003. 4328-4354 [doi]
- GLProtein: Global-and-Local Structure Aware Protein Representation LearningYunqing Liu, Wenqi Fan, Xiaoyong Wei, Li Qing. 4355-4372 [doi]
- Reward Mixology: Crafting Hybrid Signals for Reinforcement Learning Driven In-Context LearningChangshuo Zhang, Ang Gao, Xiao Zhang, Yong Liu, Deyang Li, Fangchao Liu, Xinyu Zhang. 4373-4383 [doi]
- Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials CharacterizationZhengzhao Lai, Youbin Zheng, Zhenyang Cai, Haonan Lyu, Jingpu Yang, Hongqing Liang, Yan Hu, Benyou Wang. 4384-4404 [doi]
- GRADE: Generating multi-hop QA and fine-gRAined Difficulty matrix for RAG EvaluationJeongsoo Lee, Daeyong Kwon, Kyohoon Jin. 4405-4424 [doi]
- FusionDTI: Fine-grained Binding Discovery with Token-level Fusion for Drug-Target InteractionZhaohan Meng, Zaiqiao Meng, Ke Yuan, Iadh Ounis. 4425-4444 [doi]
- A Survey on Training-free Alignment of Large Language ModelsBirong Pan, Yongqi Li 0002, Weiyu Zhang 0001, Wenpeng Lu, Mayi Xu, Shen Zhou, Yuanyuan Zhu 0001, Ming Zhong 0002, Tieyun Qian. 4445-4461 [doi]
- CIVET: Systematic Evaluation of Understanding in VLMsMassimo Rizzoli, Simone Alghisi, Olha Khomyn, Gabriel Roccabruna, Seyed Mahed Mousavi, Giuseppe Riccardi. 4462-4480 [doi]
- How Does Cognitive Bias Affect Large Language Models? A Case Study on the Anchoring Effect in Price Negotiation SimulationsYoshiki Takenami, Yin Jou Huang, Yugo Murawaki, Chenhui Chu. 4481-4498 [doi]
- Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented GenerationPengchao Feng, Ziyang Ma 0001, Wenxi Chen, Yao Li, Sheng Wang, Kai Yu 0004, Xie Chen 0001. 4499-4507 [doi]
- Backdoor-Powered Prompt Injection Attacks Nullify Defense MethodsYulin Chen, Haoran Li 0003, Yuan Sui 0001, Yangqiu Song, Bryan Hooi. 4508-4527 [doi]
- Path-enhanced Pre-trained Language Model for Knowledge Graph CompletionHao Wang 0163, Dandan Song 0005, Zhijing Wu 0001, Yuhang Tian, Pan Yang. 4528-4540 [doi]
- Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation PerspectiveZhihao Zhang, Sophia Yat Mei Lee, Dong Zhang, Shoushan Li, Guodong Zhou. 4541-4557 [doi]
- Zero-Shot Cross-Domain Aspect-Based Sentiment Analysis via Domain-Contextualized Chain-of-Thought ReasoningChuming Shen, Wei Wei, Dong Wang, Zhong-Hao Wang. 4558-4573 [doi]
- Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective ReasoningSong Yu, Xiaofei Xu, Ke Deng, Li Li, Lin Tian. 4574-4592 [doi]
- Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab WorldSaeed Almheiri, Rania Elbadry, Mena Attia, Chenxi Wang, Preslav Nakov, Timothy Baldwin, Fajri Koto. 4593-4614 [doi]
- Enhancing Partially Relevant Video Retrieval with Robust Alignment LearningLong Zhang, Peipei Song, Jianfeng Dong, Kun Li 0008, Xun Yang 0001. 4615-4629 [doi]
- Multi-level Diagnosis and Evaluation for Robust Tabular Feature Engineering with Large Language ModelsYebin Lim, Susik Yoon. 4630-4655 [doi]
- Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge ReasoningJianing Wang, Jin Jiang, Yang Liu, Mengdi Zhang, Xunliang Cai. 4656-4673 [doi]
- FroM: Frobenius Norm-Based Data-Free Adaptive Model MergingZijian Li 0020, Xiaocheng Feng, Huixin Liu, Yichong Huang, Ting Liu 0001, Bing Qin 0001. 4674-4687 [doi]
- Dynamic Simulation Framework for Disinformation Dissemination and Correction With Social BotsBoyu Qiao, Kun Li, Wei Zhou 0019, Songlin Hu 0001. 4688-4710 [doi]
- Beyond the First Error: Process Reward Models for Reflective Mathematical ReasoningZhaohui Yang, Chenghua He, Xiaowen Shi, Shihong Deng, Linjing Li, Qiyue Yin, Daxin Jiang. 4711-4728 [doi]
- PrAd: Prompt Adaptive Tuning for Decoder-only Language ModelsYouneng Ma, Junyi He, Haojun Fei. 4729-4743 [doi]
- Personalized Question Answering with User Profile Generation and CompressionHang Su, Yun Yang, Tianyang Liu, Xin Liu, Peng Pu, Xuesong Lu. 4744-4763 [doi]
- Dream to Chat: Model-based Reinforcement Learning on Dialogues with User Belief ModelingYue Zhao, Xiaoyu Wang, Dan Wang, Zhonglin Jiang, Qingqing Gu, Teng Chen, Ningyuan Xi, Jinxian Qu, Yong Chen, Luo Ji. 4764-4781 [doi]
- FakeSV-VLM: Taming VLM for Detecting Fake Short-Video News via Progressive Mixture-Of-Experts AdapterJunxi Wang, Yaxiong Wang, Lechao Cheng, Zhun Zhong. 4782-4798 [doi]
- Beyond Inherent Cognition Biases in LLM-Based Event Forecasting: A Multi-Cognition Agentic FrameworkZhen Wang, Xi Zhou, Yating Yang, Bo Ma 0004, Lei Wang 0065, Rui Dong 0002, Azmat Anwar. 4799-4818 [doi]
- Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial AttacksTzu-Ling Lin, Wei-Chih Chen, Teng-Fang Hsiao, Hou-I Liu, Ya-Hsin Yeh, Yu Kai Chan, Wen-Sheng Lien, Po-Yen Kuo, Philip S. Yu, Hong-Han Shuai. 4819-4839 [doi]
- Watermarking with Low-Entropy POS-Guided Token Partitioning and Z-Score-Driven Dynamic Bias for Large Language ModelsHe Li, Xiaojun Chen, Zhendong Zhao, Yunfei Yang, Xin Zhao, Jingcheng He. 4840-4859 [doi]
- Knowledge Graph-Driven Memory Editing with Directional InterventionsJinhu Fu, Kun Wang 0056, Chongye Guo, Junfeng Fang, Wentao Zhang, Sen Su. 4860-4874 [doi]
- DTDES-KGE: Dual-Teacher Knowledge Distillation with Distinct Embedding Spaces for Knowledge Graph EmbeddingsBofan Wei, Hongyuan Xu, Yuhang Niu, Jiarui Ren, Yanlong Wen, Xiaojie Yuan. 4875-4887 [doi]
- LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician ValidationMing Zhang 0030, Yujiong Shen, Zelin Li, Huayu Sha, Binze Hu, Yuhui Wang, Chenhao Huang, Shichun Liu, Jingqi Tong, Changhao Jiang, Mingxu Chai, Zhiheng Xi, Shihan Dou, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 4888-4914 [doi]
- Watermark Smoothing Attacks against Language ModelsHongyan Chang, Hamed Hassani, Reza Shokri. 4915-4941 [doi]
- PICD-Instruct: A Generative Instruction Learning Framework for Few-Shot Multi-Intent Spoken Language UnderstandingWenbin Hua, Rui Fan 0005, Tingting He 0003, Ming Dong 0004. 4942-4956 [doi]
- Forewarned is Forearmed: Pre-Synthesizing Jailbreak-like Instructions to Enhance LLM Safety Guardrail to Potential AttacksSheng Liu, Qiang Sheng, Danding Wang, Yang Liu 0005, Guang Yang, Juan Cao 0001. 4957-4974 [doi]
- Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent?Xi Ai, Mahardika Krisna Ihsani, Min-Yen Kan. 4975-5011 [doi]
- Krikri: Advancing Open Large Language Models for GreekDimitris Roussis, Leon Voukoutis, Georgios Paraskevopoulos, Sokratis Sofianopoulos, Prokopis Prokopidis, Vassilis P. Plagianakos, Athanasios Katsamanis, Stelios Piperidis, Vassilis Katsouros. 5012-5033 [doi]
- Beyond the Scientific Document: A Citation-Aware Multi-Granular Summarization Approach with Heterogeneous GraphsQuoc-An Nguyen, Xuan Hung Le, Thi-Minh-Thu Vu, Hoang-Quynh Le. 5034-5046 [doi]
- Detecting Continuously Evolving Scam Calls under Limited Annotation: A LLM-Augmented Expert Rule FrameworkHaoyu Ma, Qinliang Su, Minhua Huang, Wu Kai. 5047-5068 [doi]
- An Empirical Study of Position Bias in Modern Information RetrievalZiyang Zeng, Dun Zhang, Jiacheng Li, Panxiang Zou, Yudong Zhou, YuQing Yang. 5069-5081 [doi]
- GenPoE: Generative Passage-level Mixture of Experts for Knowledge Enhancement of LLMsXuebing Liu, Shanbao Qiao, Seung-Hoon Na. 5082-5097 [doi]
- CoRanking: Collaborative Ranking with Small and Large Ranking AgentsWenhan Liu, Xinyu Ma, Yutao Zhu 0001, Lixin Su, Shuaiqiang Wang, Dawei Yin 0001, Zhicheng Dou. 5098-5110 [doi]
- HIRAG: Hierarchical-Thought Instruction-Tuning Retrieval-Augmented GenerationYiHan Jiao, ZheHao Tan, Dan Yang 0004, Duolin Sun, Jie Feng, Yue Shen, Jian Wang 0108, Peng Wei. 5111-5130 [doi]
- Towards Personalized Conversational Sales Agents: Contextual User Profiling for Strategic ActionTongyoung Kim, Jeongeun Lee, Soojin Yoon, Sunghwan Kim, Dongha Lee. 5131-5154 [doi]
- WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and RollbackMinda Hu, Tianqing Fang, Jianshu Zhang, Jun-Yu Ma, Zhisong Zhang, Jingyan Zhou, Hongming Zhang 0009, Haitao Mi, Dong Yu 0001, Irwin King. 5155-5173 [doi]
- Interesting Culture: Social Relation Recognition from Videos via Culture De-confoundingYuxuan Zhang, Yangfu Zhu, Haorui Wang, Bin Wu 0001. 5174-5184 [doi]
- ThinkSwitcher: When to Think Hard, When to Think FastGuosheng Liang, Longguang Zhong, Ziyi Yang, Xiaojun Quan. 5185-5201 [doi]
- MaGiX: A Multi-Granular Adaptive Graph Intelligence Framework for Enhancing Cross-Lingual RAGNguyen Manh Hieu, Vu Lam Anh, Hung Pham Van, Nam Le Hai, Ngo Van Linh 0001, Nguyen Thi Ngoc Diep, Thien Huu Nguyen. 5202-5219 [doi]
- LexTime: A Benchmark for Temporal Ordering of Legal EventsClaire Barale, Leslie Barrett, Vikram Sunil Bajaj, Michael Rovatsos. 5220-5236 [doi]
- Beyond the Surface: A Solution-Aware Retrieval Model for Competition-level Code GenerationShiwen Zhang, Lingxiang Wang, Hainan Zhang 0001, Ziwei Wang, Sijia Wen, Zhiming Zheng 0001. 5237-5246 [doi]
- X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Jailbreak Attacks without Compromising UsabilityXiaoya Lu, Dongrui Liu, Yi Yu 0012, Luxin Xu, Jing Shao. 5247-5272 [doi]
- Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference AttackSagiv Antebi, Edan Habler, Asaf Shabtai, Yuval Elovici. 5273-5286 [doi]
- EcoLANG: Efficient and Effective Agent Communication Language Induction for Social SimulationXinyi Mou, Chen Qian, Wei Liu, Ling Yan, Yao Hu 0002, Xuanjing Huang 0001, Zhongyu Wei. 5287-5304 [doi]
- Revealing the Inherent Instructability of Pre-Trained Language ModelsSeokhyun An, Minji Kim, Hyounghun Kim. 5305-5336 [doi]
- What Media Frames Reveal About Stance: A Dataset and Study about Memes in Climate Change DiscourseShijia Zhou, Siyao Peng, Simon Luebke, Jörg Haßler, Mario Haim, Saif M. Mohammad, Barbara Plank. 5337-5356 [doi]
- Rethinking Personality Assessment from Human-Agent Dialogues: Fewer Rounds May Be Better Than MoreBaiqiao Zhang, Zhifeng Liao, Xiangxian Li, Chao Zhou 0012, Juan Liu 0008, Xiaojuan Ma, Yulong Bian. 5357-5380 [doi]
- TailorRPA: A Retrieval-Based Framework for Eliciting Personalized and Coherent Role-Playing Agents in General DomainZhenpeng Gao, Xiaofen Xing, Xiangmin Xu. 5381-5412 [doi]
- SCE: Semantic Consistency Enhanced Reinforcement Learning for Multi-Hop Knowledge Graph ReasoningYanwen Huang, Yao Liu, Qiao Liu 0003, Rui Hou 0005, Tingting Dai. 5413-5425 [doi]
- ReGraphRAG: Reorganizing Fragmented Knowledge Graphs for Multi-Perspective Retrieval-Augmented GenerationSoohyeong Kim, Seok Jun Hwang, JungHyoun Kim, Jeonghyeon Park, Yong Suk Choi. 5426-5443 [doi]
- GASE: Generatively Augmented Sentence EncodingManuel Frank, Haithem Afli. 5444-5461 [doi]
- The "r" in "woman" stands for rights. Auditing LLMs in Uncovering Social Dynamics in Implicit MisogynyArianna Muti, Chris Emmery, Debora Nozza, Alberto Barrón-Cedeño, Tommaso Caselli. 5462-5479 [doi]
- Fact Verification on Knowledge Graph via Programmatic Graph ReasoningYuanzhen Hao, Desheng Wu. 5480-5495 [doi]
- Agent Trading Arena: A Study on Numerical Understanding in LLM-Based AgentsTianmi Ma, Jiawei Du, Wenxin Huang, Wenjie Wang, Liang Xie, Xian Zhong, Joey Tianyi Zhou. 5496-5514 [doi]
- Why We Feel What We Feel: Joint Detection of Emotions and Their Opinion Triggers in E-commerceArnav Attri, Anuj Attri, Suman Banerjee, Amey Patil, Muthusamy Chelliah, Nikesh Garera, Pushpak Bhattacharyya. 5515-5532 [doi]
- Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text AugmentationJán Cegin, Branislav Pecher, Jakub Simko, Ivan Srba, Mária Bieliková, Peter Brusilovsky. 5533-5550 [doi]
- BanglaByT5: Byte-Level Modelling for BanglaPramit Bhattacharyya, Arnab Bhattacharya 0001. 5551-5560 [doi]
- XTRA: Cross-Lingual Topic Modeling with Topic and Representation AlignmentsTien-Phat Nguyen, Vu Minh Ngo, Tung Nguyen, Linh Ngo Van 0001, Duc Anh Nguyen, Dinh Viet Sang, Trung Le 0001. 5561-5575 [doi]
- CodeContests+: High-Quality Test Case Generation for Competitive ProgrammingZihan Wang, Siyao Liu, Yang Sun, Ming Ding, Hongyan Li. 5576-5600 [doi]
- SPO: Self Preference Optimization with Self RegularizationYuhao Sun, Yifan Zhang, Quandong Wang, Qinzhuo Wu, Wei Liu, Jian Luan 0001. 5601-5614 [doi]
- Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning StepsYijiong Yu, Zhixiao Qi, Yongfeng Huang 0001, Wei Wang, Weifeng Liu, Ran Chen, Ji Pei. 5615-5634 [doi]
- Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language ModelsBlanca Calvo Figueras, Rodrigo Agerri. 5635-5652 [doi]
- ResearchArena: Benchmarking Large Language Models' Ability to Collect and Organize Information as Research AgentsHao Kang, Chenyan Xiong. 5653-5671 [doi]
- LLMs are Privacy ErasableZipeng Ye, Wenjian Luo. 5672-5692 [doi]
- How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking ModelsAbdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt. 5693-5709 [doi]
- DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM DistillationAbdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Adam Jatowt. 5710-5723 [doi]
- CANDY: Benchmarking LLMs' Limitations and Assistive Potential in Chinese Misinformation Fact-CheckingRuiling Guo, Xinwei Yang, Chen Huang 0006, Tong Zhang, Yong Hu. 5724-5758 [doi]
- E-Verify: A Paradigm Shift to Scalable Embedding-based Factuality VerificationZeyang Liu, Jingfeng Xue, Xiuqi Yang, Wenbiao Du, Jiarun Fu, Junbao Chen, Wenjie Guo, Yong Wang 0010. 5759-5776 [doi]
- LLM Jailbreak Detection for (Almost) Free!Guorui Chen, Yifan Xia, Xiaojun Jia, Zhijiang Li, Philip Torr 0001, Jindong Gu. 5777-5807 [doi]
- When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient ReasoningXiaoyun Zhang, Jingqing Ruan, Xing Ma, Yawen Zhu, Haodong Zhao, Hao Li, Jiansong Chen, Ke Zeng, Xunliang Cai. 5808-5828 [doi]
- Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM RelianceXixi Wang, Miguel Costa, Jordanka Kovaceva, Shuai Wang, Francisco C. Pereira. 5829-5842 [doi]
- Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational DynamicsSheng Jin, Haoming Wang, Zhiqi Gao, Yongbo Yang, Bao Chunjia, Chengliang Wang. 5843-5857 [doi]
- Retrieval-Augmented Machine Translation with Unstructured KnowledgeJiaan Wang, Fandong Meng, Yingxue Zhang 0003, Jie Zhou 0016. 5858-5871 [doi]
- MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue EvaluationChenghao Yang, Yinbo Luo, Zhoufutu Wen, Qi Chu 0001, Tao Gong, Longxiang Liu, Kaiyuan Zhang, Jianpeng Jiao, Ge Zhang 0009, Wenhao Huang 0001, Nenghai Yu. 5872-5898 [doi]
- UTMath: A Benchmark for Math Evaluation with Unit TestBo Yang, Qingping Yang, Yingwei Ma, Runtao Liu. 5899-5915 [doi]
- The Green KNIGHT: Green Machine Translation with Knowledge-Distilled, Narrow, Inexpensive, Greedy, Hybrid TransformersAndreas Guta, Frithjof Petrick, Peter Polák. 5916-5931 [doi]
- Constructing Your Model's Value Distinction: Towards LLM Alignment with Anchor Words TuningZhen Yang, Ping Jian, Chengzhi Li, Chenxu Wang, Xinyue Zhang, Wenpeng Lu. 5932-5948 [doi]
- MCiteBench: A Multimodal Benchmark for Generating Text with CitationsCaiyu Hu, Yikai Zhang 0004, Tinghui Zhu, Yiwei Ye, Yanghua Xiao. 5949-5966 [doi]
- Do LLMs Know and Understand Domain Conceptual Knowledge?Sijia Shen, Feiyan Jiang, Peiyan Wang, Yubo Feng, Yuchen Jiang, Chang Liu. 5967-5976 [doi]
- Agent Laboratory: Using LLM Agents as Research AssistantsSamuel Schmidgall, Yusheng Su, Ze Wang 0008, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu 0014, Michael Moor, Zicheng Liu 0001, Emad Barsoum. 5977-6043 [doi]
- Retrieval-Augmented Generation with Hierarchical KnowledgeHaoyu Huang, Yongfeng Huang, Junjie Yang, Zhenyu Pan, Yongqiang Chen 0002, Kaili Ma 0001, Hongzhi Chen, James Cheng. 6044-6060 [doi]
- Regularized Contrastive Decoding with Hard Negative Samples for LLM Hallucination MitigationHaonan Sheng, Dou Hu 0001, Lingwei Wei, Wei Zhou 0019, Songlin Hu 0001. 6061-6073 [doi]
- CharacterCraft: Bridging the Literature-Reality Dialogue Gap for Practical Role-Playing AgentsXuyan Yin, Xinran Yang, Zihao Li 0005, Lixin Zou, Chenliang Li 0005. 6074-6106 [doi]
- Drift: Decoding-time Personalized Alignments with Implicit User PreferencesMinbeom Kim, Kang Il Lee, Seongho Joo, Hwaran Lee, Thibaut Thonet, Kyomin Jung. 6107-6126 [doi]
- Discovering Semantic Subdimensions through Disentangled Conceptual RepresentationsYunhao Zhang, Shaonan Wang, Nan Lin, Xinyi Dong, Chong Li, Chengqing Zong. 6127-6144 [doi]
- Identifying Aspects in Peer ReviewsSheng Lu, Ilia Kuznetsov, Iryna Gurevych. 6145-6167 [doi]
- Tree-Structured Non-Autoregressive Decoding for Sequence-to-Sequence Text GenerationPengyu Ji, Yufei Liu, Xiang Hu, Kewei Tu. 6168-6174 [doi]
- Towards More Efficient Post-training via Fourier Domain Adapter FrameworkYijia Fan, Jusheng Zhang, Keze Wang. 6175-6193 [doi]
- KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question AnsweringYushi Sun, Kai Sun 0006, Yifan Ethan Xu, Xiao Yang, Xin Luna Dong, Nan Tang 0001, Lei Chen 0002. 6194-6216 [doi]
- Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language ModelsZheyu Zhang 0007, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci. 6217-6242 [doi]
- CCG: Rare-Label Prediction via Neural SEM-Driven Causal GameYijia Fan, Jusheng Zhang, Kaitong Cai, Jing Yang, Keze Wang. 6243-6256 [doi]
- Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and ProspectsChengyan Wu, Yiqiang Cai, Yang Liu 0004, Pengxu Zhu, Yun Xue, Ziwei Gong, Julia Hirschberg, Bolei Ma. 6257-6274 [doi]
- When Allies Turn Foes: Exploring Group Characteristics of LLM-Based Multi-Agent Collaborative Systems Under Adversarial AttacksJiahao Zhang, Baoshuo Kan, Tao Gong 0001, Fu Lee Wang, Tianyong Hao. 6275-6300 [doi]
- EditID: Training-Free Editable ID Customization for Text-to-Image GenerationGuandong Li, Zhaobin Chu. 6301-6319 [doi]
- OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM CollaborationJusheng Zhang, Yijia Fan, Kaitong Cai, Xiaofei Sun, Keze Wang. 6320-6337 [doi]
- VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction FormatYueqian Wang, Xiaojun Meng, Yuxuan Wang 0004, Jianxin Liang, Jiansheng Wei, Huishuai Zhang, Dongyan Zhao 0001. 6338-6359 [doi]
- To Answer or Not to Answer (TAONA): A Robust Textual Graph Understanding and Question Answering ApproachYuchen Yan, Aakash Kolekar, Sahika Genc, Wenju Xu, Edward W. Huang, Anirudh Srinivasan, Mukesh Jain, Qi He 0002, Hanghang Tong. 6360-6376 [doi]
- Understanding Refusal in Language Models with Sparse AutoencodersWei Jie Yeo, Nirmalendu Prakash, Clement Neo, Ranjan Satapathy, Roy Ka-Wei Lee, Erik Cambria. 6377-6399 [doi]
- Where Did That Come From? Sentence-Level Error-Tolerant AttributionOri Ernst, Aviv Slobodkin, Meng Cao 0003, Sihui Wei, Jackie CK Cheung. 6400-6417 [doi]
- Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based RetrievalHaotong Bao, Jianjin Zhang, Qi Chen, Weihao Han, Zhengxin Zeng, Ruiheng Chang, Mingzheng Li, Hao Sun, Weiwei Deng, Feng Sun, Qian Zhang 0001. 6418-6427 [doi]
- Can LLMs Find a Needle in a Haystack? A Look at Anomaly Detection Language ModelingLeslie Barrett, Vikram Sunil Bajaj, Robert J. Kingan. 6428-6435 [doi]
- Beyond Single Frames: Can LMMs Comprehend Implicit Narratives in Comic Strip?Xiaochen Wang, Heming Xia, Jialin Song, Longyu Guan, Qingxiu Dong, Rui Li 0094, Yixin Yang, Yifan Pu, Weiyao Luo, Yiru Wang, Xiangdi Meng, Wenjie Li 0002, Zhifang Sui. 6436-6452 [doi]
- Enhancing Multi-Agent Debate System Performance via Confidence ExpressionZijie Lin, Bryan Hooi. 6453-6471 [doi]
- The Face of Persuasion: Analyzing Bias and Generating Culture-Aware AdsAysan Aghazadeh, Adriana Kovashka. 6472-6500 [doi]
- SIFT: Grounding LLM Reasoning in Contexts via StickersZihao Zeng, Xuyao Huang, Boxiu Li, Zhijie Deng. 6501-6513 [doi]
- When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-TuningMengyi Deng, Xin Li, Tingyu Zhu, Zhicheng Yang, Zhijiang Guo, Wei Wang. 6514-6523 [doi]
- LUME: LLM Unlearning with Multitask EvaluationsAnil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang 0001, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong 0001, Rahul Gupta 0001. 6524-6535 [doi]
- How do Language Models Generate Slang: A Systematic Comparison between Human and Machine-Generated Slang UsagesSiyang Wu, Zhewei Sun. 6536-6559 [doi]
- Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial ReasoningSiqu Ou, Hongcheng Liu, Pingjie Wang, Yusheng Liao, Chuan Xuan, Yanfeng Wang, Yu Wang. 6560-6578 [doi]
- MedCOD: Enhancing English-to-Spanish Medical Translation of Large Language Models Using Enriched Chain-of-Dictionary FrameworkMd. Shahidul Salim, Lian Fu, Arav Adikesh Ramakrishnan, Zonghai Yao, Hong Yu 0001. 6579-6597 [doi]
- Chatbot To Help Patients Understand Their HealthWon-Seok Jang, Hieu Tran, Manav Mistry, SaiKiran Gandluri, Yifan Zhang, Sharmin Sultana, Sunjae Kwon, Yuan Zhang, Zonghai Yao, Hong Yu 0001. 6598-6627 [doi]
- A Knapsack by Any Other Name: Presentation impacts LLM performance on NP-hard problemsAlex Duchnowski, Ellie Pavlick, Alexander Koller. 6628-6651 [doi]
- Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language ModelsYeonjun In, Wonjoong Kim, Kanghoon Yoon, SungChul Kim, Mehrab Tanjim, Sangwu Park, Kibum Kim, Chanyoung Park 0001. 6652-6671 [doi]
- Jailbreak Attack Initializations as Extractors of Compliance DirectionsAmit Levi, Rom Himelstein, Yaniv Nemcovsky, Avi Mendelson, Chaim Baskin. 6672-6705 [doi]
- Train Once for All: A Transitional Approach for Efficient Aspect Sentiment Triplet ExtractionXinmeng Hou, Lingyue Fu, Chenhao Meng, Kounianhua Du, Hai Hu 0001. 6706-6719 [doi]
- A Comprehensive Survey on the Trustworthiness of Large Language Models in HealthcareManar Aljohani, Jun Hou, Sindhura Kommu, Xuan Wang. 6720-6748 [doi]
- Self-Correction Makes LLMs Better ParsersZiyan Zhang, Yang Hou 0001, Chen Gong 0004, Zhenghua Li. 6749-6762 [doi]
- Explaining Length Bias in LLM-Based Preference EvaluationsZhengyu Hu, Linxin Song, Jieyu Zhang 0001, Zheyuan Xiao, Tianfu Wang 0002, Zhengyu Chen 0001, Nicholas Jing Yuan, Jianxun Lian, Kaize Ding, Hui Xiong 0001. 6763-6794 [doi]
- Investigating Controversy Framing across Topics on Social MediaMaxwell A. Weinzierl, Sanda M. Harabagiu. 6795-6814 [doi]
- HEAL: Hybrid Enhancement with LLM-based Agents for Text-attributed Hypergraph Self-supervised Representation LearningRuochang Li, Xiao Luo 0001, Zhiping Xiao 0001, Wei Ju 0001, Ming Zhang 0004. 6815-6829 [doi]
- ReMamba: Equip Mamba with Effective Long-Sequence ModelingDanlong Yuan, Jiahao Liu, Bei Li, Huishuai Zhang, Jingang Wang, Xunliang Cai, Dongyan Zhao 0001. 6830-6840 [doi]
- QUITO-X: A New Perspective on Context Compression from the Information Bottleneck TheoryYihang Wang, Xu Huang, Bowen Tian, Yueyang Su, Lei Yu 0012, Huaming Liao, Yixing Fan, Jiafeng Guo, Xueqi Cheng. 6841-6856 [doi]
- Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in TransformersYingyu Liang, Heshan Liu, Zhenmei Shi, Zhao Song 0002, Zhuoyan Xu, Jiale Zhao, Zhen Zhuang. 6857-6894 [doi]
- Mitigating Gender Bias via Fostering Exploratory Thinking in LLMsKangda Wei, Hasnat Md Abdullah, Ruihong Huang. 6895-6917 [doi]
- Beyond the Textual: Generating Coherent Visual Options for MCQsWanqiang Wang, Longzhu He, Wei Zheng. 6918-6935 [doi]
- SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation SignalsPeixuan Han, Cheng Qian 0008, Xiusi Chen, Yuji Zhang 0002, Heng Ji 0001, Denghui Zhang. 6936-6955 [doi]
- MADD: Multi-Agent Drug Discovery OrchestraGleb V. Solovev, Alina B. Zhidkovskaya, Anastasia Orlova, Nina Gubina, Anastasia Vepreva, Rodion Golovinskii, Ilya Tonkii, Ivan Dubrovsky, Ivan Gurev, Dmitry Gilemkhanov, Denis Chistiakov, Timur A. Aliev, Ivan Poddiakov, Galina Zubkova, Ekaterina V. Skorb, Vladimir Vinogradov, Alexander Boukhanovsky, Nikolay O. Nikitin, Andrei Dmitrenko, Anna V. Kaluzhnaya, Andrey V. Savchenko. 6956-6998 [doi]
- PersonaGym: Evaluating Persona Agents and LLMsVinay Samuel, Henry Peng Zou, Yue Zhou, Shreyas Chaudhari, Ashwin Kalyan, Tanmay Rajpurohit, Ameet Deshpande, Karthik R. Narasimhan, Vishvak Murahari. 6999-7022 [doi]
- LM2Protein: A Structure-to-Token Protein Large Language ModelChang Zhou, Yuheng Shan, Pengan Chen, Xiangyu Shi, Zikang Wang, Yanting Li, Jiyue Jiang. 7023-7029 [doi]
- How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts?Sohee Yang, Sang-Woo Lee, Nora Kassner, Daniela Gottesman, Sebastian Riedel 0001, Mor Geva. 7030-7047 [doi]
- From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information RetrievalDohyeon Lee, Yeonseok Jeong, Seung-won Hwang. 7048-7064 [doi]
- Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMsZeping Yu, Sophia Ananiadou. 7065-7078 [doi]
- Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse CapabilitiesQirun Dai 0002, Dylan Zhang, Jiaqi W. Ma, Hao Peng. 7079-7102 [doi]
- Diagnosing Moral Reasoning Acquisition in Language Models: Pragmatics and GeneralizationGuangliang Liu, Zimo Qi, Xitong Zhang, Lei Jiang, Kristen Marie Johnson. 7103-7117 [doi]
- Discourse Heuristics For Paradoxically Moral Self-CorrectionGuangliang Liu, Zimo Qi, Xitong Zhang, Kristen Marie Johnson. 7118-7132 [doi]
- Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language ModelsJunjie Xiong, Changjia Zhu, Shuhang Lin, Chong Zhang 0006, Yongfeng Zhang 0003, Yao Liu 0007, Lingyao Li. 7133-7147 [doi]
- Turning the Tide: Repository-based Code ReflectionWei Zhang, Jian Yang, Jiaxi Yang, Ya Wang, Zhoujun Li, Zeyu Cui, Binyuan Hui, Junyang Lin. 7148-7164 [doi]
- Reinforcement Learning with Supervised AlignmentJoão Luís Lins, Jia Xu. 7165-7181 [doi]
- EmByte: Decomposition and Compression Learning for Small yet Private NLPShenglan Li, Jia Xu, Mengjiao Zhang. 7182-7201 [doi]
- GUARD: Glocal Uncertainty-Aware Robust Decoding for Effective and Efficient Open-Ended Text GenerationYuanhao Ding, Esteban Garces Arias, Meimingwei Li, Julian Rodemann, Matthias Aßenmacher, Danlu Chen, Gaojuan Fan, Christian Heumann, Chongsheng Zhang. 7202-7226 [doi]
- Efficiently Editing Mixture-of-Experts Models with Compressed ExpertsYifei He, Yang Liu 0124, Chen Liang, Hany Hassan Awadalla. 7227-7238 [doi]
- FinGEAR: Financial Mapping-Guided Enhanced Answer RetrievalYing Li, Mengyu Wang, Miguel De Carvalho, Sotirios Sabanis, Tiejun Ma. 7239-7255 [doi]
- FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question AnsweringAmirhossein Abaskohi, Spandana Gella, Giuseppe Carenini, Issam H. Laradji. 7256-7282 [doi]
- SQUARE: Unsupervised Retrieval Adaptation via Synthetic DataJinsung Yoon, Junhao Zeng, Sercan Ö. Arik. 7283-7297 [doi]
- Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead InputsChe Liu 0002, Cheng Ouyang, Zhongwei Wan, Haozhe Wang 0002, Wenjia Bai, Rossella Arcucci. 7298-7316 [doi]
- Seeing Race, Feeling Bias: Emotion Stereotyping in Multimodal Language ModelsMahammed Kamruzzaman, Amanda Cercas Curry, Alba Cercas Curry, Flor Miriam Plaza del Arco. 7317-7351 [doi]
- AdaptMerge: Inference Time Adaptive Visual and Language-Guided Token Merging for Efficient Large Multimodal ModelsZahidul Islam, Mrigank Rochan. 7352-7361 [doi]
- Federated Retrieval-Augmented Generation: A Systematic Mapping StudyAbhijit Chakraborty 0004, Chahana Dahal, Vivek Gupta. 7362-7374 [doi]
- A Survey of Pun Generation: Datasets, Evaluations and MethodologiesYuchen Su, Yonghua Zhu, Ruofan Wang, Zijian Huang 0003, Diana Benavides Prado, Michael J. Witbrock. 7375-7395 [doi]
- Evaluating the Robustness and Accuracy of Text Watermarking Under Real-World Cross-Lingual ManipulationsMansour Al Ghanim, Jiaqi Xue, Rochana Prih Hastuti, Mengxin Zheng, Yan Solihin, Qian Lou. 7396-7416 [doi]
- HDiff: Confidence-Guided Denoising Diffusion for Robust Hyper-relational Link PredictionXiangfeng Luo, Ruoxin Zheng, Jianqiang Huang, Hang Yu 0006. 7417-7434 [doi]
- Spotlighter: Revisiting Prompt Tuning from a Representative Mining ViewYutong Gao 0001, Maoyuan Shao, Xinyang Huang, Chuang Zhu, Yu Weng, Xuan Liu 0008, Lijuan Sun, Guoshun Nan. 7435-7449 [doi]
- Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and RefinementIshan Jindal, Jayant Taneja, Chandana Badrinath, Vikas Kapur, Sachin Dev Sharma. 7450-7458 [doi]
- Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning EfficiencyChenlong Wang, Yuanning Feng, Dongping Chen, Zhaoyang Chu, Ranjay Krishna, Tianyi Zhou 0001. 7459-7482 [doi]
- Towards Reverse Engineering of Language Models: A SurveyXinpeng Ti, Wentao Ye, Zhifang Zhang, Junbo Zhao 0002, Chang Yao 0001, Lei Feng 0006, Haobo Wang 0001. 7483-7502 [doi]
- LIFTED: Multimodal Clinical Trial Outcome Prediction via Large Language Models and Mixture-of-ExpertsWenhao Zheng, Liaoyaqi Wang, Dongsheng Peng, Hongxia Xu, Yun Li, Hongtu Zhu, Tianfan Fu, Huaxiu Yao. 7503-7517 [doi]
- Addition in Four Movements: Mapping Layer-wise Information Trajectories in LLMsYao Yan. 7518-7532 [doi]
- CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuningJinyuan Feng, Chaopeng Wei, Tenghai Qiu, Tianyi Hu, Zhiqiang Pu. 7533-7551 [doi]
- GuiLoMo: Allocating Experts and Ranks for LoRA-MoE via Bilevel Optimization with GuidedSelection VectorsXinrong Chen, Hengyuan Zhang, Yingmin Qiu, Xiao Liang, Ziyue Li, Guanyu Wang, Weiping Li, Tong Mo, Hayden Kwok-Hay So, Ngai Wong 0001. 7552-7567 [doi]
- Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform QuantizerEuntae Choi, Sumin Song, Woosang Lim, Sungjoo Yoo. 7568-7590 [doi]
- Decoding in Latent Spaces for Efficient Inference in LLM-based RecommendationChengbing Wang, Yang Zhang 0072, Zhicheng Wang, Tianhao Shi, Keqin Bao, Fuli Feng, Tat-Seng Chua. 7591-7603 [doi]
- Forget for Get: A Lightweight Two-phase Gradient Method for Knowledge Editing in Large Language ModelsYanhong Li, Min Yang 0007, Xiping Hu, Chengming Li 0004. 7604-7623 [doi]
- AutoEvolve: Automatically Evolving Queries for Applicable and Scalable Retrieval-Augmented Generation BenchmarkingDing-Chu Zhang, Xiaowen Zhang, Yue Fei, Renjun Hu, Xiao-Wen Yang, Zhi Zhou, Baixuan Li, Yu-Feng Li, Xing Shi, Wei Lin. 7624-7639 [doi]
- Temporal Alignment of Time Sensitive Facts with Activation EngineeringSanjay Govindan, Maurice Pagnucco, Yang Song 0001. 7640-7657 [doi]
- ChronoBias: A Benchmark for Evaluating Temporal Group Bias in the Time-sensitive Knowledge of Large Language ModelsKyungmin Kim, Youngbin Choi, Hyounghun Kim, Dongwoo Kim, Sangdon Park. 7658-7693 [doi]
- MC²: A Minimum-Coverage and Dataset-Agnostic Framework for Compositional Generalization of LLMs on Semantic ParsingZiyao Xu 0001, Zhe Yang 0013, Houfeng Wang. 7694-7706 [doi]
- Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMsYunzhe Qi, Jinjin Tian, Tianci Liu 0003, Ruirui Li 0002, Tianxin Wei, Hui Liu 0031, Xianfeng Tang, Monica Xiao Cheng, Jingrui He. 7707-7733 [doi]
- Enriching Patent Claim Generation with European Patent DatasetLekang Jiang, Chengzu Li, Stefan Goetz. 7734-7751 [doi]
- StepKE: Stepwise Knowledge Editing for Multi-Hop Question AnsweringJaewook Lee 0008, Dahyun Jung, HeuiSeok Lim. 7752-7765 [doi]
- AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and BenchmarkLan Li 0002, Liri Fang, Bertram Ludäscher, Vetle I. Torvik. 7766-7780 [doi]
- Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI AgentsPengzhou Cheng, Haowen Hu, Zheng Wu, Zongru Wu, Tianjie Ju, Daizong Ding, Zhuosheng Zhang 0001, Gongshen Liu. 7781-7805 [doi]
- Scale Down to Speed Up: Dynamic Data Selection for Reinforcement LearningZhuoyue Chen, Jihai Zhang 0001, Ben Liu, Fangquan Lin, Wotao Yin. 7806-7817 [doi]
- Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer RationalesJianzhi Yan, Le Liu, Youcheng Pan, Shiwei Chen, Yang Xiang 0003, Buzhou Tang. 7818-7835 [doi]
- GeoDANO: Geometric VLM with Domain Agnostic Vision EncoderSeunghyuk Cho, Zhenyue Qin, Yang Liu 0249, Youngbin Choi, Seungbeom Lee, Dongwoo Kim 0002. 7836-7851 [doi]
- Leveraging 3D Gaussian for Temporal Knowledge Graph EmbeddingJiang Li, Xiangdong Su, Guanglai Gao. 7852-7865 [doi]
- LLMAP: LLM-Assisted Multi-Objective Route Planning with User PreferencesLiangqi Yuan, Dong-Jun Han, Christopher G. Brinton, Sabine Brunswicker. 7866-7894 [doi]
- ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset ConstructionJeesu Jung, Chanjun Park, Sangkeun Jung. 7895-7911 [doi]
- Token Knowledge: A New Perspective For Knowledge in Large Language ModelsJieyong Wang, Chunyao Song, Tingjian Ge. 7912-7926 [doi]
- Adaptive Schema-aware Event Extraction with Retrieval-Augmented GenerationSheng Liang, Hang Lv 0012, Zhihao Wen, Yaxiong Wu 0005, Yongyue Zhang, Hao Wang 0076, Yong Li 0008. 7927-7946 [doi]
- Enhancing Attributed Question Answering using Tailored Progressive Curriculum LearningYuhan Chen, Bowei Zou, Yifan Fan, Yuchong Chen, Shujun Cao, Yu Hong. 7947-7956 [doi]
- REAR: Reinforced Reasoning Optimization for Event Argument Extraction with Relation-Aware SupportJianwen Luo, Yu Hong 0001, Shuai Yang, Jianmin Yao. 7957-7972 [doi]
- COMI-LINGUA: Expert Annotated Large-Scale Dataset for Multitask NLP in Hindi-English Code-MixingRajvee Sheth, Himanshu Beniwal, Mayank Singh 0001. 7973-7992 [doi]
- Nine Ways to Break Copyright Law and Why Our LLM Won't: A Fair Use Aligned Generation FrameworkAakash Sen Sharma, Debdeep Sanyal, Priyansh Srivastava, Sundar Athreya H, Shirish S. Karande, Mohan Kankanhalli, Murari Mandal. 7993-8023 [doi]
- InteractSpeech: A Speech Dialogue Interaction Corpus for Spoken Dialogue ModelYifu Chen, Shengpeng Ji, Ziqing Wang, Hanting Wang, Zhou Zhao 0001. 8024-8033 [doi]
- Enhancing SQL Table Acquisition with Reverse Engineering for Text-to-SQLShixin Liu, Haoyu Xu, Yu Hong. 8034-8041 [doi]
- DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMsXiabin Zhou, Wenbin Wang, Minyan Zeng, Jiaxian Guo, Xuebo Liu 0002, Li Shen 0008, Min Zhang 0005, Liang Ding 0006. 8042-8057 [doi]
- ASD-iLLM:An Intervention Large Language Model for Autistic Children based on Real Clinical Dialogue Intervention DatasetShuzhong Lai, Chenxi Li, Junhong Lai, Yucun Zhong, Chenyu Yan, Xiang Li, Haifeng Li, Gang Pan, Lin Yao, Yueming Wang. 8058-8079 [doi]
- GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation ExtractionJie Zhao, Wanting Ning, Yuxiao Fei, Yubo Feng, Lishuang Li. 8080-8091 [doi]
- More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache CompressionJiebin Zhang, Dawei Zhu, Yifan Song 0002, Wenhao Wu, Chuqiao Kuang, Xiaoguang Li, Lifeng Shang, Qun Liu 0001, Sujian Li. 8092-8105 [doi]
- cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax TreeYilin Zhang, Xinran Zhao, Zora Zhiruo Wang, Chenyang Yang 0002, Jiayi Wei, Tongshuang Wu. 8106-8116 [doi]
- A Group Fairness Lens for Large Language ModelsGuanqun Bi, Yuqiang Xie, Lei Shen 0001, Yanan Cao 0001. 8117-8139 [doi]
- VLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected TrainingZhanpeng Chen, Chengjin Xu, Yiyan Qi, Xuhui Jiang, Jian Guo. 8140-8158 [doi]
- Rethinking DPO: The Role of Rejected Responses in Preference MisalignmentJae Hyeon Cho, Junhyeok Oh, Myunsoo Kim, Byung Jun Lee. 8159-8176 [doi]
- Enhancing Recommendation Explanations through User-Centric RefinementJingsen Zhang, Zihang Tian, Xueyang Feng, Xu Chen 0017, Chong Chen. 8177-8191 [doi]
- Distributional Surgery for Language Model ActivationsBao Nguyen, Binh T. Nguyen 0001, Duy Nguyen 0003, Viet Anh Nguyen. 8192-8212 [doi]
- Improving Alignment in LVLMs with Debiased Self-JudgmentSihan Yang 0001, Chenhang Cui, Zihao Zhao, Yiyang Zhou, Weilong Yan, Ying Wei, Huaxiu Yao. 8213-8232 [doi]
- Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction TuningHongyi Cai, Jie Li, Mohammad Mahdinur Rahman, Wenzhen Dong. 8233-8240 [doi]
- Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home?Yujin Choi, YoungJoo Park, Junyoung Byun, Jaewook Lee 0001, Jinseong Park 0001. 8241-8258 [doi]
- Causal-LLM: A Unified One-Shot Framework for Prompt- and Data-Driven Causal Graph DiscoveryAmartya Roy, Devharish N, Shreya Ganguly, Kripabandhu Ghosh. 8259-8279 [doi]
- LRPLAN: A Multi-Agent Collaboration of Large Language and Reasoning Models for Planning with Implicit & Explicit ConstraintsOm Dehlan T. Karthikeyan, Manish Gupta Mausam. 8280-8310 [doi]
- DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning PerspectiveDengyun Peng, Yuhang Zhou, Qiguang Chen, Jinhao Liu, Jingjing Chen, Libo Qin 0001, Wanxiang Che. 8311-8334 [doi]
- Towards Robust Few-Shot Relation Classification: Incorporating Relation Description with AgreementMengting Hu, Jianfeng Wu, Ming Jiang, Yalan Xie, Zhunheng Wang, Rui Ying, Xiaoyi Liu, Ruixuan Xu, Hang Gao, Renhong Cheng. 8335-8349 [doi]
- For a Fistful of Puns: Evaluating a Puns in Multiword Expressions Identification Algorithm Without Dedicated DatasetJulien Bezançon, Gaël Lejeune. 8350-8370 [doi]
- Watermarking for Factuality: Guiding Vision-Language Models Toward Truth via Tri-layer Contrastive DecodingKyungryul Back, Seongbeom Park, Milim Kim, Mincheol Kwon, Sanghyeok Lee, Hyunyoung Lee, Junhee Cho, Seunghyun Park, Jinkyu Kim. 8371-8387 [doi]
- Are the Reasoning Models Good at Automated Essay Scoring?Lui Yoshida. 8388-8394 [doi]
- Rethinking LLM-Based Recommendations: A Personalized Query-Driven Parallel IntegrationDonghee Han 0001, Hwanjun Song, Mun Yong Yi. 8395-8419 [doi]
- RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image GenerationAviv Slobodkin, Hagai Taitelbaum, Yonatan Bitton, Brian Gordon, Michal Sokolik, Nitzan Bitton Guetta, Almog Gueta, Royi Rassin, Dani Lischinski, Idan Szpektor. 8420-8438 [doi]
- What data should I include in my POS tagging training set?Zoey Liu, Masoud Jasbi, Christan Grant, Kenji Sagae, Emily Prud'hommeaux. 8439-8455 [doi]
- AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented GenerationLvzhou Luo, Yixuan Cao 0001, Ping Luo 0001. 8456-8472 [doi]
- SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation InterventionJiaQi Wu, Chen Chen 0012, Chunyan Hou, Xiaojie Yuan. 8473-8488 [doi]
- Staged Knowledge Distillation Through Least-to-Most Prompting: Optimizing Teacher Guidance via Difficulty-Aware TrainingMengxiang Zhang, Lingyuan Liu. 8489-8501 [doi]
- LLM Distillation for Efficient Few-Shot Multiple Choice Question AnsweringPatrick Sutanto, Joan Santoso, Esther Irawati Setiawan, Aji Prasetya Wibawa. 8502-8530 [doi]
- Teaching LLMs to Plan, Not Just Solve: Plan Learning Boosts LLMs Generalization in Reasoning TasksTianlong Wang, Junzhe Chen, Weibin Liao, Xueting Han, Jing Bai. 8531-8545 [doi]
- FedCoT: Federated Chain-of-Thought Distillation for Large Language ModelsTao Fan 0002, WeiJing Chen, Yan Kang 0001, Guoqiang Ma, Hanlin Gu, Yuanfeng Song, Lixin Fan, Qiang Yang 0001. 8546-8557 [doi]
- SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought ReasoningYue Xin, Chen Shen 0003, Shaotian Yan, Xiaosong Yuan, Yaoming Wang, Xiaofeng Zhang, Chenxi Huang 0004, Jieping Ye. 8558-8577 [doi]
- Representing LLMs in Prompt Semantic Task SpaceIdan Kashani, Avi Mendelson, Yaniv Nemcovsky. 8578-8597 [doi]
- PersLLM: A Personified Training Approach for Large Language ModelsZheni Zeng, Jiayi Chen 0005, Huimin Chen, Yukun Yan, Yuxuan Chen, Zhenghao Liu 0001, Zhiyuan Liu 0001, Maosong Sun 0001. 8598-8617 [doi]
- The Illusion of Randomness: How LLMs Fail to Emulate Stochastic Decision-Making in Rock-Paper-Scissors Games?Zihao Guo, Hongtao Lv, Chaoli Zhang, Yibowen Zhao, Yixin Zhang, LiZhen Cui. 8618-8637 [doi]
- DAPE-BR: Distance-Aware Positional Encoding for Mitigating Object Hallucination in LVLMsMingrui Xie, Tianxiang Xu, Qianhai Tang, Shanming Yao, Xiaofeng Zhang, Junliang Du. 8638-8649 [doi]
- From Confidence to Collapse in LLM Factual RobustnessAlina Fastowski, Bardh Prenkaj, Gjergji Kasneci. 8650-8667 [doi]
- CtrlNews: LLM-based Multi-Agent Controllable News Writing via Knowledge Gravitational FieldYifei Xu, Yingjie Zong, Wang Zhonghua, Sirui Wu, Yuan Rao, Dan Zhang, ShuiGuang Deng. 8668-8705 [doi]
- Joint Enhancement of Relational Reasoning for Long-Context LLMsZhirui Chen, Wei Shen, Jiashui Huang, Ling Shao 0001. 8706-8720 [doi]
- Training Medical QA Models Based on Mixed Rewards from Multiple-Choice and Open-Ended QuestionsYue Qiu, Yujan Ting, Pei-Dong, Terrence Chen, Weijing Huang. 8721-8729 [doi]
- Rethink Rumor Detection in the Era of LLMs: A ReviewChang Yang, Peng Zhang 0002, Jing Zhang, Hui Gao, Changhao Song. 8730-8749 [doi]
- ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic ContextsDongwon Noh, Donghyeok Koh, Junghun Yuk, Gyuwan Kim, Jae-Yong Lee, Kyungtae Lim, Cheoneum Park. 8750-8782 [doi]
- MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented GenerationJungyeon Lee, Kangmin Lee, Taeuk Kim. 8783-8803 [doi]
- Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQAQingyun Jin, Xiaohui Song, Feng Zhou, Zengchang Qin. 8804-8816 [doi]
- DRBO: Mitigating Short Board Effect via Dynamic Reward Balancing in Multi-reward LLM OptimizationNuo Chen 0002, Yufei Gao, Yongnan Jin, Yan Hu, Anningzhe Gao, Lingyong Yan, Benyou Wang. 8817-8841 [doi]
- Enhancing LLM Knowledge Learning through GeneralizationMingkang Zhu, Xi Chen 0119, Zhongdao Wang, Bei Yu 0001, Hengshuang Zhao, Jiaya Jia. 8842-8855 [doi]
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient Training R1-like Reasoning ModelsMingYang Song, Mao Zheng, Zheng Li, Wenjie Yang, Xuan Luo. 8856-8866 [doi]
- TR-MTEB: A Comprehensive Benchmark and Embedding Model Suite for Turkish Sentence RepresentationsMehmet Selman Baysan, Tunga Gungor. 8867-8887 [doi]
- ImpRAG: Retrieval-Augmented Generation with Implicit QueriesWenzheng Zhang, Xi Victoria Lin, Karl Stratos, Wen-tau Yih, Mingda Chen. 8888-8900 [doi]
- HEAL: A Hypothesis-Based Preference-Aware Analysis FrameworkYifu Huo, Chenglong Wang 0002, Qiren Zhu, Shunjie Xing, Tong Xiao 0001, Chunliang Zhang, Tongran Liu, Jingbo Zhu. 8901-8919 [doi]
- A Survey of Multilingual Reasoning in Language ModelsAkash Ghosh, Debayan Datta, Sriparna Saha 0001, Chirag Agarwal. 8920-8936 [doi]
- CLEAR: A Framework Enabling Large Language Models to Discern Confusing Legal ParagraphsQi Xu, Qian Liu 0012, Hao Fei 0001, Hang Yu 0006, Shuhao Guan, Xiao Wei 0002. 8937-8953 [doi]
- NAP2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from HumanShuo Huang, William MacLean, Xiaoxi Kang, Qiongkai Xu, Zhuang Li 0001, Xingliang Yuan, Gholamreza Haffari, Lizhen Qu. 8954-8970 [doi]
- Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM AgentsLong Li, Weiwen Xu, Jiayan Guo, Ruochen Zhao, Xingxuan Li, Yuqian Yuan, Boqiang Zhang, Yuming Jiang 0007, Yifei Xin, Ronghao Dang, Yu Rong 0001, Deli Zhao, Tian Feng, Lidong Bing. 8971-9004 [doi]
- Unveiling Multimodal Processing: Exploring Activation Patterns in Multimodal LLMs for Interpretability and EfficiencyChuan Wu, Meng Su, Youxuan Fang, ShaoLin Zhu. 9005-9016 [doi]
- Self-Supervised Prompt OptimizationJinyu Xiang, Jiayi Zhang 0017, Zhaoyang Yu, Xinbing Liang, Fengwei Teng, Jinhao Tu, Fashen Ren, Xiangru Tang, Sirui Hong, Chenglin Wu 0001, Yuyu Luo. 9017-9041 [doi]
- Polish-English medical knowledge transfer: A new benchmark and resultsLukasz Grzybowski, Jakub Pokrywka, Michal Ciesiólka, Jeremi Kaczmarek, Marek Kubis. 9042-9063 [doi]
- Hard Negatives, Hard Lessons: Revisiting Training Data Quality for Robust Information Retrieval with LLMsNandan Thakur, Crystina Zhang, Xueguang Ma, Jimmy Lin. 9064-9083 [doi]
- EventRelBench: A Comprehensive Benchmark for Evaluating Event Relation Understanding in Large Language ModelsJie Gong, Biaoshuai Zheng, Qiwang Hu. 9084-9099 [doi]
- S2LPP: Small-to-Large Prompt Prediction across LLMsLiang Cheng, Tianyi Li, Zhaowei Wang 0003, Mark Steedman. 9100-9115 [doi]
- DroidCall: A Dataset for LLM-powered Android Intent InvocationWeikai Xie, Li Zhang 0133, Shihe Wang, Rongjie Yi, Mengwei Xu 0001. 9116-9134 [doi]
- Tool Zero: Training Tool-Augmented LLMs via Pure RL from ScratchYirong Zeng, Xiao Ding, Yutai Hou, Yuxian Wang, Li Du, Juyi Dai, Qiuyang Ding, Duyu Tang, Dandan Tu, Weiwen Liu, Bing Qin 0001, Ting Liu 0001. 9135-9147 [doi]
- INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI AgentYuanlei Wang, Liuzhou Zhang, Haohao Luo, Ying Shen. 9148-9160 [doi]
- Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language ModelsJuraj Vladika, Mahdi Dhaini, Florian Matthes. 9161-9174 [doi]
- Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree SearchShuo Huang, Xingliang Yuan, Gholamreza Haffari, Lizhen Qu. 9175-9190 [doi]
- KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous RetrievalJaehyung Seo, Dahyun Jung, Jaewook Lee 0008, Yongchan Chun, Dongjun Kim, Hwijung Ryu, Donghoon Shin, HeuiSeok Lim. 9191-9217 [doi]
- HARE: an entity and relation centric evaluation framework for histopathology reportsYunsoo Kim, Michal W. S. Ong, Alex Shavick, Honghan Wu, Adam P. Levine. 9218-9233 [doi]
- VeriFastScore: Speeding up long-form factuality evaluationRishanth Rajendhran, Amir Zadeh, Matthew Sarte, Chuan Li, Mohit Iyyer. 9234-9259 [doi]
- B-REASO: A Multi-Level Multi-Faceted Bengali Evaluation Suite for Foundation ModelsMd. Tanzib Hosain, Md. Kishor Morol. 9260-9274 [doi]
- Extracting Conceptual Spaces from LLMs Using Prototype EmbeddingsNitesh Kumar, Usashi Chatterjee, Steven Schockaert. 9275-9298 [doi]
- FC-Attack: Jailbreaking Multimodal Large Language Models via Auto-Generated FlowchartsZiyi Zhang, Zhen Sun 0001, Zongmin Zhang, Jihui Guo, Xinlei He 0001. 9299-9316 [doi]
- Multilingual Data Filtering using Synthetic Data from Large Language ModelsJonas Waldendorf, Barry Haddow, Alexandra Birch, Mateusz Klimaszewski. 9317-9334 [doi]
- SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMsSamir Abdaljalil, Filippo Pallucchini, Andrea Seveso, Hasan Kurban, Fabio Mercorio, Erchin Serpedin. 9335-9346 [doi]
- Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety AlignmentSomnath Banerjee 0002, Sayan Layek, Pratyush Chatterjee, Animesh Mukherjee 0001, Rima Hazra. 9347-9364 [doi]
- LLMs as a synthesis between symbolic and distributed approaches to languageGemma Boleda. 9365-9379 [doi]
- MIND: Towards Immersive Psychological Healing with Multi-Agent Inner DialogueYujia Chen, Changsong Li, Yiming Wang, Tianjie Ju, Qingqing Xiao, Nan Zhang, Zifan Kong, Peng Wang, Binyu Yan. 9380-9413 [doi]
- A Monte-Carlo Sampling Framework For Reliable Evaluation of Large Language Models Using Behavioral AnalysisDavood Wadi, Marc Fredette. 9414-9432 [doi]
- Understanding How Value Neurons Shape the Generation of Specified Values in LLMsYi Su, Jiayi Zhang, Shu Yang 0010, Xinhai Wang, Lijie Hu, Di Wang 0015. 9433-9452 [doi]
- Likelihood Variance as Text Importance for Resampling Texts to Map Language ModelsMomose Oyama, Ryo Kishino, Hiroaki Yamagiwa, Hidetoshi Shimodaira. 9453-9465 [doi]
- Think Twice, Generate Once: Safeguarding by Progressive Self-ReflectionHoang Phan, Victor Li, Qi Lei. 9466-9483 [doi]
- Efficient Integration of External Knowledge to LLM-based World Models via Retrieval-Augmented Generation and Reinforcement LearningChang Yang, Xinrun Wang, Qinggang Zhang, Qi Jiang, Xiao Huang 0001. 9484-9501 [doi]
- Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesTyler Loakman, William Thorne, Chenghua Lin. 9502-9518 [doi]
- Modeling, Evaluating, and Embodying Personality in LLMs: A SurveyIago Alves Brito, Julia Soares Dollis, Fernanda Bufon Färber, Pedro Schindler Freire Brasil Ribeiro, Rafael Teixeira Sousa, Arlindo Rodrigues Galvão Filho. 9519-9532 [doi]
- Benchmarking the Detection of LLMs-Generated Modern Chinese PoetryShanshan Wang 0009, Junchao Wu, Fengying Ye, Derek F. Wong, Jingming Yao, Lidia S. Chao. 9533-9552 [doi]
- Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel CorpusPooja Singh, Shashwat Bhardwaj, Vaibhav Sharma, Sandeep Kumar. 9553-9579 [doi]
- Creative Preference OptimizationMete Ismayilzada, Antonio Laverghetta Jr., Simone Luchini, Reet Patel, Antoine Bosselut, Lonneke van der Plas, Roger E. Beaty. 9580-9609 [doi]
- Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-JudgeZhuo Liu, Moxin Li, Xun Deng, Qifan Wang 0001, Fuli Feng. 9610-9631 [doi]
- Uplift-RAG: Uplift-Driven Knowledge Preference Alignment for Retrieval-Augmented GenerationChangle Qu, Sunhao Dai, Hengyi Cai, Yiyang Cheng, Jun Xu 0001, Shuaiqiang Wang, Dawei Yin 0001. 9632-9644 [doi]
- Sugar-Coated Poison: Benign Generation Unlocks JailbreakingYuhang Wu, Yu-Jie Xiong, Hao Zhang, Jia-Chen Zhang, Zheng Zhou. 9645-9665 [doi]
- DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse ScenesZhaowei Wang 0003, Hongming Zhang 0009, Tianqing Fang, Ye Tian, Yue Yang, Kaixin Ma, Xiaoman Pan, Yangqiu Song, Dong Yu 0001. 9666-9686 [doi]
- Data-scarce Behavior Editing of Language ModelsJoykirat Singh, Subhabrata Dutta, Tanmoy Chakraborty 0002. 9687-9701 [doi]
- FIER: Fine-Grained and Efficient KV Cache Retrieval for Long-context LLM InferenceDongwei Wang, Zijie Liu, Song Wang, Yuxin Ren, Jianing Deng, Jingtong Hu, Tianlong Chen 0001, Huanrui Yang. 9702-9713 [doi]
- SVeritas: Benchmark for Robust Speaker Verification under Diverse ConditionsMassa Baali, Sarthak Bisht, Francisco Teixeira, Kateryna Shapovalenko, Rita Singh, Bhiksha Raj. 9714-9731 [doi]
- CAARMA: Class Augmentation with Adversarial Mixup RegularizationMassa Baali, Xiang Li 0106, Hao Chen 0102, Syed Abdul Hannan, Rita Singh, Bhiksha Raj. 9732-9742 [doi]
- Bringing Pedagogy into Focus: Evaluating Virtual Teaching Assistants' Question-Answering in Asynchronous Learning EnvironmentsLi Siyan, Zhen Xu, Vethavikashini Chithrra Raghuram, Xuanming Zhang, Renzhe Yu, Zhou Yu. 9743-9774 [doi]
- Demystifying Multilingual Reasoning in Process Reward ModelingWeixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch. 9775-9788 [doi]
- BehaviorSFT: Behavioral Token Conditioning for Health Agents Across the Proactivity SpectrumYubin Kim 0002, Zhiyuan Hu, Hyewon Jeong, Eugene Park, Shuyue Stella Li, Chanwoo Park, Shiyun Xiong, Mingyu Lu, Hyeonhoon Lee, Xin Liu 0034, Daniel McDuff, Cynthia Breazeal, Samir Tulebaev, Hae Won Park 0001. 9789-9817 [doi]
- LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure ProfilesHo Yin Sam Ng, Edward Hsu, Aashish Anantha Ramakrishnan, Branislav Kveton, Nedim Lipka, Franck Dernoncourt, Dongwon Lee 0001, Tong Yu 0001, SungChul Kim, Ryan A. Rossi, Ting-Hao Kenneth Huang. 9818-9832 [doi]
- Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-GenerationWeitao Li, Xiangyu Zhang, Kaiming Liu, Xuanyu Lei, Weizhi Ma, Yang Liu. 9833-9849 [doi]
- HebID: Detecting Social Identities in Hebrew-language Political TextGuy Mor-Lan, Naama Rivlin-Angert, Yael R. Kaplan, Tamir Sheafer, Shaul R. Shenhav. 9850-9870 [doi]
- Dub-S2ST: Textless Speech-to-Speech Translation for Seamless DubbingJeongsoo Choi, Jaehun Kim, Joon Son Chung. 9871-9881 [doi]
- FinGrAct: A Framework for FINe-GRrained Evaluation of ACTionability in Explainable Automatic Fact-CheckingIslam Eldifrawi, Shengrui Wang, Amine Trabelsi. 9882-9901 [doi]
- What Has Been Lost with Synthetic Evaluation?Alexander Gill, Abhilasha Ravichander, Ana Marasovic. 9902-9945 [doi]
- Bold Claims or Self-Doubt? Factuality Hallucination Type Detection via Belief StateDongyu Zhang 0003, Qingqing Hong, Bingxuan Hou, Jiayi Lin 0008, Chenyang Zhang 0004, Jialin Li, Junli Wang 0001. 9946-9959 [doi]
- Proxy Barrier: A Hidden Repeater Layer Defense Against System Prompt Leakage and JailbreakingPedro Schindler Freire Brasil Ribeiro, Iago Alves Brito, Rafael Teixeira Sousa, Fernanda Bufon Färber, Julia Soares Dollis, Arlindo Rodrigues Galvão Filho. 9960-9975 [doi]
- AraSafe: Benchmarking Safety in Arabic LLMsHamdy Mubarak, Abubakr Mohamed, Majd Hawasly. 9976-9992 [doi]
- Nested Named Entity Recognition as Single-Pass Sequence LabelingAlberto Muñoz-Ortiz, David Vilares 0001, Caio Corro, Carlos Gómez-Rodríguez. 9993-10002 [doi]
- DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate HallucinationsAryo Pradipta Gema, Chen Jin, Ahmed Abdulaal, Tom Diethe, Philip Alexander Teare, Beatrice Alex, Pasquale Minervini, Amrutha Saseendran. 10003-10039 [doi]
- Catch Me If You Can? Not Yet: LLMs Still Struggle to Imitate the Implicit Writing Styles of Everyday AuthorsZhengxiang Wang, Nafis Irtiza Tripto, Solha Park, Zhenzhen Li, Jiawei Zhou. 10040-10055 [doi]
- Fine-Tuning Encoder-Decoder Models with Contrastive Learning for In-Context Distractor GenerationElaf Alhazmi, Quan Z. Sheng, Wei Emma Zhang, Mohammed I. Thanoon, Haojie Zhuang, Behnaz Soltani, Munazza Zaib. 10056-10072 [doi]
- Conflicts in Texts: Data, Implications and ChallengesSiyi Liu, Dan Roth 0001. 10073-10091 [doi]
- Recognizing Limits: Investigating Infeasibility in Large Language ModelsWenbo Zhang 0010, Zihang Xu, Hengrui Cai. 10092-10112 [doi]
- VQA-Augmented Machine Translation with Cross-Modal Contrastive LearningZhihui Zhang, Shiliang Sun, Jing Zhao 0015, Tengfei Song, Hao Yang 0006. 10113-10124 [doi]
- Learning to Describe Implicit Changes: Noise-robust Pre-training for Image Difference CaptioningZixin Guo, Jiayang Sun, Tzu-Jui Julius Wang, Abduljalil Radman, Selen Pehlivan, Min Cao, Jorma Laaksonen. 10125-10145 [doi]
- SOLAR: Serendipity Optimized Language Model Aligned for RecommendationZichen Yuan, Lifan Sun, Yucen Zhuang, Yue Wang, Xinyuan Song 0002, Tianqi Xu, Siyuan Li, Junchen Fu, Youhua Li, Sirui Hong, Jiaqi Chen, Joemon M. Jose, Yongxin Ni. 10146-10169 [doi]
- AIRepr: An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data ScienceQiuhai Zeng, Claire Jin, Xinyue Wang, Yuhan Zheng, Qunhua Li. 10170-10201 [doi]
- MisinfoBench: A Multi-Dimensional Benchmark for Evaluating LLMs' Resilience to MisinformationYe Yang, Donghe Li, Zuchen Li, Fengyuan Li, Jingyi Liu, Li Sun, Qingyu Yang 0003. 10202-10229 [doi]
- Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to ClarityPing Chen, Xiang Li 0001, Zhaoxiang Liu, Zezhou Chen, Xingpeng Zhang, Huan Hu, Zipeng Wang, Kai Wang 0012, Shuming Shi, Shiguo Lian. 10230-10240 [doi]
- HighMATH: Evaluating Math Reasoning of Large Language Models in Breadth and DepthYan Liu, Minghui Zhang, Bojian Xiong, Yifan Xiao, Yinong Sun, Yating Mei, Longyu Zeng, Jingchao Yang, Yang Wang, Deyi Xiong. 10241-10253 [doi]
- CATCH: A Novel Data Synthesis Framework for High Therapy Fidelity and Memory-Driven Planning Chain of Thought in AI CounselingMingyu Chen 0016, Jingkai Lin, Zhaojie Chu, Xiaofen Xing, Yirong Chen, Xiangmin Xu. 10254-10286 [doi]
- MediVLM: A Vision Language Model for Radiology Report Generation from Medical ImagesDebanjan Goswami, Ronast Subedi, Shayok Chakraborty. 10287-10304 [doi]
- AdDriftBench: A Benchmark for Detecting Data Drift and Label Drift in Short Video AdvertisingYinghao Song, Xiangji Zeng, Shuai Cui, Lu Sun, Zhaowei Liu, Yuan Yuan, Yulu Wang, Hai Zhou, Zhaohan Gong. 10305-10321 [doi]
- NIM: Neuro-symbolic Ideographic Metalanguage for Inclusive CommunicationPrawaal Sharma, Poonam Goyal, Navneet Goyal, Vidisha Sharma. 10322-10340 [doi]
- ViFT: Towards Visual Instruction-Free Fine-tuning for Large Vision-Language ModelsZikang Liu 0001, Kun Zhou 0002, Wayne Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen. 10341-10366 [doi]
- Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language ModelsJian Wang 0067, Xiaofei Xie, Qiang Hu, Shangqing Liu, Yi Li 0008. 10367-10385 [doi]
- LongWeave: A Long-Form Generation Benchmark Bridging Real-World Relevance and VerifiabilityZikai Xiao, Fei Huang 0002, Jianhong Tu, Jianhui Wei, Wen Ma, Yuxuan Zhou, Jian Wu, Bowen Yu 0002, Zuozhu Liu, Junyang Lin. 10386-10417 [doi]
- XL-Suite: Cross-Lingual Synthetic Training and Evaluation Data for Open-Ended GenerationVivek Iyer, Pinzhen Chen, Ricardo Rei, Alexandra Birch. 10418-10432 [doi]
- Accelerating LLM Reasoning via Early Rejection with Partial Reward ModelingSeyyed Saeid Cheshmi, Azal Ahmad Khan, Xinran Wang, Zirui Liu 0001, Ali Anwar 0001. 10433-10447 [doi]
- CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer SynthesisXinyu Zhang 0017, Pei Zhang 0011, Shuang Luo, Jialong Tang, Yu Wan 0004, Baosong Yang, Fei Huang 0002. 10448-10467 [doi]
- DesignCLIP: Multimodal Learning with CLIP for Design Patent UnderstandingZhu Wang, Homaira Huda Shomee, Sathya N. Ravi, Sourav Medya. 10468-10490 [doi]
- R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement LearningYuan Li, Qi Luo, Xiaonan Li, Bufan Li, Qinyuan Cheng, Bo Wang 0084, Yining Zheng, Yuxin Wang 0005, Zhangyue Yin, Xipeng Qiu. 10491-10507 [doi]
- 'Hello, World!': Making GNNs Talk with LLMsSunwoo Kim, Soo Yong Lee, Jaemin Yoo, Kijung Shin. 10508-10526 [doi]
- Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLMDingjie Song, Sicheng Lai, Mingxuan Wang, Shunian Chen, Lichao Sun 0001, Benyou Wang. 10527-10542 [doi]
- NLKI: A Lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commonsense VQA TasksAritra Dutta, Swapnanil Mukherjee, Deepanway Ghosal, Somak Aditya. 10543-10563 [doi]
- Text or Pixels? Evaluating Efficiency and Understanding of LLMs with Visual Text InputsYanhong Li, Zixuan Lan, Jiawei Zhou. 10564-10578 [doi]
- Assessing Socio-Cultural Alignment and Technical Safety of Sovereign LLMsKyubyung Chae, Gihoon Kim, Gyuseong Lee, Taesup Kim, Jaejin Lee, Heejin Kim. 10579-10600 [doi]
- Sample Efficient Alignment Learning With Episodic ControlVan Dai Do, Quan Hung Tran, Ahmed Kirmani, Lu Zhang, Hung Le 0002. 10601-10618 [doi]
- Evaluating Automatic Speech Recognition Systems for Korean Meteorological ExpertsChaeHun Park, Hojun Cho, Jaegul Choo. 10619-10627 [doi]
- 3D-Aware Vision-Language Models Fine-Tuning with Geometric DistillationSeonho Lee, Jiho Choi, Inha Kang, Jiwook Kim, Junsung Park, Hyunjung Shim. 10628-10647 [doi]
- CAPE: Context-Aware Personality Evaluation Framework for Large Language ModelsJivnesh Sandhan, Fei Cheng 0002, Tushar Sandhan, Yugo Murawaki. 10648-10662 [doi]
- AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous DrivingKangan Qian, Sicong Jiang, Yang Zhong, Ziang Luo, Zilin Huang, Tianze Zhu, Kun Jiang 0002, Mengmeng Yang 0001, Zheng Fu, Jinyu Miao, Yining Shi 0002, He Zhe Lim, Li Liu, Tianbao Zhou, Hongyi Wang, Huang Yu, Yifei Hu, Guang Li, Guang Chen, Hao Ye, Lijun Sun 0001, Diange Yang. 10663-10682 [doi]
- Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question AnsweringBolei He, Xinran He, Run Shao, Shanfu Shu, Xianwei Xue, Mingquan Cheng, Haifeng Li, Zhen-Hua Ling. 10683-10703 [doi]
- GenPTQ: Green Post-Training Quantization for Large-Scale ASR Models with Mixed-Precision Bit AllocationBeom-Jin Kang, Hyun Kim. 10704-10718 [doi]
- "Where Does This Strange Smell Come from?": Enabling Conversational Interfaces for Artificial OlfactionXueyi Zhou, Qi Lu, Dong-Kyu Chae. 10719-10745 [doi]
- LightRAG: Simple and Fast Retrieval-Augmented GenerationZirui Guo, Lianghao Xia, Yanhua Yu, Tu Ao, Chao Huang 0001. 10746-10761 [doi]
- Beyond Distribution: Investigating Language Models' Understanding of Sino-Korean MorphemesTaehee Jeon. 10762-10772 [doi]
- Sarcasm-R1: Enhancing Sarcasm Detection through Focused ReasoningQi Yang, Jingjie Zeng, Liang Yang 0003, Kai Ma, Hongfei Lin. 10773-10785 [doi]
- ISACL: Internal State Analyzer for Copyrighted Training Data LeakageGuangwei Zhang, Qisheng Su, Jiateng Liu, Cheng Qian 0008, Yanzhou Pan, Yanjie Fu, Denghui Zhang. 10786-10807 [doi]
- Steering LVLMs via Sparse Autoencoder for Hallucination MitigationZhenglin Hua, Jinghan He, Zijun Yao 0002, Tianxu Han, Haiyun Guo, Yuheng Jia, Junfeng Fang. 10808-10828 [doi]
- On the Perception Bottleneck of VLMs for Chart UnderstandingJunteng Liu, Weihao Zeng, Xiwen Zhang, Yijun Wang, Zifei Shan, Junxian He. 10829-10841 [doi]
- Self-Guided Function Calling in Large Language Models via Stepwise Experience RecallSijia Cui, Aiyao He, Shuai Xu, Hongming Zhang 0003, Yanna Wang, Qingyang Zhang 0004, Yajing Wang, Bo Xu 0002. 10842-10854 [doi]
- Multilingual Generative Retrieval via Cross-lingual Semantic CompressionYuxin Huang 0004, Simeng Wu, Ran Song 0002, Yan Xiang, Yantuan Xian, Shengxiang Gao, Zhengtao Yu 0001. 10855-10866 [doi]
- Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and EvaluationHui Huang, Julien Velcin, Yacine Kessaci. 10867-10881 [doi]
- Multilingual Knowledge Graph Completion via Efficient Multilingual Knowledge SharingCunli Mao, Xiaofei Gao, Ran Song 0002, Shizhu He, Shengxiang Gao, Kang Liu 0001, Zhengtao Yu 0001. 10882-10896 [doi]
- Mitigating Attention Localization in Small Scale: Self-Attention Refinement via One-step Belief PropagationNakyung Lee, Yeongoon Kim, Minhae Oh, Suhwan Kim, Jin Woo Koo, Hyewon Jo, Jungwoo Lee. 10897-10912 [doi]
- Imagination and Contemplation: A Balanced Framework for Semantic-Augmented Multimodal Machine TranslationZhuang Yu, Shiliang Sun, Jing Zhao 0015, Tengfei Song, Hao Yang 0006. 10913-10928 [doi]
- NeLLCom-Lex: A Neural-agent Framework to Study the Interplay between Lexical Systems and Language UseYuqing Zhang, Ecesu Ürker, Tessa Verhoef, Gemma Boleda, Arianna Bisazza. 10929-10945 [doi]
- RLMEval: Evaluating Research-Level Neural Theorem ProvingAuguste Poiroux, Antoine Bosselut, Viktor Kuncak. 10946-10957 [doi]
- KaeDe: Progressive Generation of Logical Forms via Knowledge-Aware Question Decomposition for Improved KBQARanran Bu, Jian Cao 0001, Jianqi Gao 0001, Shiyou Qian, Hongming Cai. 10958-10973 [doi]
- Where Fact Ends and Fairness Begins: Redefining AI Bias Evaluation through Cognitive BiasesJen-tse Huang 0001, Yuhang Yan 0002, Linqi Liu, Yixin Wan, Wenxuan Wang 0001, Kai-Wei Chang 0001, Michael R. Lyu. 10974-10993 [doi]
- Equal Truth: Rumor Detection with Invariant Group FairnessJunyi Chen, Mengjia Wu, Qian Liu 0012, Jing Sun, Ying Ding 0001, Yi Zhang 0095. 10994-11007 [doi]
- STEAM: A Semantic-Level Knowledge Editing Framework for Large Language ModelsGeunyeong Jeong, Juoh Sun, Seonghee Lee, Harksoo Kim. 11008-11023 [doi]
- SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language ModelsRui Qi, Zhibo Man, Yufeng Chen 0005, Fengran Mo, Jinan Xu, Kaiyu Huang. 11024-11039 [doi]
- How Reliable is Multilingual LLM-as-a-Judge?Xiyan Fu, Wei Liu 0145. 11040-11053 [doi]
- Cognitive-Level Adaptive Generation via Capability-Aware Retrieval and Style AdaptationQingsong Wang, Tao Wu, Wang Lin, Yueying Feng, Gongsheng Yuan, Chang Yao 0001, Jingyuan Chen. 11054-11069 [doi]
- Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMsEssa Jan, Moiz Ali, Muhammad Saram Hassan, Muhammad Fareed Zaffar, Yasir Zaki. 11070-11077 [doi]
- INDOORWORLD : Integrating Physical Task Solving and Social Simulation in A Heterogeneous Multi-Agent EnvironmentDekun Wu, Frederik Brudy, Bang Liu, Yi Wang. 11078-11099 [doi]
- ARXSA: A General Negative Feedback Control Theory in Vision-Language ModelsZeyu Zhang, Tianqi Cheng, Yuki Todo. 11100-11110 [doi]
- Breaking the Attention Trap in Code LLMs: A Rejection Sampling Approach to Enhance Code Execution PredictionXingcheng Ruan, Haoxiang Geng, Yunhui Xia, Bingran Zhao. 11111-11120 [doi]
- HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation EvaluationShijie Zhang, Renhao Li, Songsheng Wang, Philipp Koehn, Min Yang 0007, Derek F. Wong. 11121-11145 [doi]
- ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of MomentsGili Lior, Eliya Habba, Shahar Levy, Avi Caciularu, Gabriel Stanovsky. 11146-11153 [doi]
- From Characters to Tokens: Dynamic Grouping with Hierarchical BPERares Dolga, Lucas Maystre, Tudor Berariu, David Barber. 11154-11162 [doi]
- Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal AssistantLei Shen 0002, Xiaoyu Shen. 11163-11174 [doi]
- NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware EmbeddingsOr Shachar, Uri Katz, Yoav Goldberg, Oren Glickman. 11175-11186 [doi]
- MMATH: A Multilingual Benchmark for Mathematical ReasoningWenyang Luo, Wayne Xin Zhao, Jing Sha, Shijin Wang 0001, Ji-Rong Wen. 11187-11202 [doi]
- MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim ClustersRrubaa Panchendrarajan, Rubén Míguez Pérez, Arkaitz Zubiaga. 11203-11215 [doi]
- DS-MHP: Improving Chain-of-Thought through Dynamic Subgraph-Guided Multi-Hop PathYongqiang Liu, Qiyao Peng, Binrong Liu, Hongtao Liu, Xuewei Li, Wenjun Wang. 11216-11230 [doi]
- LongTail-Swap: benchmarking language models' abilities on rare wordsRobin Algayres, Charles-Éric Saint-James, Mahi Luthra, Jiayi Shen, Youssef Benchekroun, Dongyan Lin, Rashel Moritz, Juan Pino 0001, Emmanuel Dupoux. 11231-11251 [doi]
- TF-Mamba: Text-enhanced Fusion Mamba with Missing Modalities for Robust Multimodal Sentiment AnalysisXiang Li 0117, Xianfu Cheng, Dezhuang Miao, Xiaoming Zhang, Zhoujun Li 0001. 11252-11267 [doi]
- Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMsManon Reusens, Bart Baesens, David Jurgens. 11268-11287 [doi]
- Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMsMohamad Ballout, Okajevo Wilfred, Seyedalireza Yaghoubi, Nohayr Abdelmoneim, Julius Mayer 0001, Elia Bruni. 11288-11309 [doi]
- On the Effectiveness of Prompt-Moderated LLMs for Math Tutoring at the Tertiary LevelSebastian Steindl, Fabian Brunner, Nada Sissouno, Dominik Schwagerl, Florian Schöler-Niewiera, Ulrich Schäfer 0001. 11310-11323 [doi]
- SkewRoute: Training-Free LLM Routing for Knowledge Graph Retrieval-Augmented Generation via Score Skewness of Retrieved ContextHairu Wang 0002, Yuan Feng, Yukun Cao, Xike Xie, S. Kevin Zhou. 11324-11340 [doi]
- Acquiescence Bias in Large Language ModelsDaniel Braun. 11341-11355 [doi]
- Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia GamesNiv Eckhaus, Uri Berger, Gabriel Stanovsky. 11356-11368 [doi]
- How Sampling Affects the Detectability of Machine-written texts: A Comprehensive StudyMatthieu Dubois, François Yvon, Pablo Piantanida. 11369-11387 [doi]
- An Improved, Strong Baseline for Pre-Trained Large Language Models as Task-Oriented Dialogue SystemsSebastian Steindl, André Kestler, Ulrich Schäfer 0001, Bernd Ludwig. 11388-11398 [doi]
- MATCH: Task-Driven Code Evaluation through Contrastive LearningMarah Ghoummaid, Vladimir Tchuiev, Ofek Glick, Michal Moshkovitz, Dotan Di Castro. 11399-11414 [doi]
- Evaluating Large Language Models for Cross-Lingual RetrievalLongfei Zuo, Pingjun Hong, Oliver Kraus, Barbara Plank, Robert Litschko. 11415-11429 [doi]
- SGCD: Subtask-Guided Causal-Debiasing Framework for Robust Cross-Utterance Sentiment Quadruple Extraction in DialoguesXiang Li, Keyu Yao, Gang Shen. 11430-11440 [doi]
- FaMTEB: Massive Text Embedding Benchmark in Persian LanguageErfan Zinvandi, Morteza Alikhani, Mehran Sarmadi, Zahra Pourbahman, Sepehr Arvin, Reza Kazemi, Arash Amini. 11441-11468 [doi]
- Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-trainingKazuma Kobayashi, Zhen Wan, Fei Cheng 0002, Tsuta Yuma, Xin Zhao, Junfeng Jiang, Jiahao Huang, Zhiyi Huang, Yusuke Oda, Rio Yokota, Yuki Arase, Daisuke Kawahara, Akiko Aizawa, Sadao Kurohashi. 11469-11488 [doi]
- Structure Trumps Size: Rethinking Data Quality for LLM ReasoningHu Xu, Zeyan Li, Rui Wang, Jianfeng Xu. 11489-11513 [doi]
- A Zero-Shot Neuro-Symbolic Approach for Complex Knowledge Graph Question AnsweringPrerna Agarwal, Srikanta Bedathur. 11514-11527 [doi]
- Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV EqualizationShuyang Hao, Yiwei Wang 0001, Bryan Hooi, Jun Liu 0036, Muhao Chen 0001, Zi Huang, Yujun Cai. 11528-11543 [doi]
- MT-Mol: Multi Agent System with Tool-based Reasoning for Molecular OptimizationHyomin Kim, Yunhui Jang, Sungsoo Ahn. 11544-11573 [doi]
- A Survey on LLM-powered Agents for Recommender SystemsQiyao Peng 0001, Hongtao Liu 0008, Hua Huang, Jian Yang 0030, Qing Yang 0033, Minglai Shao 0001. 11574-11583 [doi]
- Efficiently Selecting Response Generation Strategies for Synthetic Data Construction by Self-Aligned PerplexityXuan Ren, Qi Chen 0014, Lingqiao Liu. 11584-11605 [doi]
- Benchmarking for Domain-Specific LLMs: A Case Study on Academia and BeyondRubing Chen, Jiaxin Wu 0001, Jian Wang 0054, Xulu Zhang, Wenqi Fan, Chenghua Lin, Xiaoyong Wei, Li Qing. 11606-11619 [doi]
- FrameEOL: Semantic Frame Induction using Causal Language ModelsChihiro Yano, Kosuke Yamada, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda 0003. 11620-11632 [doi]
- CaTER: A Framework for Context-aware Topology Entity Retrieval Contrastive Learning in End-to-End Task-Oriented Dialogue SystemsDi Wu Hebeu, Zhizhi Yu. 11633-11648 [doi]
- Attribution and Application of Multiple Neurons in Multimodal Large Language ModelsFeiyu Wang, Ziran Zhao, Dong Yu, Pengyuan Liu. 11649-11662 [doi]
- When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQAElisei Rykov, Kseniia Petrushina, Maksim Savkin, Valerii Olisov, Artem Vazhentsev, Kseniia Titova, Alexander Panchenko, Vasily Konovalov, Julia Belikova. 11663-11682 [doi]
- Unraveling Misinformation Propagation in LLM ReasoningYiyang Feng, Yichen Wang, Shaobo Cui 0006, Boi Faltings, Mina Lee 0002, Jiawei Zhou. 11683-11707 [doi]
- RAISE: Reinforced Adaptive Instruction Selection For Large Language ModelsQingsong Lv, Yangning Li, Zihua Lan, Zishan Xu, Jiwei Tang, Tingwei Lu, Yinghui Li, Wenhao Jiang, Hong-Gee Kim, Hai-Tao Zheng, Philip S. Yu. 11708-11723 [doi]
- Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum LearningYangning Li, Tingwei Lu, Yinghui Li, Yankai Chen, Wei-Chieh Huang, Wenhao Jiang, Hui Wang, Hai-Tao Zheng, Philip S. Yu. 11724-11741 [doi]
- Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and PreferencesMingqian Zheng, Wenjia Hu, Patrick Zhao, Motahhare Eslami, Jena D. Hwang, Faeze Brahman, Carolyn Rose, Maarten Sap. 11742-11772 [doi]
- From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support SystemsZekun Zhou, Xiaocheng Feng, Lei Huang 0021, Xiachong Feng, Ziyun Song, Ruihan Chen, Liang Zhao, Weitao Ma, Yuxuan Gu 0004, Baoxin Wang, Dayong Wu, Guoping Hu, Ting Liu 0001, Bing Qin 0001. 11773-11803 [doi]
- Enhancing Model Privacy in Federated Learning with Random Masking and QuantizationZhibo Xu, Jianhao Zhu, Jingwen Xu, Changze Lv, Zhenghua Wang, Zisu Huang, Xiaohua Wang, Muling Wu, Qi Qian, Xiaoqing Zheng, Xuanjing Huang 0001. 11804-11816 [doi]
- SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation LearningMingsheng Cai, Jiuming Jiang, Wenhao Huang, Che Liu 0002, Rossella Arcucci. 11817-11844 [doi]
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring TechniquePala Tej Deep, Vernon Toh, Rishabh Bhardwaj, Soujanya Poria. 11845-11860 [doi]
- Do What? Teaching Vision-Language-Action Models to Reject the ImpossibleWen-Han Hsieh, Elvis Hsieh, Dantong Niu, Trevor Darrell, Roei Herzig, David M. Chan. 11861-11869 [doi]
- AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient CollaborationChunhao Tian, Yutong Wang, Xuebo Liu 0002, Zhexuan Wang, Liang Ding 0006, Miao Zhang 0037, Min Zhang 0005. 11870-11902 [doi]
- Time to Revisit Exact MatchAuss Abbood, Zaiqiao Meng, Nigel Collier. 11903-11926 [doi]
- LongTableBench: Benchmarking Long-Context Table Reasoning across Real-World Formats and DomainsLiyao Li, Jiaming Tian, Hao Chen 0102, Wentao Ye, Chao Ye 0002, Haobo Wang 0001, Ningtao Wang, Xing Fu, Gang Chen 0001, Junbo Zhao 0002. 11927-11965 [doi]
- Exploring and Evaluating Multimodal Knowledge Reasoning Consistency of Multimodal Large Language ModelsBoyu Jia, Junzhe Zhang 0004, Huixuan Zhang, Xiaojun Wan 0001. 11966-11981 [doi]
- MPTA: MultiTask Personalization AssessmentMatthieu Tehenan, Eric Chamoun, Andreas Vlachos 0001. 11982-11992 [doi]
- Semantic Geometry of Sentence EmbeddingsMatthieu Tehenan. 11993-12004 [doi]
- ReAlign: Structured Revision for Small Language Model AlignmentRuijun Chen 0001, Jiajian Guo, Hongzhan Chen, Fanqi Wan, Qifan Wang 0001, Xiaojun Quan. 12005-12020 [doi]
- Curr-ReFT: Overcoming Training Bottlenecks in Small-scale Vision-Language Models via Curriculum Reinforcement FinetuningHuilin Deng, Ding Zou, Xinghao Zhao, Rui Ma 0011, Yanming Guo, Yang Cao 0010, Yu Kang 0001. 12021-12032 [doi]
- Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following KnowledgeYan-Lun Chen, Yi-Ru Wei, Chia-Yi Hsu, Chi-Yu Li 0001, Chun-Ying Huang, Ying-Dar Lin, Yu-Sung Wu, Wei-Bin Lee. 12033-12054 [doi]
- Revisiting Pruning vs Quantization for Small Language ModelsZihan Zhou, Simon Kurz, Zhixue Zhao. 12055-12070 [doi]
- CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning AnalysisXinzhe Xu, Liang Zhao, Hongshen Xu, Chen Chen. 12071-12103 [doi]
- polyBART: A Chemical Linguist for Polymer Property Prediction and Generative DesignAnagha Savit, Harikrishna Sahu, Shivank Shukla, Wei Xiong, Rampi Ramprasad. 12104-12119 [doi]
- A Survey of RAG-Reasoning Systems in Large Language ModelsYangning Li, Weizhi Zhang, Yuyao Yang, Wei-Chieh Huang, Yaozu Wu, Junyu Luo 0002, Yuanchen Bei, Henry Peng Zou, Xiao Luo 0001, Yusheng Zhao, Chunkit Chan, Yankai Chen 0001, Zhongfen Deng, Yinghui Li, Hai-Tao Zheng, Dongyuan Li, Renhe Jiang, Ming Zhang, Yangqiu Song, Philip S. Yu. 12120-12145 [doi]
- REGen: A Reliable Evaluation Framework for Generative Event Argument ExtractionOmar Sharif, Joseph Gatto, Madhusudan Basak, Sarah Masud Preum. 12146-12168 [doi]
- Mitigating Interviewer Bias in Multimodal Depression Detection: An Approach with Adversarial Learning and Contextual Positional EncodingEnshi Zhang, Christian Poellabauer. 12169-12188 [doi]
- AMIA: Automatic Masking and Joint Intention Analysis Makes LVLMs Robust Jailbreak DefendersYuqi Zhang, Yuchun Miao, Zuchao Li, Liang Ding. 12189-12199 [doi]
- Disentangling Language Understanding and Reasoning Structures in Cross-lingual Chain-of-Thought PromptingKhanh-Tung Tran, Nguyet-Hang Vu, Barry O'Sullivan, Hoang D. Nguyen. 12200-12206 [doi]
- MoRoVoc: A Large Dataset for Geographical Variation Identification of the Spoken Romanian LanguageAndrei-Marius Avram, Ema-Ioana Banescu, Anda-Teodora Robea, Dumitru-Clementin Cercel, Mihaela-Claudia Cercel. 12207-12216 [doi]
- Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-the-flyLance Ying, Ryan Truong, Katherine M. Collins, Cedegao E. Zhang, Megan Wei, Tyler Brooke-Wilson, Tan Zhi-Xuan, Lionel Wong, Joshua B. Tenenbaum. 12217-12235 [doi]
- MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMsZaid Alyafeai, Maged Saeed AlShaibani, Bernard Ghanem. 12236-12264 [doi]
- MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language ModelsMugilan Ganesan, Shane Segal, Ankur Aggarwal, Nish Sinnadurai, Sean Lie, Vithursan Thangarasa. 12265-12276 [doi]
- FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMsDebarpan Bhattacharya, Apoorva Kulkarni, Sriram Ganapathy. 12277-12295 [doi]
- ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim GenerationSiying Zhou, Yiquan Wu 0001, Hui Chen 0023, Xueyu Hu, Kun Kuang 0001, Adam Jatowt, Chunyan Zheng, Fei Wu 0001. 12296-12323 [doi]
- Summarize-Exemplify-Reflect: Data-driven Insight Distillation Empowers LLMs for Few-shot Tabular ClassificationYifei Yuan 0002, Jiatong Li, Weijia Zhang 0004, Mohammad Aliannejadi, Evangelos Kanoulas, Renjun Hu. 12324-12348 [doi]
- Rethinking LLM Uncertainty: A Multi-Agent Approach to Estimating Black-Box Model UncertaintyYu Feng 0013, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas 0004, Kishaloy Halder, Yang Li 0150, Yassine Benajiba, Dan Roth 0001. 12349-12375 [doi]
- Stress-Testing the Reasoning Competence of Language Models With Formal ProofsKonstantine Arkoudas, Serafim Batzoglou. 12376-12394 [doi]
- Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document SummarizationChuyuan Li, Austin Xu, Shafiq Joty, Giuseppe Carenini. 12395-12412 [doi]
- FACTCHECKMATE: Preemptively Detecting and Mitigating Hallucinations in LMsDeema Alnuhait, Neeraja Kirtane, Muhammad Khalifa, Hao Peng 0009. 12413-12428 [doi]
- Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language VarietiesFahim Faisal, Md Mushfiqur Rahman, Antonios Anastasopoulos. 12429-12452 [doi]
- Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to-Image ModelsPushkar Shukla, Aditya Chinchure, Emily Diana, Alexander Tolbert, Kartik Hosanagar, Vineeth N. Balasubramanian, Leonid Sigal, Matthew A. Turk. 12453-12472 [doi]
- Language-Specific Layer Matters: Efficient Multilingual Enhancement for Large Vision-Language ModelsYuchun Fan, Yilin Wang, Yongyu Mu, Lei Huang, Bei Li, Xiaocheng Feng, Tong Xiao 0001, Jingbo Zhu. 12473-12500 [doi]
- InfAL: Inference Time Adversarial Learning for Improving Research IdeationSikun Guo, Amir Hassan Shariatmadari, Peng Wang, Albert Huang, Aidong Zhang 0001. 12501-12522 [doi]
- Speculative Decoding for Multi-Sample InferenceYiwei Li 0001, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Ji Zhang 0017, Chuyi Tan, Boyuan Pan, Yao Hu 0002, Kan Li 0001. 12523-12533 [doi]
- LSRL: Process-Supervised GRPO on Latent Recurrent States Improves Mathematical ReasoningHangliang Ren. 12534-12545 [doi]
- Multi-token Mask-filling and Implicit Discourse RelationsMeinan Liu, Yunfang Dong, Xixian Liao, Bonnie Webber. 12546-12560 [doi]
- Schema Generation for Large Knowledge Graphs Using Large Language ModelsBohui Zhang, Yuan He, Lydia Pintscher, Albert Meroño-Peñuela, Elena Simperl. 12561-12580 [doi]
- MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree SearchYunhai Hu, Yilun Zhao 0001, Chen Zhao 0013, Arman Cohan. 12581-12597 [doi]
- What if Othello-Playing Language Models Could See?Xinyi Chen, Yifei Yuan 0002, Jiaang Li 0002, Serge J. Belongie, Maarten de Rijke, Anders Søgaard. 12598-12609 [doi]
- LLM-Based Web Data Collection for Research Dataset CreationThomas Berkane, Marie-Laure Charpignon, Maimuna S. Majumder. 12610-12622 [doi]
- PsyScam: A Benchmark for Psychological Techniques in Real-World ScamsShang Ma, Tianyi Ma, Jiahao Liu 0005, Wei Song, Zhenkai Liang, Xusheng Xiao, Yanfang Ye 0001. 12623-12637 [doi]
- LoRaDA: Low-Rank Direct Attention Adaptation for Efficient LLM Fine-tuningZhangming Li, Qinghao Hu, Yiqun Chen, Peisong Wang, Yifan Zhang, Jian Cheng. 12638-12655 [doi]
- Inductive Reasoning on Few-Shot Knowledge Graphs with Task-Aware Language ModelsCheng Yan, Feng Zhao, Ruilin Zhao, Hong Zhang. 12656-12666 [doi]
- ForestCast: Open-Ended Event Forecasting with Semantic News ForestZi Yu, Shaoxiang Wang, Guozheng Li 0002, Yu Zhang 0043, Chi Harold Liu. 12667-12681 [doi]
- Agentic Medical Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical KnowledgeMohammad R. Rezaei, Reza Saadati Fard, Jayson Parker, Rahul G. Krishnan, Milad Lankarany. 12682-12701 [doi]
- Text Anomaly Detection with Simplified Isolation KernelYang Cao 0019, Sikun Yang, Yu-Jiu Yang 0001, Lianyong Qi, Ming Liu. 12702-12713 [doi]
- Idola Tribus of AI: Large Language Models tend to perceive order where none existsShin-nosuke Ishikawa, Masato Todo, Taiki Ogihara, Hirotsugu Ohba. 12714-12727 [doi]
- Thunder-DeID: Accurate and Efficient De-identification Framework for Korean Court JudgmentsSungeun Hahm, Heejin Kim, Gyuseong Lee, Hyunji M. Park, Jaejin Lee. 12728-12755 [doi]
- Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances, Resources, and Future DirectionsYaozu Wu, Dongyuan Li, Yankai Chen 0001, Renhe Jiang, Henry Peng Zou, Wei-Chieh Huang, Yangning Li, Liancheng Fang, Zhen Wang 0004, Philip S. Yu. 12756-12773 [doi]
- Comprehensive Evaluation on Lexical Normalization: Boundary-Aware Approaches for Unsegmented LanguagesShohei Higashiyama, Masao Utiyama. 12774-12799 [doi]
- Explainable Text Classification with LLMs: Enhancing Performance through Dialectical Prompting and Explanation-Guided TrainingHuaming Du, Lei Yuan, Cancan Feng, Guisong Liu, Gang Kou, Carl Yang 0001. 12800-12816 [doi]
- MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-ExpertsQing Wang, Xue Han 0018, Jiahui Wang, Lehao Xing, Qian Hu, Lianlian Zhang, Chao Deng, Junlan Feng. 12817-12828 [doi]
- AutoSpec: An Agentic Framework for Automatically Drafting Patent SpecificationRyan Shea, Zhou Yu. 12829-12840 [doi]
- LimaCost: Data Valuation for Instruction Tuning of Large Language ModelsHyeonseok Moon, Jaehyung Seo, Seonmin Koo, Jinsung Kim, Young-kyoung Ham, Jiwon Moon, HeuiSeok Lim. 12841-12854 [doi]
- Two Challenges, One Solution: Robust Multimodal Learning through Dynamic Modality Recognition and EnhancementLanxin Bi, Yunqi Zhang, Luyi Wang, Yake Niu, Hui Zhao. 12855-12867 [doi]
- SwiftPrune: Hessian-Free Weight Pruning for Large Language ModelsYuhan Kang, Yang Shi 0008, Mei Wen, Jun He, Jianchao Yang, Zeyu Xue, Jing Feng, Xinwang Liu. 12868-12879 [doi]
- Training LLMs for Optimization Modeling via Iterative Data Synthesis and Structured ValidationYang Wu, Yifan Zhang, Yurong Wu, Yuran Wang, Junkai Zhang, Jian Cheng. 12880-12896 [doi]
- Exploiting Prompt-induced Confidence for Black-Box Attacks on LLMsMeina Chen, Yihong Tang, Kehai Chen. 12897-12903 [doi]
- DPF-CM: A Data Processing Framework with Privacy-Preserving Vector Databases for Chinese Medical LLMs Training and DeploymentWei Huang 0039, Anda Cheng, Zhao Zhang, Yinggui Wang. 12904-12916 [doi]
- Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise RewardHan Weng, Puzhen Wu, Longjie Cui, Yi Zhan, Boyi Liu, Yuanfeng Song, Dun Zeng, Yingxiang Yang, Qianru Zhang, Dong Huang, Xiaoming Yin, Yang Sun, Xing Chen. 12917-12943 [doi]
- StatsChartMWP: A Dataset for Evaluating Multimodal Mathematical Reasoning Abilities on Math Word Problems with Statistical ChartsDan Zhu, Tianqiao Liu, Zitao Liu 0001. 12944-12954 [doi]
- Logic-Thinker: Teaching Large Language Models to Think more LogicallyChengyao Wen, Qiang Cheng, Shaofei Wang, Zhizhen Liu, Deng Zhao, Lei Liang. 12955-12969 [doi]
- ACEBench: A Comprehensive Evaluation of LLM Tool UsageChen Chen 0017, Xinlong Hao, Weiwen Liu, Xu Huang 0008, Xingshan Zeng, Shuai Yu, Dexun Li, Yuefeng Huang, Xiangcheng Liu, Xinzhi Wang, Wu Liu. 12970-12998 [doi]
- RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation AnalysisXue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun 0003, Ping Chen 0003, Jun Dai. 12999-13011 [doi]
- DaMoC: Efficiently Selecting the Optimal Large Language Model for Fine-tuning Domain Tasks Based on Data and Model CompressionWei Huang 0039, Huang Wei, Yinggui Wang. 13012-13027 [doi]
- CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models ReasoningJianfeng Pan, Senyou Deng, Shaomang Huang. 13028-13045 [doi]
- ChartM³: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart ComprehensionDuo Xu, Hao Cheng, Xin Lin, Zhen Xie, Hao Henry Wang. 13046-13068 [doi]
- Can LLMs Truly Plan? A Comprehensive Evaluation of Planning CapabilitiesGayeon Jung, HyeonSeok Lim, MinJun Kim, Joon-Ho Lim, Kyungtae Lim, Hansaem Kim. 13069-13084 [doi]
- MARIO-0.5B: A Multi-Agent Lightweight Model for Real-Time Open Information Extraction in Low-Resource SettingsDonghai Zhang, Shuangtao Yang, Xiaozheng Dong, Wei Song, Bo Fu. 13085-13094 [doi]
- BiMax: Bidirectional MaxSim Score for Document-Level AlignmentXiaotian Wang, Takehito Utsuro, Masaaki Nagata. 13095-13116 [doi]
- DocMMIR: A Framework for Document Multi-modal Information RetrievalZirui Li, Siwei Wu, Yizhi Li, Xingyu Wang, Yi Zhou, Chenghua Lin. 13117-13130 [doi]
- MoVoC: Morphology-Aware Subword Construction for Ge'ez Script LanguagesHailay Kidu Teklehaymanot, Dren Fazlija, Wolfgang Nejdl. 13131-13144 [doi]
- MMA: Cross-Domain Knowledge Integration via Mixture of Multi-Domain AgentsKehang Jia, Juntao Li, Xiaobo Liang, Yisheng Xiao, Yixuan Yang, Min Zhang. 13145-13160 [doi]
- HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long ContextsSeonmin Koo, Jinsung Kim, Chanjun Park, HeuiSeok Lim. 13161-13184 [doi]
- Sensitivity-LoRA : Low-Load Sensitivity-Based Fine-Tuning for Large Language ModelsHao Zhang, Bo Huang, Zhenjia Li, Xi Xiao, Hui Yi Leong, Zumeng Zhang, Xinwei Long, Tianyang Wang, Hao Xu. 13185-13199 [doi]
- ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction TuningYang Wu, Huayi Zhang, Yizheng Jiao, Lin Ma, Xiaozhong Liu, Jinhong Yu, Dongyu Zhang, Dezhi Yu, Wei Xu. 13200-13219 [doi]
- SimBA: Simplifying Benchmark Analysis Using Performance Matrices AloneNishant Subramani, Alfredo Gomez, Mona T. Diab. 13220-13233 [doi]
- MarathiEmoExplain: A Dataset for Sentiment, Emotion, and Explanation in Low-Resource MarathiAnuj Kumar, Mohammed Faisal Sayed, Satyadev Ahlawat, Yamuna Prasad. 13234-13243 [doi]
- Active Domain Knowledge Acquisition with 100-Dollar Budget: Enhancing LLMs via Cost-Efficient, Expert-Involved Interaction in Sensitive DomainsYang Wu, Raha Moraffah, Rujing Yao, Jinhong Yu, Zhimin Tao, Xiaozhong Liu. 13244-13257 [doi]
- Structure-aware Propagation Generation with Large Language Models for Fake News DetectionMengyang Chen, Lingwei Wei, Wei Zhou 0019, Songlin Hu 0001. 13258-13272 [doi]
- UniCoM: A Universal Code-Switching Speech GeneratorSangmin Lee, Woojin Chung, Seyun Um, Hong-Goo Kang. 13273-13288 [doi]
- Mitigating Sequential Dependencies: A Survey of Algorithms and Systems for Generation-Refinement Frameworks in Autoregressive ModelsYunhai Hu, Zining Liu, Zhenyuan Dong, Tianfan Peng, Bradley McDanel, Sai Qian Zhang. 13289-13304 [doi]
- Do We Really Need All Those Dimensions? An Intrinsic Evaluation Framework for Compressed EmbeddingsNathan Inkiriwang, Necva Bölücü, Garth Tarr, Maciej Rybinski. 13305-13323 [doi]
- Mixture of LoRA Experts for Continual Information Extraction with LLMsZitao Wang, Xinyi Wang, Wei Hu. 13324-13339 [doi]
- Spelling-out is not Straightforward: LLMs' Capability of Tokenization from Token to CharactersTatsuya Hiraoka, Kentaro Inui. 13340-13353 [doi]
- OAgents: An Empirical Study of Building Effective AgentsHe Zhu, Tianrui Qin, King Zhu, Heyuan Huang, Yeyi Guan, Jinxiang Xia, Hanhao Li, Yi Yao, Ningning Wang, Pai Liu, Tianhao Peng, Xin Gui, Xiaowan Li, Yuhui Liu, Xiangru Tang, Jian Yang 0003, Ge Zhang 0009, Xitong Gao, Yuchen Eleanor Jiang, Changwang Zhang, Jun Wang, Jiaheng Liu, Wangchunshu Zhou. 13354-13369 [doi]
- 2Columns1Row: A Russian Benchmark for Textual and Multimodal Table Understanding and ReasoningVildan Saburov, Daniil Vodolazsky, Danil Sazanakov, Alena Fenogenova. 13370-13389 [doi]
- Permitted Knowledge Boundary: Evaluating the Knowledge-Constrained Responsiveness of Large Language ModelsWenrui Bao, Kai Wang 0057, Siqiang Luo, Xiang Li 0067. 13390-13405 [doi]
- A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language ModelsSriram Balasubramanian, Samyadeep Basu, Soheil Feizi. 13406-13439 [doi]
- From Remembering to Metacognition: Do Existing Benchmarks Accurately Evaluate LLMs?Geng Zhang, Yizhou Ying, Sihang Jiang, Jiaqing Liang, Guanglei Yue, Yifei Fu, Hailin Hu 0002, Yanghua Xiao. 13440-13457 [doi]
- How a Bilingual LM Becomes Bilingual: Tracing Internal Representations with Sparse AutoencodersTatsuro Inaba, Go Kamoda, Kentaro Inui, Masaru Isonuma, Yusuke Miyao, Yohei Oseki, Yu Takagi, Benjamin Heinzerling. 13458-13470 [doi]
- MultiConIR: Towards Multi-Condition Information RetrievalXuan Lu, SiFan Liu, Bochao Yin, Yongqi Li 0001, Xinghao Chen 0009, Hui Su, Yaohui Jin, Wenjun Zeng, Xiaoyu Shen 0001. 13471-13494 [doi]
- HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive LearningZhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, Yitao Cao. 13495-13518 [doi]
- KBAlign: Efficient Self Adaptation on Specific Textual Knowledge BasesZheni Zeng, Yuxuan Chen, Shi Yu 0001, Ruobing Wang, Yukun Yan, Zhenghao Liu 0001, Shuo Wang 0013, Xu Han 0007, Zhiyuan Liu 0001, Maosong Sun 0001. 13519-13532 [doi]
- Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shotXiang Cheng 0007, Chengyan Pan, Minjun Zhao, Deyang Li, Fangchao Liu, Xinyu Zhang, Xiao Zhang, Yong Liu 0020. 13533-13554 [doi]
- RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-PlayingHao Xiang, Tianyi Tang, Yang Su, Bowen Yu 0002, an Yang, Fei Huang 0002, Yichang Zhang, Yaojie Lu 0001, Hongyu Lin, Xianpei Han, Jingren Zhou 0001, Junyang Lin, Le Sun 0001. 13555-13571 [doi]
- Smart-Searcher: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement LearningHuatong Song, Jinhao Jiang, Wenqing Tian, Zhipeng Chen 0001, Yuhuan Wu, Jiahao Zhao 0002, Yingqian Min, Xin Zhao 0018, Lei Fang, Ji-Rong Wen. 13572-13586 [doi]
- InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative GroundingCheng Jiayang, Qianqian Zhuang, Haoran Li 0003, Chunkit Chan, Xin Liu 0039, Lin Qiu, Yangqiu Song. 13587-13602 [doi]
- MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal CritiqueGailun Zeng, Ziyang Luo, Hongzhan Lin 0001, Yuchen Tian, Kaixin Li, Ziyang Gong, Jianxiong Guo, Jing Ma 0004. 13603-13630 [doi]
- On the Correspondence between the Squared Norm and Information Content in Text EmbeddingsEnrique Amigó, Adrián Ghajari, Alejandro Benito-Santos, Diego De La Fuente Rodríguez. 13631-13643 [doi]
- Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial TrainingFenghua Weng, Jian Lou 0001, Jun Feng, Minlie Huang, Wenjie Wang 0008. 13644-13657 [doi]
- SLiNT: Structure-aware Language Model with Injection and Contrastive Training for Knowledge Graph CompletionMengxue Yang, Chun Yang, Jiaqi Zhu, Jiafan Li, Jingqi Zhang, Yuyang Li, Ying Li. 13658-13671 [doi]
- LAVa: Layer-wise KV Cache Eviction with Dynamic Budget AllocationYiqun Shen, Song Yuan, Zhengze Zhang, Xiaoliang Wang 0001, Daxin Jiang, Cam-Tu Nguyen. 13672-13692 [doi]
- LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-TuningYining Huang, Bin Li, Keke Tang, Meilian Chen. 13693-13704 [doi]
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory SynthesisShuang Sun, Huatong Song, Yuhao Wang 0007, Ruiyang Ren, Jinhao Jiang, Junjie Zhang 0009, Fei Bai, Jia Deng, Xin Zhao 0018, Zheng Liu 0011, Lei Fang, Zhongyuan Wang 0006, Ji-Rong Wen. 13705-13720 [doi]
- LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive LearningZhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou 0016, Jinsong Su. 13721-13735 [doi]
- SampleMix: A Sample-wise Pre-training Data Mixing Strategy by Coordinating Data Quality and DiversityXiangyu Xi, Deyang Kong, Jian Yang 0003, Jiawei Yang, Zhengyu Chen, Wei Wang 0225, Jingang Wang, Xunliang Cai, Shikun Zhang, Wei Ye 0004. 13736-13758 [doi]
- Evaluating Test-Time Scaling LLMs for Legal Reasoning: OpenAI o1, DeepSeek-R1, and BeyondYinghao Hu, Yaoyao Yu, Leilei Gan, Bin Wei 0006, Kun Kuang 0001, Fei Wu 0001. 13759-13781 [doi]
- LLM Agents for Education: Advances and ApplicationsZhendong Chu, Shen Wang 0005, Jian Xie, Tinghui Zhu, Yibo Yan, Jingheng Ye, Aoxiao Zhong, Xuming Hu, Jing Liang 0008, Philip S. Yu, Qingsong Wen. 13782-13810 [doi]
- Modeling Subjectivity in Cognitive Appraisal with Language ModelsYuxiang Zhou, Hainiu Xu, Desmond C. Ong, Maria Liakata, Petr Slovák, Yulan He 0001. 13811-13833 [doi]
- Dementia Through Different Eyes: Explainable Modeling of Human and LLM Perceptions for Early AwarenessLotem Peled-Cohen, Maya Zadok, Nitay Calderon, Hila Gonen, Roi Reichart. 13834-13860 [doi]
- Mitigating Hallucinations in Large Vision-Language Models by Self-Injecting HallucinationsYifan Lu 0001, Ziqi Zhang, Chunfeng Yuan, Jun Gao, Congxuan Zhang, Xiaojuan Qi 0001, Bing Li 0001, Weiming Hu 0004. 13861-13877 [doi]
- How Much Do Large Language Models Know about Human Motion? A Case Study in 3D Avatar ControlKunhang Li, Jason Naradowsky, Yansong Feng, Yusuke Miyao. 13878-13921 [doi]
- The Search for Conflicts of Interest: Open Information Extraction in Scientific PublicationsGarima Gaur, Oana Balalau, Ioana Manolescu, Prajna Upadhyay. 13922-13936 [doi]
- On Collaborating Small and Large Models For Few-shot Intent DetectionPeng Chen, Bang Wang. 13937-13953 [doi]
- A Survey on LLMs for Story GenerationMaria Teleki, Vedangi Bengali, Xiangjue Dong, Sai Janjur, Haoran Liu, Tian Liu 0006, Cong Wang, Ting Liu, Yin Zhang 0011, Frank Shipman, James Caverlee. 13954-13966 [doi]
- From Knowledge to Treatment: Large Language Model Assisted Biomedical Concept Representation for Drug RepurposingChengrui Xiang, Tengfei Ma 0002, Xiangzheng Fu, Yiping Liu, Bosheng Song, Xiangxiang Zeng. 13967-13982 [doi]
- SKRAG: A Retrieval-Augmented Generation Framework Guided by Reasoning Skeletons over Knowledge GraphsXiaotong Xu, Yizhao Wang, Yunfei Liu, Shengyang Li. 13983-13994 [doi]
- A Generative Framework for Personalized Sticker RetrievalChangjiang Zhou, Ruqing Zhang 0001, Jiafeng Guo, Yu-An Liu, Fan Zhang, Ganyuan Luo, Xueqi Cheng. 13995-14009 [doi]
- Bridging Semantic and Modality Gaps in Zero-Shot Captioning via Retrieval from Synthetic DataZhiyue Liu, Wenkai Zhou. 14010-14023 [doi]
- Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online ComicsYuriel Ryan, Rui Yang Tan, Kenny Tsu Wei Choo, Roy Ka-Wei Lee. 14024-14050 [doi]
- BiMediX2 : Bio-Medical EXpert LMM for Diverse Medical ModalitiesSahal Shaji Mullappilly, Mohammed Irfan Kurpath, Sara Pieri, Saeed Yahya Alseiari, Shanavas Cholakkal, Khaled Aldahmani, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman H. Khan 0001, Timothy Baldwin, Hisham Cholakkal. 14051-14071 [doi]
- DeMAC: Enhancing Multi-Agent Coordination with Dynamic DAG and Manager-Player FeedbackYuhan Liu 0014, Cong Xu 0001, Lu Liu, Yihua Wang, Feiyu Chen 0005, Qi Jia 0004, Yaqian Zhao, Zhichun Wang, Xiang Li. 14072-14098 [doi]
- Coherence of Argumentative Dialogue Snippets: A New Method for Large Scale Evaluation with an Application to Inference Anchoring TheoryPaul Piwek, Jacopo Amidei, Svetlana Stoyanchev. 14099-14119 [doi]
- Angular Dispersion Accelerates k-Nearest Neighbors Machine TranslationEvgeniia Tokarchuk, Sergey Troshin 0001, Vlad Niculae. 14120-14132 [doi]
- Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild DataQiongqiong Wang, Hardik B. Sailor, Tianchi Liu 0004, Wenyu Zhang, Muhammad Huzaifah 0001, Nattadaporn Lertcheva, Shuo Sun, Nancy F. Chen, Jinyang Wu, AiTi Aw. 14133-14148 [doi]
- This is not a Disimprovement: Improving Negation Reasoning in Large Language Models via Prompt EngineeringJoshua Jose Dias Barreto, Abhik Jana. 14149-14156 [doi]
- Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual CorporaRobert Litschko, Verena Blaschke, Diana Burkhardt, Barbara Plank, Diego Frassinelli. 14157-14174 [doi]
- SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-AlignmentYuqing Huang, Rongyang Zhang, Qimeng Wang, Chengqiang Lu, Yan Gao 0017, Yi Wu, Yao Hu 0002, Xuyang Zhi, Guiquan Liu, Xin Li 0064, Hao Wang 0076, Enhong Chen. 14175-14190 [doi]
- SEKE: Specialised Experts for Keyword ExtractionMatej Martinc, Tran Thi Hong Hanh, Senja Pollak, Boshko Koloski. 14191-14205 [doi]
- 1+1\ensuremath>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language ModelsZeliang Zong, Kai Zhang 0055, Zheyang Li, Wenming Tan, Ye Ren, Yiyan Zhai, Jilin Hu. 14206-14220 [doi]
- InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical ReasoningXiaotian Han, Yiren Jian, Xuefeng Hu, Haogeng Liu, Yiqi Wang 0004, Qihang Fan, Yuang Ai, Huaibo Huang, Ran He 0001, Zhenheng Yang, Quanzeng You. 14221-14231 [doi]
- Zero-Shot Defense Against Toxic Images via Inherent Multimodal Alignment in LVLMsWei Zhao, Zhe Li, Yige Li, Jun Sun 0001. 14232-14246 [doi]
- Retrieval Augmented Generation based context discovery for ASRDimitrios Siskos, Stavros Papadopoulos 0002, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Anastasios Drosou. 14247-14254 [doi]
- pFedRAG: A Personalized Federated Retrieval-Augmented Generation System with Depth-Adaptive Tiered Embedding TuningHangyu He, Xin Yuan 0004, Kai Wu 0004, Ren Ping Liu 0001, Wei Ni 0001. 14255-14268 [doi]
- ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference OptimizationZhensheng Jin, Xinze Li, Yifan Ji, Chunyi Peng, Zhenghao Liu 0001, Qi Shi 0002, Yukun Yan, Shuo Wang 0013, Furong Peng, Ge Yu 0001. 14269-14282 [doi]
- CURE: Controlled Unlearning for Robust Embeddings - Mitigating Conceptual Shortcuts in Pre-Trained Language ModelsAysenur Kocak, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci. 14283-14297 [doi]
- MLAlgo-Bench: Can Machines Implement Machine Learning Algorithms?Yunfei Wang, Yeqin Zhang, Yuyang Wu, Liang Lu, Phi-Le Nguyen, Xiaoliang Wang, Nguyen Cam-Tu. 14298-14329 [doi]
- Fair Text-Attributed Graph Representation LearningRuilin Luo, Tianle Gu, Lin Wang, Yunfeng Zhou, Songtao Jiang, Lei Wang, Yujiu Yang. 14330-14353 [doi]
- Human-Inspired Obfuscation for Model Unlearning: Local and Global Strategies with Hyperbolic RepresentationsZekun Wang, Jingjie Zeng, Yingxu Li, Liang Yang, Hongfei Lin. 14354-14366 [doi]
- Do Influence Functions Work on Large Language Models?Zhe Li, Wei Zhao, Yige Li, Jun Sun 0001. 14367-14382 [doi]
- TRUEBench: Can LLM Response Meet Real-world Constraints as Productivity Assistant?Jiho Park, Jongyoon Song, Minjin Choi 0004, Kyuho Heo, Taehun Huh, Ji-Won Kim. 14383-14409 [doi]
- CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative TasksQi Chai, Zhang Zheng, Junlong Ren, Deheng Ye, Zichuan Lin, Hao Wang 0094. 14410-14426 [doi]
- Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language ModelsBang Trinh Tran To, Thai Le. 14427-14439 [doi]
- Learning Trajectories of Figurative Language for Pre-Trained Language ModelsNicola Arici, Luca Putelli, Ejdis Gjinika, Ivan Serina, Alfonso Gerevini. 14440-14461 [doi]
- BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal FusionSike Xiang, Shuang Chen, Amir Atapour Abarghouei. 14462-14472 [doi]
- HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic SignalsGuimin Hu, Daniel Hershcovich, Hasti Seifi. 14473-14489 [doi]
- SubDocTrans: Enhancing Document-level Machine Translation with Plug-and-play Multi-granularity Knowledge AugmentationHanghai Hong, Yibo Xie, Jiawei Zheng, Xiaoli Wang 0002. 14490-14506 [doi]
- Social Bias Evaluation for Large Language Models Requires Prompt VariationsRem Hida, Masahiro Kaneko, Naoaki Okazaki. 14507-14530 [doi]
- Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic RoundingTaowen Liu, Marta Andronic, Deniz Gündüz, George Anthony Constantinides. 14531-14546 [doi]
- FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language ModelsRadu Marinescu 0002, Debarun Bhattacharjya, Junkyu Lee 0001, Tigran T. Tchrakian, Javier Carnerero-Cano, Yufang Hou 0001, Elizabeth M. Daly, Alessandra Pascale. 14547-14577 [doi]
- Robust Knowledge Editing via Explicit Reasoning Chains for Distractor-Resilient Multi-Hop QAYuchen Wu, Liang Ding 0006, Li Shen 0008, Dacheng Tao. 14578-14586 [doi]
- RadialRouter: Structured Representation for Efficient and Robust Large Language Models RoutingRuihan Jin, Pengpeng Shao, Zhengqi Wen, Jinyang Wu, Mingkuan Feng, Shuai Zhang 0014, Jianhua Tao 0001. 14587-14600 [doi]
- Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language ModelsWataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe. 14601-14613 [doi]
- Elucidating Mechanisms of Demographic Bias in LLMs for HealthcareHiba Ahsan, Arnab Sen Sharma, Silvio Amir, David Bau, Byron C. Wallace. 14614-14631 [doi]
- Can You Trick the Grader? Adversarial Persuasion of LLM JudgesYerin Hwang, Dongryeol Lee, Taegwan Kang, Yongil Kim, Kyomin Jung. 14632-14651 [doi]
- Navigating the Unknown: Intent Classification and Out-of-Distribution Detection Using Large Language ModelsYusuf Sali, Sitki Can Toraman. 14652-14664 [doi]
- Trust Me, I'm Wrong: LLMs Hallucinate with Certainty Despite Knowing the AnswerAdi Simhi, Itay Itzhak, Fazl Barez, Gabriel Stanovsky, Yonatan Belinkov. 14665-14688 [doi]
- QUARTZ: QA-based Unsupervised Abstractive Refinement for Task-oriented Dialogue SummarizationMohamed Imed Eddine Ghebriout, Gaël Guibon, Ivan Lerner, Emmanuel Vincent 0001. 14689-14706 [doi]
- MDSEval: A Meta-Evaluation Benchmark for Multimodal Dialogue SummarizationYinhong Liu, Jianfeng He, Hang Su, Ruixue Lian, Yi Nian, Jake W. Vincent, Srikanth Vishnubhotla, Robinson Piramuthu, Saab Mansour. 14707-14727 [doi]
- PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language ModelsChenzhuo Zhao, Ziqian Liu, Xinda Wang, Junting Lu, Chaoyi Ruan. 14728-14761 [doi]
- Evaluating the Creativity of LLMs in Persian Literary Text GenerationArmin Tourajmehr, Mohammad Reza Modarres, Yadollah Yaghoobzadeh. 14762-14774 [doi]
- SCDTour: Embedding Axis Ordering and Merging for Interpretable Semantic Change DetectionTaichi Aida, Danushka Bollegala. 14775-14785 [doi]
- Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model EditingBhiman Kumar Baghel, Emma Jordan, Zheyuan Ryan Shi, Xiang Lorraine Li. 14786-14808 [doi]
- LLM-empowered Dynamic Prompt Routing for Vision-Language Models Tuning under Long-Tailed DistributionsYongju Jia, Jiarui Ma, Xiangxian Li, Baiqiao Zhang, Xianhui Cao, Juan Liu 0008, Yulong Bian. 14809-14822 [doi]
- HGAdapter: Hypergraph-based Adapters in Language Models for Code Summarization and Clone DetectionGuang Yang, Yujie Zhu. 14823-14833 [doi]
- Evaluating distillation methods for data-efficient syntax learningTakateru Yamakoshi, Thomas L. Griffiths 0001, R. Thomas McCoy, Robert D. Hawkins. 14834-14847 [doi]
- "Going to a trap house" conveys more fear than "Going to a mall": Benchmarking Emotion Context Sensitivity for LLMsEojin Jeon, Mingyu Lee, Sangyun Kim, Junho Kim, Wanzee Cho, Tae-Eui Kam, SangKeun Lee 0001. 14848-14869 [doi]
- [MASK]ED - Language Modeling for Explainable Classification and Disentangling of Socially Unacceptable DiscourseDimitra Niaouri, Rayane Ghilene, Michele Linardi, Julien Longhi. 14870-14883 [doi]
- A Survey of Cognitive Distortion Detection and Classification in NLPArchie Sage, Jeroen Keppens, Helen Yannakoudakis. 14884-14899 [doi]
- Curse of Knowledge: Your Guidance and Provided Knowledge are biasing LLM Judges in Complex EvaluationWeiyuan Li, Xintao Wang 0001, Siyu Yuan, Rui Xu 0026, Jiangjie Chen, Qingqing Dong, Yanghua Xiao, Deqing Yang. 14900-14924 [doi]
- Self-Training Large Language Models with Confident ReasoningHyosoon Jang, Yunhui Jang, Sungjae Lee, Jungseul Ok, Sungsoo Ahn. 14925-14939 [doi]
- Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical SupervisionPala Tej Deep, Panshul Sharma, Amir Zadeh 0001, Chuan Li, Soujanya Poria. 14940-14954 [doi]
- Enhancing LLM-Based Persuasion Simulations with Cultural and Speaker-Specific InformationWeicheng Ma, Hefan Zhang 0001, Shiyu Ji, Farnoosh Hashemi, Qichao Wang, Ivory Yang, Joice Chen, Juanwen Pan, Michael Macy, Saeed Hassanpour, Soroush Vosoughi. 14955-14976 [doi]
- An LLM-based Temporal-spatial Data Generation and Fusion Approach for Early Detection of Late Onset Alzheimer's Disease (LOAD) Stagings Especially in Chinese and English-speaking PopulationsYang Han 0006, Jacqueline C. K. Lam, Victor On Kwok Li, Lawrence Y. L. Cheung. 14977-14990 [doi]
- Side Effects of Erasing Concepts from Diffusion ModelsShaswati Saha, Sourajit Saha, Manas Gaur, Tejas Gokhale. 14991-15007 [doi]
- SaCa: A Highly Compatible Reinforcing Framework for Knowledge Graph Embedding via Structural Pattern ContrastJiashi Lin, Changhong Jiang, Yixiao Wang, Xinyi Zhu, Zhongtian Hu, Wei Zhang. 15008-15021 [doi]
- Real, Fake, or Manipulated? Detecting Machine-Influenced TextYitong Wang, Zhongping Zhang, Margherita Piana, Zheng Zhou, Peter Gerstoft, Bryan A. Plummer. 15022-15037 [doi]
- Character is Destiny: Can Persona-assigned Language Models Make Personal Choices?Rui Xu 0026, Xintao Wang 0001, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao. 15038-15059 [doi]
- Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative ScenariosSaba Ghanbari Haez, Mauro Dragoni. 15060-15088 [doi]
- BTW: A Non-Parametric Variance Stabilization Framework for Multimodal Model IntegrationJun Hou, Le Wang, Xuan Wang. 15089-15103 [doi]
- Can LLMs Be Efficient Predictors of Conversational Derailment?Kaustubh Olpadkar, Vikram Sunil Bajaj, Leslie Barrett. 15104-15112 [doi]
- Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process SupervisionXiaopeng Ye, Chen Xu 0010, Chaoliang Zhang, Zhaocheng Du, Jun Xu 0001, Gang Wang 0056, Zhenhua Dong. 15113-15128 [doi]
- Factuality Beyond Coherence: Evaluating LLM Watermarking Methods for Medical TextsRochana Prih Hastuti, Rian Adam Rajagede, Mansour Al Ghanim, Mengxin Zheng, Qian Lou. 15129-15147 [doi]
- Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language AgentsRui Xu 0026, Mingyu Wang, Xintao Wang 0001, Dakuan Lu, Xiaoyu Tan, Wei Chu, Yinghui Xu. 15148-15168 [doi]
- Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMsYixiao Zhou, Ziyu Zhao 0001, Dongzhou Cheng, Zhiliang Wu, Jie Gui, Yi Yang 0001, Fei Wu 0001, Yu Cheng 0001, Hehe Fan. 15169-15186 [doi]
- BiasFilter: An Inference-Time Debiasing Framework for Large Language ModelsXiaoqing Cheng, Ruizhe Chen, Hongying Zan, Yuxiang Jia, Min Peng. 15187-15205 [doi]
- X-LeBench: A Benchmark for Extremely Long Egocentric Video UnderstandingWenqi Zhou, Kai Cao, Hao Zheng, Yunze Liu, Xinyi Zheng, Miao Liu, Per Ola Kristensson, Walterio W. Mayol-Cuevas, Fan Zhang, Weizhe Lin, Junxiao Shen. 15206-15222 [doi]
- A Survey on Multi-modal Intent Recognition: Recent Advances and New FrontiersZhihong Zhu, Fan Zhang 0111, Yunyan Zhang, Jinghan Sun, Zhiqi Huang 0001, Qingqing Long, Bowen Xing, Xian Wu 0001. 15223-15236 [doi]
- Will Annotators Disagree? Identifying Subjectivity in Value-Laden ArgumentsAmir Homayounirad, Enrico Liscio, Tong Wang, Catholijn M. Jonker, Luciano Cavalcante Siebert. 15237-15252 [doi]
- LLMs Can Compensate for Deficiencies in Visual RepresentationsSho Takishita, Jay P. Gala, Abdelrahman Mohamed, Kentaro Inui, Yova Kementchedjhieva. 15253-15272 [doi]
- Adapting Large Language Models for Character-based Augmentative and Alternative CommunicationDylan Gaines, Keith Vertanen. 15273-15291 [doi]
- Token-Level Metrics for Detecting Incorrect Gold Annotations in Named Entity RecognitionElena Merdjanovska, Alan Akbik. 15292-15304 [doi]
- Exploring Paraphrasing Strategies for CEFR A1-Level Constraints in LLMsEugenio Marzona, Maria Goikhman, Alessio Palmero Aprosio, Massimo Zancanaro. 15305-15318 [doi]
- Efficient Layer-wise LLM Fine-tuning for Revision Intention PredictionZhexiong Liu, Diane J. Litman. 15319-15334 [doi]
- ConText-LE: Cross-Distribution Generalization for Longitudinal Experiential Data via Narrative-Based LLM RepresentationsAhatsham Hayat, Bilal Khan, Mohammad Rashedul Hasan. 15335-15360 [doi]
- Chain of Strategy Optimization Makes Large Language Models Better Emotional SupporterWeixiang Zhao, Xingyu Sui, Xinyang Han, Yang Deng 0002, Yulin Hu, Jiahe Guo, Libo Qin 0001, Qianyun Du, Shijin Wang 0001, Yanyan Zhao, Bing Qin 0001, Ting Liu 0001. 15361-15381 [doi]
- Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in SwitzerlandLuca Rolshoven, Vishvaksenan Rasiah, Srinanda Brügger Bose, Sarah Hostettler, Lara Burkhalter, Matthias Stürmer, Joel Niklaus. 15382-15411 [doi]
- Context Minimization for Resource-Constrained Text Classification: Optimizing Performance-Efficiency Trade-offs through Linguistic FeaturesNahid Hossain, Md Faisal Kabir. 15412-15426 [doi]
- FLAIRR-TS - Forecasting LLM-Agents with Iterative Refinement and Retrieval for Time SeriesGunjan Jalori, Preetika Verma, Sercan Ö Arik. 15427-15437 [doi]
- ULTRABENCH: Benchmarking LLMs under Extreme Fine-grained Text GenerationLongfei Yun, Letian Peng, Jingbo Shang. 15438-15453 [doi]
- The Price of Format: Diversity Collapse in LLMsLongfei Yun, Chenyang An, Zilong Wang 0002, Letian Peng, Jingbo Shang. 15454-15468 [doi]
- Zipf's and Heaps' Laws for Tokens and LLM-generated TextsNikolay Mikhaylovskiy. 15469-15481 [doi]
- LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?Rushil Gupta, Jason Hartford, Bang Liu. 15482-15510 [doi]
- A Comprehensive Taxonomy of Negation for NLP and Neural RetrieversRoxana Petcu, Samarth Bhargav 0001, Maarten de Rijke, Evangelos Kanoulas. 15511-15533 [doi]
- Identifying Noise in Human-Created Datasets using Training Dynamics from Generative ModelsMaeda F. Hanafi, Ishan Jindal, Yannis Katsis, Lucian Popa 0001, Huaiyu Zhu 0001. 15534-15550 [doi]
- Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?Yang Nan, Pengfei He, Ravi Tandon, Han Xu. 15551-15569 [doi]
- AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media TextTadesse Destaw Belay, Israel Abebe Azime, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Idris Abdulmumin, Abinew Ali Ayele, Shamsuddeen Hassan Muhammad, Seid Muhie Yimam. 15570-15587 [doi]
- Teaching Language Models To Gather Information ProactivelyTenghao Huang, Sihao Chen, Muhao Chen 0001, Jonathan May, Longqi Yang 0001, Mengting Wan, Pei Zhou. 15588-15599 [doi]
- Linguistic Alignment Predicts Learning in Small Group Tutoring SessionsDorothea French, Robert G. Moulder, Kelechi Ezema, Katharina von der Wense, Sidney K. D'Mello. 15600-15611 [doi]
- EfficientXLang: Towards Improving Token Efficiency Through Cross-Lingual ReasoningSanchit Ahuja, Praneetha Vaddamanu, Barun Patra. 15612-15624 [doi]
- Not Lost After All: How Cross-Encoder Attribution Challenges Position Bias Assumptions in LLM SummarizationElahe Rahimi, Hassan Sajjad 0001, Domenic Rosati, Abeer Badawi, Elham Dolatabadi, Frank Rudzicz. 15625-15641 [doi]
- FuzzAug: Data Augmentation by Coverage-guided Fuzzing for Neural Test GenerationYifeng He, Jicheng Wang, Yuyang Rong, Hao Chen. 15642-15655 [doi]
- DrAgent: Empowering Large Language Models as Medical Agents for Multi-hop Medical ReasoningFenglin Liu, Zheng Li 0018, Hongjian Zhou, Qingyu Yin, Jingfeng Yang 0001, Xin Liu 0039, Zhengyang Wang, Xianfeng Tang, Shiyang Li, Xiang He, Ruijie Wang 0004, Bing Yin, Xiao Gu 0003, Lei A. Clifton, David A. Clifton. 15656-15668 [doi]
- XRAG: Cross-lingual Retrieval-Augmented GenerationWei Liu, Sony Trenous, Leonardo F. R. Ribeiro, Bill Byrne, Felix Hieber. 15669-15690 [doi]
- Can VLMs Recall Factual Associations From Visual References?Dhananjay Ashok, Ashutosh Chaubey, Hirona Jacqueline Arai, Jonathan May, Jesse Thomason. 15691-15708 [doi]
- MFTCXplain: A Multilingual Benchmark Dataset for Evaluating the Moral Reasoning of LLMs through Multi-hop Hate Speech ExplanationJackson Trager, Francielle Vargas, Diego Alves, Matteo Guida, Mikel K. Ngueajio, Ameeta Agrawal, Yalda Daryani, Farzan Karimi-Malekabadi, Flor Miriam Plaza del Arco. 15709-15740 [doi]
- Large Language Models for Multilingual Previously Fact-Checked Claim DetectionIvan Vykopal, Matús Pikuliak, Simon Ostermann 0002, Tatiana Anikina, Michal Gregor, Marián Simko. 15741-15765 [doi]
- Debating for Better Reasoning in Vision-Language ModelsAshutosh Adhikari, Mirella Lapata. 15766-15784 [doi]
- Fine-tuning LLMs with Cross-Attention-based Weight Decay for Bias MitigationFarsheed Haque, Zhe Fu, Depeng Xu 0001, Shuhan Yuan, Xi Niu. 15785-15798 [doi]
- Profiling LLM's Copyright Infringement Risks under Adversarial Persuasive PromptingJikai Long, Ming Liu, Xiusi Chen, Jialiang Xu, Shenglan Li, Zhaozhuo Xu, Denghui Zhang. 15799-15823 [doi]
- Residualized Similarity for Faithfully Explainable Authorship VerificationPeter Zeng, Pegah Alipoormolabashi, Jihu Mun, Gourab Dey, Nikita Soni 0002, Niranjan Balasubramanian, Owen Rambow, H. Andrew Schwartz. 15824-15837 [doi]
- Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness EvaluationTunazzina Islam, Dan Goldwasser. 15838-15859 [doi]
- MRFD: Multi-Region Fusion Decoding with Self-Consistency for Mitigating Hallucinations in LVLMsHaonan Ge, Yiwei Wang 0001, Ming-Hsuan Yang 0001, Yujun Cai. 15860-15879 [doi]
- SIMBA UQ: Similarity-Based Aggregation for Uncertainty Quantification in Large Language ModelsDebarun Bhattacharjya, Balaji Ganesan, Junkyu Lee 0001, Radu Marinescu 0002, Katsiaryna Mirylenka, Michael R. Glass, Xiao Shou. 15880-15894 [doi]
- Mind the Dialect: NLP Advancements Uncover Fairness Disparities for Arabic Users in Recommendation SystemsAbdulla Alshabanah, Murali Annavaram. 15895-15903 [doi]
- Hopscotch: Discovering and Skipping Redundancies in Language ModelsMustafa Eyceoz, Nikhil Shivakumar Nayak, Hao Wang, Ligong Han, Akash Srivastava. 15904-15913 [doi]
- CLEAR: A Clinically Grounded Tabular Framework for Radiology Report EvaluationYuyang Jiang, Chacha Chen, Shengyuan Wang, Feng Liu 0011, Zecong Tang, Benjamin M. Mervak, Lydia Chelala, Christopher M. Straus, Reve Chahine, Samuel G. Armato III, Chenhao Tan. 15914-15933 [doi]
- Parsing the Switch: LLM-Based UD Annotation for Complex Code-Switched and Low-Resource LanguagesOlga Kellert, Nemika Tyagi, Muhammad Imran, Nelvin Licona-Guevara, Carlos Gómez-Rodríguez. 15934-15949 [doi]
- HetGCoT: Heterogeneous Graph-Enhanced Chain-of-Thought LLM Reasoning for Academic Question AnsweringRunsong Jia, Mengjia Wu, Ying Ding 0001, Jie Lu 0001, Yi Zhang 0095. 15950-15963 [doi]
- S*: Test Time Scaling for Code GenerationDacheng Li, Shiyi Cao, Chengkun Cao, Xiuyu Li, Shangyin Tan, Kurt Keutzer, Jiarong Xing, Joseph E. Gonzalez, Ion Stoica. 15964-15978 [doi]
- Language Models Can Easily Learn to Reason from DemonstrationsDacheng Li, Shiyi Cao, Tyler Griggs, Shu Liu, Xiangxi Mo, Eric Tang, Sumanth Hegde, Kourosh Hakhamaneshi, Shishir G. Patil, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica. 15979-15997 [doi]
- FSTs vs ICL: Generalisation in LLMs for an under-resourced languageXimena Gutierrez, Mikel Segura Elizalde, Victor Mijangos. 15998-16006 [doi]
- SRM-LLM: Semantic Relationship Mining with LLMs for Temporal Knowledge Graph ExtrapolationFu Zhang 0001, Panfeng Zhang, Jingwei Cheng. 16007-16021 [doi]
- Captioning for Text-Video Retrieval via Dual-Group Direct Preference OptimizationJi Soo Lee, Byungoh Ko, Jaewon Cho, Howoong Lee, Jaewoon Byun, Hyunwoo J. Kim. 16022-16039 [doi]
- Benchmarking and Improving LLM Robustness for Personalized GenerationChimaobi Okite, Naihao Deng, Kiran Bodipati, Huaidian Hou, Joyce Chai, Rada Mihalcea. 16040-16072 [doi]
- MemeInterpret: Towards an All-in-One Dataset for Meme UnderstandingJeongsik Park, Khoi P. N. Nguyen, JiHyung Park, Minseok Kim, Jaeheon Lee, Jae-Won Choi, Kalyani Ganta, Phalgun Ashrit Kasu, Rohan Sarakinti, Sanjana Vipperla, Sai Sathanapalli, Nishan Vaghani, Vincent Ng 0001. 16073-16087 [doi]
- CoRAG: Enhancing Hybrid Retrieval-Augmented Generation through a Cooperative Retriever ArchitectureZaiyi Zheng, Song Wang 0013, Zihan Chen 0002, Yaochen Zhu, Yinhan He, Liangjie Hong, Qi Guo, Jundong Li. 16088-16101 [doi]
- Hallucination Detection in Structured Query Generation via LLM Self-DebatingMiaoran Li, Jiangning Chen, Minghua Xu, Xiaolong Wang. 16102-16113 [doi]
- Not All Options Are Created Equal: Textual Option Weighting for Token-Efficient LLM-Based Knowledge TracingJongwoo Kim, SeongYeub Chu, Bryan Wong, Mun Yong Yi. 16114-16128 [doi]
- Public Data Assisted Differentially Private In-Context LearningSeongho Joo, Hyukhun Koh, Kyomin Jung. 16129-16152 [doi]
- Inducing Argument Facets for Faithful Opinion SummarizationJian Wang 0118, Yanjie Liang, Yuqing Sun 0001, Bin Gong. 16153-16166 [doi]
- Scaling Laws Are Unreliable for Downstream Tasks: A Reality CheckNicholas Lourie, Michael Y. Hu, KyungHyun Cho. 16167-16180 [doi]
- Familiarity-Aware Evidence Compression for Retrieval-Augmented GenerationDongwon Jung, Qin Liu 0010, Tenghao Huang, Ben Zhou, Muhao Chen 0001. 16181-16196 [doi]
- O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice ConversionHuu Tuong Tu, Huan Vu, Nguyen Tien Cuong, Dien Hy Ngo, Nguyen Thi Thu Trang. 16197-16208 [doi]
- Simple Factuality Probes Detect Hallucinations in Long-Form Natural Language GenerationJiatong Han, Neil Band, Muhammed Razzak, Jannik Kossen, Tim G. J. Rudner, Yarin Gal. 16209-16226 [doi]
- CESRec: Constructing Pseudo Interactions for Sequential Recommendation via Conversational FeedbackYifan Wang 0023, Shen Gao, Jiabao Fang, Rui Yan 0001, Billy Chiu, Shuo Shang. 16227-16239 [doi]
- TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained EvaluationChengrui Huang 0001, Shen Gao, Zhengliang Shi, Dongsheng Wang, Shuo Shang. 16240-16255 [doi]
- Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided DecompositionYi Liu, Xiangrong Zhu, Xiangyu Liu, Wei Wei, Wei Hu. 16256-16272 [doi]
- Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMsKuan Lok Zhou, Jiayi Chen, Siddharth Suresh, Reuben Narad, Timothy T. Rogers, Lalit K. Jain, Robert D. Nowak, Bob Mankoff, Jifan Zhang. 16273-16287 [doi]
- SMARTMiner: Extracting and Evaluating SMART Goals from Low-Resource Health Coaching NotesIva Bojic, Qi Chwen Ong, Stephanie Hilary Xinyi Ma, Lin Ai, Zheng Liu, Ziwei Gong, Julia Hirschberg, Andy Hau Yan Ho, Andy W. H. Khong. 16288-16305 [doi]
- GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language ModelsJialin Chen, Houyu Zhang, Seongjun Yun, Alejandro Mottini, Rex Ying, Xiang Song 0003, Vassilis N. Ioannidis, Zheng Li, Qingjun Cui. 16306-16319 [doi]
- Exploring Deductive and Inductive Reasoning Capabilities of Large Language Models in Procedural PlanningJiabao Kang, Xinye Li, Liyan Xu, Qingbin Liu, Xi Chen 0003, Zhiying Tu, Dianhui Chu, Dianbo Sui. 16320-16341 [doi]
- KELE: A Multi-Agent Framework for Structured Socratic Teaching with Large Language ModelsXian Peng, Pan Yuan, Dong Li, Junlong Cheng, Qin Fang, Zhi Liu. 16342-16362 [doi]
- VisualEDU: A Benchmark for Assessing Coding and Visual Comprehension through Educational Problem-Solving Video GenerationHao Chen, Tianyu Shi, Pengran Huang, Zeyuan Li, Jiahui Pan, Qianglong Chen, Lewei He. 16363-16394 [doi]
- OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Question AnsweringYulong Hui, Yihao Liu 0008, Yao Lu 0028, Huanchen Zhang. 16395-16409 [doi]
- VerifiAgent: a Unified Verification Agent in Language Model ReasoningJiuzhou Han, Wray L. Buntine, Ehsan Shareghi. 16410-16431 [doi]
- DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical DomainsYongkang Xiao, Sinian Zhang, Yi Dai, Huixue Zhou, Jue Hou, Jie Ding 0002, Rui Zhang 0028. 16432-16445 [doi]
- Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer MechanismZhiwei Wang, Yunji Wang, Zhongwang Zhang, Zhangchen Zhou, Hui Jin, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Yaoyu Zhang, Zhi-Qin John Xu. 16446-16474 [doi]
- TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' GuidanceJingxian Xu, Mengyu Zhou, Weichang Liu, Hanbing Liu, Shi Han, Dongmei Zhang 0001. 16475-16489 [doi]
- DAVIS: Planning Agent with Knowledge Graph-Powered Inner MonologueMinh Pham Dinh, Michael G. Yankoski, Munira Syed, Trenton W. Ford. 16490-16505 [doi]
- When Instructions Multiply: Measuring and Estimating LLM Capabilities of Multiple Instructions FollowingKeno Harada, Yudai Yamazaki, Masachika Taniguchi, Edison Marrese-Taylor, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo. 16506-16526 [doi]
- FormosanBench: Benchmarking Low-Resource Austronesian Languages in the Era of Large Language ModelsKaiying Kevin Lin, Hsi-Yu Chen, Haopeng Zhang. 16527-16539 [doi]
- SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language ModelsJun Rao, Yunjie Liao, Xuebo Liu 0002, Zepeng Lin, Lian-lian, Dong Jin, Shengjun Cheng, Jun Yu 0002, Min Zhang 0005. 16540-16557 [doi]
- FigEx: Aligned Extraction of Scientific Figures and CaptionsJifeng Song, Arun Das, Ge Cui, Yufei Huang. 16558-16571 [doi]
- PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language ModelsWanru Zhuang, Wenbo Li, Zhibin Lan, Xu Han 0007, Peng Li 0030, Jinsong Su. 16572-16588 [doi]
- Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model MergingHua Farn, Hsuan Su, Shachi H. Kumar, Saurav Sahay, Shang-Tse Chen, Hung-yi Lee. 16589-16602 [doi]
- Self-Ensemble: Mitigating Confidence Distortion for Large Language ModelsZicheng Xu, Guanchu Wang, Guangyao Zheng, Yu-Neng Chuang, Alex Szalay, Xia Hu 0001, Vladimir Braverman. 16603-16615 [doi]
- Annotation-Efficient Language Model Alignment via Diverse and Representative Response TextsYuu Jinnai, Ukyo Honda. 16616-16659 [doi]
- Explainable Chain-of-Thought Reasoning: An Empirical Analysis on State-Aware Reasoning DynamicsSheldon Yu, Yuxin Xiong, Junda Wu, Xintong Li 0001, Tong Yu 0001, Xiang Chen 0010, Ritwik Sinha, Jingbo Shang, Julian J. McAuley. 16660-16667 [doi]
- DecisionFlow: Advancing Large Language Model as Principled Decision MakerXiusi Chen, Shanyong Wang, Cheng Qian 0008, Hongru Wang 0011, Peixuan Han, Heng Ji 0001. 16668-16692 [doi]
- M-Ped: Multi-Prompt Ensemble Decoding for Large Language ModelsJiaxin Guo, Daimeng Wei, Yuanchang Luo, Hengchao Shang, Zongyao Li, Jinlong Yang, Zhanglin Wu, Zhiqiang Rao, Shimin Tao, Hao Yang. 16693-16711 [doi]
- Butterfly Effects in Toolchains: A Comprehensive Analysis of Failed Parameter Filling in LLM Tool-Agent SystemsQian Xiong, Yuekai Huang, Ziyou Jiang, Zhiyuan Chang, Yu Zheng 0012, Tianhao Li, Mingyang Li. 16712-16729 [doi]
- FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question AnsweringYitao Long, Tiansheng Hu, Yilun Zhao 0001, Arman Cohan, Chen Zhao 0013. 16730-16750 [doi]
- BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language ModelsXu Huang, Wenhao Zhu, Hanxu Hu, Conghui He, Lei Li 0005, Shujian Huang, Fei Yuan 0006. 16751-16774 [doi]
- Assessing the Sensitivity and Alignment of FOL Closeness MetricsRamya Keerthy Thatikonda, Wray L. Buntine, Ehsan Shareghi. 16775-16785 [doi]
- FoodSafeSum: Enabling Natural Language Processing Applications for Food Safety Document Summarization and AnalysisJuli Bakagianni, Korbinian Randl, Guido Rocchietti, Cosimo Rulli, Franco Maria Nardini, Salvatore Trani, Aron Henriksson, Anna Romanova, John Pavlopoulos. 16786-16804 [doi]
- Self-adaptive Dataset Construction for Real-World Multimodal Safety ScenariosJingen Qu, Lijun Li, Bo Zhang, Yichen Yan, Jing Shao. 16805-16829 [doi]
- EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language ModelsAbhay Gupta, Jacob Cheung, Philip Meng, Shayan Sayyed, Kevin Zhu, Austen Liao, Sean O'Brien. 16830-16855 [doi]
- FAEDKV: Infinite-Window Fourier Transform for Unbiased KV Cache CompressionRunchao Li, Yao Fu, Mu Sheng, Xianxuan Long, Haotian Yu, Pan Li 0001. 16856-16866 [doi]
- Dynamic Injection of Entity Knowledge into Dense RetrieversIkuya Yamada, Ryokan Ri, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo. 16867-16879 [doi]
- When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference LearningYijiang River Dong, Tiancheng Hu, Yinhong Liu, Ahmet Üstün, Nigel Collier. 16880-16894 [doi]
- MASTER: Multi-Agent Security Through Exploration of Roles and Topological Structures - A Comprehensive FrameworkYifan Zhu, Chao Zhang, Xin Shi, Xueqiao Zhang, Yi Yang 0001, Yawei Luo. 16895-16921 [doi]
- MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained DevicesPatara Trirat, Jae-Gil Lee. 16922-16950 [doi]
- StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy VideosValentin Barrière, Nahuel Gomez, Léo Hemamou, Sofía Callejas, Brian Ravenet. 16951-16959 [doi]
- Does Visual Grounding Enhance the Understanding of Embodied Knowledge in Large Language Models?Zhihui Yang, Yupei Wang, Kaijie Mo, Zhe Zhao 0006, Renfen Hu. 16960-16978 [doi]
- Semantic Contribution-Aware Adaptive Retrieval for Black-Box ModelsQinhong Lin, Zhongliang Yang, Yuang Cai, Dingfu Yu, Xuan Xu, Yu Li, Linna Zhou. 16979-16994 [doi]
- On Guardrail Models' Robustness to Mutations and Adversarial AttacksElias Bassani, Ignacio Sanchez. 16995-17006 [doi]
- IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic DataBo Peng, Zhiheng Wang, Heyang Gong, Chaochao Lu. 17007-17040 [doi]
- Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMsHanqing Li, Sharika Mahadevan, Kiran Sheena Jyothi, Henry Liang, Diego Klabjan. 17041-17054 [doi]
- Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered AgentsShouju Wang, Fenglin Yu, Xirui Liu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Dongmei Zhang 0001, Saravan Rajmohan. 17055-17074 [doi]
- Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision StudyYujun Zhou 0002, Jiayi Ye, Zipeng Ling, Yufei Han 0001, Yue Huang 0001, Haomin Zhuang, Zhenwen Liang, Kehan Guo, Taicheng Guo, Xiangqi Wang, Xiangliang Zhang 0001. 17075-17098 [doi]
- ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning ModelsRazvan Gabriel Dumitru, Darius Peteleaza, Vikas Yadav, Liangming Pan. 17099-17123 [doi]
- Faster and Better LLMs via Latency-Aware Test-Time ScalingZili Wang 0005, Tianyu Zhang, Haoli Bai, Lu Hou 0002, Xianzhi Yu, Wulong Liu, Shiming Xiang, Lei Zhu. 17124-17137 [doi]
- Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language ModelsZonghao Ying, Deyue Zhang, Zonglei Jing, Yisong Xiao, Quanchen Zou, Aishan Liu, Siyuan Liang 0004, Xiangzheng Zhang, Xianglong Liu 0001, Dacheng Tao. 17138-17157 [doi]
- Distilling Many-Shot In-Context Learning into a Cheat SheetUkyo Honda, Soichiro Murakami, Peinan Zhang. 17158-17178 [doi]
- Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language ModelsXiaofan Zheng, Huixuan Zhang, Xiaojun Wan 0001. 17179-17191 [doi]
- PolBiX: Detecting LLMs' Political Bias in Fact-Checking through X-phemismsCharlott Jakob, David Harbecke, Patrick Parschan, Pia Wenzel Neves, Vera Schmitt. 17192-17210 [doi]
- URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue ModelsRuiqi Yan, Xiquan Li, Wenxi Chen, Zhikang Niu, Chen Yang, Ziyang Ma 0001, Kai Yu 0004, Xie Chen 0001. 17211-17242 [doi]
- Low-Hallucination and Efficient Coreference Resolution with LLMsYujian Gan, Yuan Liang, Jinxia Xie, Yanni Lin, Juntao Yu, Massimo Poesio. 17243-17256 [doi]
- Your Mileage May Vary: How Empathy and Demographics Shape Human Preferences in LLM ResponsesYishan Wang, Amanda Cercas Curry, Flor Miriam Plaza del Arco. 17257-17270 [doi]
- Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language ModelsWeihang Wang 0011, Xinhao Li, Ziyue Wang, Yan Pang, Jielei Zhang, Peiyi Li, Qiang Zhang, Longwen Gao. 17271-17289 [doi]
- PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution DimensionsSong Dai, Yibo Yan, Jiamin Su, Dongfang Zihao, Yubo Gao, Yonghua Hei, Jungang Li, Junyan Zhang, Sicheng Tao, Zhuoran Gao, Xuming Hu. 17290-17316 [doi]
- Ko-LongRAG: A Korean Long-Context RAG Benchmark Built with a Retrieval-Free ApproachYongil Kim, Heuiyeen Yeen, Hyeongu Yun, Jinsik Lee. 17317-17329 [doi]
- Choosing a Model, Shaping a Future: Comparing LLM Perspectives on Sustainability and its Relationship with AIAnnika Bush, Meltem Aksoy, Markus Pauly, Greta Ontrup. 17330-17341 [doi]
- Optimising Factual Consistency in Summarisation via Preference Learning from Multiple Imperfect MetricsYuxuan Ye, Raúl Santos-Rodríguez, Edwin Simpson. 17342-17355 [doi]
- Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplification and Resistance in Multi-Agent Based LLM-as-JudgeChiyu Ma, Enpei Zhang, Yilun Zhao 0001, Wenjun Liu, Yaning Jia, Peijun Qing, Lin Shi, Arman Cohan, Yujun Yan, Soroush Vosoughi. 17356-17392 [doi]
- Investigating the Impact of Conceptual Metaphors on LLM-based NLI through Shapley InteractionsMeghdut Sengupta, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüllermeier, Debanjan Ghosh, Henning Wachsmuth. 17393-17403 [doi]
- KurTail : Kurtosis-based LLM QuantizationMohammad Sadegh Akhondzadeh, Aleksandar Bojchevski, Evangelos Eleftheriou, Martino Dazzi. 17404-17419 [doi]
- VIVA+: Human-Centered Situational Decision-MakingZhe Hu, Yixiao Ren, Guanzhong Liu, Jing Li 0049, Yu Yin 0001. 17420-17437 [doi]
- QuantAgents: Towards Multi-agent Financial System via Simulated TradingXiangyu Li 0010, Yawen Zeng, Xiaofen Xing, Jin Xu 0014, Xiangmin Xu. 17438-17464 [doi]
- LLMs Reproduce Stereotypes of Sexual and Gender MinoritiesRuby Ostrow, Adam Lopez. 17465-17477 [doi]
- Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization ApproachesIsrael Abebe Azime, Deborah Dormah Kanubala, Tejumade Afonja, Mario Fritz, Isabel Valera, Dietrich Klakow, Philipp Slusallek. 17478-17503 [doi]
- Transfer-Aware Data Selection for Domain Adaptation in Text RetrievalLinzhu Yu, Huan Li 0003, Ke Chen 0005, Lidan Shou. 17504-17519 [doi]
- Understanding and Improving Information Preservation in Prompt Compression for LLMsWeronika Lajewska, Momchil Hardalov, Laura Aina, Neha Anna John, Hang Su, Lluís Màrquez. 17520-17541 [doi]
- A Benchmark for Hindi Verb-Argument Structure AlternationsKanishka Jain, Ashwini Vaidya. 17542-17549 [doi]
- Beyond Binary Preferences: Semi-Online Label-Free GRACE-KTO with Group-Wise Adaptive Calibration for High-Quality Long-Text GenerationJingyang Deng, Ran Chen 0002, Jo-Ku Cheng, Jinwen Ma. 17550-17562 [doi]
- Representation-based Broad Hallucination Detectors Fail to Generalize Out of DistributionZuzanna Dubanowska, Maciej Zelaszczyk, Michal Brzozowski, Paolo Mandica, Michal P. Karpowicz. 17563-17575 [doi]
- MAFMO: Multi-modal Adaptive Fusion with Meta-template Optimization for Vision-Language ModelsMingrui Xie, Lulu Xu, Junliang Du. 17576-17585 [doi]
- Multimodal UNcommonsense: From Odd to Ordinary and Ordinary to OddYejin Son, Saejin Kim, Dongjun Min, Youngjae Yu. 17586-17609 [doi]
- Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological BehaviorManuel Couto, Marcos Fernández-Pichel, Mario Ezra Aragón, David E. Losada. 17610-17619 [doi]
- Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal DirectionYuanbo Xie, Yingjie Zhang, Tianyun Liu, Duohe Ma, Tingwen Liu. 17620-17632 [doi]
- Distributed LLM Serving on Consumer-Grade GPUs by Reconciling Computation and CommunicationLewei Jin, Kui Zhang, Yongqi Chen, Yifan Zhuo, Renjie Li, Yi Gao 0001, Bowei Yang, Zhengong Cai, Wei Dong 0001. 17633-17642 [doi]
- SafeToolBench: Pioneering a Prospective Benchmark to Evaluating Tool Utilization Safety in LLMsHongfei Xia, Hongru Wang 0003, Zeming Liu, Qian Yu 0016, Yuhang Guo 0001, Haifeng Wang 0001. 17643-17660 [doi]
- Sparsifying Mambaan Wang, Ruobing Xie, Shuaipeng Li, Xingwu Sun, Zhanhui Kang. 17661-17667 [doi]
- Beneath the Facade: Probing Safety Vulnerabilities in LLMs via Auto-Generated Jailbreak PromptsHeehyeon Kim, Kyeongryul Lee, Joyce Jiyoung Whang. 17668-17700 [doi]
- ET-MIER: Entity Type-guided Key Mention Identification and Evidence Retrieval for Document-level Relation ExtractionXin Li, Huangming Xu, Fu Zhang 0001, Jingwei Cheng. 17701-17714 [doi]
- Position IDs Matter: An Enhanced Position Layout for Efficient Context Compression in Large Language ModelsRunsong Zhao, Xin Liu, Xinyu Liu, Pengcheng Huang 0004, Chunyang Xiao, Tong Xiao 0001, Jingbo Zhu. 17715-17734 [doi]
- Can Role Vectors Affect LLM Behaviour?Daniele Potertì, Andrea Seveso, Fabio Mercorio. 17735-17747 [doi]
- Semantic Component Analysis: Introducing Multi-Topic Distributions to Clustering-Based Topic ModelingFlorian Eichin, Carolin M. Schuster, Georg Groh, Michael A. Hedderich. 17748-17771 [doi]
- ThinkQE: Query Expansion via an Evolving Thinking ProcessYibin Lei, Tao Shen 0001, Andrew Yates. 17772-17781 [doi]
- Hierarchical Reward Modeling for Fault Localization in Large Code RepositoriesJiwei Zhang 0020, Jianxun Lian, Haiming Qin, Mingyang Zhou 0001, Kezhong Lu, Rui Mao 0001, Hao Liao. 17782-17796 [doi]
- Layer Duplication in LLMsNeo Eyal, Nachum Dershowitz, Kfir Bar. 17797-17807 [doi]
- Semantic-Aware Action Space Compression via LLM-DRL Synergy for Efficient Task-oriented Dialogue Policy ExplorationYangyang Zhao, Ben Niu, Yuxuan Tan, Shihan Wang, Libo Qin. 17808-17820 [doi]
- Linear Steerability in Language Models: When It Emerges and How It EvolvesJianshu She, Xinyue Li, Eric P. Xing, Zhengzhong Liu 0001, Qirong Ho. 17821-17846 [doi]
- A Comprehensive Survey on Learning from Rewards for Large Language Models: Reward Models and Learning StrategiesXiaobao Wu. 17847-17875 [doi]
- InFact: Informativeness Alignment for Improved LLM FactualityRoi Cohen, Russa Biswas, Gerard de Melo. 17876-17888 [doi]
- Large Language Model Agents in Finance: A Survey Bridging Research, Practice, and Real-World DeploymentYifei Dong, Fengyi Wu, Kunlin Zhang, Yilong Dai, Sanjian Zhang, Wanghao Ye, Sihan Chen, Zhi-Qi Cheng. 17889-17907 [doi]
- Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMsGaye Colakoglu, Gürkan Solmaz, Jonathan Fürst. 17908-17927 [doi]
- Generation-Augmented Retrieval: Rethinking the Role of Large Language Models in Zero-Shot Relation ExtractionZehan Li, Fu Zhang 0001, Tianyue Peng, He Liu, Jingwei Cheng. 17928-17941 [doi]
- Following Occam's Razor: Dynamic Combination of Structured Knowledge for Multi-Hop Question Answering using LLMsWei Chen 0156, Zhi Zheng, Lili Zhao, Huijun Hou, Tong Xu. 17942-17956 [doi]
- Large Language Models as Reader for Bias DetectionXuan Luo, Jing Li 0049, Zhong Wenzhong, Geng Tu, Ruifeng Xu 0001. 17957-17967 [doi]
- LOHRec: Leveraging Order and Hierarchy in Generative Sequential RecommendationJiawen Xie, Haiyang Wu, Deyi Ji, Yuekui Yang, Shaoping Ma. 17968-17983 [doi]
- Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language ModelsHaonan He, Yuchen Ren 0001, Yining Tang, Ziyang Xu, Junxian Li 0001, Minghao Yang, Di Zhang 0026, Dong Yuan, Tao Chen 0003, Shufei Zhang, Yuqiang Li, Nanqing Dong, Wanli Ouyang, Dongzhan Zhou, Peng Ye 0006. 17984-18016 [doi]
- AssistedDS: Benchmarking How External Domain Knowledge Assists LLMs in Automated Data ScienceAn Luo, Xun Xian, Jin Du, Fangqiao Tian, Ganghua Wang, Ming Zhong, Shengchun Zhao, Xuan Bi, Zirui Liu 0001, Jiawei Zhou, Jayanth Srinivasa, Ashish Kundu, Charles Fleming, Mingyi Hong 0001, Jie Ding 0002. 18017-18060 [doi]
- Are you sure? Measuring models bias in content moderation through uncertaintyAlessandra Urbinati, Mirko Lai, Simona Frenda, Marco Stranisci. 18061-18076 [doi]
- FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language TasksSabrina McCallum, Amit Parekh 0001, Alessandro Suglia. 18077-18101 [doi]
- Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health CommunitiesBhagesh Gaur, Karan Gupta, Aseem Srivastava, Manish Gupta, Md. Shad Akhtar. 18102-18118 [doi]
- Logic: Long-form Outline Generation via Imitative and Critical Self-refinementHengwei Liu, Yongliang Shen 0001, Zhe Zheng, Haoyuan Ma, Xingyu Wu, Yin Zhang, Weiming Lu. 18119-18144 [doi]
- No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant UsersMengxuan Hu, Hongyi Wu, Ronghang Zhu, Zihan Guan 0001, Dongliang Guo 0002, Daiqing Qi, Sheng Li 0001. 18145-18170 [doi]
- LegoSLM: Connecting LLM with Speech Encoder using CTC PosteriorsRao Ma, Tongzhou Chen, Kartik Audhkhasi, Bhuvana Ramabhadran. 18171-18186 [doi]
- Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality EvaluationXing Zhang, Jiaheng Wen, Fangkai Yang, Yu Kang 0006, Pu Zhao 0004, Junhao Wang, Maoquan Wang, Yufan Huang, Shengyu Fu, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang 0001. 18187-18198 [doi]
- Parallel Communities Across the Surface Web and the Dark WebWenchao Dong, Megha Sundriyal, Seongchan Park, Jaehong Kim, Meeyoung Cha, Tanmoy Chakraborty 0002, Wonjae Lee. 18199-18218 [doi]
- Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training DataOlia Toporkov, Alan Akbik, Rodrigo Agerri. 18219-18232 [doi]
- LlmFixer: Fix the Helpfulness of Defensive Large Language ModelsZelong Yu, Xiaoming Zhang, Litian Zhang, Yu Yuan, Chaozhuo Li. 18233-18247 [doi]
- Universal Acoustic Adversarial Attacks for Flexible Control of Speech-LLMsRao Ma, Mengjie Qian 0001, Vyas Raina, Mark J. F. Gales, Kate M. Knill. 18248-18262 [doi]
- Probing Semantic Routing in Large Mixture-of-Expert ModelsMatthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Man Luo, Sungduk Yu, Chendi Xue, Vasudev Lal. 18263-18278 [doi]
- CMT-Eval: A Novel Chinese Multi-turn Dialogue Evaluation Dataset Addressing Real-world Conversational ChallengesSiyu Tian, Kaijie Mo, Yupei Wang, Renfen Hu. 18279-18303 [doi]
- LastingBench: Defend Benchmarks Against Knowledge LeakageYixiong Fang, Tianran Sun, Yuling Shi, Min Wang, Xiaodong Gu. 18304-18317 [doi]
- Learning API Functionality from In-Context Demonstrations for Tool-based AgentsBhrij Patel, Ashish Jagmohan, Aditya Vempaty. 18318-18336 [doi]
- Predicting Language Models' Success at Zero-Shot Probabilistic PredictionKevin Ren, Santiago Cortes-Gomez, Carlos Miguel Patiño, Ananya Joshi 0001, Ruiqi Lyu, Jingjing Tang, Alistair Turcan, Khurram Yamin, Steven Wu 0001, Bryan Wilder. 18337-18363 [doi]
- GAMIC: Graph-Aligned Molecular In-context Learning for Molecule Analysis via LLMsAli Al-Lawati, Jason Lucas, Zhiwei Zhang 0028, Prasenjit Mitra, Suhang Wang. 18364-18378 [doi]
- Rethinking Sign Language Translation: The Impact of Signer Dependence on Model EvaluationKeren Artiaga, Sabyasachi Kamila, Haithem Afli, Conor Lynch, Mohammed Hasanuzzaman. 18379-18391 [doi]
- Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical EvaluationTong Li, Shu Yang 0010, Junchao Wu, Jiyao Wei, Lijie Hu, Mengdi Li 0006, Derek F. Wong, Joshua R. Oltmanns, Di Wang 0015. 18392-18413 [doi]
- Adaptive Platt Scaling with Causal Interpretations for Self-Reflective Language Model Uncertainty EstimatesAnthony Sicilia, Malihe Alikhani. 18414-18422 [doi]
- Treble Counterfactual VLMs: A Causal Approach to HallucinationLi Li 0006, Jiashu Qu, Linxin Song, Yuxiao Zhou 0006, Yuehan Qin, Tiankai Yang 0001, Yue Zhao 0016. 18423-18434 [doi]
- Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video ReasoningDaeun Lee 0001, Jaehong Yoon, Jaemin Cho 0001, Mohit Bansal. 18435-18449 [doi]
- Glitter: A Multi-Sentence, Multi-Reference Benchmark for Gender-Fair German Machine TranslationA Pranav 0001, Janiça Hackenbuchner, Giuseppe Attanasio, Manuel Lardelli, Anne Lauscher. 18450-18477 [doi]
- From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language ModelingMohsinul Kabir, Tasfia Tahsin, Sophia Ananiadou. 18478-18498 [doi]
- SENTRA: Selected-Next-Token Transformer for LLM Text DetectionMitchell Plyler, Yilun Zhang, Alexander Tuzhilin, Saoud Khalifah, Sen Tian. 18499-18516 [doi]
- Automate Strategy Finding with LLM in Quant InvestmentZhizhuo Kou, Holam Yu, Junyu Luo 0002, Jingshu Peng, Xujia Li, Chengzhong Liu, Juntao Dai, Lei Chen 0002, Sirui Han, Yike Guo. 18517-18533 [doi]
- Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM ReasoningXuyang Wu 0002, Jinming Nian, Ting-Ruen Wei, Zhiqiang Tao, Hsin-Tai Wu, Yi Fang 0008. 18534-18555 [doi]
- MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward ModelingZhaopeng Feng, Jiahan Ren, Jiayuan Su, Jiamei Zheng, Hongwei Wang 0001, Zuozhu Liu. 18556-18567 [doi]
- Bias after Prompting: Persistent Discrimination in Large Language ModelsNivedha Sivakumar, Natalie Mackraz, Samira Khorshidi, Krishna Patel, Barry-John Theobald, Luca Zappella, Nicholas Apostoloff. 18568-18593 [doi]
- CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding CompressionDayin Gou, SangHyun Byun, Nilesh Malpeddi, Gabrielle De Micheli, Prathamesh Vaste, Jacob Song, Woo Seong Chung. 18594-18604 [doi]
- Consistent Discourse-level Temporal Relation Extraction Using Large Language ModelsYi Fan, Michael Strube 0001. 18605-18622 [doi]
- MMPlanner: Zero-Shot Multimodal Procedural Planning with Chain-of-Thought Object State ReasoningAfrina Tabassum, Bin Guo, Xiyao Ma, Hoda Eldardiry, Ismini Lourentzou. 18623-18639 [doi]
- Internal states before wait modulate reasoning patternsDmitrii Troitskii, Koyena Pal, Chris Wendler, Callum McDougall. 18640-18649 [doi]
- Sparsity May Be All You Need: Sparse Random Parameter AdaptationJesus Rios, Pierre L. Dognin, Ronny Luss, Karthikeyan Natesan Ramamurthy. 18650-18666 [doi]
- Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text RecognitionPanagiotis Kaliosis, John Pavlopoulos. 18667-18684 [doi]
- MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement LearningZhaopeng Feng, Shaosheng Cao, Jiahan Ren, Jiayuan Su, Ruizhe Chen, Yan Zhang 0004, Jian Wu 0001, Zuozhu Liu. 18685-18702 [doi]
- Discrete Minds in a Continuous World: Do Language Models Know Time Passes?Minghan Wang, Ye Bai, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari. 18703-18729 [doi]
- DLTKG: Denoising Logic-based Temporal Knowledge Graph ReasoningXiaoke Wang, Fu Zhang, Jingwei Cheng, Yiwen Chi, Jiashun Peng, Yingsong Ning. 18730-18743 [doi]
- EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion RecognitionPengcheng Li, Botao Zhao 0001, Zuheng Kang, Junqing Peng, Xiaoyang Qu, Yayun He, Jianzong Wang. 18744-18754 [doi]
- MANTA: A Scalable Pipeline for Transmuting Massive Web Corpora into Instruction DatasetsHeuiyeen Yeen, Seokhee Hong, Hyeongu Yun, Jinsik Lee. 18755-18770 [doi]
- Fast Quiet-STaR: Thinking Without Thought TokensWei Huang, Yizhe Xiong, Xin Ye, Zhijie Deng, Hui Chen 0013, Zijia Lin, Guiguang Ding. 18771-18781 [doi]
- Lock on Target! Precision Unlearning via Directional ControlYuntao Wen, Ruixiang Feng, Feng Guo, Yifan Wang 0023, Ran Le, Yang Song 0021, Shen Gao, Shuo Shang. 18782-18794 [doi]
- UniRAG: A Unified RAG Framework for Knowledge-Intensive Queries with Decomposition, Break-Down Reasoning, and Iterative RewritingGun Il Kim, Jong Wook Kim, Beakcheol Jang. 18795-18810 [doi]
- One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation SystemsZhiyuan Chang, Mingyang Li 0005, Xiaojun Jia, Junjie Wang 0001, Yuekai Huang, Ziyou Jiang, Yang Liu 0003, Qing Wang 0001. 18811-18825 [doi]
- From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference AlignmentJing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong. 18826-18853 [doi]
- MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive DecodingJingyuan Deng, Yujiu Yang. 18854-18866 [doi]
- ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMsZige Wang, Qi Zhu 0011, Fei Mi, Minghui Xu, Ruochun Jin, Wenjing Yang 0002. 18867-18880 [doi]
- TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into DocumentsHyundong Jin, Sicheol Sung, Shinwoo Park, Seung-Yeop Baik, Yo-Sub Han. 18881-18897 [doi]
- AraReasoner: Evaluating Reasoning-Based LLMs for Arabic NLPAhmed Abul Hasanaath, Aisha Alansari, Ahmed Ashraf, Salmane Chafik, Hamzah Luqman, Saad Ezzini. 18898-18914 [doi]
- Tales of Morality: Comparing Human- and LLM-Generated Moral Stories from Visual CuesRezvaneh Rezapour, Sullam Jeoung, Zhiwen You, Jana Diesner. 18915-18933 [doi]
- AirRAG: Autonomous Strategic Planning and Reasoning Steer Retrieval Augmented GenerationWenfeng Feng 0001, Chuzhan Hao, Yuewei Zhang 0003, Guochao Jiang, Jingyi Song. 18934-18953 [doi]
- Evaluating NL2SQL via SQL2NLMohammadtaher Safarzadeh, Afshin Oroojlooy, Dan Roth 0001. 18954-18968 [doi]
- DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQLHaoyuan Ma, Yongliang Shen 0001, Hengwei Liu, Wenqi Zhang 0001, Haolei Xu, Qiuying Peng, Jun Wang, Weiming Lu 0001. 18969-18979 [doi]
- Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?Junyan Zhang, Yiming Huang 0002, Shuliang Liu, Yubo Gao, Xuming Hu. 18980-18989 [doi]
- Divide, Optimize, Merge: Scalable Fine-Grained Generative Optimization for LLM AgentsJiale Liu, Yifan Zeng, Shaokun Zhang, Chi Zhang, Malte Højmark-Bertelsen, Marie Normann Gadeberg, Huazheng Wang, Qingyun Wu. 18990-19012 [doi]
- Evaluating Evaluation Metrics - The Mirage of Hallucination DetectionAtharva Kulkarni, Yuan Zhang, Joel Ruben Antony Moniz, Xiou Ge, Bo-Hsiang Tseng, Dhivya Piraviperumal, Swabha Swayamdipta, Hong Yu. 19013-19032 [doi]
- The Progress Illusion: Revisiting meta-evaluation standards of LLM evaluatorsTianruo Rose Xu, Vedant Gaur, Liu Leqi, Tanya Goyal. 19033-19043 [doi]
- MidPO: Dual Preference Optimization for Safety and Helpfulness in Large Language Models via a Mixture of Experts FrameworkYupeng Qi, Ziyu Lyu, Min Yang 0002, Yanlin Wang 0001, Lu Bai 0001, Lixin Cui. 19044-19066 [doi]
- From KMMLU-Redux to Pro: A Professional Korean Benchmark Suite for LLM EvaluationSeokhee Hong, Sunkyoung Kim 0002, Guijin Son, Soyeon Kim, Yeonjung Hong, Jinsik Lee. 19067-19096 [doi]
- RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world ScenariosFei Zhao 0012, Chengqiang Lu, Yufan Shen, Qimeng Wang, Yicheng Qian, Haoxin Zhang, Yan Gao 0017, Wu Yi, Yao Hu 0002, Zhen Wu 0002, Shangyu Xing, Xinyu Dai. 19097-19115 [doi]
- The More, The Better? A Critical Study of Multimodal Context in Radiology Report SummarizationMong Yuan Sim, Wei Emma Zhang, Xiang Dai 0001, Biaoyan Fang, Sarbin Ranjitkar, Arjun Burlakoti, Jamie Taylor, Haojie Zhuang. 19116-19131 [doi]
- Localizing Malicious Outputs from CodeLLMMayukh Borana, Junyi Liang, Sai Sathiesh Rajan, Sudipta Chattopadhyay 0001. 19132-19143 [doi]
- Knowing More, Acting Better: Hierarchical Representation for Embodied Decision-MakingChunhui Zhang, Zhongyu Ouyang, Xingjian Diao, Zheyuan Liu 0010, Soroush Vosoughi. 19144-19155 [doi]
- Culture is Everywhere: A Call for Intentionally Cultural EvaluationJuhyun Oh, Inha Cha, Michael Saxon, Hyunseung Lim, Shaily Bhatt, Alice Oh. 19156-19168 [doi]
- Fairness in Automatic Speech Recognition Isn't a One-Size-Fits-AllHend Elghazaly, Bahman Mirheidari, Heidi Christensen, Nafise Sadat Moosavi. 19169-19178 [doi]
- Uncovering Factor-Level Preference to Improve Human-Model AlignmentJuhyun Oh, Eunsu Kim, Jiseon Kim, Wenda Xu, Inha Cha, William Yang Wang, Alice Oh. 19179-19203 [doi]
- Adaptive Preference Optimization with Uncertainty-aware Utility AnchorXiaobo Wang 0004, Zixia Jia, Jiaqi Li 0021, Qi Liu, Zilong Zheng. 19204-19225 [doi]
- GRAD: Generative Retrieval-Aligned Demonstration Sampler for Efficient Few-Shot ReasoningOussama Gabouj, Kamel Charaf, Ivan Zakazov, Nicolas Mario Baldwin, Robert West 0001. 19226-19244 [doi]
- IoTMigrator: LLM-driven Embedded IoT Code Migration across Different OSes for Cloud-device IntegrationYingqi Peng, Kaijie Gong, Yi Gao 0001, Hao Wang, Wei Dong 0001. 19245-19257 [doi]
- ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented GenerationHao Chen, Yukun Yan, Sen Mei, Wanxiang Che, Zhenghao Liu 0001, Qi Shi 0002, Xinze Li, Yuchun Fan, Pengcheng Huang 0004, Qiushi Xiong, Zhiyuan Liu 0001, Maosong Sun 0001. 19258-19278 [doi]
- BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly TextIbrahim Al Azher, Miftahul Jannat Mokarrama, Zhishuai Guo, Sagnik Ray Choudhury, Hamed Alhoori. 19279-19294 [doi]
- Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of EmbeddingsLiyan Xu, Zhenlin Su, Mo Yu, Jiangnan Li, Fandong Meng, Jie Zhou 0016. 19295-19305 [doi]
- Over-Generation and Compaction: A Prompting Strategy for Procedural Text Adaptation with Large Language ModelsHyeongSik Kim, Xu Yanheng, Chaoqun Dong, Fei Du. 19306-19337 [doi]
- TransBERT: A Framework for Synthetic Translation in Domain-Specific Language ModelingJulien Knafou, Luc Mottin, Anaïs Mottaz, Alexandre Flament, Patrick Ruch. 19338-19354 [doi]
- Beyond Fixed-Length Calibration for Post-Training Compression of LLMsJaehoon Oh, Dokwan Oh. 19355-19366 [doi]
- Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data GenerationGuangzeng Han, Weisi Liu, Xiaolei Huang. 19367-19389 [doi]
- ReCoVeR the Target Language: Language Steering without Sacrificing Task PerformanceHannah Sterz, Fabian David Schmidt, Goran Glavas, Ivan Vulic. 19390-19405 [doi]
- LC-Eval: A Bilingual Multi-Task Evaluation Benchmark for Long-Context UnderstandingSheikh Jubair, Arwa Omayrah, Amal Alshammari, Alhanoof Althnian, Abdulhamed Alothaimen, Norah A. Alzahrani, Shahad D. Alzaidi, Nora Al-Twairesh, AbdulMohsen Al-Thubaity. 19406-19439 [doi]
- OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption ModelsMonika Wysoczanska, Shyamal Buch, Anurag Arnab, Cordelia Schmid. 19440-19457 [doi]
- GRPO-Guided Modality Selection Enhanced LoRA-Tuned LLMs for Multimodal Emotion RecognitionYang Chen, Shuwan Yang, Yan Xiang, Ran Song 0002, Yuxin Huang 0004, Zhengtao Yu 0001. 19458-19471 [doi]
- Defending against Indirect Prompt Injection by Instruction DetectionTongyu Wen, Chenglong Wang 0004, Xiyuan Yang, Haoyu Tang, Yueqi Xie, Lingjuan Lyu, Zhicheng Dou, Fangzhao Wu. 19472-19487 [doi]
- MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any LanguageSeyoung Song, Seogyeong Jeong, Eunsu Kim, Jiho Jin, Dongkwan Kim 0001, Jay Shin, Alice Oh. 19488-19514 [doi]
- CAC-CoT: Connector-Aware Compact Chain-of-Thought for Efficient Reasoning Data Synthesis Across Dual-System Cognitive TasksSunguk Choi, YongHoon Kwon, Heondeuk Lee. 19515-19530 [doi]
- On the Versatility of Sparse Autoencoders for In-Context LearningIkhyun Cho, Gaeul Kwon, Julia Hockenmaier. 19531-19538 [doi]
- More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAGShahar Levy, Nir Mazor, Lihi Shalmon, Michael Hassid, Gabriel Stanovsky. 19539-19547 [doi]
- CLEAR: A Comprehensive Linguistic Evaluation of Argument Rewriting by Large Language ModelsThomas Huber 0001, Christina Niklaus. 19548-19568 [doi]
- ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast & Slow Reasoning for Robust Agent DefenseShiyu Xiang, Tong Zhang, Ronghao Chen. 19569-19587 [doi]
- Stop Playing the Guessing Game! Evaluating Conversational Recommender Systems via Target-free User SimulationSunghwan Kim, Kwangwook Seo, Tongyoung Kim, Jinyoung Yeo, Dongha Lee 0003. 19588-19605 [doi]
- Out-of-Context Reasoning in Large Language ModelsJonathan Shaki, Emanuele La Malfa, Michael J. Wooldridge, Sarit Kraus. 19606-19615 [doi]
- CodeComplex: Dataset for Worst-Case Time Complexity PredictionSeung-Yeop Baik, Joonghyuk Hahn, Jungin Kim, Aditi, Mingi Jeon, Yo-Sub Han, Sang-Ki Ko. 19616-19638 [doi]
- Weak2Wise: An Automated, Lightweight Framework for Weak-LLM-Friendly Reasoning SynthesisJianing Lin, Yuanfang Guo, Shunning Liu, Zeming Liu, Yunhong Wang 0001. 19639-19657 [doi]
- From Tower to Spire: Adding the Speech Modality to a Translation-Specialist LLMKshitij Ambilduke, Ben Peters, Sonal Sannigrahi, Anil Keshwani, Tsz Kin Lam, Bruno Martins 0001, André F. T. Martins, Marcely Zanon Boito. 19658-19673 [doi]
- LLM Agents at the Roundtable: A Multi-Perspective and Dialectical Reasoning Framework for Essay ScoringJinhee Jang, Ayoung Moon, Minkyoung Jung, Youngbin Kim, Seung-Jin Lee. 19674-19687 [doi]
- DeepNote: Note-Centric Deep Retrieval-Augmented GenerationRuobing Wang, Qingfei Zhao, Yukun Yan, Daren Zha, Yuxuan Chen, Shi Yu 0001, Zhenghao Liu 0001, Yixuan Wang, Shuo Wang 0013, Xu Han 0007, Zhiyuan Liu 0001, Maosong Sun 0001. 19688-19715 [doi]
- NormAL LoRA: What is the perfect size?Aastik, Topu Sai Meghana, Chinmay Prakash Kulkarni, Pragya Paramita Sahu. 19716-19731 [doi]
- Inclusive Leadership in the Age of AI: A Dataset and Comparative Study of LLMs vs. Real-Life Leaders in Workplace Action PlanningVindhya Singh, Sabine Schulte im Walde, Ksenia Keplinger. 19732-19753 [doi]
- Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination MitigationJihao Gu, Yingyao Wang, Meng Cao, Pi Bu, Jun Song, Bo Zheng 0007, Yancheng He, Shilong Li. 19754-19767 [doi]
- EZ-VC: Easy Zero-shot Any-to-Any Voice ConversionAdvait Joglekar, Divyanshu Singh, Rooshil Rohit Bhatia, Srinivasan Umesh. 19768-19774 [doi]
- Length Representations in Large Language ModelsSangjun Moon, Dasom Choi, Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura. 19775-19793 [doi]
- MultiLingPoT: Boosting Mathematical Reasoning in LLMs through Multilingual Program IntegrationNianqi Li, Zujie Liang, Siyu Yuan, Jiaqing Liang, Feng Wei, Yanghua Xiao. 19794-19811 [doi]
- Simulating Identity, Propagating Bias: Abstraction and Stereotypes in LLM-Generated TextPia Sommerauer, Giulia Rambelli, Tommaso Caselli. 19812-19831 [doi]
- Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMsZhikai Ding, Shiyu Ni, Keping Bi. 19832-19848 [doi]
- Benchmarking Large Language Models for Cryptanalysis and Side-Channel VulnerabilitiesUtsav Maskey, Chencheng Zhu, Usman Naseem. 19849-19865 [doi]
- MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual SpaceAnshul Singh, Chris Biemann, Jan Strich. 19866-19891 [doi]
- TurnBench-MS: A Benchmark for Evaluating Multi-Turn, Multi-Step Reasoning in Large Language ModelsYiran Zhang, Mo Wang, Xiaoyang Li, Kaixuan Ren, Chencheng Zhu, Usman Naseem. 19892-19924 [doi]
- Assessing LLM Reasoning Steps via Principal Knowledge GroundingHyeon Hwang, Yewon Cho, Chanwoong Yoon, Yein Park, Minju Song, Kyungjae Lee 0002, Gangwoo Kim, Jaewoo Kang. 19925-19948 [doi]
- Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring StrategyParamita Mirza, Lucas Weber, Fabian Küch. 19949-19974 [doi]
- CoTD-PO: Chain-of-Thought Distillation with Preference OptimizationLujie Niu, Haochen Sun, Fangkun Zhao, Sheng Chen, Zimeng Bai, Jiawei Zhang, Caixia Yuan, Xiaojie Wang. 19975-19986 [doi]
- Intelligent Document Parsing: Towards End-to-end Document Parsing via Decoupled Content Parsing and Layout GroundingHangdi Xing, Feiyu Gao, Qi Zheng 0002, Zhaoqing Zhu, Zirui Shao, Ming Yan 0008. 19987-19998 [doi]
- Feel the Difference? A Comparative Analysis of Emotional Arcs in Real and LLM-Generated CBT SessionsXiaoyi Wang, Jiwei Zhang, Guangtao Zhang, Honglei Guo. 19999-20017 [doi]
- Beyond Single-User Dialogue: Assessing Multi-User Dialogue State Tracking Capabilities of Large Language ModelsSangmin Song, Juhwan Choi, Jungmin Yun, Youngbin Kim. 20018-20029 [doi]
- All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA BenchmarkDavide Testa, Giovanni Bonetta, Raffaella Bernardi, Alessandro Bondielli, Alessandro Lenci, Alessio Miaschi, Lucia C. Passaro, Bernardo Magnini. 20030-20050 [doi]
- Triangulating LLM Progress through Benchmarks, Games, and Cognitive TestsFilippo Momentè, Alessandro Suglia, Mario Giulianelli, Ambra Ferrari, Alexander Koller, Oliver Lemon, David Schlangen, Raquel Fernández, Raffaella Bernardi. 20051-20072 [doi]
- Entity Profile Generation and Reasoning with LLMs for Entity AlignmentRumana Ferdous Munne, Md. Mostafizur Rahman, Yuji Matsumoto 0001. 20073-20086 [doi]
- Re-FRAME the Meeting Summarization SCOPE: Fact-Based Summarization and Personalization via QuestionsFrederic Kirstein, Sonu Kumar, Terry Ruas, Bela Gipp. 20087-20137 [doi]
- Attack as Defense: Safeguarding Large Vision-Language Models from Jailbreaking by Adversarial AttacksChongxin Li, Hanzhang Wang, Yuchun Fang. 20138-20152 [doi]
- Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue EvaluationBohao Yang, Kun Zhao 0007, Dong Liu 0026, Chen Tang, Liang Zhan, Chenghua Lin. 20153-20169 [doi]
- Differentiated Vision: Unveiling Entity-Specific Visual Modality Requirements for Multimodal Knowledge GraphMinghang Liu, Yinghan Shen, Zihe Huang, Yuanzhuo Wang, Xuhui Jiang, Huawei Shen. 20170-20183 [doi]
- Post Persona Alignment for Multi-Session Dialogue GenerationYi-Pei Chen 0001, Noriki Nishida, Hideki Nakayama, Yuji Matsumoto 0001. 20184-20192 [doi]
- MASSIVE-Agents: A Benchmark for Multilingual Function-Calling in 52 LanguagesMayank Kulkarni, Vittorio Mazzia, Judith Gaspers, Chris Hench, Jack FitzGerald. 20193-20215 [doi]
- Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent FrameworkBohao Yang, Dong Liu 0026, Chenghao Xiao, Kun Zhao 0007, Chen Tang, Chao Li, Lin Yuan, Yang Guang, Chenghua Lin. 20216-20240 [doi]
- Can LLMs Express Personality Across Cultures? Introducing CulturalPersonas for Evaluating Trait AlignmentPriyanka Dey, Aayush Bothra, Yugal Khanter, Jieyu Zhao 0001, Emilio Ferrara. 20241-20262 [doi]
- Exploring the Hidden Reasoning Process of Large Language Models by Misleading ThemGuanyu Chen, Peiyang Wang, Yizhou Jiang, Yuqian Liu, Chujie Zhao, Ying Fang, Tianren Zhang, Feng Chen 0007. 20263-20278 [doi]
- When Models Reason in Your Language: Controlling Thinking Language Comes at the Cost of AccuracyJirui Qi, Shan Chen, Zidi Xiong, Raquel Fernández, Danielle S. Bitterman, Arianna Bisazza. 20279-20296 [doi]
- The Role of Model Confidence on Bias Effects in Measured Uncertainties for Vision-Language ModelsXinyi Liu, Weiguang Wang, Hangfeng He 0001. 20297-20313 [doi]
- GAttention: Gated Attention for the Detection of Abusive LanguageHoracio Jesús Jarquín-Vásquez, Hugo Jair Escalante, Manuel Montes, Mario Ezra Aragón. 20314-20329 [doi]
- Towards Low-Resource Alignment to Diverse Perspectives with Sparse FeedbackChu Fei Luo, Samuel Dahan, Xiaodan Zhu. 20330-20339 [doi]
- ProtoXTM: Cross-Lingual Topic Modeling with Document-Level Prototype-based Contrastive LearningSeung-Won Seo, Soon-Sun Kwon. 20340-20354 [doi]
- One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative ReasoningMengyu Wang, Sotirios Sabanis, Miguel De Carvalho, Shay B. Cohen, Tiejun Ma. 20355-20369 [doi]
- When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMsMikhail Seleznyov, Mikhail Chaichuk, Gleb Ershov, Alexander Panchenko, Elena Tutubalina, Oleg Somov. 20370-20385 [doi]
- RAR²: Retrieval-Augmented Medical Reasoning via Thought-Driven RetrievalKaishuai Xu, Wenjun Hou, Yi Cheng, Wenjie Li 0002. 20386-20396 [doi]
- The Security Threat of Compressed Projectors in Large Vision-Language ModelsYudong Zhang 0008, Ruobing Xie, Xingwu Sun, Jiansheng Chen 0001, Zhanhui Kang, Di Wang 0052, Yu Wang 0002. 20397-20407 [doi]
- NarratEX Dataset: Explaining the Dominant Narratives in News TextsNuno Guimarães, Purificação Silvano, Ricardo Campos 0001, Alípio Mário Jorge, Ana Filipa Pacheco, Dimitar Iliyanov Dimitrov, Nikolaos Nikolaidis 0004, Roman Yangarber, Elisa Sartori, Nicolas Stefanovitch, Preslav Nakov, Jakub Piskorski, Giovanni Da San Martino. 20408-20434 [doi]
- Radical Allomorphy: Phonological Surface Forms without PhonologySalam Khalifa, Nizar Habash, Owen Rambow. 20435-20441 [doi]
- Model Calibration for Emotion DetectionMihaela Petre-Vlad, Cornelia Caragea, Florentina Hristea. 20442-20457 [doi]
- From Benchmark to Better Embeddings: Leveraging Synonym Substitution to Enhance Multimodal Models in UkrainianVolodymyr Mudryi, Yurii Laba. 20458-20468 [doi]
- Context Copying Modulation: The Role of Entropy Neurons in Managing Parametric and Contextual Knowledge ConflictsZineddine Tighidet, Andrea Mogini, Hédi Ben-Younes, Jiali Mei, Patrick Gallinari, Benjamin Piwowarski. 20469-20481 [doi]
- A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and LabellingShiyu Ji, Farnoosh Hashemi, Joice Chen, Juanwen Pan, Weicheng Ma, Hefan Zhang 0001, Sophia Pan, Ming Cheng 0004, Shubham Mohole, Saeed Hassanpour, Soroush Vosoughi, Michael Macy. 20482-20503 [doi]
- SecDecoding: Steerable Decoding for Safer LLM GenerationJiaYou Wang, Rundong Liu, Yue Hu, Huijia Wu, Zhaofeng He. 20504-20521 [doi]
- GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language ModelsTuo Wang, Adithya Kulkarni, Tyler Cody, Peter A. Beling, Yujun Yan, Dawei Zhou 0003. 20522-20541 [doi]
- ReviewEval: An Evaluation Framework for AI-Generated ReviewsMadhav Krishan Garg, Tejash Prasad, Tanmay Singhal, Chhavi Kirtani, Murari Mandal, Dhruv Kumar 0001. 20542-20564 [doi]
- Overcoming Black-box Attack Inefficiency with Hybrid and Dynamic Select AlgorithmsAbhinay Shankar Belde, Rohit Ramkumar, Jonathan Rusert. 20565-20598 [doi]
- GmSLM : Generative Marmoset Spoken Language ModelingTalia Sternberg, Michael London, David Omer, Yossi Adi. 20599-20618 [doi]
- QA-LIGN: Aligning LLMs through Constitutionally Decomposed QAJacob Dineen, Aswin RRV, Qin Liu 0010, Zhikun Xu, Xiao-ye, Ming Shen 0006, Zhaonan Li, Shijie Lu, Chitta Baral, Muhao Chen 0001, Ben Zhou. 20619-20642 [doi]
- Characterizing Positional Bias in Large Language Models: A Multi-Model Evaluation of Prompt Order EffectsPatrick Schilcher, Dominik Karasin, Michael Schöpf, Haisam Saleh, Antonela Tommasel, Markus Schedl. 20643-20664 [doi]
- You Only Use Reactive Attention Slice When Retrieving From Long ContextYun Joon Soh, Hanxian Huang, Yuandong Tian, Jishen Zhao. 20665-20686 [doi]
- Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health MonitoringShuxin Lin, Dhaval C. Patel 0002, Christodoulos Constantinides. 20687-20700 [doi]
- CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language ModelsZicong Tang, Ziyang Ma, Suqing Wang, Zuchao Li, Lefei Zhang, Hai Zhao 0001, Yun Li 0011, Qianren Wang. 20701-20714 [doi]
- Large Language Models with Temporal Reasoning for Longitudinal Clinical Summarization and PredictionMaya Kruse, Shiyue Hu, Nicholas Derby, Yifu Wu, Samantha Stonbraker, Bingsheng Yao, Dakuo Wang, Elizabeth M. Goldberg, Yanjun Gao. 20715-20735 [doi]
- TransAlign: Machine Translation Encoders are Strong Word Aligners, TooBenedikt Ebing, Christian Goldschmied, Goran Glavas. 20736-20749 [doi]
- Pruning Weights but Not Truth: Safeguarding Truthfulness While Pruning LLMsYao Fu, Runchao Li, Xianxuan Long, Haotian Yu, Xiaotian Han, Yu Yin, Pan Li 0001. 20750-20768 [doi]
- Augment before You Try: Knowledge-Enhanced Table Question Answering via Table ExpansionYujian Liu, Jiabao Ji, Tong Yu 0001, Ryan A. Rossi, SungChul Kim, Handong Zhao, Ritwik Sinha, Yang Zhang 0001, Shiyu Chang. 20769-20786 [doi]
- Evaluating Large Language Models for Belief Inference: Mapping Belief Networks at ScaleTrisevgeni Papakonstantinou, Antonina Zhiteneva, Ana Yutong Ma, Derek Powell, Zachary Horne. 20787-20795 [doi]
- Distinguishing fair from unfair compositional generalization tasksAhmad Jabbar, Cleo Condoravdi, Christopher Potts. 20796-20807 [doi]
- SA-CLIP: Language Guided Image Spatial and Action Feature LearningGuanlin Li, Wenhao Shao, Praboda Rajapaksha, Noël Crespi. 20808-20814 [doi]
- Inefficiencies of Meta Agents for Agent DesignBatu El, Mert Yüksekgönül, James Zou 0001. 20815-20824 [doi]
- SCoder: Progressive Self-Distillation for Bootstrapping Small-Scale Data Synthesizers to Empower Code LLMsXinyu Zhang, Changzhi Zhou, Linmei Hu, Luhao Zhang, Xiancai Chen, Haomin Fu, Yang Yang, Mengdi Zhang. 20825-20841 [doi]
- Linguistically-Controlled Paraphrase GenerationMohamed Elgaar, Hadi Amiri. 20842-20864 [doi]
- LAWCAT: Efficient Distillation from Quadratic to Linear Attention with Convolution across Tokens for Long Context ModelingZeyu Liu 0003, Souvik Kundu 0002, Lianghao Jiang, Anni Li, Srikanth Ronanki, Sravan Babu Bodapati, Gourav Datta, Peter Anthony Beerel. 20865-20881 [doi]
- Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning BenchmarksEileen Pan, Anna Seo Gyeong Choi, Maartje ter Hoeve, Skyler Seto, Allison Koenecke. 20882-20893 [doi]
- TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N SamplingJiahao Qiu, Yifu Lu, Yifan Zeng, Jiacheng Guo, Jiayi Geng, Chenhao Zhu, Xinzhe Juan, Ling Yang 0006, Huazheng Wang, Kaixuan Huang, Yue Wu, Mengdi Wang 0001. 20894-20917 [doi]
- CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation MetricsShravan Nayak, Mehar Bhatia, Xiaofeng Zhang, Verena Rieser, Lisa Anne Hendricks, Sjoerd van Steenkiste, Yash Goyal, Karolina Stanczak, Aishwarya Agrawal. 20918-20953 [doi]
- Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLMsChenkun Tan, Pengyu Wang 0006, Shaojun Zhou, Botian Jiang, Zhaowei Li, Dong Zhang, Xinghao Wang, Yaqian Zhou 0001, Xipeng Qiu. 20954-20970 [doi]
- Riemannian Optimization for LoRA on the Stiefel ManifoldJuneyoung Park, Minjae Kang, Seongbae Lee, Haegang Lee, Seongwan Kim, Jaeho Lee. 20971-20985 [doi]
- How Real Are Synthetic Therapy Conversations? Evaluating Fidelity in Prolonged Exposure DialoguesSuhas BN, Dominik Mattioli, Andrew M. Sherrill, Rosa I. Arriaga, Christopher W. Wiese, Saeed Abdullah. 20986-20995 [doi]
- Large Language Models for Controllable Multi-property Multi-objective Molecule OptimizationVishal Dey, Xiao Hu, Xia Ning. 20996-21023 [doi]
- Measuring Lexical Diversity of Synthetic Data Generated through Fine-Grained Persona PromptingGauri Kambhatla, Chantal Shaib, Venkata Govindarajan. 21024-21033 [doi]
- Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial VerificationAofan Liu, Shiyuan Song, Haoxuan Li, Cehao Yang, Yiyan Qi. 21034-21049 [doi]
- Watermark under Fire: A Robustness Evaluation of LLM WatermarkingJiacheng Liang, Zian Wang, Spencer Hong, Shouling Ji, Ting Wang 0006. 21050-21074 [doi]
- PEPE: Long-context Extension for Large Language Models via Periodic Extrapolation Positional EncodingsJikun Hu, Dongsheng Guo, Yuli Liu, Qingyao Ai, Lixuan Wang, Xuebing Sun, Qilei Zhang, Quan Zhou, Cheng Luo 0001. 21075-21085 [doi]
- Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language ModelsYin Jou Huang, Rafik Hadfi. 21086-21101 [doi]
- Controlled Retrieval-augmented Context Evaluation for Long-form RAGJia-Huei Ju, Suzan Verberne, Maarten de Rijke, Andrew Yates. 21102-21121 [doi]
- Humanity's Last Code Exam: Can Advanced LLMs Conquer Human's Hardest Code Competition?Xiangyang Li 0004, Xiaopeng Li 0014, Kuicai Dong, Quanhu Zhang, Rongju Ruan, Xinyi Dai, Yasheng Wang, Ruiming Tang. 21122-21137 [doi]
- False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language ModelsJulie Kallini, Dan Jurafsky, Christopher Potts, Martijn Bartelds. 21138-21154 [doi]
- Rule-Guided Extraction: A Hierarchical Rule Optimization Framework for Document-Level Event Argument ExtractionYue Zuo, Yuxiao Fei, Wanting Ning, Jiayi Huang, Yubo Feng, Lishuang Li. 21155-21171 [doi]
- SOPL: A Sequential Optimal Learning Approach to Automated Prompt Engineering in Large Language ModelsShuyang Wang, Somayeh Moazeni, Diego Klabjan. 21172-21185 [doi]
- CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse UpcyclingXinze Wang, Chen Chen, Yinfei Yang, Hong-You Chen, Bowen Zhang, Aditya Pal, XiangXin Zhu, Xianzhi Du. 21186-21200 [doi]
- A Category-Theoretic Approach to Neural-Symbolic Task Planning with Bidirectional SearchShuhui Qu, Jie Wang 0006, Kincho Law. 21201-21225 [doi]
- HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language ModelsTrishna Chakraborty, Udita Ghosh, Xiaopan Zhang, Fahim Faisal Niloy, Yue Dong 0002, Jiachen Li, Amit Roy-Chowdhury 0001, Chengyu Song. 21226-21243 [doi]
- Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory SemanticsReza Sanayei, Srdjan Vesic, Eduardo Blanco 0002, Mihai Surdeanu. 21244-21262 [doi]
- How Jailbreak Defenses Work and Ensemble? A Mechanistic InvestigationZhuohan Long, Siyuan Wang, Shujun Liu, Yuhang Lai. 21263-21290 [doi]
- Visual Self-Refinement for Autoregressive ModelsJiamian Wang, Ziqi Zhou, Chaithanya Kumar Mummadi, Sohail A. Dianat, Majid Rabbani, Raghuveer Rao, Chen Qiu, Zhiqiang Tao. 21291-21300 [doi]
- Retrieval-Augmented Language Models are Mimetic Theorem ProversWenjie Yang 0006, Ruiyuan Huang, Jiaxing Guo, Zicheng Lyu, Tongshan Xu, Shengzhong Zhang, Lun Du, Da Zheng 0004, Zengfeng Huang. 21301-21313 [doi]
- LORE: Continual Logit Rewriting Fosters Faithful GenerationCharles Yu, Qingyun Wang 0005, Yuting Hu, Jinjun Xiong, Heng Ji 0001. 21314-21328 [doi]
- PRINCIPLES: Synthetic Strategy Memory for Proactive Dialogue AgentsNamyoung Kim, Kai Tzu-iunn Ong, Yeonjun Hwang, Minseok Kang, Iiseo Jihn, GaYoung Kim, Minju Kim, Jinyoung Yeo. 21329-21368 [doi]
- SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental ImpactsNghiem Thanh Pham, Tung Kieu, Duc Manh Nguyen, Ha Xuan Son, Nghia Duong-Trung, Danh Le Phuoc. 21369-21392 [doi]
- A Decoupled Multi-Agent Framework for Complex Text Style TransferLingxi Zhang, Yu-Neng Chuang, Guanchu Wang, Ruixiang Tang, Xuanting Cai, Rajesh Shenoy, Xia Hu 0001. 21393-21403 [doi]
- Mamba Drafters for Speculative DecodingDaewon Choi, Seunghyuk Oh, Saket Dingliwal, Jihoon Tack, Kyuyoung Kim, Woomin Song, Seojin Kim, Insu Han, Jinwoo Shin, Aram Galstyan, Shubham Katiyar, Sravan Babu Bodapati. 21404-21418 [doi]
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid ArchitectureXidong Wang, Dingjie Song, Shunian Chen, Junying Chen, Zhenyang Cai, Chen Zhang 0020, Lichao Sun 0001, Benyou Wang. 21419-21436 [doi]
- Think Clearly: Improving Reasoning via Redundant Token PruningDaewon Choi, Jimin Lee, Jihoon Tack, Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati. 21437-21451 [doi]
- A Systematic Survey of Claim Verification: Corpora, Systems, and Case StudiesZhaxi Zerong, Chenxi Li, Xinyi Liu, Ju-hui Chen, Fei Xia. 21452-21474 [doi]
- Automated Creativity Evaluation for Large Language Models: A Reference-Based ApproachRuizhe Li, Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao 0001. 21475-21488 [doi]
- LangProBe: a Language Program BenchmarkShangyin Tan, Lakshya A. Agrawal, Arnav Singhvi, Liheng Lai, Michael J. Ryan, Daniel Klein 0001, Omar Khattab, Koushik Sen, Matei Zaharia. 21489-21509 [doi]
- Exploring and Detecting Self-disclosure in Multi-modal posts on Chinese Social MediaJingbao Luo, Ming Liu, Aoli Huo, Fujing Hu, Gang Li, Wu Peng. 21510-21527 [doi]
- MV-CLAM: Multi-View Molecular Interpretation with Cross-Modal Projection via Language ModelSumin Ha, Jun Hyeong Kim, Yinhua Piao, Changyun Cho, Sun Kim. 21528-21549 [doi]
- Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer MetricsAmalie Brogaard Pauli, Isabelle Augenstein, Ira Assent. 21550-21564 [doi]
- ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist ContentBhavik Chandna, Mariam Aboujenane, Usman Naseem. 21565-21579 [doi]
- Data Augmentation for Maltese NLP using Transliterated and Machine Translated Arabic DataKurt Micallef, Nizar Habash, Claudia Borg. 21580-21590 [doi]
- Do LLMs Align Human Values Regarding Social Biases? Judging and Explaining Social Biases with LLMsYang Liu, Chenhui Chu. 21591-21628 [doi]
- CoEx - Co-evolving World-model and ExplorationMinsoo Kim, Seung-won Hwang. 21629-21651 [doi]
- BrainLoc: Brain Signal-Based Object Detection with Multi-modal AlignmentJiaqi Duan, Xiaoda Yang, Kaixuan Luan, Hongshun Qiu, Weicai Yan, Xueyi Zhang 0004, Youliang Zhang, Zhaoyang Li, Donglin Huang, Junyu Lu, Ziyue Jiang 0001, Xifeng Yang. 21652-21662 [doi]
- PVTNL: Prompting Vision Transformers with Natural Language for Generalizable Person Re-identificationNing Wang, Lei Xie 0004, Sanglu Lu, Shiwei Gan. 21663-21674 [doi]
- RingFormer: Rethinking Recurrent Transformer with Adaptive Level SignalsJaemu Heo, Eldor Fozilov, Hyunmin Song, Taehwan Kim. 21675-21686 [doi]
- TriSPrompt: A Hierarchical Soft Prompt Model for Multimodal Rumor Detection with Incomplete ModalitiesJiajun Chen, Yangyang Wu, Xiaoye Miao, Mengying Zhu, Meng Xi 0002. 21687-21699 [doi]
- Evaluating Uncertainty Quantification Methods in Argumentative Large Language ModelsKevin Zhou, Adam Dejl, Gabriel Freedman, Lihu Chen, Antonio Rago 0001, Francesca Toni. 21700-21711 [doi]
- CLAIMCHECK: How Grounded are LLM Critiques of Scientific Papers?Jiefu Ou, William Gantt Walden, Kate Sanders 0002, Zhengping Jiang, Kaiser Sun, Jeffrey Cheng, William Jurayj, Miriam Wanner, Shaobo Liang, Candice Morgan, Seunghoon Han, Weiqi Wang 0001, Chandler May, Hannah Recknor, Daniel Khashabi, Benjamin Van Durme. 21712-21735 [doi]
- From Noise to Clarity: Filtering Real and LLM-Generated Samples for Enhanced Intent DetectionJunbao Huang, Weizhen Li, Peijie Huang, Yuhong Xu. 21736-21746 [doi]
- Improving Language Model Personas via Rationalization with Psychological ScaffoldsBrihi Joshi, Xiang Ren 0001, Swabha Swayamdipta, Rik Koncel-Kedziorski, Tim Paek. 21747-21770 [doi]
- KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language ModelsZhen Zhang, Xinyu Wang 0013, Yong Jiang 0005, Zile Qiao, Zhuo Chen, Guangyu Li, Feiteng Mu, Mengting Hu, Pengjun Xie, Fei Huang 0002. 21771-21782 [doi]
- TABARD: A Novel Benchmark for Tabular Anomaly Analysis, Reasoning and DetectionManan Roy Choudhury, Anirudh Iyengar Kaniyar Narayana Iyengar, Shikhhar Siingh, Sugeeth Puranam, Vivek Gupta 0001. 21783-21817 [doi]
- Aspect-based Sentiment Analysis via Synthetic Image GenerationGe Chen, Zhongqing Wang, Guodong Zhou. 21818-21829 [doi]
- IntrEx: A Dataset for Modeling Engagement in Educational ConversationsXingwei Tan, Mahathi Parvatham, Chiara Gambi, Gabriele Pergola. 21830-21845 [doi]
- Bridging the Capability Gap: Joint Alignment Tuning for Harmonizing LLM-based Multi-Agent SystemsMinghang Zhu, Zhengliang Shi, Zhiwei Xu 0005, Shiguang Wu 0003, Lingjie Wang, Pengjie Ren, Zhaochun Ren, Zhumin Chen. 21846-21861 [doi]
- Safety Through Reasoning: An Empirical Study of Reasoning Guardrail ModelsMakesh Narsimhan Sreedhar, Traian Rebedea, Christopher Parisien. 21862-21880 [doi]
- Context-Aware Reasoning On Parametric Knowledge for Inferring Causal VariablesIvaxi Sheth, Sahar Abdelnabi, Mario Fritz. 21881-21918 [doi]
- LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model MergingZehua Liu, Han Wu 0004, Yuxuan Yao, Xiaojin Fu, Ruifeng She, Xiongwei Han, Tao Zhong 0004, Mingxuan Yuan. 21919-21926 [doi]
- Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem SolvingShunfeng Zheng, Yudi Zhang 0006, Meng Fang, Zihan Zhang, Zhitan Wu, Mykola Pechenizkiy, Ling Chen 0006. 21927-21956 [doi]
- FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency ReductionAkriti Jain 0001, Saransh Sharma, Koyel Mukherjee, Soumyabrata Pal. 21957-21975 [doi]
- PolitiSky24: U.S. Political Bluesky Dataset with User Stance LabelsPeyman Rostami, Vahid Rahimzadeh, Ali Adibi, Azadeh Shakery. 21976-21993 [doi]
- From Ground Trust to Truth: Disparities in Offensive Language Judgments on Contemporary Korean Political DiscourseSeunguk Yu, Jungmin Yun, Jinhee Jang, Youngbin Kim. 21994-22014 [doi]
- Misalignment Attack on Text-to-Image Models via Text Embedding Optimization and InversionZhijie Du, Daizong Liu, Pan Zhou 0001. 22015-22032 [doi]
- Domain Pre-training Impact on RepresentationsCésar González-Gutiérrez, Ariadna Quattoni. 22033-22049 [doi]
- KoACD: The First Korean Adolescent Dataset for Cognitive Distortion Analysis via Role-Switching Multi-LLM NegotiationJun Seo Kim, Hye Hyeon Kim. 22050-22078 [doi]
- Refined Assessment for Translation Evaluation: Rethinking Machine Translation Evaluation in the Era of Human-Level SystemsDmitry Popov, Vladislav Negodin, Ekaterina Enikeeva, Iana Matrosova, Nikolay Karpachev, Max Ryabinin. 22079-22095 [doi]
- Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized DialogueSangyeop Kim, Yohan Lee, Sanghwa Kim, Hyunjong Kim, Sungzoon Cho. 22096-22113 [doi]
- Temporal Consistency for LLM Reasoning Process Error IdentificationJiacheng Guo, Yue Wu, Jiahao Qiu, Kaixuan Huang, Xinzhe Juan, Ling Yang 0006, Mengdi Wang 0001. 22114-22129 [doi]
- Quantifying Compositionality of Classic and State-of-the-Art EmbeddingsZhijin Guo, Chenhao Xue, Zhaozhen Xu, Hongbo Bo, Yuxuan Ye, Janet B. Pierrehumbert, Martha Lewis. 22130-22146 [doi]
- Presumed Cultural Identity: How Names Shape LLM ResponsesSiddhesh Pawar, Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein. 22147-22172 [doi]
- I-GUARD: Interpretability-Guided Parameter Optimization for Adversarial DefenseMamta Mamta, Oana Cocarascu. 22173-22188 [doi]
- DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference OptimizationChao Zhang, Xin Shi, Xueqiao Zhang, Yifan Zhu, Yi Yang 0001, Yawei Luo. 22189-22215 [doi]
- Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language ModelsTom Kempton, Stuart Burrell. 22216-22231 [doi]
- BRIT: Bidirectional Retrieval over Unified Image-Text GraphAinulla Khan, Moyuru Yamada, Akella Srinidhi. 22232-22248 [doi]
- ReTAG: Retrieval-Enhanced, Topic-Augmented Graph-Based Global SensemakingBoyoung Kim, Dosung Lee, Sumin An, Jinseong Jeong, Paul Hongsuck Seo. 22249-22277 [doi]
- Capturing Latent Modal Association For Multimodal Entity AlignmentYongquan Ji, Jingwei Cheng, Fu Zhang 0001, Chenglong Lu. 22278-22293 [doi]
- Explaining novel senses using definition generation with open language modelsMariia Fedorova, Andrey Kutuzov, Francesco Periti, Yves Scherrer. 22294-22302 [doi]
- Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-SwitchingSeoyeon Kim, Huiseo Kim, Chanjun Park, Jinyoung Yeo, Dongha Lee 0003. 22303-22327 [doi]
- Compositional Translation: A Novel LLM-based Approach for Low-resource Machine TranslationArmel Randy Zebaze, Benoît Sagot, Rachel Bawden. 22328-22357 [doi]
- TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine TranslationArmel Randy Zebaze, Benoît Sagot, Rachel Bawden. 22358-22381 [doi]
- Fast, Not Fancy: Rethinking G2P with Rich Data and Statistical ModelsMahta Fetrat Qharabagh, Zahra Dehghanian, Hamid R. Rabiee 0001. 22382-22408 [doi]
- Personalized open world plan generation for safety-critical human centered autonomous systems: A case study on Artificial PancreasAyan Banerjee 0001, Sandeep Gupta. 22409-22422 [doi]
- CaMMT: Benchmarking Culturally Aware Multimodal Machine TranslationEmilio Villa-Cueva, Sholpan Bolatzhanova, Diana Turmakhan, Kareem Elzeky, Henok Biadglign Ademtew, Alham Fikri Aji, Vladimir Araujo, Israel Abebe Azime, Jinheon Baek, Frederico Belcavello, Fermin Cristobal, Jan Christian Blaise Cruz, Mary Dabre, Raj Dabre, Toqeer Ehsan, Naome A. Etori, Fauzan Farooqui, Jiahui Geng, Guido Ivetta, Thanmay Jayakumar, Soyeong Jeong, Zheng Wei Lim, Aishik Mandal, Sofía Martinelli, Mihail Minkov Mihaylov, Daniil Orel, Aniket Pramanick, Sukannya Purkayastha, Israfel Salazar, Haiyue Song, Tiago Timponi Torrent, Debela Desalegn Yadeta, Injy Hamed, Atnafu Lambebo Tonja, Thamar Solorio. 22423-22441 [doi]
- Training Text-to-Molecule Models with Context-Aware TokenizationSeojin Kim, Hyeontae Song, Jaehyun Nam, Jinwoo Shin. 22442-22460 [doi]
- Challenging the Evaluator: LLM Sycophancy Under User RebuttalSung Won Kim, Daniel Khashabi. 22461-22478 [doi]
- Perspective-driven Preference Optimization with Entropy Maximization for Diverse Argument GenerationYilin Cao, Ruike Zhang, Penghui Wei, Qingchao Kong, Wenji Mao. 22479-22496 [doi]
- Spoken Document Retrieval for an Unwritten Language: A Case Study on GormatiSanjay Booshanam, Kelly Chen, Ondrej Klejch, Thomas Reitmaier, Dani Kalarikalayil Raju, Electra Wallington, Nina Markl, Jennifer Pearson 0001, Matt Jones 0001, Simon Robinson 0001, Peter Bell 0001. 22497-22509 [doi]
- M-Help: Using Social Media Data to Detect Mental Health Help-Seeking SignalsMSVPJ Sathvik, Zuhair Hasan Shaik, Vivek Gupta. 22510-22520 [doi]
- Brittle Minds, Fixable Activations: Understanding Belief Representations in Language ModelsMatteo Bortoletto, Constantin Ruhdorfer, Lei Shi 0032, Andreas Bulling. 22521-22543 [doi]
- Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language ModelsXiaojun Wu, Junxi Liu, Huan-Yi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo 0016. 22544-22560 [doi]
- Quantifying the Risks of LLM- and Tool-assisted Rephrasing to Linguistic DiversityMengying Wang, Andreas Spitz. 22561-22574 [doi]
- NUMINA: A Natural Understanding Benchmark for Multi-dimensional Intelligence and Numerical Reasoning AbilitiesChangyu Zeng, Yifan Wang, Zimu Wang, Wei Wang 0042, Zhengni Yang, Muyi Bao, Jimin Xiao, Anh Nguyen 0003, Yutao Yue. 22575-22590 [doi]
- MoMentS: A Comprehensive Multimodal Benchmark for Theory of MindEmilio Villa-Cueva, S. M. Masrur Ahmed, Rendi Chevi, Jan Christian Blaise Cruz, Kareem Elzeky, Fermin Cristobal, Alham Fikri Aji, Skyler Wang, Rada Mihalcea, Thamar Solorio. 22591-22611 [doi]
- Code Like Humans: A Multi-Agent Solution for Medical CodingAndreas Geert Motzfeldt, Joakim Edin, Casper L. Christensen, Christian Hardmeier, Lars Maaløe, Anna Rogers. 22612-22627 [doi]
- Can Out-of-Distribution Evaluations Uncover Reliance on Prediction Shortcuts? A Case Study in Question AnsweringMichal Stefánik, Timothee Mickus, Michal Spiegel, Marek Kadlcík, Josef Kuchar. 22628-22635 [doi]
- MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert AggregationShoubin Yu, Yue Zhang, Ziyang Wang, Jaehong Yoon, Mohit Bansal. 22636-22652 [doi]
- Lifelong Knowledge Editing requires Better RegularizationAkshat Gupta, Phudish Prateepamornkul, Maochuan Lu, Ahmed M. Alaa, Thomas Hartvigsen, Gopala Anumanchipalli. 22653-22675 [doi]
- Lost in Embeddings: Information Loss in Vision-Language ModelsWenyan Li 0001, Raphael Tang, Chengzu Li, Caiqi Zhang, Ivan Vulic, Anders Søgaard. 22676-22693 [doi]
- Assessing the Role of Data Quality in Training Bilingual Language ModelsSkyler Seto, Maartje ter Hoeve, Maureen de Seyssel, David Grangier. 22694-22720 [doi]
- DORM: Preference Data Weights Optimization for Reward Modeling in LLM AlignmentRongzhi Zhang, Chenwei Zhang, Xinyang Zhang, Liang Qiu, Haoming Jiang, Yuchen Zhuang, Qingru Zhang, Hyokun Yun, Xian Li, Bing Yin, Tuo Zhao, Chao Zhang 0014. 22721-22739 [doi]
- Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without ThemMarc Felix Brinner, Tarek Al Mustafa, Sina Zarrieß. 22740-22754 [doi]
- Aligning Dialogue Agents with Global Feedback via Large Language Model Multimodal Reward DecompositionDon (Dong Won) Lee, Hae Won Park, Cynthia Breazeal, Louis-Philippe Morency. 22755-22787 [doi]
- UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and BenchmarkingSarfraz Ahmad, Hasan Iqbal, Momina Ahsan, Numaan Naeem, Muhammad Ahsan Riaz Khan, Arham Riaz, Muhammad Arslan Manzoor, Yuxia Wang 0003, Preslav Nakov. 22788-22802 [doi]
- Echoes of Agreement: Argument Driven Sycophancy in Large Language modelsAvneet Kaur. 22803-22812 [doi]
- Rethinking NLP for Chemistry: A Critical Look at the USPTO BenchmarkDerin Ozer, Nicolas Gutowski, Benoit Da Mota, Thomas Cauchy, Sylvain Lamprier. 22813-22825 [doi]
- Investigating Dictionary Expansion for Video-based Sign Language DictionariesAashaka Desai, Daniela Massiceti, Richard E. Ladner, Hal Daumé III, Danielle Bragg, Alex Xijie Lu. 22826-22841 [doi]
- From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text GenerationNajrin Sultana, Md. Rafi Ur Rashid, Kang Gu, Shagufta Mehnaz. 22842-22859 [doi]
- Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of RelevanceReza Esfandiarpoor, George Zerveas, Ruochen Zhang 0001, Macton Mgonzo, Carsten Eickhoff, Stephen H. Bach. 22860-22882 [doi]
- Instability in Downstream Task Performance During LLM PretrainingYuto Nishida, Masaru Isonuma, Yusuke Oda. 22883-22895 [doi]
- A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented GenerationNeal Lawton, Alfy Samuel, Anoop Kumar, Daben Liu. 22896-22904 [doi]
- mrCAD: Multimodal Communication to Refine Computer-aided DesignsWilliam P. McCarthy, Saujas Vaduguru, Karl D. D. Willis, Justin Matejka, Judith E. Fan, Daniel Fried, Yewen Pu. 22905-22921 [doi]
- MOCHA: Are Code Language Models Robust Against Multi-Turn Malicious Coding Prompts?Muntasir Wahed, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Nirav Diwan, Gang Wang 0011, Dilek Hakkani-Tür, Ismini Lourentzou. 22922-22948 [doi]
- How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on tau-benchVenkatesh Mishra, Amir Saeidi, Satyam Raj, Mutsumi Nakamura, Gaowen Liu, Ali Payani, Jayanth Srinivasa, Chitta Baral. 22949-22972 [doi]
- Evaluating Fairness in Large Vision-Language Models Across Diverse Demographic Attributes and PromptsXuyang Wu 0002, Yuan Wang 0076, Hsin-Tai Wu, Zhiqiang Tao, Yi Fang 0008. 22973-22991 [doi]
- VIBE: Can a VLM Read the Room?Tania Chakraborty, Eylon Caplan, Dan Goldwasser. 22992-23008 [doi]
- LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play EcosystemHongyi Liu, Shaochen Zhong, Xintong Sun, Minghao Tian, Mohsen Hariri, Zirui Liu 0001, Ruixiang Tang, Zhimeng Jiang, Jiayi Yuan 0001, Yu-Neng Chuang, Li Li 0035, Soo Hyun Choi, Rui Chen 0012, Vipin Chaudhary, Xia Hu 0001. 23009-23047 [doi]
- Pearl: A Multimodal Culturally-Aware Arabic Instruction DatasetFakhraddin Alwajih, Samar Mohamed Magdy, Abdellah El Mekki, Omer Nacar, Youssef Nafea, Safaa Taher Abdelfadil, Abdulfattah Mohammed Yahya, Hamzah Luqman, Nada AlMarwani, Samah Aloufi, Baraah Qawasmeh, Houdaifa Atou, Serry Sibaee, Hamzah A. Alsayadi, Walid Al-Dhabyani, Maged Saeed AlShaibani, Aya El aatar, Nour Qandos, Rahaf Alhamouri, Samar Ahmad, Mohammed Anwar Al-Ghrawi, Aminetou Yacoub, Ruwa AbuHweidi, Vatimetou Mohamed Lemin, Reem Abdel-Salam, Ahlam Bashiti, Adel Ammar, Aisha Alansari, Ahmed Ashraf, Nora Alturayeif, Alcides Alcoba Inciarte, AbdelRahim A. Elmadany, Mohamedou Cheikh Tourad, Ismail Berrada, Mustafa Jarrar, Shady Shehata, Muhammad Abdul-Mageed. 23048-23079 [doi]
- Protein Large Language Models: A Comprehensive SurveyYijia Xiao, Wanjia Zhao, Junkai Zhang, Yiqiao Jin, Han Zhang, Zhicheng Ren, Renliang Sun, Haixin Wang 0003, Guancheng Wan, Pan Lu, Xiao Luo 0001, Yu Zhang 0044, James Zou 0001, Yizhou Sun, Wei Wang 0010. 23080-23103 [doi]
- MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMsRaoyuan Zhao, Beiduo Chen, Barbara Plank, Michael A. Hedderich. 23104-23136 [doi]
- Looking Beyond the Pixels: Evaluating Visual Metaphor Understanding in VLMsManishit Kundu, Sumit Shekhar, Pushpak Bhattacharyya. 23137-23158 [doi]
- AGENTVIGIL: Automatic Black-Box Red-teaming for Indirect Prompt Injection against LLM AgentsZhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang 0001, Wenbo Guo 0002, Dawn Song. 23159-23172 [doi]
- Improving LLM-as-a-Judge Inference with the Judgment DistributionVictor Wang, Michael Jq Zhang, Eunsol Choi. 23173-23199 [doi]
- Learning Is Not A Race: Improving Retrieval in Language Models via Equal LearningWanqian Yang, Aahlad Manas Puli, Rajesh Ranganath. 23200-23211 [doi]
- The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language ModelsMarlene Lutz, Indira Sen, Georg Ahnert, Elisa Rogers, Markus Strohmaier. 23212-23237 [doi]
- Spiral of Silence in Large Language Model AgentsMingze Zhong, Meng Fang, Zijing Shi, Yuxuan Huang, Shunfeng Zheng, Yali Du 0001, Ling Chen 0006, Jun Wang 0012. 23238-23253 [doi]
- Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge ProbingRaoyuan Zhao, Abdullatif Köksal, Ali Modarressi, Michael A. Hedderich, Hinrich Schütze. 23254-23280 [doi]
- Context Length Alone Hurts LLM Performance Despite Perfect RetrievalYufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Babu Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz 0001, Eliu A. Huerta, Hao Peng 0009. 23281-23298 [doi]
- DebUnc: Improving Large Language Model Agent Communication With Uncertainty MetricsLuke Yoffe, Alfonso Amayuelas, William Yang Wang. 23299-23315 [doi]
- ProcVQA: Benchmarking the Effects of Structural Properties in Mined Process Visualizations on Vision-Language Model PerformanceKazi Tasnim Zinat, Saad Mohammad Abrar, Shoumik Saha, Sharmila Duppala, Saimadhav Naga Sakhamuri, Zhicheng Liu 0001. 23316-23348 [doi]
- Probing Political Ideology in Large Language Models: How Latent Political Representations Generalize Across TasksTianyi Zhang. 23349-23360 [doi]
- Understanding GUI Agent Localization Biases through Logit SharpnessXingjian Tao, Yiwei Wang 0001, Yujun Cai, Zhicheng Yang, Jing Tang 0004. 23361-23374 [doi]
- The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part MentionsSophie Wu, Jan Philip Wahle, Saif M. Mohammad. 23375-23399 [doi]
- HomoGraphAdapter: A Homogeneous Graph Neural Network as an Effective Adapter for Vision-Language ModelsChuan He, Zhuozhao Li, Song Guo, Xiaocheng Lu, Jinxiang Lai. 23400-23414 [doi]
- No Black Boxes: Interpretable and Interactable Predictive Healthcare with Knowledge-Enhanced Agentic Causal DiscoveryXiaoxue Han, Pengfei Hu, Chang Lu 0004, Jun-En Ding, Feng Liu 0011, Yue Ning 0001. 23415-23427 [doi]
- PROOD: A Simple LLM Out-of-Distribution Guardrail Leveraging Response SemanticsJoshua Tint. 23428-23438 [doi]
- ICL-Bandit: Relevance Labeling in Advertisement Recommendation Systems via LLMLu Wang 0029, Chiming Duan, Pu Zhao 0004, Fangkai Yang, Yong Shi 0012, Xuefeng Luo, Bingjing Xu, Weiwei Deng, Qingwei Lin, Dongmei Zhang 0001. 23439-23449 [doi]
- Intent-aware Schema Generation and Refinement for Literature Review TablesVishakh Padmakumar, Joseph Chee Chang, Kyle Lo, Doug Downey, Aakanksha Naik. 23450-23472 [doi]
- NLP Needs Diversity outside of 'Diversity'Joshua Tint. 23473-23479 [doi]
- Anatomy of a Feeling: Narrating Embodied Emotions via Large Vision-Language ModelsMohammad Saim, Phan Anh Duong, Cat Luong, Aniket Bhanderi, Tianyu Jiang. 23480-23495 [doi]
- Towards Universal Debiasing for Language Models-based Tabular Data GenerationTianchun Li, Tianci Liu 0003, XingChen Wang, Rongzhe Wei, Pan Li 0005, Lu Su 0001, Jing Gao 0004. 23496-23512 [doi]
- Beyond Linear Steering: Unified Multi-Attribute Control for Language ModelsNarmeen Fatimah Oozeer, Luke Marks, Fazl Barez, Amir Abdullah. 23513-23557 [doi]
- Unequal Scientific Recognition in the Age of LLMsYixuan Liu, Abel Elekes, Jianglin Lu, Rodrigo Dorantes Gilardi, Albert-László Barabási. 23558-23568 [doi]
- Zero-Shot Fine-Grained Image Classification Using Large Vision-Language ModelsMd. Atabuzzaman, Andrew Zhang, Christopher Thomas 0004. 23569-23582 [doi]
- Using tournaments to calculate AUROC for zero-shot classification with LLMsWonjin Yoon, Ian Bulovic, Timothy A. Miller. 23583-23591 [doi]
- Exploration-Driven Reinforcement Learning for Expert Routing Improvement in Mixture-of-Experts Language ModelsGyunyeop Kim, Sangwoo Kang. 23592-23605 [doi]
- D2CS - Documents Graph Clustering using LLM supervisionYoel Ashkenazi, Etzion Harari, Regev Yehezkel Imra, Naphtali Abudarham, Dekel Cohen, Yoram Louzoun. 23606-23623 [doi]
- GeoChain: Multimodal Chain-of-Thought for Geographic ReasoningSahiti Yerramilli, Nilay Pande, Rynaa Grover, Jayant Sravan Tamarapalli. 23624-23639 [doi]
- SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language ModelsAnushka Sivakumar, Andrew Zhang, Zaber Ibn Abdul Hakim, Christopher Thomas 0004. 23640-23665 [doi]
- FractalLLM: Lossless Self-Speculative Decoding with Layer Embedded Self-CompressionJuhyeong Kim, Sangyeon Yu, Gyunyeop Kim, Sangwoo Kang. 23666-23673 [doi]
- Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language ModelsRyan Solgi, Kai Zhen, Rupak Vignesh Swaminathan, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, Zheng Zhang 0005. 23674-23683 [doi]
- Third-Person Appraisal Agent: Simulating Human Emotional Reasoning in Text with Large Language ModelsSimin Hong, Jun Sun, Hongyang Chen. 23684-23701 [doi]
- Source-primed Multi-turn Conversation Helps Large Language Models Translate DocumentsHanxu Hu, Jannis Vamvas, Rico Sennrich. 23702-23712 [doi]
- Mitigating Spurious Correlations via Counterfactual Contrastive LearningFengxiang Cheng, Chuan Zhou 0013, Xiang Li, Alina Leidinger, Haoxuan Li 0001, Mingming Gong, Fenrong Liu, Robert van Rooij. 23713-23722 [doi]
- The RAG Paradox: A Black-Box Attack Exploiting Unintentional Vulnerabilities in Retrieval-Augmented Generation SystemsChanwoo Choi, Jinsoo Kim, Sukmin Cho, Soyeong Jeong, Buru Chang. 23723-23744 [doi]
- Guiding Large Language Models for Biomedical Entity Linking via Restrictive and Contrastive DecodingZhenxi Lin, Ziheng Zhang, Jian Wu, Yefeng Zheng 0001, Xian Wu 0001. 23745-23759 [doi]
- Cut the Deadwood Out: Backdoor Purification via Guided Module SubstitutionYao Tong, Weijun Li 0003, Xuanli He, Haolan Zhan, Qiongkai Xu. 23760-23783 [doi]
- RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language ModelsJingjing Liu, Zeming Liu, Zihao Cheng, Mengliang He, Xiaoming Shi, Yuhang Guo 0001, Xiangrong Zhu 0002, Yuanfang Guo, Yunhong Wang 0001, Haifeng Wang 0001. 23784-23813 [doi]
- FaStFact: Faster, Stronger Long-Form Factuality Evaluations in LLMsYingjia Wan, Haochen Tan, Xiao Zhu, Xinyu Zhou, Zhiwei Li, Qingsong Lv, Changxuan Sun, Jiaqi Zeng, Yi Xu, Jianqiao Lu, Yinhong Liu, Zhijiang Guo. 23814-23854 [doi]
- PropXplain: Can LLMs Enable Explainable Propaganda Detection?Maram Hasanain, Md. Arid Hasan, Mohamed Bayan Kmainasi, Elisa Sartori, Ali Ezzat Shahroor, Giovanni Da San Martino, Firoj Alam. 23855-23863 [doi]
- EoT: Evolution of Thoughts for Complex Reasoning TasksQin Hua, Jiaqi Sun, Shiyou Qian, Dingyu Yang, Jian Cao 0001, Guangtao Xue. 23864-23886 [doi]
- Reveal and Release: Iterative LLM Unlearning with Self-generated DataLinxi Xie, Xin Teng, Shichang Ke, Hongyi Wen, Shenji Wan. 23887-23899 [doi]
- An Evaluation Resource for Grounding Translation ErrorsSujin Chen, Kang Wang, Zixuan Zhou, Xiangyu Duan, Wanqun Zhang, Hao Yang 0006, Jinsong Su, Min Zhang. 23900-23916 [doi]
- Enhancing Time Awareness in Generative RecommendationSunkyung Lee 0001, Seongmin Park 0002, JongHyo Kim, Mincheol Yoon, Jongwuk Lee. 23917-23933 [doi]
- Adaptive LLM Routing under Budget ConstraintsPranoy Panda, Raghav Magazine, Chaitanya Devaguptapu, Sho Takemori, Vishal Sharma. 23934-23949 [doi]
- Promptception: How Sensitive Are Large Multimodal Models to Prompts?Mohamed Insaf Ismithdeen, Muhammad Uzair Khattak, Salman Khan 0001. 23950-23985 [doi]
- Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense EvaluationWenkai Guo, Xuefeng Liu 0001, Haolin Wang 0002, Jianwei Niu 0002, Shaojie Tang 0001, Jing Yuan 0002. 23986-24013 [doi]
- Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied EnvironmentsQingyu Lu 0001, Liang Ding 0006, Siyi Cao, Xuebo Liu 0002, Kanjian Zhang, Jinxia Zhang, Dacheng Tao. 24014-24027 [doi]
- AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance LabelsLei Li, Xiangxu Zhang, Xiao Zhou, Zheng Liu. 24028-24047 [doi]
- RG-VQA: Leveraging Retriever-Generator Pipelines for Knowledge Intensive Visual Question AnsweringSettaluri Lakshmi Sravanthi, Pulkit Agarwal, Debjyoti Mondal, Rituraj Singh, Subhadarshi Panda, Ankit Mishra, Kiran Pradeep, Srihari K. B, Godawari Sudhakar Rao, Pushpak Bhattacharyya. 24048-24060 [doi]
- Enhancing RAG Efficiency with Adaptive Context CompressionShuyu Guo, Shuo Zhang 0006, Zhaochun Ren. 24061-24076 [doi]
- Revealing the impact of synthetic native samples and multi-tasking strategies in Hindi-English code-mixed humour and sarcasm detectionDebajyoti Mazumder, Aakash Kumar, Jasabanta Patro. 24077-24107 [doi]
- CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language ModelsZhuofan Chen, Jiyuan He, Yichi Zhang, Xing Hu, Haoxing Wen, Jun Bai, Wenge Rong. 24108-24125 [doi]
- Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMsSungjae Lee, Hoyoung Kim, Jeongyeon Hwang, Eunhyeok Park, Jungseul Ok. 24126-24144 [doi]
- BannerBench: Benchmarking Vision Language Models for Multi-Ad Selection with Human PreferencesHiroto Otake, Peinan Zhang, Yusuke Sakai 0010, Masato Mita, Hiroki Ouchi, Taro Watanabe. 24145-24159 [doi]
- DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword ExtractionJian Chen 0047, Zhenyan Chen, Xuming Hu, Peilin Zhou, Yining Hua, Han Fang, Cissy Hing Yee Choy, Xinmei Ke, Jingfeng Luo, Zixuan Yuan. 24160-24176 [doi]
- Facilitating Cross-lingual Transfer of Empathy through Language-independent Latent Diffusion: A Case Study in ChineseJunlin Li, Bo Peng, Yu-Yin Hsu. 24177-24192 [doi]
- Evaluating Compound AI Systems through Behaviors, Not BenchmarksPranav Bhagat, K. N. Ajay Shastry, Pranoy Panda, Chaitanya Devaguptapu. 24193-24222 [doi]
- SciCompanion: Graph-Grounded Reasoning for Structured Evaluation of Scientific ArgumentsJoshua Alan Flashner, Adithya Kulkarni, Dawei Zhou 0003. 24223-24244 [doi]
- From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health MisinformationZhihao Zhang, Yiran Zhang, XiYue Zhou, Liting Huang, Imran Razzak, Preslav Nakov, Usman Naseem. 24245-24260 [doi]
- Estimating Machine Translation DifficultyLorenzo Proietti 0002, Stefano Perrella, Vilém Zouhar, Roberto Navigli, Tom Kocmi. 24261-24285 [doi]
- TIU-Bench: A Benchmark for Evaluating Large Multimodal Models on Text-rich Image UnderstandingKun Zhang 0041, Liqiang Niu, Zhen Cao, Fandong Meng, Jie Zhou 0016. 24286-24295 [doi]
- Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared SemanticsKavin R. V., Pawan Goyal 0002. 24296-24304 [doi]
- ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL DialectsJipeng Zhang, Haolin Yang, Kehao Miao, Ruiyuan Zhang, Renjie Pi, Jiahui Gao, Xiaofang Zhou. 24305-24326 [doi]
- Under the Shadow of Babel: How Language Shapes Reasoning in LLMsChenxi Wang 0001, Yixuan Zhang, Lang Gao, Zixiang Xu, Zirui Song, Yanbo Wang 0005, Xiuying Chen. 24327-24344 [doi]
- Think Right, Not More: Test-Time Scaling for Numerical Claim VerificationPrimakov Chungkham, Venktesh V, Vinay Setty, Avishek Anand. 24345-24363 [doi]
- Nexus: Adaptive Upcycling to Efficiently Pretrain Mixture of ExpertsNikolas Gritsch, Qizhen Zhang 0002, Acyr Locatelli, Sara Hooker, Ahmet Üstün. 24364-24381 [doi]
- Exploring Context Strategies in LLMs for Discourse-Aware Machine TranslationRitvik Choudhary, Rem Hida, Masaki Hamada 0001, Hayato Futami, Toshiyuki Sekiya. 24382-24391 [doi]
- Insights into using temporal coordinated behaviour to explore connections between social media posts and influenceElisa Sartori, Serena Tardelli, Maurizio Tesconi, Mauro Conti, Alessandro Galeazzi, Stefano Cresci, Giovanni Da San Martino. 24392-24404 [doi]
- SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative ExplorationJunhan Shi, Yijia Zhu, Zhenning Shi, Dan Zhao 0003, Qing Li 0006, Yong Jiang 0001. 24405-24415 [doi]
- A Similarity Measure for Comparing Conversational DynamicsSang-Min Jung, Kaixiang Zhang, Cristian Danescu-Niculescu-Mizil. 24416-24447 [doi]
- AgentDrug: Utilizing Large Language Models in an Agentic Workflow for Zero-Shot Molecular OptimizationLe Huy Khiem, Ting Hua, Nitesh V. Chawla. 24448-24458 [doi]
- Improving Preference Alignment of LLM with Inference-Free Self-RefinementFukun Ma, Kaibin Tian, Jieting Xue, Xiaoyi Wang, Ye Ma, Quan Chen 0006, Peng Jiang 0002, Lijie Wen. 24459-24473 [doi]
- Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing GuaranteesAhmed Heakl, Sarim Hashmi, Chaimaa Abi, Celine Lee, Abdulrahman Mahmoud. 24474-24488 [doi]
- StructuThink: Reasoning with Task Transition Knowledge for Autonomous LLM-Based AgentsHaiyu Zhao, Zhenyu Guo, Chunhong Zhang, Ziyu Zhou, Zheng Hu. 24489-24506 [doi]
- Leveraging Unpaired Feedback for Long-Term LLM-based Recommendation TuningJizhi Zhang, Chongming Gao, Wentao Shi 0002, Xi-Lin Chen 0001, Jingang Wang, Xunliang Cai, Fuli Feng. 24507-24521 [doi]
- Investigating Multi-layer Representations for Dense Passage RetrievalZhongbin Xie, Thomas Lukasiewicz. 24522-24536 [doi]
- KELE: Residual Knowledge Erasure for Enhanced Multi-hop Reasoning in Knowledge EditingMengqi Zhang 0002, Bowen Fang, Qiang Liu 0006, Xiaotian Ye, Shu Wu, Pengjie Ren, Zhumin Chen, Liang Wang 0001. 24537-24552 [doi]
- Dissecting Persona-Driven Reasoning in Language Models via Activation PatchingAnsh Poonia, Maeghal Jain. 24553-24566 [doi]
- PUER: Boosting Few-shot Positive-Unlabeled Entity Resolution with Reinforcement LearningYaoshu Wang, Mengyi Yan, Wei Wang. 24567-24579 [doi]
- Toward the Automatic Detection of Word Meaning Negotiation Indicators in ConversationAina Garí Soler, Matthieu Labeau, Chloé Clavel. 24580-24596 [doi]
- Forget the Unneeded: Backdooring Large Language Models via Contrastive-enhanced Machine UnlearningShiji Yang, Shu Zhao 0005, Congyao Mei, Zhen Yang 0010, Jie Chen 0025, Fulan Qian, Zhen Duan, Yan-ping Zhang 0001. 24597-24607 [doi]
- Equipping Retrieval-Augmented Large Language Models with Document Structure AwarenessLingnan Xu, Chong Feng 0001, Kaiyuan Zhang, Liu Zhengyong, Wenqiang Xu, Fanqing Meng. 24608-24631 [doi]
- QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question AnsweringWoojun Jung, Junyeong Kim. 24632-24642 [doi]
- Thinking Before You Speak: A Proactive Test-time Scaling ApproachCong Liu 0001, Wenchang Chai, Hejun Wu, Yan Pan 0002, Pengxu Wei, Liang Lin. 24643-24650 [doi]
- Do Before You Judge: Self-Reference as a Pathway to Better LLM EvaluationWei-Hsiang Lin, Sheng-Lun Wei, Hen-Hsen Huang, Hsin-Hsi Chen. 24651-24672 [doi]
- Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image ModelsMuhammed Saeed, Shaina Raza, Ashmal Vayani, Muhammad Abdul-Mageed, Ali Emami, Shady Shehata. 24673-24695 [doi]
- ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term InteractionsBeong-woo Kwak, Minju Kim, DongHa Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, Jinyoung Yeo. 24696-24727 [doi]
- GraphCheck: Multipath Fact-Checking with Entity-Relationship GraphsHyewon Jeon, Jay Yoon Lee. 24728-24745 [doi]
- FLAMES: Improving LLM Math Reasoning via a Fine-Grained Analysis of the Data Synthesis PipelineParker Seegmiller, Kartik Mehta, Soumya Saha, Chenyang Tao, Shereen Oraby, Arpit Gupta, Tagyoung Chung, Mohit Bansal, Nanyun Peng 0001. 24746-24766 [doi]
- POW: Political Overton Windows of Large Language ModelsLeif Azzopardi, Yashar Moshfeghi. 24767-24773 [doi]
- Columbo: Expanding Abbreviated Column Names for Tabular Data Using Large Language ModelsTing Cai, Stephen Sheen, AnHai Doan. 24774-24792 [doi]
- RTTC: Reward-Guided Collaborative Test-Time ComputeJuan Pablo Muñoz, Jinjie Yuan. 24793-24809 [doi]
- AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question AnsweringZiqing Wang, Chengsheng Mao, Xiaole Wen, Yuan Luo, Kaize Ding. 24810-24832 [doi]
- Mixed Signals: Decoding VLMs' Reasoning and Underlying Bias in Vision-Language ConflictPouya Pezeshkpour, Moin Aminnaseri, Estevam Hruschka. 24833-24848 [doi]
- Mitigating Hallucination in Large Vision-Language Models through Aligning Attention Distribution to Information FlowJianfei Zhao, Feng Zhang, Xin Sun, Chong Feng 0001. 24849-24863 [doi]
- OptiSeq: Ordering Examples On-The-Fly for In-Context LearningRahul Atul Bhope, Praveen Venkateswaran, K. R. Jayaram, Vatche Isahagian, Vinod Muthusamy, Nalini Venkatasubramanian. 24864-24887 [doi]
- Dependency Parsing-Based Syntactic Enhancement of Relation Extraction in Scientific TextsDevvrat Joshi, Islem Rekik. 24888-24897 [doi]
- DIPLomA: Efficient Adaptation of Instructed LLMs to Low-Resource Languages via Post-Training Delta MergingIxak Sarasua, Ander Corral, Xabier Saralegi. 24898-24912 [doi]
- Reliability Crisis of Reference-free Metrics for Grammatical Error CorrectionTakumi Goto, Yusuke Sakai 0010, Taro Watanabe. 24913-24926 [doi]
- Who Speaks Matters: Analysing the Influence of the Speaker's Linguistic Identity on Hate ClassificationAnanya Malik, Kartik Sharma, Shaily Bhatt, Lynnette Hui Xian Ng. 24927-24937 [doi]
- Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model's EmpathyAnanya Malik, Nazanin Sabri, Melissa Karnaze, Mai ElSherief. 24938-24959 [doi]
- Active Learning for Multidialectal Arabic POS TaggingDiyam Akra, Mohammed Khalilia, Mustafa Jarrar. 24960-24973 [doi]
- Embedding-Free RAGJessica Maghakian, Raunak Sinha, Max Schettewi, Gunkirat Kaur. 24974-24985 [doi]
- Rating Roulette: Self-Inconsistency in LLM-As-A-Judge FrameworksRajarshi Haldar, Julia Hockenmaier. 24986-25004 [doi]
- Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question AnsweringYangyi Li, Mengdi Huai. 25005-25013 [doi]
- Real-World Summarization: When Evaluation Reaches Its LimitsPatrícia Schmidtová, Ondrej Dusek, Saad Mahamood. 25014-25026 [doi]
- Open-DeBias: Toward Mitigating Open-Set Bias in Language ModelsArti Rani, Shweta Singh, Nihar Ranjan Sahoo, Gaurav Kumar Nayak. 25027-25051 [doi]
- SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and LocalizationDhruv Gupta, Gayathri Ganesh Lakshmy, Yiqing Xie. 25052-25065 [doi]
- Jailbreak Distillation: Renewable Safety BenchmarkingJingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S. M. Iftekhar, Ahmed Magooda, Benjamin Van Durme, Daniel Khashabi, Kyle Jackson. 25066-25089 [doi]
- Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM SystemsAakriti Agrawal, Rohith Aralikatti, Anirudh Satheesh, Souradip Chakraborty, Amrit Singh Bedi, Furong Huang. 25090-25098 [doi]
- GreekBarBench: A Challenging Benchmark for Free-Text Legal Reasoning and CitationsOdysseas S. Chlapanis, Dimitris Galanis, Nikolaos Aletras, Ion Androutsopoulos. 25099-25119 [doi]
- Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming LanguagesYongdong Chi, Hanqing Wang 0003, Yun Chen 0007, Yan Yang, Jian Yang, Zonghan Yang, Xiao Yan, Guanhua Chen 0001. 25120-25144 [doi]
- RAC: Efficient LLM Factuality Correction with Retrieval AugmentationChangmao Li, Jeffrey Flanigan. 25145-25159 [doi]
- Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent ApproachJames Ford, Anthony Rios. 25160-25173 [doi]
- GeLoRA: Geometric Adaptive Ranks For Efficient LoRA Fine-tuningAbdessalam Ed-dib, Zhanibek Datbayev, Amine Mohamed Aboussalah. 25174-25196 [doi]
- Uncovering Scaling Laws for Large Language Models via Inverse ProblemsArun Verma, Zhaoxuan Wu, Zijian Zhou 0006, Xiaoqiang Lin, Zhiliang Chen, Rachael Hwee Ling Sim, Rui Qiao 0006, Jingtan Wang 0001, Nhung Bui, Xinyuan Niu 0001, Wenyang Hu, Gregory Kang Ruey Lau, Zi-Yu Khoo, Zitong Zhao, Xinyi Xu, Apivich Hemachandra, See-Kiong Ng, Bryan Kian Hsiang Low. 25197-25211 [doi]
- UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting TargetsWenyu Wang, Mengqi Zhang 0002, Xiaotian Ye, Zhaochun Ren, Pengjie Ren, Zhumin Chen. 25212-25227 [doi]
- FicSim: A Dataset for Multi-Faceted Semantic Similarity in Long-Form FictionNatasha Johnson, Amanda Bertsch, Maria-Emil Deal, Emma Strubell. 25228-25246 [doi]
- Masked Diffusion Captioning for Visual Feature LearningChao Feng, Zihao Wei, Andrew Owens. 25247-25263 [doi]
- Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math ReasoningBohan Yao, Vikas Yadav. 25264-25282 [doi]
- Enhancing Goal-oriented Proactive Dialogue Systems via Dynamic Multi-dimensional Consistency OptimizationDidi Zhang, Yaxin Fan, Peifeng Li, Qiaoming Zhu. 25283-25296 [doi]
- Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive SurveyZirui Song, Bin Yan 0004, Yuhan Liu 0023, Miao Fang 0001, Mingzhe Li 0001, Rui Yan 0001, Xiuying Chen. 25297-25311 [doi]
- Who's the Author? How Explanations Impact User Reliance in AI-Assisted Authorship AttributionCalvin Bao, Connor Baumler, Hal Daumé III, Marine Carpuat. 25312-25330 [doi]
- UniSpeaker: A Unified Approach for Multimodality-driven Speaker GenerationZhengyan Sheng, Zhihao Du, Heng Lu 0002, Shiliang Zhang, Zhen-Hua Ling. 25331-25346 [doi]
- On the Fine-Grained Planning Abilities of VLM Web AgentsSurgan Jandial, Yinong Oliver Wang, Andrea Bajcsy, Fernando De la Torre. 25347-25380 [doi]
- InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models with Human FeedbackHenry Hengyuan Zhao, Wenqi Pei, Yifei Tao, Haiyang Mei, Mike Zheng Shou. 25381-25400 [doi]
- ReFLAIR: Enhancing Multimodal Reasoning via Structured Reflection and Reward-Guided LearningJiazhou Ji, Xinru Lu. 25401-25413 [doi]
- ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font AnnotationsBowen Jiang, Yuan Yuan, Xinyi Bai, Zhuoqun Hao, Alyson Yin, Yaojie Hu, Wenyu Liao, Lyle H. Ungar, Camillo Jose Taylor. 25414-25425 [doi]
- STA-CoT: Structured Target-Centric Agentic Chain-of-Thought for Consistent Multi-Image Geological ReasoningBeibei Yu, Tao Shen, Ling Chen. 25426-25444 [doi]
- Can Language Models Follow Multiple Turns of Entangled Instructions?Chi Han, Xin Liu 0039, Haodong Wang, Shiyang Li, Jingfeng Yang 0001, Haoming Jiang, Zhengyang Wang, Qingyu Yin, Liang Qiu, Changlong Yu, Yifan Gao 0001, Zheng Li 0018, Bing Yin, Jingbo Shang, Heng Ji 0001. 25445-25460 [doi]
- How to Generalize the Detection of AI-Generated Text: Confounding NeuronsClaudio Borile, Carlo Abrate. 25461-25476 [doi]
- SparsePO: Controlling Preference Alignment of LLMs via Sparse Token MasksFenia Christopoulou, Ronald Cardenas, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang 0012. 25477-25503 [doi]
- We Argue to Agree: Towards Personality-Driven Argumentation-Based Negotiation Dialogue Systems for TourismPriyanshu Priya, Saurav Dudhate, Desai Yasheshbhai, Asif Ekbal. 25504-25536 [doi]
- Towards the Roots of the Negation Problem: A Multilingual NLI Dataset and Model Scaling AnalysisTereza Vrabcová, Marek Kadlcík, Petr Sojka, Michal Stefánik, Michal Spiegel. 25537-25551 [doi]
- Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement LearningSai Ashish Somayajula, Bokai Hu, Qi Cao, Xin Pan, Pengtao Xie. 25552-25567 [doi]
- HATECAT-TR: A Hate Speech Span Detection and Categorization Dataset for TurkishHasan Kerem Seker, Gökçe Uludogan, Pelin Önal, Arzucan Özgür. 25568-25579 [doi]
- DM-Codec: Distilling Multimodal Representations for Speech TokenizationMd Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, AKMMahbubur Rahman, Aman Chadha, Tariq Iqbal, M. Ashraful Amin, Md Mofijul Islam, Amin Ahsan Ali. 25580-25602 [doi]
- LCAN: A Label-Aware Contrastive Attention Network for Multi-Intent Recognition and Slot Filling in Task-Oriented Dialogue SystemsShuli Zhang, Zhiqiang You, Xiao Xiang Qi, Peng Liu, Gaode Wu, Kan Xia, Shenguang Huang. 25603-25612 [doi]
- Low-Resource Languages LLM Disinformation is Within Reach: The Case of WalliserdeutschAndrei Kucharavy, Sherine Seppey, Cyril Vallez, Dimitri Percia David, Ljiljana Dolamic. 25613-25625 [doi]
- Exploring and Controlling Diversity in LLM-Agent ConversationKuanchao Chu, Yi-Pei Chen 0001, Hideki Nakayama. 25626-25644 [doi]
- Agentic-ToM: Cognition-Inspired Agentic Processing For Enhancing Theory of Mind ReasoningSneheel Sarangi, Chetan Talele, Hanan Salam. 25645-25661 [doi]
- Can We Edit LLMs for Long-Tail Biomedical Knowledge?Xinhao Yi, Jake Lever, Kevin Bryson 0001, Zaiqiao Meng. 25662-25679 [doi]
- GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric ReasoningGuizhen Chen, Weiwen Xu, Hao Zhang 0048, Hou Pong Chan, Deli Zhao, Anh Tuan Luu, Yu Rong 0001. 25680-25688 [doi]
- CM-Align: Consistency-based Multilingual Alignment for Large Language ModelsXue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang 0001, Yufeng Chen 0005, Jinan Xu, Jie Zhou 0016. 25689-25702 [doi]
- Cache Saver: A Modular Framework for Efficient, Affordable, and Reproducible LLM InferenceNearchos Potamitis, Lars Henning Klein, Bardia Mohammadi, Chongyang Xu, Attreyee Mukherjee, Niket Tandon, Laurent Bindschaedler, Akhil Arora 0001. 25703-25724 [doi]
- Evaluating Cultural Knowledge and Reasoning in LLMs Through Persian AllusionsMelika Nobakhtian, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar. 25725-25737 [doi]
- Evolving Stances on Reproducibility: A Longitudinal Study of NLP and ML Researchers' Views and Experience of ReproducibilityCraig Thomson, Ehud Reiter, João Sedoc, Anya Belz. 25738-25760 [doi]
- KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data NarrationYajing Yang, Tony Deng, Min-Yen Kan. 25761-25785 [doi]