Abstract is missing.
- HEARTS: A Holistic Framework for Explainable, Sustainable and Robust Text Stereotype DetectionTheo King, Zekun Wu 0003, Adriano S. Koshiyama, Emre Kazim, Philip C. Treleaven. 1-18 [doi]
- Roles of MLLMs in Visually Rich Document Retrieval for RAG: A SurveyXiantao Zhang. 19-36 [doi]
- With Privacy, Size Matters: On the Importance of Dataset Size in Differentially Private Text RewritingStephen Meisenbacher, Florian Matthes. 37-47 [doi]
- From Anger to Joy: How Nationality Personas Shape Emotion Attribution in Large Language ModelsMahammed Kamruzzaman, Abdullah Al-Monsur, Gene Louis Kim, Anshuman Chhabra. 48-68 [doi]
- REGULAR: A Framework for Relation-Guided Multi-Span Question GenerationJiayi Lin 0008, Chenyang Zhang 0004, Bingxuan Hou, Dongyu Zhang 0003, Qingqing Hong, Junli Wang 0001. 69-85 [doi]
- Feature Decomposition-Augmentation Network for Multimodal Sentiment AnalysisDapeng Yin, Bingxuan Hou, Mengna Gao, Shuyue Zhu, Jun-li Wang 0001. 86-98 [doi]
- CSPLADE: Learned Sparse Retrieval with Causal Language ModelsZhichao Xu 0001, Aosong Feng, Yijun Tian 0006, Haibo Ding, Lin Lee Cheong. 99-114 [doi]
- Bias Amplification: Large Language Models as Increasingly Biased MediaZe Wang, Zekun Wu 0003, Yichi Zhang, Xin Guan, Navya Jain, Qinyang Lu, Saloni Gupta, Adriano S. Koshiyama. 115-132 [doi]
- STAR: Self-Automated Back-Querying for Production Data GenerationKellen Tan Cheng, Anna Lisa Gentile, Chad DeLuca, Guang-Jie Ren. 133-148 [doi]
- GQSA: Group Quantization and Sparsity for Accelerating Large Language Model InferenceChao Zeng, Songwei Liu, Shu Yang, Fangmin Chen, Xing Mei, Lean Fu. 149-165 [doi]
- Explainable Ethical Assessment on Human Behaviors by Generating Conflicting Social NormsYuxi Sun 0011, Wei Gao 0001, Hongzhan Lin 0001, Jing Ma 0004, Wenxuan Zhang 0001. 166-184 [doi]
- Enhancing ID and Text Fusion via Alternative Training in Session-based RecommendationJuanhui Li, Haoyu Han 0001, Zhikai Chen, Harry Shomer, Wei Jin 0009, Amin Javari, Hui Liu 0031. 185-199 [doi]
- Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data GenerationZijian Li 0023, Jingjing Fu, Lei Song 0001, Jiang Bian 0002, Jun Zhang 0004, Rui Wang 0028. 200-234 [doi]
- Topology-Aware Gated Graph Neural Network for Social Bot DetectionPi Jiebin, Yantuan Xian, Yuxin Huang 0004, Yan Xiang, Ran Song 0002, Zhengtao Yu 0001. 235-245 [doi]
- Minimizing Queries, Maximizing Impact: Adaptive Score-Based Attack and Defense for Sentiment AnalysisYigit Efe Enhos, Shira Wein, Scott Alfeld. 246-258 [doi]
- Generating Text from Uniform Meaning RepresentationEmma Markle, Reihaneh Iranmanesh, Shira Wein. 259-271 [doi]
- Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?Kai Yan, Yufei Xu, Zhengyin Du, Xuesong Yao, Zheyu Wang, Xiaowen Guo, Jiecao Chen. 272-291 [doi]
- Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-JudgeLin Shi, Chiyu Ma, Wenhua Liang, Xingjian Diao, Weicheng Ma, Soroush Vosoughi. 292-314 [doi]
- Item-Language Model: Improving Large Language Model for Recommendation via Item-Language Representation LearningLi Yang, Anushya Subbiah, Hardik Patel, Judith Yue Li, Yanwei Song, Reza Mirghaderi, Vikram Aggarwal, Fuli Feng, Zenglin Xu, Dongfang Liu, Qifan Wang 0001. 315-330 [doi]
- Breaking Language Barriers or Reinforcing Bias? A Study of Gender and Racial Disparities in Multilingual Contrastive Vision Language ModelsZahraa Al Sahili, Ioannis Patras, Matthew Purver. 331-352 [doi]
- A Scalable Pipeline for Estimating Verb Frame Frequencies Using Large Language ModelsAdam M. Morgan, Adeen Flinker. 353-371 [doi]
- Ibom NLP: A Step Toward Inclusive Natural Language Processing for Nigeria's Minority LanguagesOluwadara Kalejaiye, Luel Hagos Beyene, David Ifeoluwa Adelani, Mmekut-Mfon Gabriel Edet, Aniefon Daniel Akpan, Eno-Abasi Urua, Anietie Andy. 372-382 [doi]
- An Analysis of the Impact of Problem Paraphrasing on LLM-Based Mathematical Problem SolvingYerim Han, Hyein Seo, Hyuk Namgoong, Sangkeun Jung. 383-395 [doi]
- Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis ApproachesChanghao Pan, Dongyu Yao, Yu Zhang 0126, Wenxiang Guo, Jingyu Lu, Zhiyuan Zhu, Zhou Zhao 0001. 396-416 [doi]
- ASAudio: A Survey of Advanced Spatial Audio ResearchZhiyuan Zhu, Yu Zhang 0126, Wenxiang Guo, Changhao Pan, Zhou Zhao 0001. 417-442 [doi]
- MossNet: Mixture of State-Space Experts is a Multi-Head AttentionShikhar Tuli, James Seale Smith, Haris Jeelani, Chi-Heng Lin, Abhishek Patel, Vasili Ramanishka, Yen-Chang Hsu, Hongxia Jin. 443-458 [doi]
- LLM-Based Behavior Prediction for Social Media Users with Continuous MemoryKun Li, Chengwei Dai, Wei Zhou 0019, Songlin Hu 0001. 459-474 [doi]
- Hassles and Uplifts Detection on Social Media NarrativesJiyu Chen, Sarvnaz Karimi, Diego Mollá, Andreas Duenser, Maria Kangas, Cécile Paris. 475-489 [doi]
- Role-Aware Language Models for Secure and Contextualized Access Control in OrganizationsSaeed Almheiri, Yerulan Kongrat, Adrian Santosh, Ruslan Tasmukhanov, Josemaria Vera, Muhammad Dehan Al Kautsar, Fajri Koto. 490-511 [doi]
- SEA-LION: Southeast Asian Languages in One NetworkRaymond Ng, Thanh Ngan Nguyen, Yuli Huang, Ngee Chia Tai, Wai Yi Leong, Wei Qi Leong, Xianbin Yong, Jian Gang Ngui, Yosephine Susanto, Nicholas Cheng, Hamsawardhini Rengarajan, Peerat Limkonchotiwat, Adithya Venkatadri Hulagadri, Kok Wai Teng, Yeo Yeow Tong, Bryan Siow, Wei Yi Teo, Choon Meng Tan, Brandon Ong, Zhi Hao Ong, Jann Railey Montalan, Adwin Chan, Sajeban Antonyrex, Ren Lee, Esther Choa, David Ong Tat-Wee, Bing Jie Darius Liu, William-Chandra Tjhi, Erik Cambria, Leslie Teo. 512-526 [doi]
- Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-CorrectionHongjin Kim, Jaewook Lee, KiYoung Lee, Jong-Hun Shin, Soojong Lim, Oh-Woog Kwon. 527-542 [doi]
- Multilingual Iterative Model Pruning: What Matters?Haryo Akbarianto Wibowo, Haiyue Song, Hideki Tanaka, Masao Utiyama, Alham Fikri Aji, Raj Dabre. 543-571 [doi]
- Counterfactual Evaluation for Blind Attack Detection in LLM-based Evaluation SystemsLijia Liu, Takumi Kondo, Kyohei Atarashi, Koh Takeuchi 0001, Jiyi Li, Shigeru Saito, Hisashi Kashima. 572-584 [doi]
- Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMsYohan Mathew, Ollie Matthews, Robert McCarthy, Joan Velja, Christian Schröder de Witt, Dylan Cope, Nandi Schoots. 585-624 [doi]
- A Survey on LLM-Assisted Clinical Trial RecruitmentShrestha Ghosh, Moritz Schneider, Carina Reinicke, Carsten Eickhoff. 625-646 [doi]
- The Learning Dynamics of Subword Segmentation for Morphologically Diverse LanguagesFrancois Meyer, Jan Buys. 647-661 [doi]
- PRALEKHA: Cross-Lingual Document Alignment for Indic LanguagesSanjay Suryanarayanan, Haiyue Song, Mohammed Safi Ur Rahman Khan, Anoop Kunchukuttan, Raj Dabre. 662-676 [doi]
- Structured Document Translation via Format Reinforcement LearningHaiyue Song, Johannes Eschbach-Dymanus, Hour Kaing, Sumire Honda, Hideki Tanaka, Bianka Buschbeck, Masao Utiyama. 677-697 [doi]
- StuD: A Multimodal Approach for Stuttering Detection with RAG and Fusion StrategiesPragya Khanna, Priyanka Kommagouni, Vamshi Raghu Simha Narasinga, Anil Vuppala. 698-707 [doi]
- Deconstructing Attention: Investigating Design Principles for Effective Language ModelingHuiyin Xue, Nafise Sadat Moosavi, Nikolaos Aletras. 708-727 [doi]
- Fine-Tuning on Noisy Instructions: Effects on Generalization and PerformanceAhmed Alajrami, Xingwei Tan, Nikolaos Aletras. 728-742 [doi]
- Captions Speak Louder than Images: Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction DataXinyi Ling, Hanwen Du, Bo Peng 0009, Zhihui Zhu, Xia Ning. 743-768 [doi]
- EcomMMMU: Strategic Utilization of Visuals for Robust Multimodal E-commerce ModelsXinyi Ling, Hanwen Du, Zhihui Zhu, Xia Ning. 769-790 [doi]
- Multilingual, Not Multicultural: Uncovering the Cultural Empathy Gap in LLMs through a Comparative Empathetic Dialogue BenchmarkWoojin Lee, Yujin Sim, Hongjin Kim, Harksoo Kim. 791-809 [doi]
- HiPPO: Exploring A Novel Hierarchical Pronunciation Assessment Approach for Spoken LanguagesBi-Cheng Yan, Hsin-Wei Wang, Fu-An Chao, Tien-Hong Lo, Yung-Chang Hsu, Berlin Chen. 810-823 [doi]
- Positional Bias in Long-Document Ranking: Impact, Assessment, and MitigationLeonid Boytsov, David Akinpelu, Nipun Katyal, Tianyi Lin, Fangwei Gao, Yutian Zhao, Jeffrey Huang, Eric Nyberg. 824-856 [doi]
- How Reliable are Causal Probing Interventions?Marc E. Canby, Adam Davies, Chirag Rastogi, Julia Hockenmaier. 857-878 [doi]
- wavCSE: Learning Fixed-size Unified Speech Embeddings via Feature-based Multi-Task LearningBraveenan Sritharan, Uthayasanker Thayasivam. 879-887 [doi]
- Unveiling Empathic Triggers in Online Interactions via Empathy Cause IdentificationCalliope Chloe Bandera, Gyeongeun Lee, Natalie Parde. 888-899 [doi]
- Assessing the Limits of In-Context Learning beyond Functions using Partially Ordered RelationDebanjan Dutta, Faizanuddin Ansari, Swagatam Das. 900-918 [doi]
- ProSwitch: Knowledge-Guided Instruction Tuning to Switch Between Professional and Non-Professional ResponsesChang Zong, Yuyan Chen, Weiming Lu 0001, Jian Shao 0001, Yongfeng Huang, Heng Chang, Yueting Zhuang. 919-935 [doi]
- Reasoning Models Reason Well, Until They Don'tRevanth Rameshkumar, Jimson Huang, Yunxin Sun 0001, Fei Xia, Abulhair Saparov. 936-956 [doi]
- Chain-of-Query: Unleashing the Power of LLMs in SQL-Aided Table Understanding via Multi-Agent CollaborationSongyuan Sui, Hongyi Liu, Serena Liu, Li Li 0035, Soo Hyun Choi, Rui Chen 0012, Xia Hu 0001. 957-986 [doi]
- Multimodal Language Models for Financial Forecasting from Interleaved Sequences of Text and Time SeriesRoss Koval, Nicholas Andrews, Xifeng Yan. 987-1001 [doi]
- Comparing Language Models of Different Scales for Security-Focused Tabular Query Generation and ReasoningVarivashya Poladi, Sandipan Dandapat. 1002-1016 [doi]
- Generate but Verify: Answering with Faithfulness in RAG-based Question AnsweringSimone Filice, Elad Haramaty, Guy Horowitz, Zohar S. Karnin, Liane Lewin-Eytan, Alex Shtoff. 1017-1037 [doi]
- Critical Thinking: Which Kinds of Complexity Govern Optimal Reasoning Length?Celine Lee, Alexander M. Rush, Keyon Vafa. 1038-1060 [doi]
- MAJI: A Multi-Agent Workflow for Augmenting Journalistic InterviewsKaiwen Guo, Yimeng Wu. 1061-1083 [doi]
- How Aligned Are Unimodal Language and Graph Encodings of Chemical Molecules?Congfeng Cao, Zhi Zhang, Jelke Bloem, Khalil Sima'an. 1084-1097 [doi]
- Interpreting Multi-Attribute Confounding through Numerical Attributes in Large Language ModelsHirohane Takagi, Gouki Minegishi, Shota Kizawa, Issey Sukeda, Hitomi Yanaka. 1098-1115 [doi]
- EmplifAI: a Fine-grained Dataset for Japanese Empathetic Medical Dialogues in 28 Emotion LabelsWan-Jou She, Lis Pereira, Fei Cheng 0002, Sakiko Yahata, Panote Siriaraya, Eiji Aramaki. 1116-1131 [doi]
- Beyond Classification: Towards Speech Emotion Reasoning with Multitask AudioLLMsWenyu Zhang, Yingxu He, Geyu Lin, Zhuohan Liu, Shuo Sun, Bin Wang 0040, Xunlong Zou, Jeremy H. M. Wong, Qiongqiong Wang, Hardik Bhupendra Sailor, Nancy F. Chen, AiTi Aw. 1132-1148 [doi]
- On the Convergence of Moral Self-Correction in Large Language ModelsGuangliang Liu, Haitao Mao, Bochuan Cao, Xitong Zhang, Zhiyu Xue, Rongrong Wang, Kristen Marie Johnson. 1149-1165 [doi]
- FastVLM: Self-Speculative Decoding for Fast Vision-Language Model InferenceDivya Jyoti Bajpai, Manjesh Kumar Hanawal. 1166-1183 [doi]
- Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal ConstructionsWesley Scivetti, Melissa Torgbi, Mollie Shichman, Taylor Pellegrin, Austin Blodgett, Claire Bonial, Harish Tayyar Madabushi. 1184-1201 [doi]
- Improving Sign Language Understanding with a Multi-Stream Masked Autoencoder Trained on ASL VideosJunwen Mo, MinhDuc Vo, Hideki Nakayama. 1202-1218 [doi]
- Quantifying Phonosemantic Iconicity Distributionally in 6 LanguagesGeorge Flint, Kaustubh Kislay. 1219-1237 [doi]
- Fine-grained Confidence Estimation for Spurious Correctness Detection in Large Language ModelsAi Ishii, Naoya Inoue, Hisami Suzuki, Satoshi Sekine. 1238-1257 [doi]
- Observing Micromotives and Macrobehavior of Large Language ModelsYuyang Cheng, Xingwei Qu, Tomas Goldsack, Chenghua Lin, Chung-Chi Chen. 1258-1276 [doi]
- SciHallu: A Multi-Granularity Hallucination Detection Dataset for Scientific WritingAdiba Ibnat Hossain, Sagnik Ray Choudhury, Hamed Alhoori. 1277-1304 [doi]
- Enhancing Investment Opinion Ranking through Argument-Based Sentiment AnalysisChung-Chi Chen 0001, Hen-Hsen Huang, Hsin-Hsi Chen, Hiroya Takamura, Ichiro Kobayashi, Yusuke Miyao. 1305-1315 [doi]
- A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLPShinnosuke Ono, Issey Sukeda, Takuro Fujii, Kosei Buma, Shunsuke Sasaki. 1316-1332 [doi]
- Investigating Feasibility of Large Language Model Agent Collaboration in Minecraft and Comparison with Human-Human CollaborationYuki Hirota, Ryuichiro Higashinaka. 1333-1347 [doi]
- Is Fine-Tuning an Effective Solution? Reassessing Knowledge Editing for Unstructured DataHao Xiong, Chuanyuan Tan, Wenliang Chen. 1348-1361 [doi]
- Optimizing the Arrangement of Citations in Related Work SectionMasashi Oshika, Ryohei Sasano. 1362-1373 [doi]
- Ability Transfer Through Language MixingPetr Hyner, Jan Mrógala, Jan Hula. 1374-1381 [doi]
- Enhancing Training Data Quality through Influence Scores for Generalizable Classification: A Case Study on Sexism DetectionRabiraj Bandyopadhyay, Dennis Assenmacher, Jose Maria Alonso-Moral, Claudia Wagner 0001. 1382-1403 [doi]
- Mitigating Label Length Bias in Large Language ModelsMario Sanz-Guerrero, Katharina von der Wense. 1404-1420 [doi]
- Social Bias in Popular Question-Answering BenchmarksAngelie Kraft, Judith Simon 0001, Sonja Schimmler. 1421-1438 [doi]
- ProofTeller: Exposing recency bias in LLM reasoning and its side effects on communicationMayank Jobanputra, Alisa Kovtunova, Brisca Balthes, Fedor Grigoryevich Pogulskiy, Yifan Wang 0019, Stefan Borgwardt, Vera Demberg. 1439-1462 [doi]
- Relation Extraction or Pattern Matching? Unravelling the Generalisation Limits of Language Models for Biographical REVarvara Arzt, Allan Hanbury, Michael Wiegand, Gábor Recski, Terra Blevins. 1463-1484 [doi]
- Whose story is it? Personalizing story generation by inferring author stylesNischal Ashok Kumar, Chau Minh Pham, Mohit Iyyer, Andrew Lan. 1485-1540 [doi]
- The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation FailureNiyati Bafna, Tianjian Li, Kenton Murray, David R. Mortensen, David Yarowsky, Hale Sirin, Daniel Khashabi. 1541-1568 [doi]
- TaCL-CoMoE: Task-adaptive Contrastive Learning with Cooperative Mixture of Experts for Multi-task Social Media AnalysisXingren Wang, Hongde Liu 0002, Shanhong Liu, Feiyang Meng, Chenyuan He, Senbin Zhu, Li Zechen, Yuxiang Jia. 1569-1581 [doi]
- Towards Generalizable Generic Harmful Speech Datasets for Implicit Hate Speech DetectionSaad Almohaimeed, Saleh Almohaimeed, Damla Turgut, Ladislau Bölöni. 1582-1592 [doi]
- SEAGraph: Unveiling the Whole Story of Paper Review CommentsJianxiang Yu 0001, Jiaqi Tan 0006, Zichen Ding 0002, Jiapeng Zhu 0002, Jiahao Li, Yao Cheng 0009, Qier Cui, Yunshi Lan, Yao Liu, Xiang Li 0067. 1593-1614 [doi]
- A Comparative Study of Human-operated and AI-driven Guidance with a Teleoperated Mobile RobotAo Guo, Shota Mochizuki, Sanae Yamashita, Kenya Hoshimure, Jun Baba, Ryuichiro Higashinaka. 1615-1627 [doi]
- R²-CoD: Understanding Text-Graph Complementarity in Relational Reasoning via Knowledge Co-DistillationZhen Wu, Ritam Dutt, Luke Breitfeller, Armineh Nourbakhsh, Siddharth Parekh, Carolyn P. Rosé. 1628-1652 [doi]
- ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in ThaiSurapon Nonesung, Teetouch Jaknamon, Sirinya Chaiophat, Natapong Nitarach, Chanakan Wittayasakpan, Warit Sirichotedumrong, Adisai Na-Thalang, Kunat Pipatanakul. 1653-1675 [doi]
- An Adversary-Resistant Multi-Agent LLM System via Credibility ScoringSana Ebrahimi, Mohsen Dehghankar, Abolfazl Asudeh. 1676-1693 [doi]
- Are LLMs Rigorous Logical Reasoners? Empowering Natural Language Proof Generation by Stepwise Decoding with Contrastive LearningYing Su, Mingwen Liu, Zhijiang Guo. 1694-1708 [doi]
- NyayaRAG: Realistic Legal Judgment Prediction with RAG under the Indian Common Law SystemShubham Kumar Nigam, Balaramamahanthi Deepak Patnaik, Shivam Mishra, Ajay Varghese Thomas, Noel Shallum, Kripabandhu Ghosh, Arnab Bhattacharya 0001. 1709-1726 [doi]
- Exploring Working Memory Capacity in LLMs: From Stressors to Human-Inspired StrategiesEunjin Hong, Sumin Cho, Juae Kim. 1727-1744 [doi]
- CLASSER: Cross-lingual Annotation Projection enhancement through Script Similarity for Fine-grained Named Entity RecognitionPrachuryya Kaushik, Ashish Anand. 1745-1760 [doi]
- On the Interplay between Positional Encodings, Morphological Complexity, and Word Order FlexibilityKushal Tatariya, Wessel Poelman, Miryam de Lhoneux. 1761-1778 [doi]
- Doppelganger-JC: Benchmarking the LLMs' Understanding of Cross-Lingual Homographs between Japanese and ChineseYuka Kitamura, Jiahao Huang, Akiko Aizawa. 1779-1794 [doi]
- Don't Take it Literally! Idiom-aware Vietnamese Translation via In-context LearningLuan Thanh Nguyen, Parisa KordJamshidi. 1795-1814 [doi]
- LLMs Do Not See Age: Assessing Demographic Bias in Automated Systematic Review SynthesisFavour Yahdii Aghaebe, Elizabeth Williams, Tanefa Apekey, Nafise Sadat Moosavi. 1815-1833 [doi]
- KERLQA: Knowledge-Enhanced Reinforcement Learning for Question Answering in Low-resource LanguagesSello Ralethe, Jan Buys. 1834-1846 [doi]
- The Visual Counter Turing Test (VCT²): A Benchmark for Evaluating AI-Generated Image Detection and the Visual AI Index (V_AI)Nasrin Imanpour, Abhilekh Borah, Shashwat Bajpai, Subhankar Ghosh, Sainath Reddy Sankepally, Hasnat Md Abdullah, Nishoak Kosaraju, Shreyas Dixit, Ashhar Aziz, Shwetangshu Biswas, Vinija Jain, Aman Chadha, Song Wang, Amit P. Sheth, Amitava Das 0001. 1847-1862 [doi]
- Hildoc: Leveraging Hilbert Curve Representation for Accurate and Efficient Document RetrievalMuhammad Al-Qurishi, Zhaozhi Qian, Faroq Al-Tam, Riad Souissi. 1863-1876 [doi]
- Rethinking what matters: Effective and Robust Multilingual Realignment for Low-Resource LanguagesQuang-Phuoc Nguyen, David Anugraha, Félix Gaschi, Jun Bin Cheng, En-Shiun Annie Lee. 1877-1905 [doi]
- Decode Like a Clinician: Enhancing LLM Fine-Tuning with Temporal Structured Data RepresentationDaniel Fadlon, David Dov, Aviya Bennett, Daphna Heller-Miron, Gad Levy, Kfir Bar, Ahuva Weiss-Meilik. 1906-1922 [doi]
- Curved Worlds, Clear Boundaries: Generalizing Speech Deepfake Detection using Hyperbolic and Spherical Geometry SpacesFarhan Sheth, Girish, Mohd Mujtaba Akhtar, Muskaan Singh. 1923-1932 [doi]
- MELAC: Massive Evaluation of Large Language Models with Alignment of Culture in Persian LanguageFarhan Farsi, Farnaz Aghababaloo, Shahriar Shariati Motlagh, Parsa Ghofrani, MohammadAli SadraeiJavaheri, Shayan Bali, Amirhossein Shabani, Farbod Bijary, Ghazal Zamaninejad, Amirmohammad Salehoof, Saeedeh Momtazi. 1933-1950 [doi]
- Atomic Consistency Preference Optimization for Long-Form Question AnsweringJingfeng Chen, Raghuveer Thirukovalluru, Junlin Wang, Kaiwei Luo, Bhuwan Dhingra. 1951-1963 [doi]
- Breaking Bad: Norms for Valence, Arousal, and Dominance for over 10k English Multiword ExpressionsSaif M. Mohammad. 1964-1988 [doi]
- A Multimodal Recaptioning Framework to Account for Perceptual Diversity Across Languages in Vision-Language ModelingKyle Buettner, Jacob Emmerson, Adriana Kovashka. 1989-2006 [doi]
- Characterizing Mamba's Selective Memory using Auto-EncodersTamanna Hossain, Robert L. Logan IV, Ganesh Jagadeesan, Sameer Singh 0001, Joel R. Tetreault, Alejandro Jaimes. 2007-2022 [doi]
- Task-Aligned Tool Recommendation for Large Language ModelsHang Gao, Yongfeng Zhang. 2023-2045 [doi]
- INTERCHART: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart InformationAnirudh Iyengar Kaniyar Narayana Iyengar, Srija Mukhopadhyay, Adnan Qidwai, Shubhankar Singh, Dan Roth 0001, Vivek Gupta 0001. 2046-2067 [doi]
- LangCompress: Language-Aware Compression of Large Language ModelsDieu-Hien Nguyen, Nguyen-Khang Le, Truong Dinh Do, Le-Minh Nguyen 0001. 2068-2077 [doi]
- The Confidence Paradox: Can LLM Know When It's Wrong?Sahil Tripathi, Md Tabrez Nafis, Imran Hussain, Jiechao Gao. 2078-2087 [doi]
- DharmaBench: Evaluating Language Models on Buddhist Texts in Sanskrit and TibetanKai Golan Hashiloni, Shay Cohen, Asaf Shina, Jingyi Yang, Orr Meir Zwebner, Nicola Bajetta, Guy Bilitski, Rebecca Sundén, Guy Maduel, Ryan Conlon, Ari Barzilai, Daniel Mass, Shanshan Jia, Aviv Naaman, Sonam Choden, Sonam Jamtsho, Yadi Qu, Harunaga Isaacson, Dorji Wangchuk, Shai Fine, Orna Almogi, Kfir Bar. 2088-2110 [doi]
- BhashaSetu: Cross-Lingual Knowledge Transfer from High-Resource to Extreme Low-Resource LanguagesSubhadip Maji, Arnab Bhattacharya 0001. 2111-2129 [doi]
- Are Humans as Brittle as Large Language Models?Jiahui Li, Sean Papay, Roman Klinger. 2130-2155 [doi]
- Large Temporal Models: Unlocking Temporal Understanding in LLMs for Temporal Relation ClassificationOmri Homburger, Kfir Bar. 2156-2171 [doi]
- What Are They Talking About? A Benchmark of Knowledge-Grounded Discussion SummarizationWeixiao Zhou, Junnan Zhu, Gengyao Li, Xianfu Cheng, Xinnian Liang, Feifei Zhai, Zhoujun Li 0001. 2172-2191 [doi]
- CtrlShift: Steering Language Models for Dense Quotation Retrieval with Dynamic PromptsChuang Liang, Wei Li, Yanqiu Shao. 2192-2204 [doi]
- PerCoR: Evaluating Commonsense Reasoning in Persian via Multiple-Choice Sentence CompletionMorteza Alikhani, Mohammadtaha Bagherifard, Erfan Zinvandi, Mehran Sarmadi. 2205-2224 [doi]
- Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video RetrievalShubhashis Roy Dipta, Francis Ferraro. 2225-2245 [doi]
- VideoChain: A Transformer-Based Framework for Multi-hop Video Question GenerationArpan Phukan, Anupam Pandey, Deepjyoti Bodo, Asif Ekbal. 2246-2266 [doi]
- Interpreting the Effects of Quantization on LLMsManpreet Singh, Hassan Sajjad 0001. 2267-2281 [doi]
- Large Language Models Encode Semantics and Alignment in Linearly Separable RepresentationsBaturay Saglam, Paul Kassianik, Blaine Nelson, Sajana Weerawardhena, Yaron Singer, Amin Karbasi. 2282-2303 [doi]
- Beyond statistical significance: Quantifying uncertainty and statistical variability in multilingual and multitask NLP evaluationJonne Sälevä, Duygu Ataman, Constantine Lignos. 2304-2321 [doi]
- What Would You Ask When You First Saw a²+b²=c²? Evaluating LLM on Curiosity-Driven Question GenerationShashidhar Reddy Javaji, Zining Zhu. 2322-2354 [doi]
- Can AI Validate Science? Benchmarking LLMs on Claim →Evidence Reasoning in AI PapersShashidhar Reddy Javaji, Yupeng Cao, Haohang Li, Yangyang Yu, Nikhil Muralidhar, Zining Zhu 0005. 2355-2379 [doi]
- More Than a Score: Probing the Impact of Prompt Specificity on LLM Code GenerationYangtian Zi, Harshitha Menon, Arjun Guha. 2380-2402 [doi]
- Integrating Video and Text: A Balanced Approach to Multimodal Summary Generation and EvaluationGalann Pennec, Zhengyuan Liu, Nicholas Asher, Philippe Muller, Nancy F. Chen. 2403-2426 [doi]
- PMPO: A Self-Optimizing Framework for Creating High-Fidelity Measurement Tools for Social Bias in Large Language ModelsZeqiang Wang, Yuqi Wang, Xinyue Wu, Chenxi Li, Yiran Liu, Linghan Ge, Zhan Yu, Jiaxin Shi, Suparna De. 2427-2440 [doi]
- ELR-1000: A Community-Generated Dataset for Endangered Indic Indigenous LanguagesNeha Joshi, Pamir Gogoi, Aasim Mirza, Aayush Jansari, Aditya Yadavalli, Ayushi Pandey, Arunima Shukla, Deepthi Sudharsan, Kalika Bali, Vivek Seshadri. 2441-2457 [doi]
- Pragmatic Theories Enhance Understanding of Implied Meanings in LLMsTakuma Sato, Seiya Kawano, Koichiro Yoshino. 2458-2477 [doi]
- IndicClaimBuster: A Multilingual Claim Verification DatasetPritam Pal, Shyamal Krishna Jana, Dipankar Das 0001. 2478-2489 [doi]
- IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute RandomizationAhmed Frikha 0002, Nassim Walha, Krishna Kanth Nakka, Ricardo Mendes, Xue Jiang, Xuebing Zhou. 2490-2501 [doi]
- Crypto-LLM: Two-Stage Language Model Pre-training with Ciphered and Natural Language DataYohei Kobashi, Fumiya Uchiyama, Takeshi Kojima, Andrew Gambardella, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo. 2502-2520 [doi]
- Not Just a Piece of Cake: Cross-Lingual Fine-Tuning for Idiom IdentificationOfri Hefetz, Kai Golan Hashiloni, Alon Mannor, Kfir Bar. 2521-2537 [doi]
- Hallucinations in Code Change to Natural Language Generation: Prevalence and Evaluation of Detection MetricsChunhua Liu, Hong-Yi Lin, Patanamon Thongtanunam. 2538-2560 [doi]
- Differential MambaNadav Schneider, Itamar Zimerman, Eliya Nachmani. 2561-2575 [doi]
- From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial ReasoningKranti Chalamalasetti, Sherzod Hakimov, David Schlangen. 2576-2591 [doi]
- Online Learning Defense against Iterative Jailbreak Attacks via Prompt OptimizationMasahiro Kaneko, Zeerak Talat, Timothy Baldwin. 2592-2609 [doi]
- HARBOR: Exploring Persona Dynamics in Multi-Agent CompetitionKenan Jiang, Li Xiong, Fei Liu. 2610-2632 [doi]
- A Diagnostic Framework for Auditing Reference-Free Vision-Language MetricsAngeline Charles, Srikant Panda, Amit Agarwal, Hitesh Laxmichand Patel, Priyaranjan Pattnayak, Bhargava Kumar, Tejaswini Kumar. 2633-2644 [doi]
- Small Changes, Large Consequences: Analyzing the Allocational Fairness of LLMs in Hiring ContextsPreethi Seshadri, Hongyu Chen, Sameer Singh 0001, Seraphina Goldfarb-Tarrant. 2645-2665 [doi]
- CultureGuard: Towards Culturally-Aware Dataset and Guard Model for Multilingual Safety ApplicationsRaviraj Joshi, Rakesh Paul, Kanishk Singla, Anusha Kamath, Michael Evans, Katherine Luna, Shaona Ghosh, Utkarsh Vaidya, Eileen Peters Long, Sanjay Singh Chauhan, Niranjan Wartikar. 2666-2685 [doi]
- Revisiting Word Embeddings in the LLM EraYash Mahajan, Matthew Freestone, Naman Bansal, Sathyanarayanan N. Aakur, Shubhra Kanti Karmaker. 2686-2717 [doi]
- Who Remembers What? Tracing Information Fidelity in Human-AI ChainsSuvojit Acharjee, Utathya Aich, Diptarka Mandal, Asfak Ali. 2718-2726 [doi]
- QA-Noun: Representing Nominal Semantics via Natural Language Question-Answer PairsMaria Tseytlin, Paul Roit, Omri Abend, Ido Dagan, Ayal Klein. 2727-2741 [doi]
- On Memorization of Large Language Models in Logical ReasoningChulin Xie, Yangsibo Huang, Chiyuan Zhang, Da Yu, Xinyun Chen, Bill Yuchen Lin, Bo Li 0026, Badih Ghazi, Ravi Kumar 0001. 2742-2785 [doi]
- ControlMed: Adding Reasoning Control to Medical Language ModelSung-Min Lee, Siyoon Lee, Juyeon Kim, Kyoungmin Roh. 2786-2799 [doi]
- No Universal Prompt: Unifying Reasoning through Adaptive Prompting for Temporal Table ReasoningAbhishek Rajgaria, Kushagra Dixit, Mayank Vyas, Harshavardhan Kalalbandi, Dan Roth 0001, Vivek Gupta 0001. 2800-2821 [doi]
- ClinStructor: AI-Powered Structuring of Unstructured Clinical TextsKarthikeyan K, Raghuveer Thirukovalluru, David E. Carlson. 2822-2836 [doi]
- Adaptive Collaborative Labeling with MLLMs for Low-Resource Multimodal Emotion RecognitionWenwen Zhuang, Lu Xiang, Shubei Tang, Yaping Zhang, Yu Zhou. 2837-2853 [doi]
- Understanding and Controlling Repetition Neurons and Induction Heads in In-Context LearningNhi Hoai Doan, Tatsuya Hiraoka, Kentaro Inui. 2854-2876 [doi]
- ReVision: A Dataset and Baseline VLM for Privacy-Preserving Task-Oriented Visual Instruction RewritingAbhijit Mishra, Mingda Li, Hsiang Fu, Richard Noh, Minji Kim. 2877-2889 [doi]
- Quantifying Cognitive Bias Induction in LLM-Generated ContentAbeer Alessa, Param Somane, Akshaya Lakshminarasimhan, Julian Skirzynski, Julian J. McAuley, Jessica Maria Echterhoff. 2890-2910 [doi]
- Language Arithmetics: Towards Systematic Language Neuron Identification and ManipulationDaniil Gurgurov, Katharina Trinley, Yusser Al Ghussin, Tanja Baeumel, Josef van Genabith, Simon Ostermann 0002. 2911-2937 [doi]
- FOCUS: A Benchmark for Targeted Socratic Question Generation via Source-Span GroundingSurawat Pothong, Machi Shimmei, Naoya Inoue, Paul Reisert, Ana Brassard, Wenzhi Wang, Shoichi Naito, Jungmin Choi, Kentaro Inui. 2938-2958 [doi]
- MedPath: Multi-Domain Cross-Vocabulary Hierarchical Paths for Biomedical Entity LinkingNishant Mishra 0001, Wilker Aziz, Iacer Calixto. 2959-2978 [doi]
- AURA-QG: Automated Unsupervised Replicable Assessment for Question GenerationRajshekar K, Harshad Khadilkar, Pushpak Bhattacharyya. 2979-2992 [doi]
- EFSA-CLC: Enhancing Zero-shot Entity-level Financial Sentiment Analysis with Cross-lingual CollaborationSenbin Zhu, Hongde Liu 0002, Chenyuan He, Yuxiang Jia. 2993-3003 [doi]
- Enhancing Low-Resource Text Classification with LLM-Generated Corpora : A Case Study on Olfactory Reference ExtractionCédric Boscher, Shannon Bruderer, Christine Largeron, Véronique Eglin, Elöd Egyed-Zsigmond. 3004-3027 [doi]
- Learning from *Sufficient* Rationales: Analysing the Relationship Between Explanation Faithfulness and Token-level Regularisation StrategiesJonathan Kamp, Lisa Beinborn, Antske Fokkens. 3028-3044 [doi]
- ContrastScore: Towards Higher Quality, Less Biased, More Efficient Evaluation Metrics with Contrastive EvaluationXiao Wang, Daniil Larionov, Siwei Wu, Yiqi Liu, Steffen Eger, Nafise Sadat Moosavi, Chenghua Lin. 3045-3060 [doi]
- ModernBERT or DeBERTaV3? Examining Architecture and Data Influence on Transformer Encoder Models PerformanceWissam Antoun, Benoît Sagot, Djamé Seddah. 3061-3074 [doi]
- Simplified Rewriting Improves Expert SummarizationXingmeng Zhao, Tongnian Wang, Anthony Rios. 3075-3097 [doi]
- RASTeR: Robust, Agentic, and Structured Temporal ReasoningDaniel Schumacher, Fatemeh Haji, Tara Grey, Niharika Bandlamudi, Nupoor Karnik, Gagana Uday Kumar, Cho-Yu Jason Chiang, Peyman Najafirad, Nishant Vishwamitra, Anthony Rios. 3098-3123 [doi]
- ReasoningWeekly: A General Knowledge and Verbal Reasoning Challenge for Large Language ModelsZixuan Wu, Francesca Lucchetti, Aleksander Boruch-Gruszecki, Jingmiao Zhao, Carolyn Jane Anderson, Joydeep Biswas, Federico Cassano, Arjun Guha. 3124-3140 [doi]
- Noise May Drown Out Words but Foster Compositionality: The Advantage of the Erasure and Deletion Noisy Channels on Emergent CommunicationCezary Klamra, Francijn Keur, Raquel G. Alhama. 3141-3166 [doi]
- Zero-Shot Grammar Competency Estimation Using Large Language Model Generated Pseudo LabelsSourya Dipta Das, Shubham Kumar, Kuldeep Yadav. 3167-3179 [doi]
- LLM-Guided Lifecycle-Aware Clustering of Multi-Turn Customer Support ConversationsPriyaranjan Pattnayak, Sanchari Chowdhuri, Amit Agarwal, Hitesh Laxmichand Patel. 3180-3206 [doi]
- Cascaded Information Disclosure for Generalized Evaluation of Problem Solving CapabilitiesYunxiang Yan, Tomohiro Sawada, Kartik Goyal. 3207-3234 [doi]
- Minority-Aware Satisfaction Estimation in Dialogue Systems via Preference-Adaptive Reinforcement LearningYahui Fu 0001, Zi Haur Pang, Tatsuya Kawahara. 3235-3249 [doi]
- The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1Kaiwen Zhou 0002, Chengzhi Liu, Xuandong Zhao, Shreedhar Jangam, Jayanth Srinivasa, Gaowen Liu, Dawn Song, Xin Eric Wang. 3250-3265 [doi]
- Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language ModelsKristian Noullet, Ayoub Ourgani, Niklas Thomas Lakner, Lukas Kinder, Tobias Käfer. 3266-3284 [doi]
- MuSciClaims: Multimodal Scientific Claim VerificationYash Kumar Lal, Manikanta Bandham, Mohammad Saqib Hasan, Apoorva Kashi, Mahnaz Koupaee, Niranjan Balasubramanian. 3285-3307 [doi]
- Program Synthesis Dialog Agents for Interactive Decision-MakingMatthew Toles, Nikhil Balwani, Rattandeep Singh, Valentina Giulia Sartori Rodriguez, Zhou Yu. 3308-3323 [doi]
- Learning a Continue-Thinking Token for Enhanced Test-Time ScalingLiran Ringel, Elad Tolochinsky, Yaniv Romano. 3324-3345 [doi]
- Video-guided Machine Translation: A Survey of Models, Datasets, and ChallengesPinaki Das, Virendra Singh, Pushpak Bhattacharyya, Gholamreza Haffari. 3346-3356 [doi]
- ProST: Progressive Sub-task Training for Pareto-Optimal Multi-agent Systems Using Small Language ModelsBiddut Sarker Bijoy, Mohammad Saqib Hasan, Pegah Alipoormolabashi, Avirup Sil, Aruna Balasubramanian, Niranjan Balasubramanian. 3357-3375 [doi]
- Rethinking Large Language Model Architectures for Sequential RecommendationsHanbing Wang, Xiaorui Liu, Wenqi Fan, Xiangyu Zhao 0001, Venkataramana Kini, Devendra Pratap Yadav, Fei Wang, Zhen Wen, Hui Liu 0031. 3376-3391 [doi]
- DSBC : Data Science task Benchmarking with Context engineeringRam Mohan Rao Kadiyala, Jebish Purbey, Siddhant Gupta, Giulio Martini, Suman Debnath, Hamza Farooq. 3392-3424 [doi]
- Improving Document Retrieval Coherence for Semantically Equivalent QueriesStefano Campese, Alessandro Moschitti, Ivano Lauriola. 3425-3441 [doi]
- Referring Expressions as a Lens into Spatial Language Grounding in Vision-Language ModelsAkshar Tumu, Varad Shinde, Parisa KordJamshidi. 3442-3455 [doi]
- FINDR: A Fast Influential Data Selector for NL2Code PretrainingXinliang Frederick Zhang, Lu Wang. 3456-3476 [doi]
- Found in Translation: Measuring Multilingual LLM Consistency as Simple as Translate then EvaluateAshim Gupta, Maitrey Mehta, Zhichao Xu 0001, Vivek Srikumar. 3477-3496 [doi]
- Do Persona-Infused LLMs Affect Performance in a Strategic Reasoning Game?John Licato, Stephen Steinle. 3497-3528 [doi]
- FarSense: A Comprehensive Commonsense Benchmark and Evaluation Framework for the Farsi LanguageKamyar Zeinalipour, Neda Jamshidi, Seyedehbahareh Hejazi, Marco Maggini, Monica Bianchini, Simone Paoletti, Marco Gori. 3529-3599 [doi]
- GL-CLiC: Global-Local Coherence and Lexical Complexity for Sentence-Level AI-Generated Text DetectionRizky Adi, Bassamtiano Renaufalgi Irnawan, Yoshimi Suzuki, Fumiyo Fukumoto. 3600-3617 [doi]
- Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native PerformanceRam Mohan Rao Kadiyala, Siddartha Pullakhandam, Siddhant Gupta, Jebish Purbey, Drishti Sharma, Kanwal Mehreen, Muhammad Arham, Suman Debnath, Hamza Farooq. 3618-3641 [doi]
- AfriSpeech-MultiBench: A Verticalized Multidomain Multicountry Benchmark Suite for African Accented English ASRGabrial Zencha Ashungafac, Mardhiyah Sanni, Busayo Awobade, Alex Gichamba, Tobi Olatunji. 3642-3653 [doi]
- An Offline Mobile Conversational Agent for Mental Health Support: Learning from Emotional Dialogues and Psychological Texts with Student-Centered EvaluationVimaleswar A, Prabhu Nandan Sahu, Nilesh Kumar Sahu, Haroon R. Lone. 3654-3673 [doi]
- Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent PerspectiveTejas Anvekar, Krishna Singh Rajput, Chitta Baral, Vivek Gupta 0001. 3674-3686 [doi]
- SurveyGen-I: Consistent Scientific Survey Generation with Evolving Plans and Memory-Guided WritingJing Chen, Zhiheng Yang, Yixian Shen, Jie Liu, Adam Belloum, Paola Grosso, Chrysa Papagianni. 3687-3714 [doi]
- VAGUE-Gate: Plug-and-Play Local-Privacy Shield for Retrieval-Augmented GenerationArshia Hemmat, Matin Moqadas, Ali Mamanpoosh, Amirmasoud Rismanchian, Afsaneh Fatemi. 3715-3730 [doi]
- PII-Scope: A Comprehensive Study on Training Data Privacy Leakage in Pretrained LLMsKrishna Kanth Nakka, Ahmed Frikha 0002, Ricardo Mendes, Xue Jiang, Xuebing Zhou. 3731-3765 [doi]
- Indic-S2ST: a Multilingual and Multimodal Many-to-Many Indic Speech-to-Speech Translation DatasetNivedita Sethiya, Puneet Walia, Chandresh Kumar Maurya. 3766-3775 [doi]