Abstract is missing.
- Frontmatter [doi]
- Named Entity Recognition Under Domain Shift via Metric Learning for Life SciencesHongyi Liu, Qingyun Wang 0005, Payam Karisani, Heng Ji. 1-21 [doi]
- Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence GenerationHongyi Yuan, Zheng Yuan, Chuanqi Tan, Fei Huang, Songfang Huang. 22-39 [doi]
- An Interactive Framework for Profiling News Media SourcesNikhil Mehta 0003, Dan Goldwasser. 40-58 [doi]
- Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case StudyYinghao Li, Haorui Wang, Chao Zhang. 59-81 [doi]
- TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in ConversationTaeyang Yun, Hyunkuk Lim, Jeonghwan Lee, Min Song. 82-95 [doi]
- Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text SummariesSeanie Lee, Jianpeng Cheng 0002, Joris Driesen, Alexandru Coca, Anders Johannsen. 96-111 [doi]
- Promptly Predicting Structures: The Return of InferenceMaitrey Mehta, Valentina Pyatkin, Vivek Srikumar. 112-130 [doi]
- On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQLYutong Shao, Ndapa Nakashole. 131-156 [doi]
- Extractive Summarization with Text GeneratorThang Le, Anh Tuan Luu. 157-174 [doi]
- Self-generated Replay Memories for Continual Neural Machine TranslationMichele Resta, Davide Bacciu. 175-191 [doi]
- Measuring and Improving Chain-of-Thought Reasoning in Vision-Language ModelsYangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran. 192-210 [doi]
- Building Knowledge-Guided Lexica to Model Cultural VariationShreya Havaldar, Salvatore Giorgi, Sunny Rai, Thomas Talhelm, Sharath Chandra Guntuku, Lyle H. Ungar. 211-226 [doi]
- Adaptive Rank Selections for Low-Rank Approximation of Language ModelsShangqian Gao, Ting Hua, Yen-Chang Hsu, Yilin Shen, Hongxia Jin. 227-241 [doi]
- An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text TranslationPengzhi Gao, Ruiqing Zhang, Zhongjun He, Hua Wu 0003, Haifeng Wang 0001. 242-256 [doi]
- Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-CollaborationZhenhailong Wang, Shaoguang Mao, Wenshan Wu, Tao Ge 0001, Furu Wei, Heng Ji. 257-279 [doi]
- FPT: Feature Prompt Tuning for Few-shot Readability AssessmentZiyang Wang, Sanwoo Lee, Hsiu-Yuan Huang, Yunfang Wu. 280-295 [doi]
- Self-Prompting Large Language Models for Zero-Shot Open-Domain QAJunlong Li, Jinyuan Wang, Zhuosheng Zhang 0001, Hai Zhao 0001. 296-310 [doi]
- Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs?Kai Sun, Yifan Ethan Xu, Hanwen Zha, Yue Liu, Xin Luna Dong. 311-325 [doi]
- kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context LearningWenting Zhao 0006, Ye Liu 0006, Yao Wan 0001, Yibo Wang, Qingyang Wu, Zhongfen Deng, Jiangshu Du, Shuaiqi Liu 0002, Yunlong Xu, Philip S. Yu. 326-337 [doi]
- ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation SystemsJon Saad-Falcon, Omar Khattab, Christopher Potts, Matei Zaharia. 338-354 [doi]
- DEMO: A Statistical Perspective for Efficient Image-Text MatchingFan Zhang, Xian-Sheng Hua 0001, Chong Chen 0002, Xiao Luo 0001. 355-369 [doi]
- SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural ReasoningBin Wang, Zhengyuan Liu, Xin Huang, Fangkai Jiao, Yang Ding, AiTi Aw, Nancy Chen. 370-390 [doi]
- Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided RevisionSeongyun Lee, Sue Hyun Park, Yongrae Jo, Minjoon Seo. 391-404 [doi]
- LLMs Are Few-Shot In-Context Low-Resource Language LearnersSamuel Cahyawijaya, Holy Lovenia, Pascale Fung. 405-433 [doi]
- Simple and effective data augmentation for compositional generalizationYuekun Yao, Alexander Koller. 434-449 [doi]
- Rethinking Tabular Data Understanding with Large Language ModelsTianyang Liu 0003, Fei Wang, Muhao Chen. 450-482 [doi]
- From Shortcuts to Triggers: Backdoor Defense with Denoised PoEQin Liu, Fei Wang, Chaowei Xiao, Muhao Chen. 483-496 [doi]
- BookSQL: A Large Scale Text-to-SQL Dataset for Accounting DomainRahul Kumar, Amar Raja Dibbu, Shrutendra Harsola, Vignesh Subrahmaniam, Ashutosh Modi. 497-516 [doi]
- FLAP: Flow-Adhering Planning with Constrained Decoding in LLMsShamik Roy, Sailik Sengupta, Daniele Bonadiman, Saab Mansour, Arshit Gupta. 517-539 [doi]
- DuRE: Dual Contrastive Self Training for Semi-Supervised Relation ExtractionYuxi Feng, Laks V. S. Lakshmanan. 540-555 [doi]
- Query-Efficient Textual Adversarial Example Generation for Black-Box AttacksZhen Yu, Zhenhua Chen, Kun He 0001. 556-569 [doi]
- Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News ArticlesKung-Hsiang Huang, Philippe Laban, Alexander R. Fabbri, Prafulla Kumar Choubey, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu. 570-593 [doi]
- AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples GenerationHaoyi Qiu, Kung-Hsiang Huang, Jingnong Qu, Nanyun Peng. 594-608 [doi]
- PILOT: Legal Case Outcome Prediction with Case LawLang Cao, Zifeng Wang 0010, Cao Xiao, Jimeng Sun 0001. 609-621 [doi]
- ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language ModelsZequan Liu, Jiawen Lyn, Wei Zhu, Xing Tian, Yvette Graham. 622-641 [doi]
- R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic PiecesHeng-Jui Chang, James R. Glass. 642-662 [doi]
- InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with InstructionsYifan Wang, Yafei Liu, Chufan Shi, Haoling Li, Chen Chen 0015, Haonan Lu, Yujiu Yang. 663-677 [doi]
- Language Agnostic Code EmbeddingsSaiteja Utpala, Alex Gu, Pin-Yu Chen. 678-691 [doi]
- An Examination of the Compositionality of Large Generative Vision-Language ModelsTeli Ma, Rong Li, Junwei Liang 0008. 692-705 [doi]
- Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-BackdoorsVictoria Graf, Qin Liu, Muhao Chen. 706-718 [doi]
- VertAttack: Taking Advantage of Text Classifiers' Horizontal VisionJonathan Rusert. 719-732 [doi]
- KDMCSE: Knowledge Distillation Multimodal Sentence Embeddings with Adaptive Angular margin Contrastive LearningCong-Duy Nguyen, Thong Nguyen, Xiaobao Wu, Anh Tuan Luu. 733-749 [doi]
- The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any languageJian Zhu, Changbing Yang, Farhan Samir, Jahurul Islam. 750-772 [doi]
- Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language TasksYunqi Zhang, Songda Li, Chunyuan Deng, Luyi Wang, Hui Zhao. 773-791 [doi]
- BeLLM: Backward Dependency Enhanced Large Language Model for Sentence EmbeddingsXianming Li, Jing Li. 792-804 [doi]
- Assessing Factual Reliability of Large Language Model KnowledgeWeixuan Wang, Barry Haddow, Alexandra Birch, Wei Peng. 805-819 [doi]
- Dial-MAE: ConTextual Masked Auto-Encoder for Retrieval-based Dialogue SystemsZhenpeng Su, Xing Wu 0002, Wei Zhou 0019, Guangyuan Ma, Songlin Hu. 820-830 [doi]
- Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on Open-Source ModelCheng Qian, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu 0001. 831-854 [doi]
- Create! Don't Repeat: A Paradigm Shift in Multi-Label Augmentation through Label Creative GenerationLetian Wang, Xianggen Liu, Jiancheng Lv 0001. 855-869 [doi]
- Neurocache: Efficient Vector Retrieval for Long-range Language ModelingAli Safaya, Deniz Yuret. 870-883 [doi]
- Unveiling the Generalization Power of Fine-Tuned Large Language ModelsHaoran Yang, Yumeng Zhang, Jiaqi Xu, Hongyuan Lu, Pheng-Ann Heng, Wai Lam. 884-899 [doi]
- A Closer Look at the Self-Verification Abilities of Large Language Models in Logical ReasoningRuixin Hong, Hongming Zhang 0009, Xinyu Pang, Dong Yu 0001, Changshui Zhang. 900-925 [doi]
- Exploring Self-supervised Logic-enhanced Training for Large Language ModelsFangkai Jiao, Zhiyang Teng, Bosheng Ding, Zhengyuan Liu, Nancy F. Chen, Shafiq Joty. 926-941 [doi]
- MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical ReasoningDebrup Das, Debopriyo Banerjee, Somak Aditya, Ashish Kulkarni. 942-966 [doi]
- CoUDA: Coherence Evaluation via Unified Data AugmentationDawei Zhu, Wenhao Wu, Yifan Song, Fangwei Zhu, Ziqiang Cao, Sujian Li. 967-978 [doi]
- mEdIT: Multilingual Text Editing via Instruction TuningVipul Raheja, Dimitris Alikaniotis, Vivek Kulkarni, Bashar Alhafni, Dhruv Kumar 0005. 979-1001 [doi]
- Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated LearningYunchao Zhang, Zonglin Di, Kaiwen Zhou, Cihang Xie, Xin Wang. 1002-1016 [doi]
- In-context Learning and Gradient Descent RevisitedGilad Deutch, Nadav Magar, Tomer Bar Natan, Guy Dar. 1017-1028 [doi]
- Corpus Considerations for Annotator Modeling and ScalingOlufunke Oluyemi Sarumi, Béla Neuendorf, Joan Plepi, Lucie Flek, Jörg Schlötterer, Charles Welch. 1029-1040 [doi]
- On Large Language Models' Hallucination with Regard to Known FactsChe Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou 0016. 1041-1053 [doi]
- "One-Size-Fits-All"? Examining Expectations around What Constitute "Fair" or "Good" NLG System BehaviorsLi Lucy, Su Lin Blodgett, Milad Shokouhi, Hanna M. Wallach, Alexandra Olteanu. 1054-1089 [doi]
- Language Models Hallucinate, but May Excel at Fact VerificationJian Guan 0002, Jesse Dodge, David Wadden, Minlie Huang, Hao Peng 0009. 1090-1111 [doi]
- A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference ResolutionBowen Ding, Qingkai Min, Shengkun Ma, Yingjie Li, Linyi Yang, Yue Zhang. 1112-1140 [doi]
- TrojFSP: Trojan Insertion in Few-shot Prompt TuningMengxin Zheng, Jiaqi Xue, Xun Chen, Yanshan Wang, Qian Lou, Lei Jiang. 1141-1151 [doi]
- Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language ModelsYi Luo, Zhenghao Lin, Yuhao Zhang, Jiashuo Sun, Chen Lin 0001, Chengjin Xu, Xiangdong Su, Yelong Shen, Jian Guo, Yeyun Gong. 1152-1197 [doi]
- X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across ParagraphsJuan Diego Rodriguez, Katrin Erk, Greg Durrett. 1198-1222 [doi]
- Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv PapersRajiv Movva, Sidhika Balachandar, Kenny Peng, Gabriel Agostini, Nikhil Garg 0001, Emma Pierson. 1223-1243 [doi]
- E⁵: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit and ExtrapolateZhehao Zhang, Yan Gao 0002, Jian-Guang Lou. 1244-1258 [doi]
- S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language ModelFangyu Lei, Qian Liu, Yiming Huang, Shizhu He, Jun Zhao 0001, Kang Liu 0001. 1259-1286 [doi]
- MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction TuningFuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu 0001. 1287-1310 [doi]
- Visual Grounding Helps Learn Word Meanings in Low-Data RegimesChengxu Zhuang, Evelina Fedorenko, Jacob Andreas. 1311-1329 [doi]
- Accurate Knowledge Distillation via n-best RerankingHendra Setiawan. 1330-1345 [doi]
- AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question DecompositionZhaorun Chen, Zhuokai Zhao, Zhihong Zhu, Ruiqi Zhang, Xiang Li, Bhiksha Raj, Huaxiu Yao. 1346-1362 [doi]
- SEMQA: Semi-Extractive Multi-Source Question AnsweringTal Schuster, Ádám D. Lelkes, Haitian Sun, Jai Gupta, Jonathan Berant, William W. Cohen, Donald Metzler. 1363-1381 [doi]
- Fine-Tuning Language Models with Reward Learning on PolicyHao Lang, Fei Huang, Yongbin Li. 1382-1392 [doi]
- A Universal Dependencies Treebank for Highland Puebla NahuatlRobert Pugh, Francis M. Tyers. 1393-1403 [doi]
- COPAL-ID: Indonesian Language Reasoning with Local Culture and NuancesHaryo Akbarianto Wibowo, Erland Hilman Fuadi, Made Nindyatama Nityasya, Radityo Eko Prasojo, Alham Fikri Aji. 1404-1422 [doi]
- IterAlign: Iterative Constitutional Alignment of Large Language ModelsXiusi Chen, Hongzhi Wen, Sreyashi Nag, Chen Luo 0003, Qingyu Yin, Ruirui Li 0002, Zheng Li 0018, Wei Wang 0010. 1423-1433 [doi]
- OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State TrackingChia-Hsuan Lee 0001, Hao Cheng 0002, Mari Ostendorf. 1434-1445 [doi]
- Multi-Operational Mathematical Derivations in Latent SpaceMarco Valentino, Jordan Meadows, Lan Zhang, André Freitas. 1446-1458 [doi]
- Large Language Models Help Humans Verify Truthfulness - Except When They Are Convincingly WrongChenglei Si, Navita Goyal, Tongshuang Wu, Chen Zhao 0009, Shi Feng, Hal Daumé III, Jordan L. Boyd-Graber. 1459-1474 [doi]
- XferBench: a Data-Driven Benchmark for Emergent LanguageBrendon Boldt, David Mortensen. 1475-1489 [doi]
- Evaluating Large Language Models as Generative User Simulators for Conversational RecommendationSe-eun Yoon, Zhankui He, Jessica Maria Echterhoff, Julian J. McAuley. 1490-1504 [doi]
- A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with TransformersJordan Meadows, Marco Valentino, Damien Teney, André Freitas. 1505-1523 [doi]
- Identifying Linear Relational Concepts in Large Language ModelsDavid Chanin, Anthony Hunter, Oana-Maria Camburu. 1524-1535 [doi]
- Benchmark Transparency: Measuring the Impact of Data on EvaluationVenelin Kovatchev, Matthew Lease. 1536-1551 [doi]
- JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language ModelsJillian Fisher, Ximing Lu, Jaehun Jung, Liwei Jiang, Zaïd Harchaoui, Yejin Choi 0001. 1552-1581 [doi]
- REST: Retrieval-Based Speculative DecodingZhenyu He 0012, Zexuan Zhong, Tianle Cai, Jason D. Lee, Di He 0001. 1582-1595 [doi]
- Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic RepresentationsSihao Chen, Hongming Zhang 0009, Tong Chen, Ben Zhou, Wenhao Yu 0011, Dian Yu 0001, Baolin Peng, Hongwei Wang 0010, Dan Roth, Dong Yu 0001. 1596-1609 [doi]
- MSciNLI: A Diverse Benchmark for Scientific Natural Language InferenceMobashir Sadat, Cornelia Caragea. 1610-1629 [doi]
- Causal Inference for Human-Language Model CollaborationBohan Zhang, Yixin Wang, Paramveer Dhillon. 1630-1647 [doi]
- SELF-GUARD: Empower the LLM to Safeguard ItselfZezhong Wang 0004, Fangkai Yang, Lu Wang 0008, Pu Zhao 0004, Hongru Wang 0003, Liang Chen 0001, Qingwei Lin, Kam-Fai Wong. 1648-1668 [doi]
- COSIGN: Contextual Facts Guided Generation for Knowledge Graph CompletionJinpeng Li, Hang Yu 0006, Xiangfeng Luo, Qian Liu. 1669-1682 [doi]
- Toward Informal Language Processing: Knowledge of Slang in Large Language ModelsZhewei Sun, Qian Hu, Rahul Gupta 0001, Richard S. Zemel, Yang Xu 0023. 1683-1701 [doi]
- Ghostbuster: Detecting Text Ghostwritten by Large Language ModelsVivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein. 1702-1717 [doi]
- End-to-End Beam Retrieval for Multi-Hop Question AnsweringJiahao Zhang, Haiyang Zhang, Dongmei Zhang 0007, Yong Liu 0027, Shen Huang. 1718-1731 [doi]
- Leveraging Generative Large Language Models with Visual Instruction and Demonstration Retrieval for Multimodal Sarcasm DetectionBinghao Tang, Boda Lin, Haolong Yan, Si Li 0001. 1732-1742 [doi]
- Multi-Scale Prompt Memory-Augmented Model for Black-Box ScenariosXiaojun Kuang, C. L. Philip Chen, Shuzhen Li, Tong Zhang 0015. 1743-1757 [doi]
- Ungrammatical-syntax-based In-context Example Selection for Grammatical Error CorrectionChenming Tang, Fanyi Qu, Yunfang Wu. 1758-1770 [doi]
- BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual TransferAkari Asai, Sneha Kudugunta, Xinyan Yu 0001, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi. 1771-1800 [doi]
- TISE: A Tripartite In-context Selection Method for Event Argument ExtractionYanhe Fu, Yanan Cao, Qingyue Wang, Yi Liu 0067. 1801-1818 [doi]
- Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual TasksZhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen 0003, Bailin Wang, Najoung Kim, Jacob Andreas, Yoon Kim. 1819-1862 [doi]
- TRUE-UIE: Two Universal Relations Unify Information Extraction TasksYucheng Wang, Bowen Yu 0002, Yilin Liu, Shudong Lu. 1863-1876 [doi]
- zrLLM: Zero-Shot Relational Learning on Temporal Knowledge Graphs with Large Language ModelsZifeng Ding, Heling Cai, Jingpei Wu, Yunpu Ma, Ruotong Liao, Bo Xiong, Volker Tresp. 1877-1895 [doi]
- Embodied Executable Policy Learning with Language-based Scene SummarizationJielin Qiu, Mengdi Xu, William Han, Seungwhan Moon, Ding Zhao. 1896-1913 [doi]
- Metacognitive Prompting Improves Understanding in Large Language ModelsYuqing Wang, Yun Zhao 0001. 1914-1926 [doi]
- MART: Improving LLM Safety with Multi-round Automatic Red-TeamingSuyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han 0001, Yuning Mao. 1927-1937 [doi]
- DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue DatasetYoung-Jun Lee, ByungSoo Ko, Han-Gyu Kim, Jonghwan Hyeon, Ho-Jin Choi. 1938-1963 [doi]
- Routing to the Expert: Efficient Reward-guided Ensemble of Large Language ModelsKeming Lu, Hongyi Yuan, Runji Lin, Junyang Lin, Zheng Yuan 0002, Chang Zhou, Jingren Zhou. 1964-1974 [doi]
- Automatic Generation of Model and Data Cards: A Step Towards Responsible AIJiarui Liu 0004, Wenkai Li, Zhijing Jin, Mona T. Diab. 1975-1997 [doi]
- FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled UnfreezingChen Liu, Jonas Pfeiffer, Ivan Vulic, Iryna Gurevych. 1998-2015 [doi]
- Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and SayingsChen Liu, Fajri Koto, Timothy Baldwin, Iryna Gurevych. 2016-2039 [doi]
- The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional Supporters for Queer YouthShir Lissak, Nitay Calderon, Geva Shenkman, Yaakov Ophir, Eyal Fruchter, Anat Brunstein Klomek, Roi Reichart. 2040-2079 [doi]
- IPED: An Implicit Perspective for Relational Triple Extraction based on Diffusion ModelJianli Zhao, Changhao Xu, Bin. Jiang. 2080-2092 [doi]
- QualEval: Qualitative Evaluation for Model ImprovementVishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan. 2093-2111 [doi]
- Quantum-inspired Language Model with Lindblad Master Equation and Interference Measurement for Sentiment AnalysisKehuan Yan, Peichao Lai, Yilei Wang. 2112-2121 [doi]
- VisLingInstruct: Elevating Zero-Shot Learning in Multi-Modal Language Models with Autonomous Instruction OptimizationDongsheng Zhu, Daniel Tang, Weidong Han, Jinghui Lu, Yukun Zhao, Guoliang Xing, Junfeng Wang, Dawei Yin. 2122-2135 [doi]
- A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models EasilyPeng Ding, Jun Kuang, Dan Ma, Xuezhi Cao, Yunsen Xian, Jiajun Chen, Shujian Huang. 2136-2153 [doi]
- P³Sum: Preserving Author's Perspective in News Summarization with Diffusion Language ModelsYuhan Liu, Shangbin Feng, Xiaochuang Han, Vidhisha Balachandran, Chan Young Park, Sachin Kumar 0009, Yulia Tsvetkov. 2154-2173 [doi]
- Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math MistakesRose Wang, Qingyang Zhang, Carly Robinson, Susanna Loeb, Dorottya Demszky. 2174-2199 [doi]
- RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive SummarizationDongqi Pu, Vera Demberg. 2200-2220 [doi]
- Strings from the Library of Babel: Random Sampling as a Strong Baseline for Prompt OptimisationYao Lu, Jiayi Wang, Raphael Tang, Sebastian Riedel 0001, Pontus Stenetorp. 2221-2231 [doi]
- ReTA: Recursively Thinking Ahead to Improve the Strategic Reasoning of Large Language ModelsJinhao Duan, Shiqi Wang 0002, James Diffenderfer, Lichao Sun 0001, Tianlong Chen, Bhavya Kailkhura, Kaidi Xu. 2232-2246 [doi]
- Fact Checking Beyond Training SetPayam Karisani, Heng Ji. 2247-2261 [doi]
- Program-Aided Reasoners (Better) Know What They KnowAnubha Kabra, Sanketh Rangreji, Yash Mathur, Aman Madaan, Emmy Liu, Graham Neubig. 2262-2278 [doi]
- The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human LabelsEve Fleisig, Su Lin Blodgett, Dan Klein, Zeerak Talat. 2279-2292 [doi]
- Principles from Clinical Research for NLP Model GeneralizationAparna Elangovan, Jiayuan He 0002, Yuan Li 0012, Karin Verspoor. 2293-2309 [doi]
- First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language ModelsNaomi Saphra, Eve Fleisig, KyungHyun Cho, Adam Lopez. 2310-2326 [doi]
- Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language ModelsRaphael Tang, Xinyu Crystina Zhang, Xueguang Ma, Jimmy Lin, Ferhan Ture. 2327-2340 [doi]
- From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction TuningXuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu 0001. 2341-2369 [doi]
- POLYIE: A Dataset of Information Extraction from Polymer Material Scientific LiteratureJerry Junyang Cheung, Yuchen Zhuang, Yinghao Li, Pranav Shetty, Wantian Zhao, Sanjeev Grampurohit, Rampi Ramprasad, Chao Zhang. 2370-2385 [doi]
- LLM-based Medical Assistant Personalization with Short- and Long-Term Memory CoordinationKai Zhang, Yangyang Kang, Fubang Zhao, Xiaozhong Liu. 2386-2398 [doi]
- SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual SummarizationJacob Parnell, Inigo Jauregi Unanue, Massimo Piccardi. 2399-2415 [doi]
- KTRL+F: Knowledge-Augmented In-Document SearchHanseok Oh, Haebin Shin, Miyoung Ko, Hyunji Lee, Minjoon Seo. 2416-2436 [doi]
- How Well Do Large Language Models Truly Ground?Hyunji Lee, Se June Joo, Chaeeun Kim, Joel Jang, Doyoung Kim, Kyoung-woon On, Minjoon Seo. 2437-2465 [doi]
- Kjell, H. Andrew Schwartz: ALBA: Adaptive Language-Based Assessments for Mental HealthVasudha Varadarajan, Sverker Sikström, Oscar N. E. 2466-2478 [doi]
- FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question AnsweringWei Zhou, Mohsen Mesgar, Heike Adel, Annemarie Friedrich. 2479-2497 [doi]
- MILL: Mutual Verification with Large Language Models for Zero-Shot Query ExpansionPengyue Jia, Yiding Liu, Xiangyu Zhao 0001, Xiaopeng Li, Changying Hao, Shuaiqiang Wang, Dawei Yin. 2498-2518 [doi]
- Efficient Benchmarking (of Language Models)Yotam Perlitz, Elron Bandel, Ariel Gera, Ofir Arviv, Liat Ein-Dor, Eyal Shnarch, Noam Slonim, Michal Shmueli-Scheuer, Leshem Choshen. 2519-2536 [doi]
- ReFACT: Updating Text-to-Image Models by Editing the Text EncoderDana Arad, Hadas Orgad, Yonatan Belinkov. 2537-2558 [doi]
- A Likelihood Ratio Test of Genetic Relationship among LanguagesV. S. D. S. Mahesh Akavarapu, Arnab Bhattacharya 0001. 2559-2570 [doi]
- PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuningXuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, Bowen Zhou. 2571-2597 [doi]
- MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and TasksSanchit Ahuja, Divyanshu Aggarwal, Varun Gumma, Ishaan Watts, Ashutosh Sathe, Millicent Ochieng, Rishav Hada, Prachi Jain, Mohamed Ahmed, Kalika Bali, Sunayana Sitaram. 2598-2637 [doi]
- Unlocking Emergent Modularity in Large Language ModelsZihan Qiu, Zeyu Huang, Jie Fu. 2638-2660 [doi]
- A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and QualityMaja Stahl, Nadine Michel, Sebastian Kilsbach, Julian Schmidtke, Sara Rezat, Henning Wachsmuth. 2661-2674 [doi]
- Adjusting Interpretable Dimensions in Embedding Space with Human JudgmentsKatrin Erk, Marianna Apidianaki. 2675-2686 [doi]
- PatentEval: Understanding Errors in Patent GenerationYou Zuo, Kim Gerdes, Éric de la Clergerie, Benoît Sagot. 2687-2710 [doi]
- Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-EditingSai Koneru, Miriam Exel, Matthias Huck, Jan Niehues. 2711-2725 [doi]
- Metaphor Detection with Context Enhancement and Curriculum LearningKaidi Jia, Rongsheng Li. 2726-2737 [doi]
- What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?Wei Liu 0145, Stephen Wan 0001, Michael Strube 0001. 2738-2753 [doi]
- UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language InstructionsSiddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan S. Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Karen Livescu, Shinji Watanabe 0001. 2754-2774 [doi]
- How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their VulnerabilitiesLingbo Mo, Boshi Wang, Muhao Chen, Huan Sun 0001. 2775-2792 [doi]
- Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language ModelsYue Zhou, Yada Zhu, Diego Antognini, Yoon Kim, Yang Zhang. 2793-2804 [doi]
- TriSum: Learning Summarization Ability from Large Language Models with Structured RationalePengcheng Jiang, Cao Xiao, Zifeng Wang 0010, Parminder Bhatia, Jimeng Sun 0001, Jiawei Han 0001. 2805-2819 [doi]
- GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language ModelsPengcheng Jiang, Jiacheng Lin, Zifeng Wang 0010, Jimeng Sun 0001, Jiawei Han 0001. 2820-2837 [doi]
- Curated Datasets and Neural Models for Machine Translation of Informal Registers between Mayan and Spanish VernacularsAndrés Lou, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena. 2838-2850 [doi]
- The Effect of Data Partitioning Strategy on Model Generalizability: A Case Study of Morphological SegmentationZoey Liu, Bonnie J. Dorr. 2851-2864 [doi]
- Measuring Entrainment in Spontaneous Code-switched SpeechDebasmita Bhattacharya, Siying Ding, Alayna Nguyen, Julia Hirschberg. 2865-2876 [doi]
- A Survey of Meaning Representations - From Theory to Practical UtilityZacchary Sadeddine, Juri Opitz, Fabian M. Suchanek. 2877-2892 [doi]
- Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-DistillationHaozhe Zhao, Zefan Cai, Shuzheng Si, Liang Chen 0024, Yufeng He, Kaikai An, Baobao Chang. 2893-2907 [doi]
- Evaluating In-Context Learning of Libraries for Code GenerationArkil Patel, Siva Reddy, Dzmitry Bahdanau, Pradeep Dasigi. 2908-2926 [doi]
- Visually-Aware Context Modeling for News Image CaptioningTingyu Qu, Tinne Tuytelaars, Marie-Francine Moens. 2927-2943 [doi]
- Regularized Conventions: Equilibrium Computation as a Model of Pragmatic ReasoningAthul Paul Jacob, Gabriele Farina, Jacob Andreas. 2944-2955 [doi]
- TopicGPT: A Prompt-based Topic Modeling FrameworkChau Pham, Alexander Miserlis Hoyle, Simeng Sun, Philip Resnik, Mohit Iyyer. 2956-2984 [doi]
- ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model TriggerJiazhao Li, Yijin Yang, Zhuofeng Wu 0001, V. G. Vinod Vydiswaran, Chaowei Xiao. 2985-3004 [doi]
- Social Meme-ing: Measuring Linguistic Variation in MemesNaitian Zhou, David Jurgens, David Bamman. 3005-3024 [doi]
- ExpertQA: Expert-Curated Questions and Attributed AnswersChaitanya Malaviya, Subin Lee, Sihao Chen, Elizabeth Sieber, Mark Yatskar, Dan Roth. 3025-3045 [doi]
- What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User PerceptionChaitanya Malaviya, Subin Lee, Dan Roth, Mark Yatskar. 3046-3065 [doi]
- When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good LabelsWeiyan Shi, Emily Dinan, Kurt Shuster 0001, Jason Weston, Jing Xu. 3066-3082 [doi]
- Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole LanguagesNathaniel R. Robinson, Raj Dabre, Ammon Shurtz, Rasul Dent, Onenamiyi Onesi, Claire Bizon Monroc, Loïc Grobol, Hasan Muhammad, Ashi Garg, Naome A. Etori, Vijay Murari Tiyyala, Olanrewaju Samuel, Matthew Dean Stutzman, Bismarck Bamfo Odoom, Sanjeev Khudanpur, Stephen D. Richardson, Kenton Murray. 3083-3110 [doi]
- Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language ModelsJiashu Xu, Mingyu Derek Ma, Fei Wang 0060, Chaowei Xiao, Muhao Chen. 3111-3126 [doi]
- Modeling Empathetic Alignment in ConversationJiamin Yang, David Jurgens. 3127-3148 [doi]
- Native Language Identification in Texts: A SurveyDhiman Goswami, Sharanya Thilagan, Kai North, Shervin Malmasi, Marcos Zampieri. 3149-3160 [doi]
- LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language ModelsYifan Yang, Jiajun Zhou, Ngai Wong, Zheng Zhang. 3161-3176 [doi]
- Which One? Leveraging Context Between Objects and Multiple Views for Language GroundingChancharik Mitra, Abrar Anwar, Rodolfo Corona, Dan Klein, Trevor Darrell, Jesse Thomason. 3177-3189 [doi]
- Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two BenchmarksTing-Yun Chang, Jesse Thomason, Robin Jia. 3190-3211 [doi]
- PromptFix: Few-shot Backdoor Removal via Adversarial Prompt TuningTianrong Zhang, Zhaohan Xi, Ting Wang, Prasenjit Mitra, Jinghui Chen. 3212-3225 [doi]
- Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language ModelsZhixue Zhao, Nikolaos Aletras. 3226-3244 [doi]
- A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & ToxicityShayne Longpre, Gregory Yauney, Emily Reif, Katherine Lee, Adam Roberts, Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David Mimno, Daphne Ippolito. 3245-3276 [doi]
- Instructional Fingerprinting of Large Language ModelsJiashu Xu, Fei Wang 0060, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, Muhao Chen. 3277-3306 [doi]
- Reinforced Multiple Instance Selection for Speaker Attribute PredictionAlireza Salkhordeh Ziabari, Ali Omrani, Parsa Hejabi, Preni Golazizian, Brendan Kennedy 0001, Payam Piray, Morteza Dehghani. 3307-3321 [doi]
- DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token SamplingShikhar Tuli, Chi-Heng Lin, Yen-Chang Hsu, Niraj K. Jha, Yilin Shen, Hongxia Jin. 3322-3345 [doi]
- Few-shot Knowledge Graph Relational Reasoning via Subgraph AdaptationHaochen Liu, Song Wang, Chen Chen 0022, Jundong Li. 3346-3356 [doi]
- Uncertainty Quantification for In-Context Learning of Large Language ModelsChen Ling 0003, Xujiang Zhao, Xuchao Zhang, Wei Cheng 0002, Yanchi Liu, Yiyou Sun, Mika Oishi, Takao Osaki, Katsushi Matsuda, Jie Ji, Guangji Bai, Liang Zhao 0002, Haifeng Chen. 3357-3370 [doi]
- HelpSteer: Multi-attribute Helpfulness Dataset for SteerLMZhilin Wang, Yi Dong, Jiaqi Zeng, Virginia Adams, Makesh Narsimhan Sreedhar, Daniel Egert, Olivier Delalleau, Jane Polak Scowcroft, Neel Kant, Aidan Swope, Oleksii Kuchaiev. 3371-3384 [doi]
- A Preference-driven Paradigm for Enhanced Translation with Large Language ModelsDawei Zhu, Sony Trenous, Xiaoyu Shen 0001, Dietrich Klakow, Bill Byrne, Eva Hasler. 3385-3403 [doi]
- Fair Abstractive Summarization of Diverse PerspectivesYusen Zhang 0001, Nan Zhang, Yixin Liu 0003, Alexander R. Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen R. McKeown, Rui Zhang 0037. 3404-3426 [doi]
- What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and BiasesAnthony Meng Huat Tiong, Junqi Zhao, Boyang Li 0001, Junnan Li 0001, Steven C. H. Hoi, Caiming Xiong. 3427-3454 [doi]
- Show Your Work with Confidence: Confidence Bands for Tuning CurvesNicholas Lourie, KyungHyun Cho, He He 0001. 3455-3472 [doi]
- GRASP: A Disagreement Analysis Framework to Assess Group Associations in PerspectivesVinodkumar Prabhakaran, Christopher Homan, Lora Aroyo, Aida Mostafazadeh Davani, Alicia Parrish, Alex S. Taylor, Mark Diaz, Ding Wang, Gregory Serapio-García. 3473-3492 [doi]
- Event Causality Is Key to Computational Story UnderstandingYidan Sun, Qin Chao, Boyang Li. 3493-3511 [doi]
- Subspace Representations for Soft Set Operations and Sentence SimilaritiesYoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura 0001. 3512-3524 [doi]
- My Heart Skipped a Beat! Recognizing Expressions of Embodied Emotion in Natural LanguageYuan Zhuang, Tianyu Jiang, Ellen Riloff. 3525-3537 [doi]
- Low-Cost Generation and Evaluation of Dictionary Example SentencesBill Cai, Clarence Boon Liang Ng, Daniel Liang, Shelvia Hotama. 3538-3549 [doi]
- Making Language Models Better Tool Learners with Execution FeedbackShuofei Qiao, Honghao Gui, Chengfei Lv, Qianghuai Jia, Huajun Chen, Ningyu Zhang 0001. 3550-3568 [doi]
- Complex Claim Verification with Evidence Retrieved in the WildJifan Chen, Grace Kim, Aniruddh Sriram, Greg Durrett, Eunsol Choi. 3569-3587 [doi]
- Multimodal Multi-loss Fusion Network for Sentiment AnalysisZehui Wu, Ziwei Gong, Jaywon Koo, Julia Hirschberg. 3588-3602 [doi]
- Confronting LLMs with Traditional ML: Rethinking the Fairness of Large Language Models in Tabular ClassificationsYanchen Liu, Srishti Gautam, Jiaqi Ma, Himabindu Lakkaraju. 3603-3620 [doi]
- Analyzing the Use of Metaphors in News Editorials for Political FramingMeghdut Sengupta, Roxanne El Baff, Milad Alshomary, Henning Wachsmuth. 3621-3631 [doi]
- SharpSeq: Empowering Continual Event Detection through Sharpness-Aware Sequential-task LearningThanh-Thien Le, Viet Dao, Linh Nguyen, Thi-Nhung Nguyen, Linh Ngo, Thien Nguyen. 3632-3644 [doi]
- Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language ModelsStephan Linzbach, Dimitar Dimitrov 0002, Laura Kallmeyer, Kilian Evang, Hajira Jabeen, Stefan Dietze. 3645-3655 [doi]
- Know When To Stop: A Study of Semantic Drift in Text GenerationAva Spataru, Eric Hambro, Elena Voita, Nicola Cancedda. 3656-3671 [doi]
- Curriculum Masking in Vision-Language Pretraining to Maximize Cross Modal InteractionKraig Tou, Zijun Sun. 3672-3688 [doi]
- Elote, Choclo and Mazorca: on the Varieties of SpanishCristina España-Bonet, Alberto Barrón-Cedeño. 3689-3711 [doi]
- Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarksChonghua Wang, Haodong Duan, Songyang Zhang, Dahua Lin, Kai Chen 0026. 3712-3724 [doi]
- A Zero-Shot Monolingual Dual Stage Information Retrieval System for Spanish Biomedical Systematic Literature ReviewsRegina Ofori-Boateng, Magaly Aceves-Martins, Nirmalie Wiratunga, Carlos Francisco Moreno-García. 3725-3736 [doi]
- LayoutPointer: A Spatial-Context Adaptive Pointer Network for Visual Information ExtractionSiyuan Huang, Yongping Xiong, Guibin Wu. 3737-3748 [doi]
- Long-form evaluation of model editingDomenic Rosati, Robie Gonzales, Jinkun Chen, Xuemin Yu, Yahya Kayani, Frank Rudzicz, Hassan Sajjad 0001. 3749-3780 [doi]
- Analyzing the Role of Semantic Representations in the Era of Large Language ModelsZhijing Jin, Yuen Chen, Fernando Gonzalez Adauto, Jiarui Liu 0004, Jiayi Zhang, Julian Michael, Bernhard Schölkopf, Mona T. Diab. 3781-3798 [doi]
- TRAQ: Trustworthy Retrieval Augmented Question Answering via Conformal PredictionShuo Li, Sangdon Park 0001, Insup Lee 0001, Osbert Bastani. 3799-3821 [doi]
- MapGuide: A Simple yet Effective Method to Reconstruct Continuous Language from Brain ActivitiesXinpei Zhao, Jingyuan Sun, Shaonan Wang, Jing Ye, Xiaohan Zhang, Chengqing Zong. 3822-3832 [doi]
- On-the-fly Definition Augmentation of LLMs for Biomedical NERMonica Munnangi, Sergey Feldman, Byron C. Wallace, Silvio Amir, Tom Hope, Aakanksha Naik. 3833-3854 [doi]
- This Land is Your, My Land: Evaluating Geopolitical Bias in Language Models through Territorial DisputesBryan Li, Samar Haider, Chris Callison-Burch. 3855-3871 [doi]
- Set-Aligning Framework for Auto-Regressive Event Temporal Graph GenerationXingwei Tan, Yuxiang Zhou, Gabriele Pergola, Yulan He 0001. 3872-3892 [doi]
- LanguageFlow: Advancing Diffusion Language Generation with Probabilistic FlowsShujian Zhang, Lemeng Wu, ChengYue Gong, Xingchao Liu. 3893-3905 [doi]
- Towards Improved Multi-Source Attribution for Long-Form Answer GenerationNilay Patel, Shivashankar Subramanian, Siddhant Garg, Pratyay Banerjee, Amita Misra. 3906-3919 [doi]
- Synthetic Query Generation for Privacy-Preserving Deep Retrieval Systems using Differentially Private Language ModelsAldo G. Carranza, Rezsa Farahani, Natalia Ponomareva 0001, Alexey Kurakin, Matthew Jagielski, Milad Nasr. 3920-3930 [doi]
- Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge DistillationAbhijnan Nath, Shadi Manafi Avari, Avyakta Chelle, Nikhil Krishnaswamy. 3931-3946 [doi]
- Can Knowledge Graphs Reduce Hallucinations in LLMs? : A SurveyGarima Agrawal, Tharindu Kumarage, Zeyad Alghamdi, Huan Liu 0001. 3947-3960 [doi]
- Pedagogically Aligned Objectives Create Reliable Automatic Cloze TestsBrian D. Ondov, Kush Attal, Dina Demner-Fushman. 3961-3972 [doi]
- Take One Step at a Time to Know Incremental Utility of Demonstration: An Analysis on Reranking for Few-Shot In-Context LearningKazuma Hashimoto, Karthik Raman 0001, Michael Bendersky. 3973-3990 [doi]
- LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language ModelsChi Han, Qifan Wang, Hao Peng 0009, Wenhan Xiong, Yu Chen 0022, Heng Ji, Sinong Wang. 3991-4008 [doi]
- CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual AssistantsAlbert Yu Sun, Varun Nair, Elliot Schumacher, Anitha Kannan. 4009-4030 [doi]
- Advancing Beyond Identification: Multi-bit Watermark for Large Language ModelsKiYoon Yoo, Wonhyuk Ahn, Nojun Kwak. 4031-4055 [doi]
- HTCCN: Temporal Causal Convolutional Networks with Hawkes Process for Extrapolation Reasoning in Temporal Knowledge GraphsTingxuan Chen, Jun Long, Liu Yang, Zidong Wang, Yongheng Wang, Xiongnan Jin. 4056-4066 [doi]
- SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text GenerationAbe Bohan Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov. 4067-4082 [doi]
- Media Bias Detection Across Families of Language ModelsIffat Maab, Edison Marrese-Taylor, Sebastian Padó, Yutaka Matsuo. 4083-4098 [doi]
- Better Zero-Shot Reasoning with Role-Play PromptingAobo Kong, Shiwan Zhao, Hao Chen, Qicheng Li, Yong Qin, Ruiqi Sun, Xin Zhou, Enzhi Wang, Xiaohang Dong. 4099-4113 [doi]
- Event-Content-Oriented Dialogue Generation in Short VideoFenghua Cheng, Xue Li 0001, Zi Huang, Jinxiang Wang, Sen Wang 0001. 4114-4124 [doi]
- DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction WrappingYongrui Chen 0002, Haiyun Jiang, Xinting Huang, Shuming Shi 0001, Guilin Qi. 4125-4135 [doi]
- Beyond Borders: Investigating Cross-Jurisdiction Transfer in Legal Case SummarizationT. Y. S. S. Santosh, Vatsal Venkatkrishna, Saptarshi Ghosh, Matthias Grabmair. 4136-4150 [doi]
- EDC: Effective and Efficient Dialog Comprehension For Dialog State TrackingQifan Lu, Bhaskar Ramasubramanian, Radha Poovendran. 4151-4165 [doi]
- Automatic Restoration of Diacritics for Speech Data SetsSara Abedalmonem Mohammad Shatnawi, Sawsan Alqahtani, Hanan Aldarmaki. 4166-4176 [doi]
- XNLIeu: a dataset for cross-lingual NLI in BasqueMaite Heredia, Julen Etxaniz, Muitze Zulaika, Xabier Saralegi, Jeremy Barnes, Aitor Soroa. 4177-4188 [doi]
- MDR: Model-Specific Demonstration Retrieval at Inference Time for In-Context LearningHuazheng Wang, Jinming Wu, Haifeng Sun 0001, Zixuan Xia, Daixuan Cheng, Jingyu Wang 0001, Qi Qi 0001, Jianxin Liao. 4189-4204 [doi]
- Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to AnalysisNayeon Lee, Chani Jung, Junho Myung, Jiho Jin, José Camacho-Collados, Juho Kim, Alice Oh. 4205-4224 [doi]
- Enhancing Contextual Understanding in Large Language Models through Contrastive DecodingZheng Zhao, Emilio Monti, Jens Lehmann 0001, Haytham Assem. 4225-4237 [doi]
- Generalizable Sarcasm Detection is Just Around the Corner, of Course!Hyewon Jang, Diego Frassinelli. 4238-4249 [doi]
- Encoding of lexical tone in self-supervised models of spoken languageGaofei Shen, Michaela Watkins, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupala. 4250-4261 [doi]
- A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic ChangeFrancesco Periti, Nina Tahmasebi. 4262-4282 [doi]
- iACOS: Advancing Implicit Sentiment Extraction with Informative and Adaptive Negative ExamplesXiancai Xu, Jia-Dong Zhang, Lei Xiong, Zhishang Liu. 4283-4293 [doi]
- Rectifying Demonstration Shortcut in In-Context LearningJoonwon Jang, Sanghwan Jang, Wonbin Kweon, Minjin Jeon, Hwanjo Yu. 4294-4321 [doi]
- Universal NER: A Gold-Standard Multilingual Named Entity Recognition BenchmarkStephen Mayhew 0002, Terra Blevins, Shuheng Liu, Marek Suppa, Hila Gonen, Joseph Marvin Imperial, Börje Karlsson 0001, Peiqin Lin, Nikola Ljubesic, Lester James V. Miranda, Barbara Plank, Arij Riabi, Yuval Pinter. 4322-4337 [doi]
- ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior DetectionSunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee L. Sung, Joel I. Reisman, Wenjun Li, Robert D. Kerns, William Becker, Hong Yu 0001. 4338-4359 [doi]
- A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition ModelsXingmeng Zhao, Ali Niazi, Anthony Rios. 4360-4374 [doi]
- The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in EducationPaiheng Xu, Jing Liu, Nathan Jones, Julie Cohen, Wei Ai 0002. 4375-4389 [doi]
- Differentially Private Next-Token Prediction of Large Language ModelsJames Flemings, Meisam Razaviyayn, Murali Annavaram. 4390-4404 [doi]
- Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech DatasetJanis Goldzycher, Paul Röttger, Gerold Schneider. 4405-4424 [doi]
- Memory Augmented Language Models through Mixture of Word ExpertsCícero Nogueira dos Santos, James Lee-Thorp, Isaac Noble, Chung-Ching Chang, David C. Uthus. 4425-4438 [doi]
- Impossible Distillation for Paraphrasing and Summarization: How to Make High-quality Lemonade out of Small, Low-quality ModelJaehun Jung, Peter West, Liwei Jiang, Faeze Brahman, Ximing Lu, Jillian Fisher, Taylor Sorensen, Yejin Choi 0001. 4439-4454 [doi]
- TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue SummarizationLiyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yuan Yang, Siffi Singh, Song Feng 0001, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang 0053, Saab Mansour, Kathleen McKeown. 4455-4480 [doi]
- MOKA: Moral Knowledge Augmentation for Moral Event ExtractionXinliang Frederick Zhang, Winston Wu, Nicholas Beauchamp, Lu Wang 0008. 4481-4502 [doi]
- Fixing Rogue Memorization in Many-to-One Multilingual Translators of Extremely-Low-Resource Languages by Rephrasing Training SamplesPaulo Cavalin, Pedro Henrique Domingues, Claudio S. Pinhanez, Julio Nogima. 4503-4514 [doi]
- Backdoor Attacks on Multilingual Machine TranslationJun Wang 0126, Qiongkai Xu, Xuanli He, Benjamin I. P. Rubinstein, Trevor Cohn. 4515-4534 [doi]
- Personalized Jargon Identification for Enhanced Interdisciplinary CommunicationYue Guo, Joseph Chee Chang, Maria Antoniak, Erin Bransom, Trevor Cohen, Lucy Lu Wang, Tal August. 4535-4550 [doi]
- Flames: Benchmarking Value Alignment of LLMs in ChineseKexin Huang, Xiangyang Liu, Qianyu Guo, Tianxiang Sun, Jiawei Sun, Yaru Wang, Zeyang Zhou, Yixu Wang, Yan Teng, Xipeng Qiu, Yingchun Wang, Dahua Lin. 4551-4591 [doi]
- Mitigating Bias for Question Answering Models by Tracking Bias InfluenceMingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao 0006, Tagyoung Chung, Wei Wang 0010, Kai-Wei Chang, Nanyun Peng. 4592-4610 [doi]
- Extending CLIP's Image-Text Alignment to Referring Image SegmentationSeoyeon Kim, Minguk Kang, Dongwon Kim, Jaesik Park, Suha Kwak. 4611-4628 [doi]
- Generating Attractive and Authentic Copywriting from Customer ReviewsYu-Xiang Lin, Wei-Yun Ma. 4629-4642 [doi]
- Effective Long-Context Scaling of Foundation ModelsWenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma 0001. 4643-4663 [doi]
- Empowering Diffusion Models on the Embedding Space for Text GenerationZhujin Gao, Junliang Guo, Xu Tan 0003, Yongxin Zhu, Fang Zhang, Jiang Bian 0002, Linli Xu. 4664-4683 [doi]
- Aligning as Debiasing: Causality-Aware Alignment via Reinforcement Learning with Interventional FeedbackYu Xia, Tong Yu 0001, Zhankui He, Handong Zhao, Julian J. McAuley, Shuai Li 0010. 4684-4695 [doi]
- Fake Alignment: Are LLMs Really Aligned Well?Yixu Wang, Yan Teng, Kexin Huang, Chengqi Lyu, Songyang Zhang, Wenwei Zhang, Xingjun Ma, Yu-Gang Jiang, Yu Qiao, Yingchun Wang. 4696-4712 [doi]
- Visually Guided Generative Text-Layout Pre-training for Document IntelligenceZhiming Mao, Haoli Bai, Lu Hou, Lifeng Shang, Xin Jiang 0002, Qun Liu 0001, Kam-Fai Wong. 4713-4730 [doi]
- HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text ClassificationHe Zhu, Junran Wu, Ruomei Liu, Yue Hou, Ze Yuan, Shangzhe Li, Yicheng Pan 0001, Ke Xu 0001. 4731-4745 [doi]
- Investigating the Emergent Audio Classification Ability of ASR Foundation ModelsRao Ma, Adian Liusie, Mark J. F. Gales, Kate M. Knill. 4746-4760 [doi]
- In-context Learning Generalizes, But Not Always Robustly: The Case of SyntaxAaron Mueller, Albert Webson, Jackson Petty, Tal Linzen. 4761-4779 [doi]
- Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language PromptYongqi Wang, Ruofan Hu, Rongjie Huang, Zhiqing Hong, Ruiqi Li, Wenrui Liu, Fuming You, Tao Jin, Zhou Zhao. 4780-4794 [doi]
- Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent SpeechDena F. Mujtaba, Nihar R. Mahapatra, Megan Arney, J. Scott Yaruss, Hope Gerlach-Houck, Caryn Herring, Jia-bin. 4795-4809 [doi]
- MAFALDA: A Benchmark and Comprehensive Study of Fallacy Detection and ClassificationChadi Helwe, Tom Calamai, Pierre-Henri Paris, Chloé Clavel, Fabian M. Suchanek. 4810-4845 [doi]
- Diffusion Glancing Transformer for Parallel Sequence-to-Sequence LearningLihua Qian, Mingxuan Wang, Yang Liu, Hao Zhou. 4846-4862 [doi]
- No Context Needed: Contextual Quandary In Idiomatic Reasoning With Pre-Trained Language ModelsKellen Cheng, Suma Bhat. 4863-4880 [doi]
- Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding RecommendationXindi Wang, Robert E. Mercer, Frank Rudzicz. 4881-4891 [doi]
- Anisotropy is Not Inherent to TransformersAnemily Machina, Robert E. Mercer. 4892-4907 [doi]
- Finding Replicable Human Evaluations via Stable Ranking ProbabilityParker Riley, Daniel Deutsch, George F. Foster, Viresh Ratnakar, Ali Dabirmoghaddam, Markus Freitag. 4908-4919 [doi]
- Stealthy and Persistent Unalignment on Large Language Models via Backdoor InjectionsYuanpu Cao, Bochuan Cao, Jinghui Chen. 4920-4935 [doi]
- Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource TextsSai Ashish Somayajula, Youwei Liang, Li Zhang, Abhishek Singh, Pengtao Xie. 4936-4953 [doi]
- Detecting Bipolar Disorder from Misdiagnosed Major Depressive Disorder with Mood-Aware Multi-Task LearningDaeun Lee, Hyolim Jeon, Sejung Son, Chaewon Park, Ji Hyun An, Seungbae Kim, Jinyoung Han. 4954-4970 [doi]
- Leveraging Code to Improve In-Context Learning for Semantic ParsingBen Bogin, Shivanshu Gupta, Peter Clark, Ashish Sabharwal. 4971-5012 [doi]
- Improving Pre-trained Language Model Sensitivity via Mask Specific losses: A case study on Biomedical NERMicheal Abaho, Danushka Bollegala, Gary Leeming, Dan Joyce, Iain E. Buchan. 5013-5029 [doi]
- Language Models Implement Simple Word2Vec-style Vector ArithmeticJack Merullo, Carsten Eickhoff, Ellie Pavlick. 5030-5047 [doi]
- AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta LearningRuiyi Zhang, Rushi Qiang, Sai Ashish Somayajula, Pengtao Xie. 5048-5060 [doi]
- SportQA: A Benchmark for Sports Understanding in Large Language ModelsHaotian Xia, Zhengbang Yang, Yuqing Wang, Rhys Tracy, Yun Zhao 0001, Dongdong Huang, Zezhi Chen, Yan Zhu, Yuan-Fang Wang, Weining Shen. 5061-5081 [doi]
- Revisiting subword tokenization: A case study on affixal negation in large language modelsThinh Truong, Yulia Otmakhova 0001, Karin Verspoor, Trevor Cohn, Timothy Baldwin. 5082-5095 [doi]
- Generating Mental Health Transcripts with SAPE (Spanish Adaptive Prompt Engineering)Daniel Cabrera Lozoya, Alejandro Berazaluce, Juan Perches, Eloy Lúa, Mike Conway, Simon D'Alfonso. 5096-5113 [doi]
- Where are you from? Geolocating Speech and Applications to Language IdentificationPatrick Foley, Matthew Wiesner, Bismarck Odoom, Leibny Paola García-Perera, Kenton Murray, Philipp Koehn. 5114-5126 [doi]
- Teaching Language Models to Self-Improve through Interactive DemonstrationsXiao Yu, Baolin Peng, Michel Galley, Jianfeng Gao 0001, Zhou Yu 0005. 5127-5149 [doi]
- MAGID: An Automated Pipeline for Generating Synthetic Multi-modal DatasetsHossein Aboutalebi, Hwanjun Song, Yusheng Xie, Arshit Gupta, Lijia Sun, Hang Su, Igor Shalyminov, Nikolaos Pappas 0004, Siffi Singh, Saab Mansour. 5150-5167 [doi]
- Zero-shot Generative Linguistic SteganographyKe Lin, Yiyang Luo, Zijian Zhang, Ping Luo. 5168-5182 [doi]
- Does GPT-4 pass the Turing test?Cameron-Jones, Ben Bergen 0001. 5183-5210 [doi]
- Polarity Calibration for Opinion SummarizationYuanyuan Lei 0001, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Ruihong Huang, Dong Yu. 5211-5224 [doi]
- Sentence-level Media Bias Analysis with Event Relation GraphYuanyuan Lei 0001, Ruihong Huang. 5225-5238 [doi]
- EMONA: Event-level Moral Opinions in News ArticlesYuanyuan Lei 0001, Md Messal Monem Miah, Ayesha Qamar, Sai Ramana Reddy, Jonathan Tong, Haotian Xu, Ruihong Huang. 5239-5251 [doi]
- DLM: A Decoupled Learning Model for Long-tailed Polyphone Disambiguation in MandarinBeibei Gao, Yangsen Zhang, Ga Xiang, Yushan Jiang. 5252-5262 [doi]
- You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric InstrumentsBangzhao Shu, LeChen Zhang, Minje Choi, Lavinia Dunagan, Lajanugen Logeswaran, Moontae Lee, Dallas Card, David Jurgens. 5263-5281 [doi]
- CASA: Causality-driven Argument Sufficiency AssessmentXiao Liu, Yansong Feng, Kai-Wei Chang. 5282-5302 [doi]
- MacGyver: Are Large Language Models Creative Problem Solvers?Yufei Tian, Abhilasha Ravichander, Lianhui Qin, Ronan Le Bras 0001, Raja Marjieh, Nanyun Peng, Yejin Choi 0001, Thomas L. Griffiths 0001, Faeze Brahman. 5303-5324 [doi]
- To Translate or Not to Translate: A Systematic Investigation of Translation-Based Cross-Lingual Transfer to Low-Resource LanguagesBenedikt Ebing, Goran Glavas. 5325-5344 [doi]
- Enhancing Large Language Models Against Inductive Instructions with Dual-critique PromptingRui Wang 0092, Hongru Wang 0003, Fei Mi, Boyang Xue, Yi Chen 0007, Kam-Fai Wong, Ruifeng Xu. 5345-5363 [doi]
- GLiNER: Generalist Model for Named Entity Recognition using Bidirectional TransformerUrchade Zaratiana, Nadi Tomeh, Pierre Holat, Thierry Charnois. 5364-5376 [doi]
- XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language ModelsPaul Röttger, Hannah Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi 0001, Dirk Hovy. 5377-5400 [doi]
- Carpe diem: On the Evaluation of World Knowledge in Lifelong Language ModelsYujin Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, Se-Young Yun. 5401-5415 [doi]
- Fine-grained Gender Control in Machine Translation with Large Language ModelsMinwoo Lee, Hyukhun Koh, MinSung Kim, Kyomin Jung. 5416-5430 [doi]
- DialogVCS: Robust Natural Language Understanding in Dialogue System UpgradeZefan Cai, Xin Zheng, Tianyu Liu 0001, Haoran Meng, Jiaqi Han, Gang Yuan, Binghuai Lin, Baobao Chang, Yunbo Cao. 5431-5452 [doi]
- LLatrieval: LLM-Verified Retrieval for Verifiable GenerationXiaonan Li, Changtai Zhu, Linyang Li, Zhangyue Yin, Tianxiang Sun, Xipeng Qiu. 5453-5471 [doi]
- Mapping Long-term Causalities in Psychiatric Symptomatology and Life Events from Social MediaSiyuan Chen, Meilin Wang, Minghao Lv, Zhiling Zhang, Juqianqian Juqianqian, Dejiyangla Dejiyangla, Yujia Peng, Kenny Q. Zhu, Mengyue Wu. 5472-5487 [doi]
- Multimodal Chart Retrieval: A Comparison of Text, Table and Image Based ApproachesAveri Nowak, Francesco Piccinno, Yasemin Altun. 5488-5505 [doi]
- Retrieval Helps or Hurts? A Deeper Dive into the Efficacy of Retrieval Augmentation to Language ModelsSeiji Maekawa, Hayate Iso, Sairam Gurajada, Nikita Bhutani. 5506-5521 [doi]
- AudioChatLlama: Towards General-Purpose Speech Abilities for LLMsYassir Fathullah, Chunyang Wu, Egor Lakomkin, Ke Li, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer. 5522-5532 [doi]
- Whispers of Doubt Amidst Echoes of Triumph in NLP RobustnessAshim Gupta, Rishanth Rajendhran, Nathan Stringham, Vivek Srikumar, Ana Marasovic. 5533-5590 [doi]
- Sequential Compositional Generalization in Multimodal ModelsSemih Yagcioglu, Osman Batur Ince, Aykut Erdem, Erkut Erdem, Desmond Elliott, Deniz Yuret. 5591-5611 [doi]
- Generating Uncontextualized and Contextualized Questions for Document-Level Event Argument ExtractionMd Nayem Uddin, Enfa Rose George, Eduardo Blanco 0002, Steven R. Corman. 5612-5627 [doi]
- Evidence-Driven Retrieval Augmented Response Generation for Online MisinformationZhenrui Yue, Huimin Zeng, Yimeng Lu, Lanyu Shang, Yang Zhang 0031, Dong Wang 0002. 5628-5643 [doi]
- Open-Vocabulary Federated Learning with Multimodal PrototypingHuimin Zeng, Zhenrui Yue, Dong Wang. 5644-5656 [doi]
- Exploring Key Point Analysis with Pairwise Generation and Graph PartitioningXiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang. 5657-5667 [doi]
- Understanding the Capabilities and Limitations of Large Language Models for Cultural CommonsenseSiqi Shen, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Soujanya Poria, Rada Mihalcea. 5668-5680 [doi]
- Code Models are Zero-shot Precondition ReasonersLajanugen Logeswaran, Sungryull Sohn, Yiwei Lyu, Anthony Z. Liu, Dong Ki Kim, Dongsub Shim, Moontae Lee, Honglak Lee. 5681-5697 [doi]
- Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language UnderstandingSuyoung Kim, Jiyeon Hwang, Ho-Young Jung. 5698-5711 [doi]
- Do Large Language Models Rank Fairly? An Empirical Study on the Fairness of LLMs as RankersYuan Wang, Xuyang Wu 0002, Hsin-Tai Wu, Zhiqiang Tao, Yi Fang 0008. 5712-5724 [doi]
- TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table DecompositionMd Nahid Hasan Nahid, Davood Rafiei. 5725-5737 [doi]
- Contextual Label Projection for Cross-Lingual Structured PredictionTanmay Parekh, I-Hung Hsu, Kuan-Hao Huang, Kai-Wei Chang, Nanyun Peng. 5738-5757 [doi]
- Event Detection from Social Media for Epidemic PredictionTanmay Parekh, Anh Mac, Jiarui Yu, Yuxuan Dong, Syed Shahriar, Bonnie Liu, Eric Yang, Kuan-Hao Huang, Wei Wang 0010, Nanyun Peng, Kai-Wei Chang. 5758-5783 [doi]
- RESPROMPT: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language ModelsSong Jiang 0002, Zahra Shakeri, Aaron Chan, Maziar Sanjabi, Hamed Firooz, Yinglong Xia, Bugra Akyildiz, Yizhou Sun, Jinchao Li, Qifan Wang, Asli Celikyilmaz. 5784-5809 [doi]
- BPE-knockout: Pruning Pre-existing BPE Tokenisers with Backwards-compatible Morphological Semi-supervisionThomas Bauwens, Pieter Delobelle. 5810-5832 [doi]
- How are Prompts Different in Terms of Sensitivity?Sheng Lu, Hendrik Schuff, Iryna Gurevych. 5833-5856 [doi]
- LSTDial: Enhancing Dialogue Generation via Long- and Short-Term Measurement FeedbackGuanghui Ye, Huan Zhao, Zixing Zhang 0001, Xupeng Zha, Zhihua Jiang. 5857-5871 [doi]
- The ART of LLM Refinement: Ask, Refine, and TrustKumar Shridhar, Koustuv Sinha, Andrew Cohen, Tianlu Wang, Ping Yu, Ramakanth Pasunuru, Mrinmaya Sachan, Jason Weston, Asli Celikyilmaz. 5872-5883 [doi]
- Modularized Multilingual NMT with Fine-grained InterlinguaSungjun Lim, Yoonjung Choi, Sangha Kim 0002. 5884-5899 [doi]
- ParallelPARC: A Scalable Pipeline for Generating Natural-Language AnalogiesOren Sultan, Yonatan Bitton, Ron Yosef, Dafna Shahaf. 5900-5924 [doi]
- AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient ContentShuyang Cao, Lu Wang 0008. 5925-5941 [doi]
- NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction HelpsKristina Gligoric, Myra Cheng, Lucia Zheng, Esin Durmus, Dan Jurafsky. 5942-5959 [doi]
- Debiasing with Sufficient Projection: A General Theoretical Framework for Vector RepresentationsEnze Shi, Lei Ding 0013, Linglong Kong, Bei Jiang. 5960-5975 [doi]
- Semi-Supervised Dialogue Abstractive Summarization via High-Quality Pseudolabel SelectionJianfeng He, Hang Su, Jason Cai, Igor Shalyminov, Hwanjun Song, Saab Mansour. 5976-5996 [doi]
- AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African LanguagesJiayi Wang, David Ifeoluwa Adelani, Sweta Agrawal, Marek Masiak, Ricardo Rei, Eleftheria Briakou, Marine Carpuat, Xuanli He, Sofia Bourhim, Andiswa Bukula, Muhidin Mohamed, Temitayo Olatoye, Tosin P. Adewumi, Hamam Mokayed, Christine Mwase, Wangui Kimotho, Foutse Yuehgoh, Anuoluwapo Aremu, Jessica Ojo, Shamsuddeen Hassan Muhammad, Salomey Osei, Abdul-Hakeem Omotayo, Chiamaka Chukwuneke, Perez Ogayo, Oumaima Hourrane, Salma El Anigri, Lolwethu Ndolela, Thabiso Mangwana, Shafie Abdi Mohamed, Ayinde Hassan, Oluwabusayo Olufunke Awoyomi, Lama Alkhaled, Sana Sabah Al-azzawi, Naome A. Etori, Millicent Ochieng, Clemencia Siro, Njoroge Kiragu, Eric Muchiri, Wangari Kimotho, Sakayo Toadoum Sari, Lyse Naomi Wamba Momo, Daud Abolade, Simbiat Ajao, Iyanuoluwa Shode, Ricky Macharm, Ruqayya Nasir Iro, Saheed S. Abdullahi, Stephen E. Moore, Bernard Opoku, Zainab Akinjobi, Afolabi Abeeb, Nnaemeka C. Obiefuna, Onyekachi Raphael Ogbu, Sam Ochieng', Verrah Otiende, Chinedu E. Mbonu, Yao Lu, Pontus Stenetorp. 5997-6023 [doi]
- TableLlama: Towards Open Large Generalist Models for TablesTianshu Zhang, Xiang Yue, Yifei Li, Huan Sun 0001. 6024-6044 [doi]
- PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language ModelsHyunjin Kim, Young-Jin Kim, JinYeong Bak. 6045-6064 [doi]
- Backdooring Instruction-Tuned Large Language Models with Virtual Prompt InjectionJun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin. 6065-6086 [doi]
- Exploring the Factual Consistency in Dialogue Comprehension of Large Language ModelsShuaijie She, Shujian Huang, Xingyun Wang, Yanke Zhou, Jiajun Chen. 6087-6100 [doi]
- Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only ShallowlyChangjiang Gao, Hongda Hu, Peng Hu, Jiajun Chen, Jixing Li, Shujian Huang. 6101-6117 [doi]
- A Study on the Calibration of In-context LearningHanlin Zhang, Yifan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric P. Xing, Himabindu Lakkaraju, Sham M. Kakade. 6118-6136 [doi]
- DialogBench: Evaluating LLMs as Human-like Dialogue SystemsJiao Ou, Junda Lu, Che Liu, Yihong Tang, Fuzheng Zhang, Di Zhang, Kun Gai. 6137-6170 [doi]
- GINopic: Topic Modeling with Graph Isomorphism NetworkSuman Adhya, Debarshi Kumar Sanyal. 6171-6183 [doi]
- CMB: A Comprehensive Medical Benchmark in ChineseXidong Wang, Guiming Chen, Dingjie Song, Zhiyi Zhang 0007, Zhihong Chen, Qingying Xiao, Junying Chen, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li 0001. 6184-6205 [doi]
- Massive End-to-end Speech Recognition Models with Time ReductionWeiran Wang, Rohit Prabhavalkar, Haozhe Shan, Zhong Meng, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li 0028, James Qin, Xingyu Cai, Adam Stooke, Chengjian Zheng, Yanzhang He, Tara N. Sainath, Pedro Moreno Mengibar. 6206-6217 [doi]
- SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training DynamicsArash Ardakani, Altan Haan, Shangyin Tan, Doru-Thom Popovici, Alvin Cheung, Costin Iancu, Koushik Sen. 6218-6236 [doi]
- Effective Large Language Model Adaptation for Improved Grounding and Citation GenerationXi Ye, Ruoxi Sun 0002, Sercan Ö. Arik, Tomas Pfister. 6237-6251 [doi]
- Assisting in Writing Wikipedia-like Articles From Scratch with Large Language ModelsYijia Shao, Yucheng Jiang, Theodore A. Kanell, Peter Xu 0003, Omar Khattab, Monica S. Lam. 6252-6278 [doi]
- Grounding Gaps in Language Model GenerationsOmar Shaikh, Kristina Gligoric, Ashna Khetan, Matthias Gerstgrasser, Diyi Yang, Dan Jurafsky. 6279-6296 [doi]
- When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model ScaleChristos Baziotis, Biao Zhang 0006, Alexandra Birch, Barry Haddow. 6297-6324 [doi]
- ContraSim - Analyzing Neural Representations Based on Contrastive LearningAdir Rahamim, Yonatan Belinkov. 6325-6339 [doi]
- Universal Prompt Optimizer for Safe Text-to-Image GenerationZongyu Wu, Hongcheng Gao, Yueze Wang, Xiang Zhang 0001, Suhang Wang. 6340-6354 [doi]
- Language Model Based Unsupervised Dependency Parsing with Conditional Mutual Information and Grammatical ConstraintsJunjie Chen, Xiangheng He, Yusuke Miyao. 6355-6366 [doi]
- The Bias Amplification Paradox in Text-to-Image GenerationPreethi Seshadri, Sameer Singh 0001, Yanai Elazar. 6367-6384 [doi]
- Grammar-based Data Augmentation for Low-Resource Languages: The Case of Guarani-Spanish Neural Machine TranslationAgustín Lucas, Alexis Baladón, Victoria Pardiñas, Marvin M. Agüero-Torales, Santiago Góngora, Luis Chiruzzo. 6385-6397 [doi]
- Global Gallery: The Fine Art of Painting Culture Portraits through Multilingual Instruction TuningAnjishnu Mukherjee, Aylin Caliskan, Ziwei Zhu 0001, Antonios Anastasopoulos. 6398-6415 [doi]
- Toward Interactive Regional Understanding in Vision-Large Language ModelsJungbeom Lee, Sanghyuk Chun, Sangdoo Yun. 6416-6429 [doi]
- ScriptMix: Mixing Scripts for Low-resource Language ParsingJaeseong Lee 0002, Dohyeon Lee, Seung-won Hwang. 6430-6444 [doi]
- MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine TranslationJiahuan Li, Shanbo Cheng, Shujian Huang, Jiajun Chen. 6445-6459 [doi]
- ToXCL: A Unified Framework for Toxic Speech Detection and ExplanationNhat M. Hoang, Xuan Long Do, Duc Anh Do, Duc Anh Vu 0002, Anh Tuan Luu. 6460-6472 [doi]
- LinkPrompt: Natural and Universal Adversarial Attacks on Prompt-based Language ModelsYue Xu, Wenjie Wang. 6473-6486 [doi]
- CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-EditionsHanchong Zhang, Ruisheng Cao, Hongshen Xu, Lu Chen 0002, Kai Yu 0004. 6487-6508 [doi]
- ContraDoc: Understanding Self-Contradictions in Documents with Large Language ModelsJierui Li, Vipul Raheja, Dhruv Kumar 0005. 6509-6523 [doi]
- Entity Disambiguation via Fusion Entity DecodingJunxiong Wang, Ali Mousavi 0003, Omar Attia, Ronak Pradeep, Saloni Potdar, Alexander M. Rush, Umar Farooq Minhas, Yunyao Li 0001. 6524-6536 [doi]
- PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision MakersMyeonghwa Lee, Seonho An, Min-Soo Kim. 6537-6555 [doi]
- GPTScore: Evaluate as You DesireJinLan Fu, See-Kiong Ng, Zhengbao Jiang, Pengfei Liu 0003. 6556-6576 [doi]
- A Survey of Confidence Estimation and Calibration in Large Language ModelsJiahui Geng, Fengyu Cai, Yuxia Wang, Heinz Koeppl, Preslav Nakov, Iryna Gurevych. 6577-6595 [doi]
- Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying ReferencesTianyi Tang, Hongyuan Lu, Yuchen Jiang, Haoyang Huang, Dongdong Zhang 0001, Wayne Xin Zhao, Tom Kocmi, Furu Wei. 6596-6610 [doi]
- Separation and Fusion: A Novel Multiple Token Linking Model for Event Argument ExtractionJing Xu, Dandan Song, Siu Hui, Zhijing Wu 0001, Meihuizi Jia, Hao Wang 0163, Yanru Zhou, Changzhi Zhou, Ziyi Yang. 6611-6624 [doi]
- The Integration of Semantic and Structural Knowledge in Knowledge Graph Entity TypingMuzhi Li, Minda Hu, Irwin King, Ho-Fung Leung. 6625-6638 [doi]
- ComCLIP: Training-Free Compositional Image and Text MatchingKenan Jiang, Xuehai He, Ruize Xu, Xin Wang. 6639-6659 [doi]
- ACLSum: A New Dataset for Aspect-based Summarization of Scientific PublicationsSotaro Takeshita, Tommaso Green, Ines Reinig, Kai Eckert 0001, Simone Paolo Ponzetto. 6660-6675 [doi]
- XAL: EXplainable Active Learning Makes Classifiers Better Low-resource LearnersYun Luo, Zhen Yang, Fandong Meng, Yingjie Li, Fang Guo, Qinglin Qi, Jie Zhou 0016, Yue Zhang 0004. 6676-6698 [doi]
- LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?YuChi Wang, Shuhuai Ren, Rundong Gao, Linli Yao, Qingyan Guo, Kaikai An, Jianhong Bai, Xu Sun 0001. 6699-6715 [doi]
- Intent-conditioned and Non-toxic Counterspeech Generation using Multi-Task Instruction Tuning with RLAIFAmey Hengle, Aswini Padhi, Sahajpreet Singh, Anil Bandhakavi, Md. Shad Akhtar, Tanmoy Chakraborty 0002. 6716-6733 [doi]
- Attacks, Defenses and Evaluations for LLM Conversation Safety: A SurveyZhichen Dong, Zhanhui Zhou, Chao Yang, Jing Shao, Yu Qiao. 6734-6747 [doi]
- Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language ModelsWeize Liu, Guocong Li, Kai Zhang 0039, Bang Du, Qiyuan Chen, Xuming Hu, Hongxia Xu, Jintai Chen, Jian Wu 0001. 6748-6763 [doi]
- Divergent Token Metrics: Measuring degradation to prune away LLM components - and optimize quantizationBjörn Deiseroth, Max Meuer, Nikolas Gritsch, Constantin Eichenberg, Patrick Schramowski, Matthias Aßenmacher, Kristian Kersting. 6764-6783 [doi]
- Beyond Performance: Quantifying and Mitigating Label Bias in LLMsYuval Reif, Roy Schwartz 0001. 6784-6798 [doi]
- Instructing Large Language Models to Identify and Ignore Irrelevant ConditionsZhenyu Wu 0004, Chao Shen, Meng Jiang 0001. 6799-6819 [doi]
- Lower Bounds on the Expressivity of Recurrent Neural Language ModelsAnej Svete, Franz Nowak, Anisha Mohamed Sahabdeen, Ryan Cotterell. 6820-6844 [doi]
- Transformers Can Represent n-gram Language ModelsAnej Svete, Ryan Cotterell. 6845-6881 [doi]
- The Role of n-gram Smoothing in the Age of Neural NetworksLuca Malagutti, Andrius Buinovskij, Anej Svete, Clara Meister, Afra Amini, Ryan Cotterell. 6882-6899 [doi]
- Reliability Estimation of News Media Sources: Birds of a Feather Flock TogetherSergio Burdisso, Dairazalia Sanchez-Cortes, Esaú Villatoro-Tello, Petr Motlícek. 6900-6918 [doi]
- On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific NeuronsTakeshi Kojima, Itsuki Okimura, Yusuke Iwasawa, Hitomi Yanaka, Yutaka Matsuo. 6919-6971 [doi]
- NLP Progress in Indigenous Latin American LanguagesAtnafu Lambebo Tonja, Fazlourrahman Balouchzahi, Sabur Butt, Olga Kolesnikova, Hector G. Ceballos, Alexander F. Gelbukh, Thamar Solorio. 6972-6987 [doi]
- On the Effectiveness of Adversarial Robustness for Abuse Mitigation with CounterspeechYi-Ling Chung, Jonathan Bright. 6988-7002 [doi]
- Leveraging the Structure of Pre-trained Embeddings to Minimize Annotation EffortCésar Gonzalez-Gutiérrez, Ariadna Quattoni. 7003-7017 [doi]
- UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through DebiasingYijun Yang, Jie He 0004, Pinzhen Chen, Víctor Gutiérrez-Basulto, Jeff Z. Pan. 7018-7035 [doi]
- Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question ComplexitySoyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong Park. 7036-7050 [doi]
- Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection MethodYukun Zhao, Lingyong Yan, Weiwei Sun 0001, Guoliang Xing, Chong Meng, Shuaiqiang Wang, Zhicong Cheng, Zhaochun Ren, Dawei Yin. 7051-7063 [doi]
- Are Large Language Model Temporally Grounded?Yifu Qiu, Zheng Zhao 0005, Yftah Ziser, Anna Korhonen, Edoardo Maria Ponti, Shay B. Cohen. 7064-7083 [doi]
- Document Image Machine Translation with Dynamic Multi-pre-trained Models AssemblingYupu Liang, Yaping Zhang, Cong Ma, Zhiyang Zhang, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou. 7084-7095 [doi]
- Elastic Weight Removal for Faithful and Abstractive Dialogue GenerationNico Daheim, Nouha Dziri, Mrinmaya Sachan, Iryna Gurevych, Edoardo M. Ponti. 7096-7112 [doi]
- R-Tuning: Instructing Large Language Models to Say 'I Don't Know'Hanning Zhang, Shizhe Diao, Yong Lin, Yi R. Fung 0001, Qing Lian, Xingyao Wang 0002, Yangyi Chen, Heng Ji, Tong Zhang 0001. 7113-7139 [doi]
- Bridging the Gap between Different Vocabularies for LLM EnsembleYangyifan Xu, Jinliang Lu, Jiajun Zhang. 7140-7152 [doi]
- KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable AdaptationXindi Luo, Zequn Sun, Jing Zhao, Zhe Zhao, Wei Hu 0007. 7153-7166 [doi]
- Extremely Weakly-supervised Text Classification with Wordsets Mining and Sync-DenoisingLysa Xiao. 7167-7179 [doi]
- F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine TranslationJunhong Wu, Yuchen Liu 0007, Chengqing Zong. 7180-7192 [doi]
- Towards Reducing Diagnostic Errors with Interpretable Risk PredictionDenis Jered McInerney, William Dickinson, Lucy C. Flynn, Andrea Young, Geoffrey Young, Jan-Willem van de Meent, Byron C. Wallace. 7193-7210 [doi]
- Generalizable Multilingual Hate Speech Detection on Low Resource Indian Languages using Fair Selection in Federated LearningAkshay Singh, Rahul Thakur. 7211-7221 [doi]
- Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasksNadezhda Chirkova, Vassilina Nikoulina. 7222-7238 [doi]
- The Impact of Depth on Compositional Generalization in Transformer Language ModelsJackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta 0001, Fei Sha, Dan Garrette, Tal Linzen. 7239-7252 [doi]
- Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question AnsweringNeha Srikanth, Rupak Sarkar, Heran Mane, Elizabeth Aparicio, Quynh C. Nguyen, Rachel Rudinger, Jordan L. Boyd-Graber. 7253-7268 [doi]
- Towards Explainability in Legal Outcome Prediction ModelsJosef Valvoda, Ryan Cotterell. 7269-7289 [doi]
- The steerability of large language models toward data-driven personasJunyi Li, Charith Peris, Ninareh Mehrabi, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard S. Zemel, Rahul Gupta 0001. 7290-7305 [doi]
- CCSum: A Large-Scale and High-Quality Dataset for Abstractive News SummarizationXiang Jiang, Markus Dreyer. 7306-7336 [doi]
- Capturing Perspectives of Crowdsourced Annotators in Subjective Learning TasksNegar Mokhberian, Myrl G. Marmarelis, Frederic R. Hopp, Valerio Basile, Fred Morstatter, Kristina Lerman. 7337-7349 [doi]
- Improving Factual Accuracy of Neural Table-to-Text Output by Addressing Input Problems in ToTToBarkavi Sundararajan, Yaji Sripada, Ehud Reiter. 7350-7376 [doi]
- CERET: Cost-Effective Extrinsic Refinement for Text GenerationJason Cai, Hang Su, Monica Sunkara, Igor Shalyminov, Saab Mansour. 7377-7390 [doi]
- Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral LabellingSubhendu Khatuya, Rajdeep Mukherjee, Akash Ghosh, Manjunath Hegde, Koustuv Dasgupta, Niloy Ganguly, Saptarshi Ghosh 0001, Pawan Goyal 0002. 7391-7403 [doi]
- Analysis of State-Level Legislative Process in Enhanced Linguistic and Nationwide Network ContextsMaryam Davoodi, Dan Goldwasser. 7404-7422 [doi]
- DeMuX: Data-efficient Multilingual LearningSimran Khanuja, Srinivas Gowriraj, Lucio M. Dery, Graham Neubig. 7423-7436 [doi]
- DUQGen: Effective Unsupervised Domain Adaptation of Neural Rankers by Diversifying Synthetic Query GenerationRamraj Chandradevan, Kaustubh D. Dhole, Eugene Agichtein. 7437-7451 [doi]
- How did we get here? Summarizing conversation dynamicsYilun Hua, Nicholas Chernogor, Yuzhe Gu, Seoyeon Julie Jeong, Miranda Luo, Cristian Danescu-Niculescu-Mizil. 7452-7477 [doi]
- Can Language Model Moderators Improve the Health of Online Discourse?Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, Yuyang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrara, Jonathan May. 7478-7496 [doi]
- LeanReasoner: Boosting Complex Logical Reasoning with LeanDongwei Jiang, Marcio Fonseca, Shay B. Cohen. 7497-7510 [doi]
- UICoder: Finetuning Large Language Models to Generate User Interface Code through Automated FeedbackJason Wu 0001, Eldon Schoop, Alan Leung, Titus Barik, Jeffrey P. Bigham, Jeffrey Nichols 0001. 7511-7525 [doi]
- Measuring Cross-lingual Transfer in BytesLeandro Rodrigues de Souza, Thales Sales Almeida, Roberto de Alencar Lotufo, Rodrigo Frassetto Nogueira. 7526-7537 [doi]
- MisgenderMender: A Community-Informed Approach to Interventions for MisgenderingTamanna Hossain, Sunipa Dev, Sameer Singh 0001. 7538-7558 [doi]
- Interplay of Machine Translation, Diacritics, and DiacritizationWei-Rui Chen, Ife Adebara, Muhammad Abdul-Mageed. 7559-7601 [doi]
- From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction TuningMing Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng 0001, Jianzong Wang, Tianyi Zhou 0001, Jing Xiao 0006. 7602-7635 [doi]
- Safer-Instruct: Aligning Language Models with Automated Preference DataTaiwei Shi, Kai Chen, Jieyu Zhao. 7636-7651 [doi]
- PELMS: Pre-training for Effective Low-Shot Multi-Document SummarizationJoseph Peper, Wenzhao Qiu, Lu Wang 0008. 7652-7674 [doi]
- Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?Bangzheng Li, Ben Zhou, Fei Wang 0060, Xingyu Fu, Dan Roth, Muhao Chen. 7675-7688 [doi]
- IndiSentiment140: Sentiment Analysis Dataset for Indian Languages with Emphasis on Low-Resource Languages using Machine TranslationSaurabh Kumar, Sanasam Ranbir Sanasam, Sukumar Nandi. 7689-7698 [doi]
- Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense RetrievalNandan Thakur, Jianmo Ni, Gustavo Hernández Ábrego, John Wieting, Jimmy Lin, Daniel Cer. 7699-7724 [doi]
- SCANNER: Knowledge-Enhanced Approach for Robust Multi-modal Named Entity Recognition of Unseen EntitiesHyunjong Ok, Taeho Kil, Sukmin Seo, Jaeho Lee 0001. 7725-7737 [doi]
- A Theory Guided Scaffolding Instruction Framework for LLM-Enabled Metaphor ReasoningYuan Tian, Nan Xu, Wenji Mao. 7738-7755 [doi]
- Learning to Compress Prompt in Natural Language FormatsYu-Neng Chuang, Tianwei Xing, Chia-Yuan Chang, Zirui Liu, Xun Chen, Xia Ben Hu. 7756-7767 [doi]
- Automatic, Meta and Human Evaluation for Multimodal Summarization with Multimodal OutputHaojie Zhuang, Wei Emma Zhang, Leon Xie, Weitong Chen 0001, Jian Yang 0001, Quan Sheng. 7768-7790 [doi]
- Naive Bayes-based Context Extension for Large Language ModelsJianlin Su, Murtadha H. M. Ahmed, Bo Wen, Luo Ao, Mingren Zhu, Yunfeng Liu. 7791-7807 [doi]
- Leitner-Guided Memory Replay for Cross-lingual Continual LearningMeryem M'hamdi, Jonathan May. 7808-7821 [doi]
- Multilingual Nonce Dependency Treebanks: Understanding how Language Models Represent and Process Syntactic StructureDavid Arps, Laura Kallmeyer, Younes Samih, Hassan Sajjad 0001. 7822-7844 [doi]
- Actively Learn from LLMs with Uncertainty Propagation for Generalized Category DiscoveryJinggui Liang, Lizi Liao, Hao Fei 0001, Bobo Li, Jing Jiang 0001. 7845-7858 [doi]
- Explaining Text Similarity in Transformer ModelsAlexandros Vasileiou 0002, Oliver Eberle. 7859-7873 [doi]
- Large Language Models can Contrastively Refine their Generation for Better Sentence Representation LearningHuiming Wang, Zhaodonghui Li, LiYing Cheng, De Wen Soh, Lidong Bing. 7874-7891 [doi]
- HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense retrievalJaeyoung Kim, Dohyeon Lee, Seung-won Hwang. 7892-7903 [doi]
- SuperGLEBer: German Language Understanding Evaluation BenchmarkJan Pfister, Andreas Hotho. 7904-7923 [doi]
- "You are an expert annotator": Automatic Best-Worst-Scaling Annotations for Emotion Intensity ModelingChristopher Bagdon, Prathamesh Karmalkar, Harsha Gurulingappa, Roman Klinger. 7924-7936 [doi]
- What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?Yan Zeng, Hanbo Zhang, Jiani Zheng, Jiangnan Xia, Guoqiang Wei, Yang Wei, Yuchen Zhang, Tao Kong, Ruihua Song. 7937-7964 [doi]
- Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG EvaluationJie Ruan, Wenqing Wang, Xiaojun Wan 0001. 7965-7989 [doi]
- MOSAICo: a Multilingual Open-text Semantically Annotated Interlinked CorpusSimone Conia, Edoardo Barba, Abelardo Carlos Martinez Lorenzo, Pere-Lluís Huguet Cabot, Riccardo Orlando, Luigi Procopio, Roberto Navigli. 7990-8004 [doi]
- SemRoDe: Macro Adversarial Training to Learn Representations that are Robust to Word-Level AttacksBrian Formento, Wenjie Feng 0001, Chuan-Sheng Foo, Anh Tuan Luu, See-Kiong Ng. 8005-8028 [doi]
- BUST: Benchmark for the evaluation of detectors of LLM-Generated TextJoseph Cornelius, Oscar Lithgow-Serrano, Sandra Mitrovic, Ljiljana Dolamic, Fabio Rinaldi 0001. 8029-8057 [doi]
- Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual AlignmentChong Li, Shaonan Wang, Jiajun Zhang, Chengqing Zong. 8058-8076 [doi]
- MaCSC: Towards Multimodal-augmented Pre-trained Language Models via Conceptual Prototypes and Self-balancing CalibrationXianwei Zhuang, Zhichang Wang, Xuxin Cheng, Yuxin Xie, Liming Liang, Yuexian Zou. 8077-8090 [doi]
- Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?Yusuke Sakai 0010, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe. 8091-8106 [doi]
- Discovering Lobby-Parliamentarian Alignments through NLPAswin Suresh, Lazar Radojevic, Francesco Salvi, Antoine Magron, Victor Kristof, Matthias Grossglauser. 8107-8120 [doi]
- IterCQR: Iterative Conversational Query Reformulation with Retrieval GuidanceYunah Jang, Kang Il Lee, Hyunkyung Bae, Hwanhee Lee, Kyomin Jung. 8121-8138 [doi]
- AceGPT, Localizing Large Language Models in ArabicHuang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Mosen Alharthi, Bang An, Juncai He, Ziche Liu, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun 0001, Xiang Wan, Haizhou Li 0001, Jinchao Xu. 8139-8163 [doi]
- Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward ModelZhiwei He 0002, Xing Wang 0007, Wenxiang Jiao, Zhuosheng Zhang 0001, Rui Wang 0015, Shuming Shi 0001, Zhaopeng Tu. 8164-8180 [doi]
- Depression Detection in Clinical Interviews with LLM-Empowered Structural Element GraphZhuang Chen 0002, Jiawen Deng, Jinfeng Zhou, Jincenzi Wu, Tieyun Qian, Minlie Huang. 8181-8194 [doi]
- SQATIN: Supervised Instruction Tuning Meets Question Answering for Improved Dialogue NLUEvgeniia Razumovskaia, Goran Glavas, Anna Korhonen, Ivan Vulic. 8195-8211 [doi]
- Enhancing Argument Summarization: Prioritizing Exhaustiveness in Key Point Generation and Introducing an Automatic Coverage Evaluation MetricMohammad Khosravani, Chenyang Huang 0001, Amine Trabelsi. 8212-8224 [doi]
- ARM: Alignment with Residual Energy-Based ModelBo Pang, Caiming Xiong, Yingbo Zhou. 8225-8236 [doi]
- HumanRankEval: Automatic Evaluation of LMs as Conversational AssistantsMilan Gritta, Gerasimos Lampouras, Ignacio Iacobacci. 8237-8249 [doi]
- FAMuS: Frames Across Multiple SourcesSiddharth Vashishtha, Alexander Martin 0006, William Gantt, Benjamin Van Durme, Aaron Steven White. 8250-8273 [doi]
- Rationale-based Opinion SummarizationHaoyuan Li, Snigdha Chaturvedi. 8274-8292 [doi]
- Mustango: Toward Controllable Text-to-Music GenerationJan Melechovský, Zixun Guo, Deepanway Ghosal, Navonil Majumder, Dorien Herremans, Soujanya Poria. 8293-8316 [doi]
- Adaptive Cross-lingual Text Classification through In-Context One-Shot DemonstrationsEmilio Villa-Cueva, Adrián Pastor López-Monroy, Fernando Sánchez-Vega, Thamar Solorio. 8317-8335 [doi]
- CNER: Concept and Named Entity RecognitionGiuliano Martinelli, Francesco Molfese 0001, Simone Tedeschi, Alberte Fernández-Castro, Roberto Navigli. 8336-8351 [doi]
- Branch-Solve-Merge Improves Large Language Model Evaluation and GenerationSwarnadeep Saha, Omer Levy, Asli Celikyilmaz, Mohit Bansal, Jason Weston, Xian Li. 8352-8370 [doi]
- REPLUG: Retrieval-Augmented Black-Box Language ModelsWeijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Richard James 0001, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih. 8371-8384 [doi]
- David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMsXiaochuang Han, Sachin Kumar 0009, Yulia Tsvetkov, Marjan Ghazvininejad. 8385-8400 [doi]
- Efficient End-to-End Visual Document Understanding with Rationale DistillationWang Zhu 0001, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova. 8401-8424 [doi]
- A Systematic Comparison of Syllogistic Reasoning in Humans and Language ModelsTiwalayo Eisape, Michael Henry Tessler, Ishita Dasgupta 0001, Fei Sha, Sjoerd van Steenkiste, Tal Linzen. 8425-8444 [doi]
- AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced DatasetsPietro Lesci, Andreas Vlachos 0001. 8445-8464 [doi]
- ICLE++: Modeling Fine-Grained Traits for Holistic Essay ScoringShengjie Li 0002, Vincent Ng 0001. 8465-8486 [doi]
- UNcommonsense Reasoning: Abductive Reasoning about Uncommon SituationsWenting Zhao, Justin T. Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Yejin Choi 0001, Xiang Li 0069, Alane Suhr. 8487-8505 [doi]
- To Tell The Truth: Language of Deception and Language ModelsSanchaita Hazra, Bodhisattwa Prasad Majumder. 8506-8520 [doi]
- Multilingual Models for ASR in Chibchan LanguagesRolando Coto-Solano, Tai-Wan Kim, Alexander Jones, Sharid Loáiciga. 8521-8535 [doi]
- LegalDiscourse: Interpreting When Laws Apply and To WhomAlexander Spangher, Zihan Xue, Te-Lin Wu, Mark Hansen, Jonathan May. 8536-8559 [doi]
- X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation AspectsMinqian Liu, Ying Shen, Zhiyang Xu, Yixin Cao 0002, Eunah Cho, Vaibhav Kumar, Reza Ghanadan, Lifu Huang. 8560-8579 [doi]
- Is Reference Necessary in the Evaluation of NLG Systems? When and Where?Shuqian Sheng, Yi Xu 0004, Luoyi Fu, Jiaxin Ding, Lei Zhou, Xinbing Wang, Chenghu Zhou. 8580-8596 [doi]
- Semi-Structured Chain-of-Thought: Integrating Multiple Sources of Knowledge for Improved Language Model ReasoningXin Su, Tiep Le, Steven Bethard, Phillip Howard. 8597-8613 [doi]
- Evaluating the Deductive Competence of Large Language ModelsS. M. Seals, Valerie L. Shalin. 8614-8630 [doi]
- Large Human Language Models: A Need and the ChallengesNikita Soni 0002, H. Andrew Schwartz, João Sedoc, Niranjan Balasubramanian. 8631-8646 [doi]
- On Learning to Summarize with Large Language Models as ReferencesYixin Liu 0003, Kejian Shi, Katherine He, Longtian Ye, Alexander R. Fabbri, Pengfei Liu 0003, Dragomir Radev, Arman Cohan. 8647-8664 [doi]
- Hallucination Diversity-Aware Active Learning for Text SummarizationYu Xia, Xu Liu, Tong Yu 0001, SungChul Kim, Ryan A. Rossi, Anup B. Rao, Tung Mai, Shuai Li 0010. 8665-8677 [doi]
- Keep it Private: Unsupervised Privatization of Online TextCalvin Bao, Marine Carpuat. 8678-8693 [doi]
- Tied-LoRA: Enhancing parameter efficiency of LoRA with Weight TyingAdithya Renduchintala, Tugrul Konuk, Oleksii Kuchaiev. 8694-8705 [doi]
- Investigating Data Contamination in Modern Benchmarks for Large Language ModelsChunyuan Deng, Yilun Zhao 0001, Xiangru Tang, Mark Gerstein, Arman Cohan. 8706-8719 [doi]
- Pre-trained Language Models for Entity Blocking: A Reproducibility StudyRunhui Wang, Yongfeng Zhang. 8720-8730 [doi]
- RE²: Region-Aware Relation Extraction from Visually Rich DocumentsPritika Ramu, Sijia Wang, Lalla Mouatadid, Joy Rimchala, Lifu Huang. 8731-8747 [doi]
- Mix-Initiative Response Generation with Dynamic Prefix TuningYuxiang Nie, Heyan Huang, Xian-Ling Mao, Lizi Liao. 8748-8761 [doi]
- Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human ValueJing Yao, Xiaoyuan Yi, Yifan Gong 0001, Xiting Wang, Xing Xie 0001. 8762-8785 [doi]
- IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian ContextNihar R. Sahoo, Pranamya Prashant Kulkarni, Arif Ahmad, Tanu Goyal, Narjis Asad, Aparna Garimella, Pushpak Bhattacharyya. 8786-8806 [doi]