Vulnerability of Large Language Models to Output Prefix Jailbreaks: Impact of Positions on Safety - researchr publication

researchr

You are not signed in
Sign in
Sign up

Yiwei Wang 0001, Muhao Chen, Nanyun Peng 0001, Kai-Wei Chang. Vulnerability of Large Language Models to Output Prefix Jailbreaks: Impact of Positions on Safety. In Luis Chiruzzo, Alan Ritter, Lu Wang, editors, Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29 - May 4, 2025. pages 3939-3952, Association for Computational Linguistics, 2025. [doi]

Abstract is missing.

runs on WebDSL