Visual Adversarial Examples Jailbreak Aligned Large Language Models

Xiangyu Qi, Kaixuan Huang, Ashwinee Panda, Peter Henderson 0002, Mengdi Wang, Prateek Mittal. Visual Adversarial Examples Jailbreak Aligned Large Language Models. In Michael J. Wooldridge, Jennifer G. Dy, Sriraam Natarajan, editors, Thirty-Eigth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada. pages 21527-21536, AAAI Press, 2024. [doi]

Abstract

Abstract is missing.