PromptAttack: Prompt-Based Attack for Language Models via Gradient Search

Yundi Shi, Piji Li, Changchun Yin, Zhaoyang Han, Lu Zhou, Zhe Liu 0001. PromptAttack: Prompt-Based Attack for Language Models via Gradient Search. In Wei Lu 0011, Shujian Huang, Yu Hong, Xiabing Zhou, editors, Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Guilin, China, September 24-25, 2022, Proceedings, Part I. Volume 13551 of Lecture Notes in Computer Science, pages 682-693, Springer, 2022. [doi]

Abstract

Abstract is missing.