Extracting Latent Steering Vectors from Pretrained Language Models

Nishant Subramani, Nivedita Suresh, Matthew E. Peters. Extracting Latent Steering Vectors from Pretrained Language Models. In Smaranda Muresan, Preslav Nakov, Aline Villavicencio, editors, Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022. pages 566-581, Association for Computational Linguistics, 2022. [doi]

Abstract

Abstract is missing.