Eliciting Language Model Behaviors with Investigator Agents

Xiang Lisa Li, Neil Chowdhury, Daniel D. Johnson 0001, Tatsunori Hashimoto, Percy Liang, Sarah Schwettmann, Jacob Steinhardt. Eliciting Language Model Behaviors with Investigator Agents. In Forty-second International Conference on Machine Learning, ICML 2025, Vancouver, BC, Canada, July 13-19, 2025. OpenReview.net, 2025. [doi]

Abstract

Abstract is missing.