The Internal State of an LLM Knows When It's Lying

Amos Azaria, Tom M. Mitchell. The Internal State of an LLM Knows When It's Lying. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 967-976, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.