The Internal State of an LLM Knows When It's Lying

researchr

explore
calendar
search

You are not signed in
Sign in
Sign up

Amos Azaria, Tom M. Mitchell. The Internal State of an LLM Knows When It's Lying. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 967-976, Association for Computational Linguistics, 2023. [doi]

@inproceedings{AzariaM23,
  title = {The Internal State of an LLM Knows When It's Lying},
  author = {Amos Azaria and Tom M. Mitchell},
  year = {2023},
  url = {https://aclanthology.org/2023.findings-emnlp.68},
  researchr = {https://researchr.org/publication/AzariaM23},
  cites = {0},
  citedby = {0},
  pages = {967-976},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023},
  editor = {Houda Bouamor and Juan Pino 0001 and Kalika Bali},
  publisher = {Association for Computational Linguistics},
  isbn = {979-8-89176-061-5},
}

External Links

Cite Key

Statistics

PDF

Researchr

The Internal State of an LLM Knows When It's Lying