Defending LLMs against jailbreak attacks through representation offset detection - researchr publication

researchr

You are not signed in
Sign in
Sign up

Shuo Liu, Xiang Cheng 0003, ZhenZhong Zheng, Sen Su. Defending LLMs against jailbreak attacks through representation offset detection. Inf. Process. Manage., 63(5):104662, 2026. [doi]

Abstract is missing.

runs on WebDSL