The following publications are possibly variants of this publication:
- Alignment of SOTIF and Scenario-Based Safety Evaluation FrameworkIkue Warren, Kenji Taguchi 0001, Sou Kitajima, Hiroki Nakamura, Tomoyoshi Murata. safecomp 2026: 99-114 [doi]
- On the Vulnerability of Safety Alignment in Open-Access LLMsJingwei Yi, Rui Ye, Qisi Chen, Bin Zhu, Siheng Chen, Defu Lian, Guangzhong Sun, Xing Xie 0001, Fangzhao Wu. acl 2014: 9236-9260 [doi]
- Safety Alignment via Constrained Knowledge UnlearningZesheng Shi, Yucheng Zhou 0001, Jing Li 0034, Yuxin Jin, Yu Li 0007, Daojing He, Fangming Liu, Saleh Alharbi, Jun Yu, Min Zhang 0005. acl 2025: 25515-25529 [doi]
- SaLoRA: Safety-Alignment Preserved Low-Rank AdaptationMingjie Li 0007, Wai Man Si, Michael Backes 0001, Yang Zhang 0016, Yisen Wang 0001. iclr 2025: [doi]
- Course-Correction: Safety Alignment Using Synthetic PreferencesRongwu Xu, Yishuo Cai, Zhenhong Zhou, Renjie Gu, Haiqin Weng, Liu Yan, Tianwei Zhang 0004, Wei Xu, Han Qiu 0001. emnlp 2024: 1622-1649 [doi]