Counterfactual Evaluation for Blind Attack Detection in LLM-based Evaluation Systems - researchr publication

researchr

You are not signed in
Sign in
Sign up

Lijia Liu, Takumi Kondo, Kyohei Atarashi, Koh Takeuchi 0001, Jiyi Li, Shigeru Saito, Hisashi Kashima. Counterfactual Evaluation for Blind Attack Detection in LLM-based Evaluation Systems. In Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty 0002, Dhirendra Pratap Singh, editors, Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, IJCNLP-AACL 2025, Mumbai, India, December 20-24, 2025. pages 572-584, The Asian Federation of Natural Language Processing and The Association for Computational Linguistics, 2025. [doi]

Abstract is missing.

runs on WebDSL