Jieun Park, Kyungtae Lim, Joon-Ho Lim. Beyond Accuracy: Alignment and Error Detection across Languages in the Bi-GSM8K Math-Teaching Benchmark. In Vera Demberg, Kentaro Inui, LluĂs Marquez, editors, Findings of the Association for Computational Linguistics: EACL 2026, Rabat, Morocco, March 24-29, 2026. pages 1678-1704, Association for Computational Linguistics, 2026. [doi]
Abstract is missing.