Ken Shi, Gerald Penn. Semantic Masking in a Needle-in-a-haystack Test for Evaluating Large Language Model Long-Text Capabilities. In Proceedings of the 31st International Conference on Computational Linguistics, COLING 2025 - Workshops, Abu Dhabi, UAE, January 19-24, 2025. pages 16-23, Association for Computational Linguistics, 2025. [doi]
Abstract is missing.