The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation

Dung Nguyen Manh, Le Nam Hai, Anh T. V. Dau, Anh-Minh Nguyen, Khanh Nghiem, Jin Guo, Nghi D. Q. Bui. The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 4763-4788, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.