The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation

Dung Nguyen Manh, Le Nam Hai, Anh T. V. Dau, Anh-Minh Nguyen, Khanh Nghiem, Jin Guo, Nghi D. Q. Bui. The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 4763-4788, Association for Computational Linguistics, 2023. [doi]

Authors

Dung Nguyen Manh

This author has not been identified. Look up 'Dung Nguyen Manh' in Google

Le Nam Hai

This author has not been identified. Look up 'Le Nam Hai' in Google

Anh T. V. Dau

This author has not been identified. Look up 'Anh T. V. Dau' in Google

Anh-Minh Nguyen

This author has not been identified. Look up 'Anh-Minh Nguyen' in Google

Khanh Nghiem

This author has not been identified. Look up 'Khanh Nghiem' in Google

Jin Guo

This author has not been identified. Look up 'Jin Guo' in Google

Nghi D. Q. Bui

This author has not been identified. Look up 'Nghi D. Q. Bui' in Google