MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation

Zexue He, Yu Wang, An Yan 0003, Yao Liu, Eric Y. Chang, Amilcare Gentili, Julian J. McAuley, Chun-Nan Hsu. MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023. pages 8725-8744, Association for Computational Linguistics, 2023. [doi]

Authors

Zexue He

This author has not been identified. Look up 'Zexue He' in Google

Yu Wang

This author has not been identified. It may be one of the following persons: Look up 'Yu Wang' in Google

An Yan 0003

This author has not been identified. Look up 'An Yan 0003' in Google

Yao Liu

This author has not been identified. Look up 'Yao Liu' in Google

Eric Y. Chang

This author has not been identified. Look up 'Eric Y. Chang' in Google

Amilcare Gentili

This author has not been identified. Look up 'Amilcare Gentili' in Google

Julian J. McAuley

This author has not been identified. Look up 'Julian J. McAuley' in Google

Chun-Nan Hsu

This author has not been identified. Look up 'Chun-Nan Hsu' in Google