Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models

Boxin Wang, Chejian Xu, Shuohang Wang, Zhe Gan, Yu Cheng 0001, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li 0026. Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models. In Joaquin Vanschoren, Sai Kit Yeung, editors, Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual. 2021. [doi]

Abstract

Abstract is missing.