Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models

Boxin Wang, Chejian Xu, Shuohang Wang, Zhe Gan, Yu Cheng 0001, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li 0026. Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models. In Joaquin Vanschoren, Sai Kit Yeung, editors, Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual. 2021. [doi]

@inproceedings{WangXWG0GA021,
  title = {Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models},
  author = {Boxin Wang and Chejian Xu and Shuohang Wang and Zhe Gan and Yu Cheng 0001 and Jianfeng Gao and Ahmed Hassan Awadallah and Bo Li 0026},
  year = {2021},
  url = {https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/335f5352088d7d9bf74191e006d8e24c-Abstract-round2.html},
  researchr = {https://researchr.org/publication/WangXWG0GA021},
  cites = {0},
  citedby = {0},
  booktitle = {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual},
  editor = {Joaquin Vanschoren and Sai Kit Yeung},
}