DocumentNet: Bridging the Data Gap in Document Pre-training

Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, Alexander G. Hauptmann, Hanjun Dai, Wei Wei 0019. DocumentNet: Bridging the Data Gap in Document Pre-training. In Mingxuan Wang, Imed Zitouni, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023 - Industry Track, Singapore, December 6-10, 2023. pages 707-722, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.