DocumentNet: Bridging the Data Gap in Document Pre-training

Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, Alexander G. Hauptmann, Hanjun Dai, Wei Wei 0019. DocumentNet: Bridging the Data Gap in Document Pre-training. In Mingxuan Wang, Imed Zitouni, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023 - Industry Track, Singapore, December 6-10, 2023. pages 707-722, Association for Computational Linguistics, 2023. [doi]

@inproceedings{YuMSCHD023,
  title = {DocumentNet: Bridging the Data Gap in Document Pre-training},
  author = {Lijun Yu and Jin Miao and Xiaoyu Sun and Jiayi Chen and Alexander G. Hauptmann and Hanjun Dai and Wei Wei 0019},
  year = {2023},
  url = {https://aclanthology.org/2023.emnlp-industry.66},
  researchr = {https://researchr.org/publication/YuMSCHD023},
  cites = {0},
  citedby = {0},
  pages = {707-722},
  booktitle = {Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023 - Industry Track, Singapore, December 6-10, 2023},
  editor = {Mingxuan Wang and Imed Zitouni},
  publisher = {Association for Computational Linguistics},
}