The following publications are possibly variants of this publication:
- One-Shot Transformer-Based Framework for Visually-Rich Document UnderstandingHuynh The Vu, Van Pham Hoai, Jeff Yang. icdar 2024: 244-261 [doi]
- XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document UnderstandingZhangxuan Gu, Changhua Meng, Ke Wang, Jun Lan, Weiqiang Wang, Ming Gu, Liqing Zhang 0001. cvpr 2022: 4573-4582 [doi]
- LayoutLMv2: Multi-modal Pre-training for Visually-rich Document UnderstandingYang Xu, Yiheng Xu, Tengchao Lv, Lei Cui 0001, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei A. F. FlorĂȘncio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou. acl 2021: 2579-2591 [doi]
- LayoutGCN: A Lightweight Architecture for Visually Rich Document UnderstandingDengliang Shi, Siliang Liu, Jintao Du, Huijia Zhu. icdar 2023: 149-165 [doi]
- ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document UnderstandingQiming Peng, Yinxu Pan, Wenjin Wang 0003, Bin Luo, Zhenyu Zhang 0006, Zhengjie Huang, Yuhui Cao, Weichong Yin, Yongfeng Chen, Yin Zhang, Shikun Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang. emnlp 2022: 3744-3756 [doi]