The following publications are possibly variants of this publication:
- Neural Knowledge Bank for Pretrained TransformersDamai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Zhifang Sui. nlpcc 2023: 772-783 [doi]
- BEVT: BERT Pretraining of Video TransformersRui Wang, Dongdong Chen, Zuxuan Wu, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Yu-Gang Jiang, Luowei Zhou, Lu Yuan. cvpr 2022: 14713-14723 [doi]
- What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained TransformersBoseop Kim, Hyoungseok Kim, Sang-Woo Lee, Gichang Lee, Dong-Hyun Kwak, Dong Hyeon Jeon, Sunghyun Park 0005, Sungju Kim, Seonhoon Kim, Dongpil Seo, Heungsub Lee, Minyoung Jeong, Sungjae Lee, Minsub Kim, SukHyun Ko, Seokhun Kim, Taeyong Park, Jinuk Kim, Soyoung Kang, Na-Hyeon Ryu, Kang Min Yoo, Minsuk Chang, Soobin Suh, Sookyo In, Jinseong Park, Kyungduk Kim, Hiun Kim, Jisu Jeong, Yong Goo Yeo, DongHoon Ham, Dongju Park, Min-Young Lee, Jaewook Kang, Inho Kang, Jung-Woo Ha, Woo-Myoung Park, Nako Sung. emnlp 2021: 3405-3424 [doi]