Ting Pan, Lulu Tang, Xinlong Wang, Xin Liu 0044, Shiguang Shan. Consistent multimodal pre-training for visual tokenization. Science in China Series F: Information Sciences, 68(10), 2025. [doi]
No references recorded for this publication.
No citations of this publication recorded.