FashionViL: Fashion-Focused Vision-and-Language Representation Learning

Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang. FashionViL: Fashion-Focused Vision-and-Language Representation Learning. In Shai Avidan, Gabriel J. Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner, editors, Computer Vision - ECCV 2022 - 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXV. Volume 13695 of Lecture Notes in Computer Science, pages 634-651, Springer, 2022. [doi]

Abstract

Abstract is missing.