Applying Positional Encoding to Enhance Vision-Language Transformers

Xuehao Liu, Sarah Jane Delany, Susan McKeever. Applying Positional Encoding to Enhance Vision-Language Transformers. In Petia Radeva, Giovanni Maria Farinella, Kadi Bouatouch, editors, Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2023, Volume 5: VISAPP, Lisbon, Portugal, February 19-21, 2023. pages 838-845, SCITEPRESS, 2023. [doi]

Abstract

Abstract is missing.