Jules Samaran, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima. Attending Self-Attention: A Case Study of Visually Grounded Supervision in Vision-and-Language Transformers. In Jad Kabbara, Haitao Lin, Amandalynne Paullada, Jannis Vamvas, editors, Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, ACL 2021, Online, JUli 5-10, 2021. pages 81-86, Association for Computational Linguistics, 2021. [doi]
Abstract is missing.