Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman. Localizing Visual Sounds the Hard Way. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. pages 16867-16876, Computer Vision Foundation / IEEE, 2021. [doi]
Abstract is missing.