RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation

Zeyuan Yang, Jiageng Lin, Peihao Chen, Anoop Cherian, Tim K. Marks, Jonathan Le Roux, Chuang Gan. RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pages 16251-16261, IEEE, 2024. [doi]

Abstract

Abstract is missing.