Natural-Language-Driven Multimodal Representation Learning for Audio-Visual Scene-Aware Dialog System

Yoonseok Heo, Sangwoo Kang, Jungyun Seo. Natural-Language-Driven Multimodal Representation Learning for Audio-Visual Scene-Aware Dialog System. Sensors, 23(18):7875, September 2023. [doi]

Abstract

Abstract is missing.