Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer

Yanpeng Zhao, Jack Hessel, Youngjae Yu, Ximing Lu, Rowan Zellers, Yejin Choi. Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer. In Marine Carpuat, Marie-Catherine de Marneffe, Iván Vladimir Meza Ruíz, editors, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022. pages 4492-4507, Association for Computational Linguistics, 2022. [doi]

Abstract

Abstract is missing.