Synthesizing Spoken Descriptions of Images

Xinsheng Wang, Justin van der Hout, Jihua Zhu, Mark Hasegawa-Johnson, Odette Scharenborg. Synthesizing Spoken Descriptions of Images. IEEE Transactions on Audio, Speech & Language Processing, 29:3242-3254, 2021. [doi]

Abstract

Abstract is missing.