Xinsheng Wang, Justin van der Hout, Jihua Zhu, Mark Hasegawa-Johnson, Odette Scharenborg. Synthesizing Spoken Descriptions of Images. IEEE Transactions on Audio, Speech & Language Processing, 29:3242-3254, 2021. [doi]
No references recorded for this publication.
No citations of this publication recorded.