SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

Lijun Yu, Yong Cheng, Zhiruo Wang, Vivek Kumar, Wolfgang Macherey, Yanping Huang, David A. Ross, Irfan Essa, Yonatan Bisk, Ming-Hsuan Yang 0001, Kevin P. Murphy, Alexander G. Hauptmann, Lu Jiang 0004. SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs. In Alice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, Sergey Levine, editors, Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023. [doi]

Authors

Lijun Yu

This author has not been identified. Look up 'Lijun Yu' in Google

Yong Cheng

This author has not been identified. Look up 'Yong Cheng' in Google

Zhiruo Wang

This author has not been identified. Look up 'Zhiruo Wang' in Google

Vivek Kumar

This author has not been identified. Look up 'Vivek Kumar' in Google

Wolfgang Macherey

This author has not been identified. Look up 'Wolfgang Macherey' in Google

Yanping Huang

This author has not been identified. Look up 'Yanping Huang' in Google

David A. Ross

This author has not been identified. Look up 'David A. Ross' in Google

Irfan Essa

This author has not been identified. Look up 'Irfan Essa' in Google

Yonatan Bisk

This author has not been identified. Look up 'Yonatan Bisk' in Google

Ming-Hsuan Yang 0001

This author has not been identified. Look up 'Ming-Hsuan Yang 0001' in Google

Kevin P. Murphy

This author has not been identified. Look up 'Kevin P. Murphy' in Google

Alexander G. Hauptmann

This author has not been identified. Look up 'Alexander G. Hauptmann' in Google

Lu Jiang 0004

This author has not been identified. Look up 'Lu Jiang 0004' in Google