Show, Think, and Tell: Thought-Augmented Fine-Tuning of Large Language Models for Video Captioning

Byoungjip Kim, Dasol Hwang, Sungjun Cho, Youngsoo Jang, Honglak Lee, Moontae Lee. Show, Think, and Tell: Thought-Augmented Fine-Tuning of Large Language Models for Video Captioning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Workshops, Seattle, WA, USA, June 17-18, 2024. pages 1808-1817, IEEE, 2024. [doi]

Authors

Byoungjip Kim

This author has not been identified. Look up 'Byoungjip Kim' in Google

Dasol Hwang

This author has not been identified. Look up 'Dasol Hwang' in Google

Sungjun Cho

This author has not been identified. Look up 'Sungjun Cho' in Google

Youngsoo Jang

This author has not been identified. Look up 'Youngsoo Jang' in Google

Honglak Lee

This author has not been identified. Look up 'Honglak Lee' in Google

Moontae Lee

This author has not been identified. Look up 'Moontae Lee' in Google