Polos: Multimodal Metric Learning from Human Feedback for Image Captioning

Yuiga Wada, Kanta Kaneda, Daichi Saito, Komei Sugiura. Polos: Multimodal Metric Learning from Human Feedback for Image Captioning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pages 13559-13568, IEEE, 2024. [doi]

Authors

Yuiga Wada

This author has not been identified. Look up 'Yuiga Wada' in Google

Kanta Kaneda

This author has not been identified. Look up 'Kanta Kaneda' in Google

Daichi Saito

This author has not been identified. Look up 'Daichi Saito' in Google

Komei Sugiura

This author has not been identified. Look up 'Komei Sugiura' in Google