Language-guided Multi-Modal Fusion for Video Action Recognition

Jenhao Hsiao, Yikang Li, Chiuman Ho. Language-guided Multi-Modal Fusion for Video Action Recognition. In IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021, Montreal, BC, Canada, October 11-17, 2021. pages 3151-3155, IEEE, 2021. [doi]

Authors

Jenhao Hsiao

This author has not been identified. Look up 'Jenhao Hsiao' in Google

Yikang Li

This author has not been identified. Look up 'Yikang Li' in Google

Chiuman Ho

This author has not been identified. Look up 'Chiuman Ho' in Google