Leveraging Text Representation and Face-head Tracking for Long-form Multimodal Semantic Relation Understanding

Raksha Ramesh, Vishal Anand, Zifan Chen, Yifei Dong, Yun Chen, Ching-Yung Lin. Leveraging Text Representation and Face-head Tracking for Long-form Multimodal Semantic Relation Understanding. In João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh 0001, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni, editors, MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022. pages 7215-7219, ACM, 2022. [doi]

Authors

Raksha Ramesh

This author has not been identified. Look up 'Raksha Ramesh' in Google

Vishal Anand

This author has not been identified. Look up 'Vishal Anand' in Google

Zifan Chen

This author has not been identified. Look up 'Zifan Chen' in Google

Yifei Dong

This author has not been identified. Look up 'Yifei Dong' in Google

Yun Chen

This author has not been identified. Look up 'Yun Chen' in Google

Ching-Yung Lin

This author has not been identified. Look up 'Ching-Yung Lin' in Google