HSVLT: Hierarchical Scale-Aware Vision-Language Transformer for Multi-Label Image Classification

Shuyi Ouyang, Hongyi Wang, Ziwei Niu, Zhenjia Bai, Shiao Xie, Yingying Xu, Ruofeng Tong 0001, Yen-Wei Chen 0001, Lanfen Lin. HSVLT: Hierarchical Scale-Aware Vision-Language Transformer for Multi-Label Image Classification. In Abdulmotaleb El-Saddik, Tao Mei, Rita Cucchiara, Marco Bertini 0001, Diana Patricia Tobon Vallejo, Pradeep K. Atrey, M. Shamim Hossain, editors, Proceedings of the 31st ACM International Conference on Multimedia, MM 2023, Ottawa, ON, Canada, 29 October 2023- 3 November 2023. pages 4768-4777, ACM, 2023. [doi]

Authors

Shuyi Ouyang

This author has not been identified. Look up 'Shuyi Ouyang' in Google

Hongyi Wang

This author has not been identified. Look up 'Hongyi Wang' in Google

Ziwei Niu

This author has not been identified. Look up 'Ziwei Niu' in Google

Zhenjia Bai

This author has not been identified. Look up 'Zhenjia Bai' in Google

Shiao Xie

This author has not been identified. Look up 'Shiao Xie' in Google

Yingying Xu

This author has not been identified. Look up 'Yingying Xu' in Google

Ruofeng Tong 0001

This author has not been identified. Look up 'Ruofeng Tong 0001' in Google

Yen-Wei Chen 0001

This author has not been identified. Look up 'Yen-Wei Chen 0001' in Google

Lanfen Lin

This author has not been identified. Look up 'Lanfen Lin' in Google