Transformer vision-language tracking via proxy token guided cross-modal fusion

Haojie Zhao, Xiao Wang 0014, Dong Wang 0004, Huchuan Lu, Xiang Ruan. Transformer vision-language tracking via proxy token guided cross-modal fusion. Pattern Recognition Letters, 168:10-16, April 2023. [doi]

Abstract

Abstract is missing.