Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering

Ting Yu 0002, Kunhao Fu, Jian Zhang 0026, Qingming Huang, Jun Yu 0002. Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering. IEEE Transactions on Image Processing, 33:3115-3129, 2024. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.