The following publications are possibly variants of this publication:
- Temporal Deformable Convolutional Encoder-Decoder Networks for Video CaptioningJingwen Chen, Yingwei Pan, Yehao Li, Ting Yao, Hongyang Chao, Tao Mei. AAAI 2019: 8167-8174 [doi]
- Retrieval-Augmented Egocentric Video CaptioningJilan Xu, Yifei Huang, Junlin Hou, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie. cvpr 2024: 13525-13536 [doi]
- Interaction augmented transformer with decoupled decoding for video captioningTao Jin, Zhou Zhao, Peng Wang, Jun Yu 0002, Fei Wu 0001. ijon, 492:496-507, 2022. [doi]
- Retrieval-augmented Image CaptioningRita Ramos, Desmond Elliott, Bruno Martins 0001. eacl 2023: 3648-3663 [doi]