The following publications are possibly variants of this publication:
- Spatiotemporal module for video saliency prediction based on self-attentionYuhao Wang, Zhuoran Liu, Yibo Xia, Chunbo Zhu, Danpei Zhao. ivc, 112:104216, 2021. [doi]
- Residual attention-based LSTM for video captioningXiangPeng Li, Zhilong Zhou, Lijiang Chen, Lianli Gao. www, 22(2):621-636, 2019. [doi]
- An attention based dual learning approach for video captioningWanting Ji, Ruili Wang, Yan Tian, Xun Wang. asc, 117:108332, 2022. [doi]
- Hierarchical attention-based multimodal fusion for video captioningChunlei Wu, Yiwei Wei, Xiaoliang Chu, Weichen Sun, Fei Su, Leiquan Wang. ijon, 315:362-370, 2018. [doi]
- Video Captioning With Attention-Based LSTM and Semantic ConsistencyLianli Gao, Zhao Guo, Hanwang Zhang, Xing Xu, Heng Tao Shen. tmm, 19(9):2045-2055, 2017. [doi]
- Dense video captioning based on local attentionYong Qian, Yingchi Mao, Zhihao Chen, Chang Li, Olano Teah Bloh, Qian Huang. iet-ipr, 17(9):2673-2685, 2023. [doi]
- STAM: A SpatioTemporal Attention Based Memory for Video PredictionZheng Chang 0002, Xinfeng Zhang 0001, Shanshe Wang, Siwei Ma, Wen Gao 0001. tmm, 25:2354-2367, 2023. [doi]
- A novel spatiotemporal attention enhanced discriminative network for video salient object detectionBing Liu 0022, Kezhou Mu, Mingzhu Xu, Fangyuan Wang, Lei Feng. apin, 52(6):5922-5937, 2022. [doi]