Fully-attentive iterative networks for region-based controllable image and video captioning

Marcella Cornia, Lorenzo Baraldi, Ayellet Tal, Rita Cucchiara. Fully-attentive iterative networks for region-based controllable image and video captioning. Computer Vision and Image Understanding, 237:103857, December 2023. [doi]

@article{CorniaBTC23,
  title = {Fully-attentive iterative networks for region-based controllable image and video captioning},
  author = {Marcella Cornia and Lorenzo Baraldi and Ayellet Tal and Rita Cucchiara},
  year = {2023},
  month = {December},
  doi = {10.1016/j.cviu.2023.103857},
  url = {https://doi.org/10.1016/j.cviu.2023.103857},
  researchr = {https://researchr.org/publication/CorniaBTC23},
  cites = {0},
  citedby = {0},
  journal = {Computer Vision and Image Understanding},
  volume = {237},
  pages = {103857},
}