Multilayer Vision and Language Augmented Transformer for Image Captioning

Qiang Su, Zhixin Li. Multilayer Vision and Language Augmented Transformer for Image Captioning. In Fenrong Liu, Arun Anand Sadanandan, Duc Nghia Pham, Petrus Mursanto, Dickson Lukose, editors, PRICAI 2023: Trends in Artificial Intelligence - 20th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2023, Jakarta, Indonesia, November 15-19, 2023, Proceedings, Part II. Volume 14326 of Lecture Notes in Computer Science, pages 210-222, Springer, 2023. [doi]

@inproceedings{SuL23-6,
  title = {Multilayer Vision and Language Augmented Transformer for Image Captioning},
  author = {Qiang Su and Zhixin Li},
  year = {2023},
  doi = {10.1007/978-981-99-7022-3_19},
  url = {https://doi.org/10.1007/978-981-99-7022-3_19},
  researchr = {https://researchr.org/publication/SuL23-6},
  cites = {0},
  citedby = {0},
  pages = {210-222},
  booktitle = {PRICAI 2023: Trends in Artificial Intelligence - 20th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2023, Jakarta, Indonesia, November 15-19, 2023, Proceedings, Part II},
  editor = {Fenrong Liu and Arun Anand Sadanandan and Duc Nghia Pham and Petrus Mursanto and Dickson Lukose},
  volume = {14326},
  series = {Lecture Notes in Computer Science},
  publisher = {Springer},
  isbn = {978-981-99-7022-3},
}