CAT-ViL: Co-attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery

Long Bai 0008, Mobarakol Islam, Hongliang Ren 0001. CAT-ViL: Co-attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery. In Hayit Greenspan, Anant Madabhushi, Parvin Mousavi, Septimiu Salcudean, James Duncan 0001, Tanveer F. Syeda-Mahmood, Russell H. Taylor, editors, Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 - 26th International Conference, Vancouver, BC, Canada, October 8-12, 2023, Proceedings, Part IX. Volume 14228 of Lecture Notes in Computer Science, pages 397-407, Springer, 2023. [doi]

@inproceedings{BaiIR23a,
  title = {CAT-ViL: Co-attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery},
  author = {Long Bai 0008 and Mobarakol Islam and Hongliang Ren 0001},
  year = {2023},
  doi = {10.1007/978-3-031-43996-4_38},
  url = {https://doi.org/10.1007/978-3-031-43996-4_38},
  researchr = {https://researchr.org/publication/BaiIR23a},
  cites = {0},
  citedby = {0},
  pages = {397-407},
  booktitle = {Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 - 26th International Conference, Vancouver, BC, Canada, October 8-12, 2023, Proceedings, Part IX},
  editor = {Hayit Greenspan and Anant Madabhushi and Parvin Mousavi and Septimiu Salcudean and James Duncan 0001 and Tanveer F. Syeda-Mahmood and Russell H. Taylor},
  volume = {14228},
  series = {Lecture Notes in Computer Science},
  publisher = {Springer},
  isbn = {978-3-031-43996-4},
}