Towards Multilingual spoken Visual Question Answering system using Cross-Attention

Amartya Roy Chowdhury, Tonmoy Rajkhowa, Sanjeev Sharma. Towards Multilingual spoken Visual Question Answering system using Cross-Attention. In Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa 0001, Barbara Di Eugenio, Steven Schockaert, editors, Proceedings of the 31st International Conference on Computational Linguistics, COLING 2025, Abu Dhabi, UAE, January 19-24, 2025. pages 9165-9175, Association for Computational Linguistics, 2025. [doi]

@inproceedings{ChowdhuryRS25,
  title = {Towards Multilingual spoken Visual Question Answering system using Cross-Attention},
  author = {Amartya Roy Chowdhury and Tonmoy Rajkhowa and Sanjeev Sharma},
  year = {2025},
  url = {https://aclanthology.org/2025.coling-main.615/},
  researchr = {https://researchr.org/publication/ChowdhuryRS25},
  cites = {0},
  citedby = {0},
  pages = {9165-9175},
  booktitle = {Proceedings of the 31st International Conference on Computational Linguistics, COLING 2025, Abu Dhabi, UAE, January 19-24, 2025},
  editor = {Owen Rambow and Leo Wanner and Marianna Apidianaki and Hend Al-Khalifa 0001 and Barbara Di Eugenio and Steven Schockaert},
  publisher = {Association for Computational Linguistics},
  isbn = {979-8-89176-196-4},
}