VICTR: Visual Information Captured Text Representation for Text-to-Vision Multimodal Tasks

Soyeon Caren Han, Siqu Long, Siwen Luo, Kunze Wang, Josiah Poon. VICTR: Visual Information Captured Text Representation for Text-to-Vision Multimodal Tasks. In Donia Scott, Núria Bel, Chengqing Zong, editors, Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, 2020. pages 3107-3117, International Committee on Computational Linguistics, 2020. [doi]

@inproceedings{HanLLWP20,
  title = {VICTR: Visual Information Captured Text Representation for Text-to-Vision Multimodal Tasks},
  author = {Soyeon Caren Han and Siqu Long and Siwen Luo and Kunze Wang and Josiah Poon},
  year = {2020},
  url = {https://www.aclweb.org/anthology/2020.coling-main.277/},
  researchr = {https://researchr.org/publication/HanLLWP20},
  cites = {0},
  citedby = {0},
  pages = {3107-3117},
  booktitle = {Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, 2020},
  editor = {Donia Scott and Núria Bel and Chengqing Zong},
  publisher = {International Committee on Computational Linguistics},
  isbn = {978-1-952148-27-9},
}