Generating Natural Video Descriptions via Multimodal Processing

Qin Jin, Junwei Liang, Xiaozhu Lin. Generating Natural Video Descriptions via Multimodal Processing. In Nelson Morgan, editor, Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016. pages 570-574, ISCA, 2016. [doi]

@inproceedings{JinLL16-1,
  title = {Generating Natural Video Descriptions via Multimodal Processing},
  author = {Qin Jin and Junwei Liang and Xiaozhu Lin},
  year = {2016},
  doi = {10.21437/Interspeech.2016-380},
  url = {http://dx.doi.org/10.21437/Interspeech.2016-380},
  researchr = {https://researchr.org/publication/JinLL16-1},
  cites = {0},
  citedby = {0},
  pages = {570-574},
  booktitle = {Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016},
  editor = {Nelson Morgan},
  publisher = {ISCA},
}