Deep multimodal semantic embeddings for speech and images

David F. Harwath, James R. Glass. Deep multimodal semantic embeddings for speech and images. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015. pages 237-244, IEEE, 2015. [doi]

@inproceedings{HarwathG15,
  title = {Deep multimodal semantic embeddings for speech and images},
  author = {David F. Harwath and James R. Glass},
  year = {2015},
  doi = {10.1109/ASRU.2015.7404800},
  url = {http://dx.doi.org/10.1109/ASRU.2015.7404800},
  researchr = {https://researchr.org/publication/HarwathG15},
  cites = {0},
  citedby = {0},
  pages = {237-244},
  booktitle = {2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015},
  publisher = {IEEE},
  isbn = {978-1-4799-7291-3},
}