SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations

Paul-Ambroise Duquenne, Hongyu Gong, Ning Dong, Jingfei Du, Ann Lee 0001, Vedanuj Goswami, Changhan Wang, Juan Pino 0001, Benoît Sagot, Holger Schwenk. SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations. In Anna Rogers, Jordan L. Boyd-Graber, Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023. pages 16251-16269, Association for Computational Linguistics, 2023. [doi]

@inproceedings{DuquenneGDD0GW023,
  title = {SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations},
  author = {Paul-Ambroise Duquenne and Hongyu Gong and Ning Dong and Jingfei Du and Ann Lee 0001 and Vedanuj Goswami and Changhan Wang and Juan Pino 0001 and Benoît Sagot and Holger Schwenk},
  year = {2023},
  url = {https://aclanthology.org/2023.acl-long.899},
  researchr = {https://researchr.org/publication/DuquenneGDD0GW023},
  cites = {0},
  citedby = {0},
  pages = {16251-16269},
  booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023},
  editor = {Anna Rogers and Jordan L. Boyd-Graber and Naoaki Okazaki},
  publisher = {Association for Computational Linguistics},
  isbn = {978-1-959429-72-2},
}