Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction

Zhaoxi Mu, Xinyu Yang. Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI 2024, Jeju, South Korea, August 3-9, 2024. pages 6415-6423, ijcai.org, 2024. [doi]

@inproceedings{MuY24,
  title = {Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction},
  author = {Zhaoxi Mu and Xinyu Yang},
  year = {2024},
  url = {https://www.ijcai.org/proceedings/2024/709},
  researchr = {https://researchr.org/publication/MuY24},
  cites = {0},
  citedby = {0},
  pages = {6415-6423},
  booktitle = {Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI 2024, Jeju, South Korea, August 3-9, 2024},
  publisher = {ijcai.org},
  isbn = {978-1-956792-04-1},
}