CLAPSep: Leveraging Contrastive Pre-Trained Model for Multi-Modal Query-Conditioned Target Sound Extraction

Hao Ma, Zhiyuan Peng, Xu Li 0015, Mingjie Shao, Xixin Wu, Ju Liu. CLAPSep: Leveraging Contrastive Pre-Trained Model for Multi-Modal Query-Conditioned Target Sound Extraction. IEEE Transactions on Audio, Speech & Language Processing, 32:4945-4960, 2024. [doi]

@article{MaPLSWL24,
  title = {CLAPSep: Leveraging Contrastive Pre-Trained Model for Multi-Modal Query-Conditioned Target Sound Extraction},
  author = {Hao Ma and Zhiyuan Peng and Xu Li 0015 and Mingjie Shao and Xixin Wu and Ju Liu},
  year = {2024},
  doi = {10.1109/TASLP.2024.3497586},
  url = {https://doi.org/10.1109/TASLP.2024.3497586},
  researchr = {https://researchr.org/publication/MaPLSWL24},
  cites = {0},
  citedby = {0},
  journal = {IEEE Transactions on Audio, Speech & Language Processing},
  volume = {32},
  pages = {4945-4960},
}