TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining

Paul Primus, Florian Schmid, Gerhard Widmer. TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2025, Tahoe City, CA, USA, October 12-15, 2025. pages 1-5, IEEE, 2025. [doi]

@inproceedings{PrimusSW25,
  title = {TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining},
  author = {Paul Primus and Florian Schmid and Gerhard Widmer},
  year = {2025},
  doi = {10.1109/WASPAA66052.2025.11230997},
  url = {https://doi.org/10.1109/WASPAA66052.2025.11230997},
  researchr = {https://researchr.org/publication/PrimusSW25},
  cites = {0},
  citedby = {0},
  pages = {1-5},
  booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2025, Tahoe City, CA, USA, October 12-15, 2025},
  publisher = {IEEE},
  isbn = {979-8-3315-3745-6},
}