Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events

Wim Boes, Hugo Van Hamme. Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events. In Laurent Amsaleg, Benoit Huet, Martha Larson, Guillaume Gravier, Hayley Hung, Chong-Wah Ngo, Wei Tsang Ooi, editors, Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, October 21-25, 2019. pages 1961-1969, ACM, 2019. [doi]

@inproceedings{Boesh19,
  title = {Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events},
  author = {Wim Boes and Hugo Van Hamme},
  year = {2019},
  doi = {10.1145/3343031.3350873},
  url = {https://doi.org/10.1145/3343031.3350873},
  researchr = {https://researchr.org/publication/Boesh19},
  cites = {0},
  citedby = {0},
  pages = {1961-1969},
  booktitle = {Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, October 21-25, 2019},
  editor = {Laurent Amsaleg and Benoit Huet and Martha Larson and Guillaume Gravier and Hayley Hung and Chong-Wah Ngo and Wei Tsang Ooi},
  publisher = {ACM},
  isbn = {978-1-4503-6889-6},
}