DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura 0001, Geoffrey Zweig. DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020. pages 6899-6903, IEEE, 2020. [doi]

@inproceedings{TjandraLZZWS0Z20,
  title = {DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks},
  author = {Andros Tjandra and Chunxi Liu and Frank Zhang and Xiaohui Zhang and Yongqiang Wang and Gabriel Synnaeve and Satoshi Nakamura 0001 and Geoffrey Zweig},
  year = {2020},
  doi = {10.1109/ICASSP40776.2020.9052964},
  url = {https://doi.org/10.1109/ICASSP40776.2020.9052964},
  researchr = {https://researchr.org/publication/TjandraLZZWS0Z20},
  cites = {0},
  citedby = {0},
  pages = {6899-6903},
  booktitle = {2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020},
  publisher = {IEEE},
  isbn = {978-1-5090-6631-5},
}