DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura 0001, Geoffrey Zweig. DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020. pages 6899-6903, IEEE, 2020. [doi]

Authors

Andros Tjandra

This author has not been identified. Look up 'Andros Tjandra' in Google

Chunxi Liu

This author has not been identified. Look up 'Chunxi Liu' in Google

Frank Zhang

This author has not been identified. Look up 'Frank Zhang' in Google

Xiaohui Zhang

This author has not been identified. Look up 'Xiaohui Zhang' in Google

Yongqiang Wang

This author has not been identified. Look up 'Yongqiang Wang' in Google

Gabriel Synnaeve

This author has not been identified. Look up 'Gabriel Synnaeve' in Google

Satoshi Nakamura 0001

This author has not been identified. Look up 'Satoshi Nakamura 0001' in Google

Geoffrey Zweig

This author has not been identified. Look up 'Geoffrey Zweig' in Google