Applying Segment-Level Attention on Bi-Modal Transformer Encoder for Audio-Visual Emotion Recognition

Jia-Hao Hsu, Chung-Hsien Wu. Applying Segment-Level Attention on Bi-Modal Transformer Encoder for Audio-Visual Emotion Recognition. T. Affective Computing, 14(4):3231-3243, October - December 2023. [doi]

Abstract

Abstract is missing.