Abstract is missing.
- Towards Fine-Grained Prosody Control for Voice ConversionZheng Lian, Rongxiu Zhong, Zhengqi Wen, Bin Liu, Jianhua Tao. 1-5 [doi]
- Prosody and Dialogue Act: A Perceptual Study on Chinese InterrogativesGan Huang, Aijun Li, Sichen Zhang, Liang Zhang. 1-5 [doi]
- Tone Realization in Mandarin Speech: A Large Corpus Based Study of Disyllabic WordsYaru Wu, Lori Lamel, Martine Adda-Decker. 1-5 [doi]
- Automatic Speaker-level Pronunciation Assessment of L2 Speech Using Posterior Probabilities from Multiple UtterancesGuolei Jiang, Chunhong Liao, Kun Li 0003, Pengfei Liu, Linying Jiang, Helen Meng. 1-5 [doi]
- Accent and Speaker Disentanglement in Many-to-many Voice ConversionZhichao Wang, Wenshuo Ge, Xiong Wang, Shan Yang, Wendong Gan, Haitao Chen, Hai Li, Lei Xie, Xiulin Li. 1-5 [doi]
- Exploring Cross-lingual Singing Voice Synthesis Using Speech DataYuewen Cao, Songxiang Liu, Shiyin Kang, Na Hu, Peng Liu, Xunying Liu, Dan Su 0002, Dong Yu 0001, Helen Meng. 1-5 [doi]
- Acoustical Characteristics of the Cantonese Vowels and Tones Produced by Hearing Impaired SpeakersWai-Sum Lee, Irene Ching-Yin Tsoi. 1-5 [doi]
- Speaker Charisma Analyzed through the Cultural LensAnna Gutnyk, Oliver Niebuhr, Wentao Gu. 1-5 [doi]
- An Eye-tracking Study of Transposed-letter Effect in English Word Recognition by Mandarin SpeakersHuan Lei, Jianwu Dang, Yu Chen. 1-5 [doi]
- Non-autoregressive Deliberation-Attention based End-to-End ASRChangfeng Gao, Gaofeng Cheng, Jun Zhou, Pengyuan Zhang, Yonghong Yan 0002. 1-5 [doi]
- A New Method for Improving Generative Adversarial Networks in Speech EnhancementFan Yang, Junfeng Li, Yonghong Yan 0002. 1-5 [doi]
- Non-parallel Sequence-to-Sequence Voice Conversion for Arbitrary SpeakersYing Zhang, Hao Che, Xiaorui Wang. 1-5 [doi]
- Improves Neural Acoustic Word Embeddings Query by Example Spoken Term Detection with Wav2vec Pretraining and Circle LossZhaoqi Li, Long Wu, Ta Li, Yonghong Yan 0002. 1-5 [doi]
- An Attention-augmented Fully Convolutional Neural Network for Monaural Speech EnhancementZezheng Xu, Ting Jiang, Chao Li, JiaCheng Yu. 1-5 [doi]
- GAN-Based Inter-Channel Amplitude Ratio Decoding in Multi-Channel Speech CodingJinru Zhu, Changchun Bao. 1-5 [doi]
- Context-dependent Label Smoothing Regularization for Attention-based End-to-End Code-Switching Speech RecognitionZheying Huang, Peng Li, Ji Xu, Pengyuan Zhang, Yonghong Yan 0002. 1-5 [doi]
- Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer LearningZhiping Zeng, Van Tung Pham, Haihua Xu, Yerbolat Khassanov, Eng Siong Chng, Chongjia Ni, Bin Ma. 1-5 [doi]
- Order-aware Pairwise Intoxication DetectionMeng Ge, Ruixiong Zhang, Wei Zou, Xiangang Li, Cheng Gong, Longbiao Wang, Jianwu Dang. 1-5 [doi]
- A Practical Way to Improve Automatic Phonetic Segmentation PerformanceWenjie Peng, Yingming Gao, Binghuai Lin, Jinsong Zhang. 1-5 [doi]
- Capsule Network based End-to-end System for Detection of Replay AttacksMeidan Ouyang, Rohan Kumar Das, Jichen Yang, Haizhou Li 0001. 1-5 [doi]
- Adversarial Training for Multi-domain Speaker RecognitionQing Wang, Wei Rao, Pengcheng Guo, Lei Xie. 1-5 [doi]
- A Comparison Study on the Alignment of Prosodic and Semantic Units and Its Effects on F0 Shifting in L1 and L2 English Spontaneous SpeechYuqing Zhang, Zhu Li, Jinsong Zhang. 1-5 [doi]
- LDA-based Speaker Verification in Multi-Enrollment Scenario using Expected Vector ApproachMeet H. Soni, Ashish Panda. 1-5 [doi]
- Unsupervised Cross-Lingual Speech Emotion Recognition Using Domain Adversarial Neural NetworkXiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng. 1-5 [doi]
- Towards Realizing Sign Language to Emotional Speech Conversion by Deep LearningWeizhe Wang, Hongwu Yang. 1-5 [doi]
- Multi-Scale Model for Mandarin Tone RecognitionLinkai Peng, Wang Dai, Dengfeng Ke, Jinsong Zhang. 1-5 [doi]
- ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN VocodersYu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma. 1-5 [doi]
- Comparing the Rhythm of Instrumental Music and Vocal Music in Mandarin and EnglishLujia Yang, Hongwei Ding. 1-5 [doi]
- Transformer-based Empathetic Response Generation Using Dialogue Situation and Advanced-Level Definition of EmpathyYi Hsuan Wang, Jia-Hao Hsu, Chung-Hsien Wu, Tsung-Hsien Yang. 1-5 [doi]
- Production of Tone 3 Sandhi by Advanced Korean Learners of MandarinXin Li, Yin Huang, Yunheng Xu, Linxin Yi, Yuming Yuan, Min Xiang. 1-5 [doi]
- Improved End-to-End Dysarthric Speech Recognition via Meta-learning Based Model Re-initializationDisong Wang, Jianwei Yu, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng. 1-5 [doi]
- Controllable Emotion Transfer For End-to-End Speech SynthesisTao Li, Shan Yang, Liumeng Xue, Lei Xie. 1-5 [doi]
- Age-Invariant Speaker Embedding for Diarization of Cognitive AssessmentsSean Shensheng Xu, Man-Wai Mak, Ka-Ho Wong, Helen Meng, Timothy C. Y. Kwok. 1-5 [doi]
- Rapid Word Learning of Children with Cochlear Implants: Phonological Structure and Mutual ExclusivityYu-Chen Hung, Tzu-Hui Lin. 1-4 [doi]
- Low-complexity Post-processing Method for Speech EnhancementFeng Bao, Yuepeng Li, Shidong Shang. 1-5 [doi]
- Articulatory and Acoustic Features of Mandarin /ɹ/: A Preliminary StudyShu-Wen Chen, Peggy Pik Ki Mok. 1-5 [doi]
- Hierarchically Attending Time-Frequency and Channel Features for Improving Speaker VerificationChenglong Wang, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian. 1-5 [doi]
- Consonantal Effects of Aspiration on Onset F0 in CantoneseXinran Ren, Peggy Mok. 1-5 [doi]
- Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser MicrophonesChang Liu, Yang Ai, Zhenhua Ling. 1-5 [doi]
- Acoustic Word Embedding System for Code-Switching Query-by-example Spoken Term DetectionMurong Ma, Haiwei Wu, Xuyang Wang, Lin Yang, Junjie Wang, Ming Li. 1-5 [doi]
- An Experimental Research on Tonal Errors in Monosyllables of Standard Spoken Chinese Language Produced by Uyghur LearnersQiuyuan Li, Yuan Jia. 1-5 [doi]
- A Model Ensemble Approach for Sound Event Localization and DetectionQing Wang, Huaxin Wu, Zijun Jing, Feng Ma, Yi Fang, Yuxuan Wang, Tairan Chen, Jia Pan, Jun Du, Chin-Hui Lee. 1-5 [doi]
- Dialogue Act Recognition using Branch Architecture with Attention Mechanism for Imbalanced DataMengfei Wu, Longbiao Wang, Yuke Si, Jianwu Dang. 1-5 [doi]
- The Acoustic Correlates and Time Span of the Non-modal Phonation in Kunshan Wu ChineseWenwei Xu, Peggy Mok. 1-5 [doi]
- Automatic Detection of Word-Level Reading Errors in Non-native English Speech Based on ASR OutputYing Qin, Yao Qian, Anastassia Loukina, Patrick L. Lange, Abhinav Misra, Keelan Evanini, Tan Lee. 1-5 [doi]
- Improving Attention-based End-to-end ASR by Incorporating an N-gram Neural NetworkJunyi Ao, Tom Ko. 1-5 [doi]
- Audio Caption in a Car Setting with a Sentence-Level LossXuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu 0004. 1-5 [doi]
- Speech Emotion Recognition Based on Acoustic Segment ModelSiYuan Zheng, Jun Du, Hengshun Zhou, Xue Bai, Chin-Hui Lee, Shipeng Li. 1-5 [doi]
- Spoken Language Understanding with Sememe Knowledge as Domain KnowledgeSixia Li, Jianwu Dang, Longbiao Wang. 1-5 [doi]
- UNet++-Based Multi-Channel Speech Dereverberation and Distant Speech RecognitionTuo Zhao, Yunxin Zhao, Shaojun Wang, Mei Han. 1-5 [doi]
- Age-Related Decline of Classifier Usage in Southwestern MandarinYun-feng, Yan Feng, Chenwei Xie, William Shi-Yuan Wang. 1-5 [doi]
- Speaker Embedding Augmentation with Noise Distribution MatchingXun Gong, Zhengyang Chen, Yexin Yang, Shuai Wang, Lan Wang, Yanmin Qian. 1-5 [doi]
- Channel Interdependence Enhanced Speaker Embeddings for Far-Field Speaker VerificationLing-Jun Zhao, Man-Wai Mak. 1-5 [doi]
- Automatic Extraction of Semantic Patterns in Dialogs using Convex Polytopic ModelJingyan Zhou, Xiaoying Zhang, Xiaohan Feng, King Keung Wu, Helen Meng. 1-5 [doi]
- An Investigation of Positional Encoding in Transformer-based End-to-end Speech RecognitionFengpeng Yue, Tom Ko. 1-5 [doi]
- Rnn-transducer With Language Bias For End-to-end Mandarin-English Code-switching Speech RecognitionShuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Ye Bai. 1-5 [doi]
- Usability And Practicality of Speech Recording by Mobile Phones for Phonetic AnalysisYihan Guan, Bin Li. 1-5 [doi]
- Text Enhancement for Paragraph Processing in End-to-End Code-switching TTSChunyu Qiang, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Jiangyan Yi, Tao Wang, Shiming Wang. 1-5 [doi]
- Context-aware RNNLM Rescoring for Conversational Speech RecognitionKun Wei, Pengcheng Guo, Hang Lv 0001, Zhen Tu, Lei Xie. 1-5 [doi]
- Deep Time Delay Neural Network for Speech Enhancement with Full Data LearningCunhang Fan, Bin Liu 0041, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Leichao Song. 1-5 [doi]
- Impact of Mismatched Spectral Amplitude Levels on Vowel Identification in Simulated Electric-acoustic HearingChangjie Pan, Fei Chen 0011. 1-5 [doi]
- Sams-Net: A Sliced Attention-based Neural Network for Music Source SeparationTingle Li, Jiawei Chen, Haowen Hou, Ming Li. 1-5 [doi]
- MoEVC: A Mixture of Experts Voice Conversion System With Sparse Gating Mechanism for Online Computation AccelerationYu-Tao Chang, Yuan-Hong Yang, Yu-Huai Peng, Syu-Siang Wang, Tai-Shih Chi, Yu Tsao 0001, Hsin-Min Wang. 1-5 [doi]
- Prosodic Profiles of the Mandarin Speech Conveying Ironic ComplimentShanpeng Li, Wentao Gu. 1-5 [doi]
- Complex Patterns of Tonal Realization in Taifeng ChineseXiaoyan Zhang, Aijun Li, Zhiqiang Li. 1-5 [doi]
- Frequency-specific Brain Network Dynamics during Perceiving Real Words and PseudowordsTaiyang Guo, Jianwu Dang, Gaoyan Zhang, Bin Zhao, Masashi Unoki. 1-5 [doi]
- Estimating Mutual Information in Prosody Representation for Emotional Prosody Transfer in Speech SynthesisGuangyan Zhang, Shirong Qiu, Ying Qin, Tan Lee. 1-5 [doi]
- Effects of Mandarin Tones on Acoustic Cue Weighting Patterns for ProminenceWei Zhang, Meghan Clayards, Jinsong Zhang. 1-5 [doi]
- Approaches to Improving Recognition of Underrepresented Named Entities in Hybrid ASR SystemsTingzhi Mao, Yerbolat Khassanov, Van Tung Pham, Haihua Xu, Hao Huang, Eng Siong Chng. 1-5 [doi]
- Revisiting the Statistics Pooling Layer in Deep Speaker Embedding LearningShuai Wang, Yexin Yang, Yanmin Qian, Kai Yu 0004. 1-5 [doi]
- Syllable-Based Acoustic Modeling With Lattice-Free MMI for Mandarin Speech RecognitionJie Li, Zhiyun Fan, Xiaorui Wang, Yan Li. 1-5 [doi]
- On Adaptive LASSO-based Sparse Time-Varying Complex AR Speech AnalysisKeiichi Funaki. 1-5 [doi]