Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 27, Issue 12

1852 -- 1867Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari. Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method
1868 -- 1879Lijun Wu, Xu Tan, Tao Qin, Jianhuang Lai, Tie-Yan Liu. Beyond Error Propagation: Language Branching Also Affects the Accuracy of Sequence Generation
1880 -- 1892Amit Das, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong. Advancing Acoustic-to-Word CTC Model With Attention and Mixed-Units
1893 -- 1905Niccolo Antonello, Enzo De Sena, Marc Moonen, Patrick A. Naylor, Toon van Waterschoot. Joint Acoustic Localization and Dereverberation Through Plane Wave Decomposition and Sparse Regularization
1906 -- 1918Federico Borra, Alberto Bernardini, Fabio Antonacci, Augusto Sarti. Uniform Linear Arrays of First-Order Steerable Differential Microphones
1919 -- 1931Li Chai, Jun Du, Qing-Feng Liu, Chin-Hui Lee. Using Generalized Gaussian Distributions to Improve Regression Error Modeling for Deep Learning-Based Speech Enhancement
1932 -- 1943Jun Qi, Jun Du, Sabato Marco Siniscalchi, Chin-Hui Lee. A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement
1944 -- 1956Xudong Dang, Qi Cheng, Hongyan Zhu. Indoor Multiple Sound Source Localization via Multi-Dimensional Assignment Data Association
1957 -- 1969Martin Schneider 0009, Emanuel A. P. Habets. Iterative DFT-Domain Inverse Filter Optimization Using a Weighted Least-Squares Criterion
1970 -- 1984Kehai Chen, Rui Wang 0015, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao. Neural Machine Translation With Sentence-Level Topic Context
1985 -- 1999Alejandro Gómez Alanís, Antonio M. Peinado, José A. González 0001, Angel M. Gomez. A Gated Recurrent Convolutional Neural Network for Robust Spoofing Detection
2000 -- 2011Siyuan Feng, Tan Lee. Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling
2012 -- 2024Wei Li, Nancy F. Chen, Sabato Marco Siniscalchi, Chin-Hui Lee. Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models
2025 -- 2040Quansheng Tu, Huawei Chen. On Mainlobe Orientation of the First- and Second-Order Differential Microphone Arrays
2041 -- 2053Jan Chorowski, Ron J. Weiss, Samy Bengio, Aäron Van Den Oord. Unsupervised Speech Representation Learning Using WaveNet Autoencoders
2054 -- 2066Vishnuvardhan Varanasi, Ayushya Agarwal, Rajesh M. Hegde. Near-Field Acoustic Source Localization Using Spherical Harmonic Features
2067 -- 2079Yibin Zheng, Jianhua Tao, Zhengqi Wen, Jiangyan Yi. Forward-Backward Decoding Sequence for Regularizing End-to-End TTS
2080 -- 2091Yanhui Tu, Jun Du, Chin-Hui Lee. Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition
2092 -- 2102Yuzhou Liu, DeLiang Wang. Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation
2103 -- 2112Xuebo Liu, Derek F. Wong, Lidia S. Chao, Yang Liu 0005. Latent Attribute Based Hierarchical Decoder for Neural Machine Translation
2113 -- 2126Jingyi Hu, Ning Chen. Enhanced Feature Summarizing for Effective Cover Song Identification
2127 -- 2139Qianli Ma, Liuhong Yu, Shuai Tian, Enhuan Chen, Wing W. Y. Ng. Global-Local Mutual Attention Model for Text Classification
2140 -- 2149Vesa Välimäki, Jussi Rämö. Neurally Controlled Graphic Equalizer
2150 -- 2161Sean U. N. Wood, Johannes Stahl 0003, Pejman Mowlaee. Binaural Codebook-Based Speech Enhancement With Atomic Speech Presence Probability
2162 -- 2172Lukas Pfeifenberger, Matthias Zöhrer, Franz Pernkopf. Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement
2173 -- 2182Marc Arnela, Saeed Dabbaghchian, Oriol Guasch, Olov Engwall. MRI-Based Vocal Tract Representations for the Three-Dimensional Finite Element Synthesis of Diphthongs
2183 -- 2196Varun Srivastava, Mayank Mishra. Adversarial Approximate Inference for Speech to Electroglottograph Conversion
2197 -- 2212Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii, Tatsuya Kawahara. Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior
2213 -- 2222Qipeng Guo, Xipeng Qiu, Xiangyang Xue, Zheng Zhang. Low-Rank and Locality Constrained Self-Attention for Sequence Modeling
2223 -- 2233Jun Yu 0001, Qiang Ling, Changwei Luo, Chang Wen Chen. Synthesizing 3D Trump: Predicting and Visualizing the Relationship Between Text, Speech, and Articulatory Movements
2234 -- 2248Ryosuke Sugiura, Yutaka Kamamoto, Takehiro Moriya. Shape Control of Discrete Generalized Gaussian Distributions for Frequency-Domain Audio Coding
2249 -- 2262Zamir Ben-Hur, David Lou Alon, Ravish Mehra, Boaz Rafaely. Efficient Representation and Sparse Sampling of Head-Related Transfer Functions Using Phase-Correction Based on Ear Alignment
2263 -- 2277Luca Remaggi, Philip J. B. Jackson, Wenwu Wang. Modeling the Comb Filter Effect and Interaural Coherence for Binaural Source Separation
2278 -- 2287Biao Zhang 0002, Deyi Xiong, Jinsong Su, Jiebo Luo. Future-Aware Knowledge Distillation for Neural Machine Translation
2288 -- 2300Randall Ali, Toon van Waterschoot, Marc Moonen. Integration of a Priori and Estimated Constraints Into an MVDR Beamformer for Speech Enhancement
2301 -- 2312Nitya Tiwari, Prem C. Pandey. Speech Enhancement Using Noise Estimation With Dynamic Quantile Tracking
2313 -- 2325Junwen Duan, Xiao Ding, Yue Zhang, Ting Liu 0001. TEND: A Target-Dependent Representation Learning Framework for News Document
2326 -- 2335Lujun Zhao, Xipeng Qiu, Qi Zhang 0001, Xuanjing Huang. Sequence Labeling With Deep Gated Dual Path CNN
2336 -- 2349Akihiro Kato, Tomi H. Kinnunen. Statistical Regression Models for Noise Robust F0 Estimation Using Recurrent Deep Neural Networks
2350 -- 2361Dayiheng Liu, Jie Fu, Qian Qu, Jiancheng Lv. BFGAN: Backward and Forward Generative Adversarial Networks for Lexically Constrained Sentence Generation
2362 -- 2372Andrés Marafioti, Nathanaël Perraudin, Nicki Holighaus, Piotr Majdak. A Context Encoder For Audio Inpainting
2373 -- 2384Jichen Yang, Rohan Kumar Das, Nina Zhou. Extraction of Octave Spectra Information for Spoofing Attack Detection
2385 -- 2396Oren Barkan, David Tsiris, Ori Katz, Noam Koenigstein. InverSynth: Deep Estimation of Synthesizer Parameter Configurations From Audio Signals