Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 27, Issue 9

1349 -- 1364Randall Ali, Giuliano Bernardi, Toon van Waterschoot, Marc Moonen. Methods of Extending a Generalized Sidelobe Canceller With External Microphones
1365 -- 1377Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud. Multichannel Online Dereverberation Based on Spectral Magnitude Inverse Filtering
1378 -- 1391Lu Chen, Zhi Chen, Bowen Tan, Sishan Long, Milica Gasic, Kai Yu 0004. AgentGraph: Toward Universal Dialogue Management With Structured Deep Reinforcement Learning
1392 -- 1404Luoqin Li, Jiabing Wang, Jichang Li, Qianli Ma, Jia Wei. Relation Classification via Keyword-Attentive Sentence Mechanism and Synthetic Stimulation Loss
1405 -- 1418Martin Bo Møller, Jesper Kjær Nielsen, Efren Fernandez-Grande, Søren Krarup Olesen. On the Influence of Transfer Function Noise on Sound Zone Control in a Room
1419 -- 1431Zhen Xu, Chengjie Sun, Yinong Long, Bingquan Liu, Baoxun Wang, Mingjiang Wang, Min Zhang 0005, Xiaolong Wang 0001. Dynamic Working Memory for Context-Aware Response Generation
1432 -- 1443Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo. ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder
1444 -- 1454Xie Chen, Xunying Liu, Yu Wang 0027, Anton Ragni, Jeremy H. M. Wong, Mark J. F. Gales. Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition
1455 -- 1468Rui Wang, Zhe Chen 0005, Fuliang Yin. DOA-Based Three-Dimensional Node Geometry Calibration in Acoustic Sensor Networks and Its Cramér-Rao Bound and Sensitivity Analysis
1469 -- 1480Chia-Hsuan Lee, Hung-yi Lee, Szu-Lin Wu, Chi-Liang Liu, Wei Fang, Juei-Yang Hsu, Bo-Hsiang Tseng. Machine Comprehension of Spoken Content: TOEFL Listening Test and Spoken SQuAD
1481 -- 1493Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Yu-Hsuan Wang, Chia-Hao Shen. Audio Word2vec: Sequence-to-Sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation

Volume 27, Issue 8

1216 -- 1228Teng Zhang, Ji Wu. Constrained Learned Feature Extraction for Acoustic Scene Classification
1229 -- 1240Leonardo Gabrielli, Stefano Tomassetti, Stefano Squartini, Carlo Zinato, Stefano Guaiana. A Multi-Stage Algorithm for Acoustic Physical Model Parameters Estimation
1241 -- 1255Bing Yang, Hong Liu 0008, Cheng Pang, Xiaofei Li. Multiple Sound Source Counting and Localization Based on TF-Wise Spatial Spectrum Clustering
1256 -- 1266Yi Luo 0004, Nima Mesgarani. Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
1267 -- 1279Achintya Kumar Sarkar, Zheng-Hua Tan, Hao Tang, Suwon Shon, James Glass. Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
1280 -- 1294Jiawen Chua, W. Bastiaan Kleijn. A Low Latency Approach for Blind Source Separation
1295 -- 1307Chao Pan, Jingdong Chen, Jacob Benesty, Guangming Shi. On the Design of Target Beampatterns for Differential Microphone Arrays
1308 -- 1320Aqil M. Azmi, Manal N. Almutery, Hatim A. Aboalsamh. Real-Word Errors in Arabic Texts: A Better Algorithm for Detection and Correction
1321 -- 1334Mandy Korpusik, James Glass. Deep Learning for Database Mapping and Asking Clarification Questions in Dialogue Systems
1335 -- 1345Junhyeong Pak, Jong Won Shin. Sound Localization Based on Phase Difference Enhancement Using Deep Neural Networks

Volume 27, Issue 7

1112 -- 1125Jan-Hendrik Flesner, Thomas Biberger, Stephan Dieter Ewert. Subjective and Objective Assessment of Monaural and Binaural Aspects of Audio Quality
1126 -- 1135Bolaji Yusuf, Batuhan Gündogdu, Murat Saraclar. Low Resource Keyword Search With Synthesized Crosslingual Exemplars
1136 -- 1150Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen 0001. Robust Joint Estimation of Multimicrophone Signal Model Parameters
1151 -- 1163Benjamin Cauchi, Kai Siedenburg, João Felipe Santos, Tiago H. Falk, Simon Doclo, Stefan Goetze. Non-Intrusive Speech Quality Prediction Using Modulation Energies and LSTM-Network
1164 -- 1178Yike Zhang, Pengyuan Zhang, Yonghong Yan 0002. Tailoring an Interpretable Neural Language Model
1179 -- 1188Ashutosh Pandey, DeLiang Wang. A New Framework for CNN-Based Speech Enhancement in the Time Domain
1189 -- 1200Vikram C. M., Nagaraj Adiga, S. R. Mahadeva Prasanna. Detection of Nasalized Voiced Stops in Cleft Palate Speech Using Epoch-Synchronous Features
1201 -- 1212Huaishao Luo, Tianrui Li, Bing Liu, Bin Wang, Herwig Unger. Improving Aspect Term Extraction With Bidirectional Dependency Tree Representation

Volume 27, Issue 6

992 -- 1006Annamaria Mesaros, Aleksandr Diment, Benjamin Elizalde, Toni Heittola, Emmanuel Vincent, Bhiksha Raj, Tuomas Virtanen. Sound Event Detection in the DCASE 2017 Challenge
1007 -- 1018Srikanth Raj Chetupalli, Thippur V. Sreenivas. Late Reverberation Cancellation Using Bayesian Estimation of Multi-Channel Linear Predictors and Student's t-Source Prior
1019 -- 1030Lauri Juvela, Bajibabu Bollepalli, Vassilis Tsiaras, Paavo Alku. GlotNet - A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis
1031 -- 1046Fiete Winter, Frank-Schultz, Gergely Firtha, Sascha Spors. A Geometric Model for Prediction of Spatial Aliasing in 2.5D Sound Field Synthesis
1047 -- 1059Yuanyuan Liu, Tan Lee, Thomas K. T. Law, Kathy Yuet-Sheung Lee. Acoustical Assessment of Voice Disorder With Continuous Speech Using ASR Posterior Features
1060 -- 1071Christoph Pörschmann, Johannes M. Arend, Fabian Brinkmann. Directional Equalization of Sparse Head-Related Transfer Function Sets for Spatial Upsampling
1072 -- 1084Shreyas Srikanth Payal, V. John Mathews, Douglas J. Button, Ajay Iyer, Russell H. Lambert, Jeffrey Hutchings, Luis Antonio Azpicueta-Ruiz. Equalization of Nonlinear Propagation Distortion in Cylindrical Waveguides
1085 -- 1097Berrak Sisman, Mingyang Zhang, Haizhou Li 0001. Group Sparse Representation With WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion
1098 -- 1109JinKyu Lee, Hong-Goo Kang. A Joint Learning Algorithm for Complex-Valued T-F Masks in Deep Learning-Based Single-Channel Speech Enhancement Systems

Volume 27, Issue 5

881 -- 891Francisco Javier Ibarrola, Ruben Daniel Spies, Leandro Ezequiel Di Persia. Switching Divergences for Spectral Learning in Blind Speech Dereverberation
892 -- 902Israel Cohen, Jacob Benesty, Jingdong Chen. Differential Kronecker Product Beamforming
903 -- 918Camelia Elisei-Iliescu, Constantin Paleologu, Jacob Benesty, Cristian Stanciu, Cristian Anghel, Silviu Ciochina. Recursive Least-Squares Algorithms for the Identification of Low-Rank Systems
919 -- 931Anurendra Kumar, Tanaya Guha, Prasanta Kumar Ghosh. Dirichlet Latent Variable Model: A Dynamic Model Based on Dirichlet Prior for Audio Processing
932 -- 947Peter Jancovic, Münevver Köküer. Bird Species Recognition Using Unsupervised Modeling of Individual Vocalization Elements
948 -- 959Tomoki Koriyama, Takao Kobayashi. Statistical Parametric Speech Synthesis Using Deep Gaussian Processes
960 -- 971Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition
972 -- 987Simon Widmark. Causal MSE-Optimal Filters for Personal Audio Subject to Constrained Contrast

Volume 27, Issue 4

663 -- 678Ziyue Zhao, Huijun Liu, Tim Fingscheidt. Convolutional Neural Networks to Enhance Coded Speech
679 -- 691Henning F. Schepker, Sven Erik Nordholm, Linh Thi Thuc Tran, Simon Doclo. Null-Steering Beamformer-Based Feedback Cancellation for Multi-Microphone Hearing Aids With Incoming Signal Preservation
692 -- 703Zengxi Li, Yan Song, Li-Rong Dai, Ian Vince McLoughlin. Listening and Grouping: An Online Autoregressive Approach for Monaural Speech Separation
704 -- 718Dong Deng, Liping Jing, Jian Yu, Shaolong Sun, Michael K. Ng. Sentiment Lexicon Construction With Hierarchical Supervision Topic Model
719 -- 729Mantong Zhou, Minlie Huang, Xiaoyan Zhu. Story Ending Selection by Finding Hints From Pairwise Candidate Endings
730 -- 741Jan-Gerrit Richter, Janina Fels. On the Influence of Continuous Subject Rotation During High-Resolution Head-Related Transfer Function Measurements
742 -- 752Jianguo Yu, Konstantin Markov, Tomoko Matsui. Articulatory and Spectrum Information Fusion Based on Deep Recurrent Neural Networks
753 -- 764Fábio P. Itturriet, Márcio Holsbach Costa. Perceptually Relevant Preservation of Interaural Time Differences in Binaural Hearing Aids
765 -- 776Johannes Abel, Tim Fingscheidt. Sinusoidal-Based Lowband Synthesis for Artificial Speech Bandwidth Extension
777 -- 787Qiuqiang Kong, Yong Xu 0004, Iwona Sobieraj, Wenwu Wang, Mark D. Plumbley. Sound Event Detection and Time-Frequency Segmentation from Weakly Labelled Data
788 -- 798Yi-Lin Tuan, Hung-yi Lee. Improving Conditional Sequence Generative Adversarial Networks by Stepwise Evaluation
799 -- 814Nikolaos Dionelis, Mike Brookes. Modulation-Domain Kalman Filtering for Monaural Blind Speech Denoising and Dereverberation
815 -- 826Reza Lotfian, Carlos Busso. Curriculum Learning for Speech Emotion Recognition From Crowdsourced Labels
827 -- 841Shoufeng Lin. Robust Pitch Estimation and Tracking For Speakers Based on Subband Encoding and The Generalized Labeled Multi-Bernoulli Filter
842 -- 852Xianghui Wang, Israel Cohen, Jingdong Chen, Jacob Benesty. On Robust and High Directive Beamforming With Small-Spacing Microphone Arrays for Scattered Sources
853 -- 865Zhe Quan, Zhi-jie Wang, Yuquan Le, Bin Yao 0002, Kenli Li, Jian Yin. An Efficient Framework for Sentence Similarity Modeling
866 -- 877Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura 0001. Positive Emotion Elicitation in Chat-Based Dialogue Systems

Volume 27, Issue 3

472 -- 481Mohsen Zareian Jahromi, Adel Zahedi, Jesper Jensen 0001, Jan Østergaard. Information Loss in the Human Auditory System
482 -- 495Yaakov Buchris, Alon Amar, Jacob Benesty, Israel Cohen. Incoherent Synthesis of Sparse Arrays for Frequency-Invariant Beamforming
496 -- 506Yogachandran Rahulamathavan, Kunaraj R. Sutharsini, Indranil Ghosh Ray, Rongxing Lu, Muttukrishnan Rajarajan. Privacy-Preserving iVector-Based Speaker Verification
507 -- 518Jiajun Zhang, Yang Zhao, Haoran Li, Chengqing Zong. Attention With Sparsity Regularization for Neural Machine Translation and Summarization
519 -- 530Alastair H. Moore, Wei Xue, Patrick A. Naylor, Mike Brookes. Noise Covariance Matrix Estimation for Rotating Microphone Arrays
531 -- 543Guang Yang, Haibo He, Qian Chen. Emotion-Semantic-Enhanced Neural Network
544 -- 558Thomas Dietzen, Ann Spriet, Wouter Tirry, Simon Doclo, Marc Moonen, Toon van Waterschoot. Comparative Analysis of Generalized Sidelobe Cancellation and Multi-Channel Linear Prediction for Speech Dereverberation and Noise Reduction
559 -- 571Jianqing Gao, Jun Du, Enhong Chen. Mixed-Bandwidth Cross-Channel Speech Recognition via Joint Optimization of DNN-Based Bandwidth Expansion and Acoustic Modeling
572 -- 582Salil Deena, Madina Hasan, Mortaza Doulaty, Oscar Saz, Thomas Hain. Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment
583 -- 594Femke B. Gelderblom, Tron V. Tronstad, Erlend Magnus Viggen. Subjective Evaluation of a Noise-Reduced Training Target for Deep Neural Network-Based Speech Enhancement
595 -- 609Maria Luis Valero, Emanuel A. P. Habets. Low-Complexity Multi-Microphone Acoustic Echo Control in the Short-Time Fourier Transform Domain
610 -- 620Qiaoxi Zhu, Philip Coleman, Xiaojun Qiu, Ming Wu, Jun Yang 0004, Ian S. Burnett. Robust Personal Audio Geometry Optimization in the SVD-Based Modal Domain
621 -- 630Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ye Bai. Language-Adversarial Transfer Learning for Low-Resource Speech Recognition
631 -- 644Jing-Xuan Zhang, Zhen-Hua Ling, Li-juan Liu, Yuan Jiang, Li-Rong Dai. Sequence-to-Sequence Acoustic Modeling for Voice Conversion
645 -- 659Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud. Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function

Volume 27, Issue 2

244 -- 254Toru Nakashika, Shinji Takaki, Junichi Yamagishi. Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra
255 -- 267Feifei Xiong, Stefan Goetze, Birger Kollmeier, Bernd T. Meyer. Joint Estimation of Reverberation Time and Early-To-Late Reverberation Ratio From Single-Channel Speech Signals
268 -- 282Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuel A. P. Habets. CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning
283 -- 295Morten Kolbaek, Zheng-Hua Tan, Jesper Jensen 0001. On the Relationship Between Short-Time Objective Intelligibility and Short-Time Spectral-Amplitude Mean-Square Error for Speech Enhancement
296 -- 310Martin Weiss Hansen, Jesper Rindom Jensen, Mads Græsbøll Christensen. Estimation of Fundamental Frequencies in Stereophonic Music Mixtures
311 -- 320Jun-Wei Bao, Duyu Tang, Nan Duan, Zhao Yan, Ming Zhou 0001, Tiejun Zhao. Text Generation From Tables
321 -- 331Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen 0001. A Convex Approximation of the Relaxed Binaural Beamforming Optimization Problem
332 -- 341Tetsuya Hashimoto, Daisuke Saito, Nobuaki Minematsu. Many-to-Many and Completely Parallel-Data-Free Voice Conversion Based on Eigenspace DNN
342 -- 354Fatemeh Pishdadian, Bryan Pardo. Multi-Resolution Common Fate Transform
355 -- 366Yiming Wu, Wei Li. Automatic Audio Chord Recognition With MIDI-Trained Deep Feature and BLSTM-CRF Sequence Decoding Model
367 -- 382Keisuke Imoto, Nobutaka Ono. Acoustic Topic Model for Scene Analysis With Intermittently Missing Observations
383 -- 391Ke Xiao, Supin Wang, Mingxi Wan, Liang Wu 0004. Reconstruction of Mandarin Electrolaryngeal Fricatives With Hybrid Noise Source
392 -- 403Lakshmi Krishnan, Terence Betlehem, Paul D. Teal. Fast Algorithms for Acoustic Impulse Response Shaping
404 -- 414Vahid Zakeri, Antony J. Hodgson. Automatic Identification of Hard and Soft Bone Tissues by Analyzing Drilling Sounds
415 -- 428Stefan Bilbao, Brian Hamilton. Directional Sources in Wave-Based Acoustic Simulation
429 -- 441Yichi Zhang, Bryan Pardo, Zhiyao Duan. Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation
442 -- 456Fangchen Feng, Matthieu Kowalski. Underdetermined Reverberant Blind Source Separation: Sparse Approaches for Multiplicative and Convolutive Narrowband Approximation
457 -- 468Zhong-qiu Wang, DeLiang Wang. Combining Spectral and Spatial Features for Deep Learning Based Blind Speaker Separation

Volume 27, Issue 12

1852 -- 1867Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari. Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method
1868 -- 1879Lijun Wu, Xu Tan, Tao Qin, Jianhuang Lai, Tie-Yan Liu. Beyond Error Propagation: Language Branching Also Affects the Accuracy of Sequence Generation
1880 -- 1892Amit Das, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong. Advancing Acoustic-to-Word CTC Model With Attention and Mixed-Units
1893 -- 1905Niccolo Antonello, Enzo De Sena, Marc Moonen, Patrick A. Naylor, Toon van Waterschoot. Joint Acoustic Localization and Dereverberation Through Plane Wave Decomposition and Sparse Regularization
1906 -- 1918Federico Borra, Alberto Bernardini, Fabio Antonacci, Augusto Sarti. Uniform Linear Arrays of First-Order Steerable Differential Microphones
1919 -- 1931Li Chai, Jun Du, Qing-Feng Liu, Chin-Hui Lee. Using Generalized Gaussian Distributions to Improve Regression Error Modeling for Deep Learning-Based Speech Enhancement
1932 -- 1943Jun Qi, Jun Du, Sabato Marco Siniscalchi, Chin-Hui Lee. A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement
1944 -- 1956Xudong Dang, Qi Cheng, Hongyan Zhu. Indoor Multiple Sound Source Localization via Multi-Dimensional Assignment Data Association
1957 -- 1969Martin Schneider 0009, Emanuel A. P. Habets. Iterative DFT-Domain Inverse Filter Optimization Using a Weighted Least-Squares Criterion
1970 -- 1984Kehai Chen, Rui Wang 0015, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao. Neural Machine Translation With Sentence-Level Topic Context
1985 -- 1999Alejandro Gómez Alanís, Antonio M. Peinado, José A. González 0001, Angel M. Gomez. A Gated Recurrent Convolutional Neural Network for Robust Spoofing Detection
2000 -- 2011Siyuan Feng, Tan Lee. Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling
2012 -- 2024Wei Li, Nancy F. Chen, Sabato Marco Siniscalchi, Chin-Hui Lee. Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models
2025 -- 2040Quansheng Tu, Huawei Chen. On Mainlobe Orientation of the First- and Second-Order Differential Microphone Arrays
2041 -- 2053Jan Chorowski, Ron J. Weiss, Samy Bengio, Aäron Van Den Oord. Unsupervised Speech Representation Learning Using WaveNet Autoencoders
2054 -- 2066Vishnuvardhan Varanasi, Ayushya Agarwal, Rajesh M. Hegde. Near-Field Acoustic Source Localization Using Spherical Harmonic Features
2067 -- 2079Yibin Zheng, Jianhua Tao, Zhengqi Wen, Jiangyan Yi. Forward-Backward Decoding Sequence for Regularizing End-to-End TTS
2080 -- 2091Yanhui Tu, Jun Du, Chin-Hui Lee. Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition
2092 -- 2102Yuzhou Liu, DeLiang Wang. Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation
2103 -- 2112Xuebo Liu, Derek F. Wong, Lidia S. Chao, Yang Liu 0005. Latent Attribute Based Hierarchical Decoder for Neural Machine Translation
2113 -- 2126Jingyi Hu, Ning Chen. Enhanced Feature Summarizing for Effective Cover Song Identification
2127 -- 2139Qianli Ma, Liuhong Yu, Shuai Tian, Enhuan Chen, Wing W. Y. Ng. Global-Local Mutual Attention Model for Text Classification
2140 -- 2149Vesa Välimäki, Jussi Rämö. Neurally Controlled Graphic Equalizer
2150 -- 2161Sean U. N. Wood, Johannes Stahl 0003, Pejman Mowlaee. Binaural Codebook-Based Speech Enhancement With Atomic Speech Presence Probability
2162 -- 2172Lukas Pfeifenberger, Matthias Zöhrer, Franz Pernkopf. Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement
2173 -- 2182Marc Arnela, Saeed Dabbaghchian, Oriol Guasch, Olov Engwall. MRI-Based Vocal Tract Representations for the Three-Dimensional Finite Element Synthesis of Diphthongs
2183 -- 2196Varun Srivastava, Mayank Mishra. Adversarial Approximate Inference for Speech to Electroglottograph Conversion
2197 -- 2212Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii, Tatsuya Kawahara. Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior
2213 -- 2222Qipeng Guo, Xipeng Qiu, Xiangyang Xue, Zheng Zhang. Low-Rank and Locality Constrained Self-Attention for Sequence Modeling
2223 -- 2233Jun Yu 0001, Qiang Ling, Changwei Luo, Chang Wen Chen. Synthesizing 3D Trump: Predicting and Visualizing the Relationship Between Text, Speech, and Articulatory Movements
2234 -- 2248Ryosuke Sugiura, Yutaka Kamamoto, Takehiro Moriya. Shape Control of Discrete Generalized Gaussian Distributions for Frequency-Domain Audio Coding
2249 -- 2262Zamir Ben-Hur, David Lou Alon, Ravish Mehra, Boaz Rafaely. Efficient Representation and Sparse Sampling of Head-Related Transfer Functions Using Phase-Correction Based on Ear Alignment
2263 -- 2277Luca Remaggi, Philip J. B. Jackson, Wenwu Wang. Modeling the Comb Filter Effect and Interaural Coherence for Binaural Source Separation
2278 -- 2287Biao Zhang 0002, Deyi Xiong, Jinsong Su, Jiebo Luo. Future-Aware Knowledge Distillation for Neural Machine Translation
2288 -- 2300Randall Ali, Toon van Waterschoot, Marc Moonen. Integration of a Priori and Estimated Constraints Into an MVDR Beamformer for Speech Enhancement
2301 -- 2312Nitya Tiwari, Prem C. Pandey. Speech Enhancement Using Noise Estimation With Dynamic Quantile Tracking
2313 -- 2325Junwen Duan, Xiao Ding, Yue Zhang, Ting Liu 0001. TEND: A Target-Dependent Representation Learning Framework for News Document
2326 -- 2335Lujun Zhao, Xipeng Qiu, Qi Zhang 0001, Xuanjing Huang. Sequence Labeling With Deep Gated Dual Path CNN
2336 -- 2349Akihiro Kato, Tomi H. Kinnunen. Statistical Regression Models for Noise Robust F0 Estimation Using Recurrent Deep Neural Networks
2350 -- 2361Dayiheng Liu, Jie Fu, Qian Qu, Jiancheng Lv. BFGAN: Backward and Forward Generative Adversarial Networks for Lexically Constrained Sentence Generation
2362 -- 2372Andrés Marafioti, Nathanaël Perraudin, Nicki Holighaus, Piotr Majdak. A Context Encoder For Audio Inpainting
2373 -- 2384Jichen Yang, Rohan Kumar Das, Nina Zhou. Extraction of Octave Spectra Information for Spoofing Attack Detection
2385 -- 2396Oren Barkan, David Tsiris, Ori Katz, Noam Koenigstein. InverSynth: Deep Estimation of Synthesizer Parameter Configurations From Audio Signals

Volume 27, Issue 11

1664 -- 1674Zhuosheng Zhang, Hai Zhao, Kangwei Ling, Jiangtong Li, Zuchao Li, Shexia He, Guohong Fu. Effective Subword Segmentation for Text Comprehension
1675 -- 1685Yue Xie, Ruiyu Liang, Zhenlin Liang, Chengwei Huang, Cairong Zou, Björn W. Schuller. Speech Emotion Classification Using Attention-Based LSTM
1686 -- 1696Shuai Wang, Zili Huang, Yanmin Qian, Kai Yu 0004. Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification
1697 -- 1712Rui Lu, Zhiyao Duan, Changshui Zhang. Audio-Visual Deep Clustering for Speech Separation
1713 -- 1724Tetiana Parshakova, François Rameau, Andriy Serdega, In-So Kweon, Dae-Shik Kim. Latent Question Interpretation Through Variational Adaptation
1725 -- 1736Jeremy Heng Meng Wong, Mark John Francis Gales, Yu Wang 0027. General Sequence Teacher-Student Learning
1737 -- 1751Liming Shi, Jesper Kjær Nielsen, Jesper Rindom Jensen, Max A. Little, Mads Græsbøll Christensen. Robust Bayesian Pitch Tracking Based on the Harmonic Model
1752 -- 1762Yan Yang, Changchun Bao. RS-CAE-Based AR-Wiener Filtering and Harmonic Recovery for Speech Enhancement
1763 -- 1776Alberto Bernardini, Paolo Maffezzoni, Augusto Sarti. Linear Multistep Discretization Methods With Variable Step-Size in Nonlinear Wave Digital Structures for Virtual Analog Modeling
1777 -- 1790Dong Deng, Liping Jing, Jian Yu, Shaolong Sun. Sparse Self-Attention LSTM for Sentiment Lexicon Construction
1791 -- 1802Qiuqiang Kong, Changsong Yu, Yong Xu 0004, Turab Iqbal, Wenwu Wang, Mark D. Plumbley. Weakly Labelled AudioSet Tagging With Attention Neural Networks
1803 -- 1814Samy Elshamy, Tim Fingscheidt. DNN-Based Cepstral Excitation Manipulation for Speech Enhancement
1815 -- 1825Nooshin Maghsoodi, Hossein Sameti, Hossein Zeinali, Themos Stafylakis. Speaker Recognition With Random Digit Strings Using Uncertainty Normalized HMM-Based i-Vectors
1826 -- 1838Sining Sun, Pengcheng Guo, Lei Xie 0001, Mei-Yuh Hwang. Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition
1839 -- 1848Masood Delfarah, DeLiang Wang. Deep Learning for Talker-Dependent Reverberant Speaker Separation: An Empirical Study

Volume 27, Issue 10

1497 -- 1506Pairui Li, Chuan Chen, Wujie Zheng, Yuetang Deng, Fanghua Ye, Zibin Zheng. STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings
1507 -- 1519Jie Zhang 0042, Richard Heusdens, Richard Christian Hendriks. Relative Acoustic Transfer Function Estimation in Wireless Acoustic Sensor Networks
1520 -- 1534Jihwan Park, Joon-Hyuk Chang. State-Space Microphone Array Nonlinear Acoustic Echo Cancellation Using Multi-Microphone Near-End Speech Covariance
1535 -- 1548Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki. Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features
1549 -- 1563Hala As'ad, Martin Bouchard, Homayoun Kamkar-Parsi. A Robust Target Linearly Constrained Minimum Variance Beamformer With Spatial Cues Preservation for Binaural Hearing Aids
1564 -- 1576Yijun Wang, Yingce Xia, Li Zhao, Jiang Bian, Tao Qin, Enhong Chen, Tie-Yan Liu. Semi-Supervised Neural Machine Translation via Marginal Distribution Estimation
1577 -- 1589Arindam Jati, Panayiotis G. Georgiou. Neural Predictive Coding Using Convolutional Neural Networks Toward Unsupervised Learning of Speaker Characteristics
1590 -- 1600Federico Fontana, Enrico Bozzo. Newton-Raphson Solution of Nonlinear Delay-Free Loop Filter Networks
1601 -- 1615Naoki Makishima, Shinichi Mogami, Norihiro Takamune, Daichi Kitamura, Hayato Sumino, Shinnosuke Takamichi, Hiroshi Saruwatari, Nobutaka Ono. Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation
1616 -- 1628Jeena J. Prakash, Hema A. Murthy. Analysis of Inter-Pausal Units in Indian Languages and Its Application to Text-to-Speech Synthesis
1629 -- 1638Yunshi Lan, Shuohang Wang, Jing Jiang 0001. Knowledge Base Question Answering With a Matching-Aggregation Model and Question-Specific Contextual Relations
1639 -- 1648Xuefeng Bai, Hailong Cao, Kehai Chen, Tiejun Zhao. A Bilingual Adversarial Autoencoder for Unsupervised Bilingual Lexicon Induction
1649 -- 1660Guanlong Zhao, Ricardo Gutierrez-Osuna. Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion

Volume 27, Issue 1

5 -- 6Dilek Hakkani-Tür. Inaugural Editorial Innovations in an Era of Ubiquitous Audio, Speech, and Language Processing
7 -- 19Feng Bao 0003, Waleed H. Abdulla. A New Ratio Mask Representation for CASA-Based Speech Enhancement
20 -- 31Paul Magron, Tuomas Virtanen. Complex ISNMF: A Phase-Aware Model for Monaural Audio Source Separation
32 -- 43Thanh Thi Hien Duong, Ngoc Q. K. Duong, Cong Phuong Nguyen, Quoc Cuong Nguyen. Gaussian Modeling-Based Multichannel Audio Source Separation Exploiting Generic Source Spectral Model
44 -- 52Guoqiang Zhang 0003, Jiancheng Tao, Xiaojun Qiu, Ian S. Burnett. Decentralized Two-Channel Active Noise Control for Single Frequency by Shaping Matrix Eigenvalues
53 -- 62Yan Zhao, Zhong-qiu Wang, DeLiang Wang. Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement
63 -- 76Naijun Zheng, Xiao-lei Zhang. Phase-Aware Speech Enhancement Based on Deep Neural Networks
77 -- 88Takafumi Moriya, Tomohiro Tanaka, Takahiro Shinozaki, Shinji Watanabe, Kevin Duh. Evolution-Strategy-Based Automation of System Development for High-Performance Speech Recognition
89 -- 98Herman Kamper, Gregory Shakhnarovich, Karen Livescu. Semantic Speech Retrieval With a Visually Grounded Model of Untranscribed Speech
99 -- 113Mathew Shaji Kavalekalam, Jesper Kjær Nielsen, Jesper Bünsow Boldt, Mads Græsbøll Christensen. Model-Based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids
114 -- 124Achuth Rao MV, Prasanta Kumar Ghosh. Glottal Inverse Filtering Using Probabilistic Weighted Linear Prediction
125 -- 139Yang Sun, Wenwu Wang, Jonathon A. Chambers, Syed Mohsen Naqvi. Two-Stage Monaural Source Separation in Reverberant Room Environments Using Deep Neural Networks
140 -- 153Luciana Ferrer, Mahesh Kumar Nandwana, Mitchell McLaren, Diego Castán, Aaron Lawson. Toward Fail-Safe Speaker Recognition: Trial-Based Calibration With a Reject Option
154 -- 167Jamal Amini, Richard C. Hendriks, Richard Heusdens, Meng Guo, Jesper Jensen 0001. Asymmetric Coding for Rate-Constrained Noise Reduction in Binaural Hearing Aids
168 -- 177Jianfei Yu, Jing Jiang, Rui Xia. Global Inference for Aspect and Opinion Terms Co-Extraction Based on Multi-Task Neural Networks
178 -- 188Zhong-qiu Wang, Xueliang Zhang, DeLiang Wang. Robust Speaker Localization Guided by Deep Learning-Based Time-Frequency Masking
189 -- 198Ke Tan, Jitong Chen, DeLiang Wang. Gated Residual Networks With Dilated Convolutions for Monaural Speech Enhancement
199 -- 211Hoang Gia Ngo, Minh Nguyen, Nancy F. Chen. Phonology-Augmented Statistical Framework for Machine Transliteration Using Limited Linguistic Resources
212 -- 224Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu, Yuta Kawachi, Noboru Harada. Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma
225 -- 239Yaron Laufer, Sharon Gannot. A Bayesian Hierarchical Model for Speech Enhancement With Time-Varying Audio Channel