Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 26, Issue 9

1457 -- 1483Chih-Wei Wu, Christian Dittmar, Carl Southall, Richard Vogl, Gerhard Widmer, Jason Hockman, Meinard Müller, Alexander Lerch. A Review of Automatic Drum Transcription
1484 -- 1498Christine Evers, Patrick A. Naylor. Acoustic SLAM
1499 -- 1511Clement Laroche, Matthieu Kowalski, Hélène Papadopoulos, Gaël Richard. Hybrid Projective Nonnegative Matrix Factorization With Drum Dictionaries for Harmonic/Percussive Source Separation
1512 -- 1527Julio J. Carabias-Orti, Joonas Nikunen, Tuomas Virtanen, Pedro Vera-Candeas. Multichannel Blind Sound Source Separation Using Spatial Covariance Model With Level and Time Differences and Nonnegative Matrix Factorization
1528 -- 1538Meishan Zhang, Nan Yu, Guohong Fu. A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging
1539 -- 1548Dylan Menzies, Filippo Maria Fazi. A Complex Panning Method for Near-Field Imaging
1549 -- 1558Abhinav Misra, John H. L. Hansen. Maximum-Likelihood Linear Transformation for Unsupervised Domain Adaptation in Speaker Verification
1559 -- 1569Yukoh Wakabayashi, Takahiro Fukumori, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita. Single-Channel Speech Enhancement With Phase Reconstruction Based on Phase Distortion Averaging
1570 -- 1584Szu-Wei Fu, Taowei Wang, Yu Tsao, Xugang Lu, Hisashi Kawai. End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks
1585 -- 1593Ke Xiao, Supin Wang, Mingxi Wan, Liang Wu 0004. Radiated Noise Suppression for Electrolarynx Speech Based on Multiband Time-Domain Amplitude Modulation
1594 -- 1607Abdullah Fahim, Prasanga N. Samarasinghe, Thushara D. Abhayapala. PSD Estimation and Source Separation in a Noisy Reverberant Environment Using a Spherical Microphone Array
1608 -- 1619Hongsen He, Jingdong Chen, Jacob Benesty, Tao Yang. p-Norm Constraint
1620 -- 1632Weiwei Zhang, Zhe Chen, Fuliang Yin, Qiaoling Zhang. Melody Extraction From Polyphonic Music Using Particle Filter and Dynamic Programming
1633 -- 1644Chunlei Zhang, Kazuhito Koishida, John H. L. Hansen. Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings
1645 -- 1657M. V. Achuth Rao, Prasanta Kumar Ghosh. PSFM - A Probabilistic Source Filter Model for Noise Robust Glottal Closure Instant Detection
1658 -- 1670Manu Airaksinen, Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku. A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis
1671 -- 1683Gaël Mahé, Meriem Jaïdane. Perceptually Controlled Reshaping of Sound Histograms
1684 -- 1697Qinghua Huang, Lin Zhang, Yong Fang. Two-Step Spherical Harmonics ESPRIT-Type Algorithms and Performance Analysis

Volume 26, Issue 8

1307 -- 1335Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis, Derry Fitzgerald, Bryan Pardo. An Overview of Lead and Accompaniment Separation in Music
1336 -- 1351Chien-Yao Wang, Jia-Ching Wang, Andri Santoso, Chin-Chin Chiang, Chung-Hsien Wu. Sound Event Recognition Using Auditory-Receptive-Field Binary Pattern and Hierarchical-Diving Deep Belief Network
1352 -- 1358Liner Yang, Meishan Zhang, Yang Liu, Maosong Sun, Nan Yu, Guohong Fu. Joint POS Tagging and Dependence Parsing With Transition-Based Neural Networks
1359 -- 1368Kai Yu 0004, Zijian Zhao, Xueyang Wu, Hongtao Lin, Xuan Liu. Rich Short Text Conversation Using Semantic-Key-Controlled Sequence Generation
1369 -- 1380Bernhard Lehner, Jan Schlüter, Gerhard Widmer. Online, Loudness-Invariant Vocal Detection in Mixed Music Signals
1381 -- 1392Simon Stone, Michael Marxen, Peter Birkholz. Construction and Evaluation of a Parametric One-Dimensional Vocal Tract Model
1393 -- 1405Tian Tan 0002, Yanmin Qian, Hu Hu, Ying Zhou, Wen Ding, Kai Yu 0004. Adaptive Very Deep Convolutional Residual Network for Noise Robust Speech Recognition
1406 -- 1419Xin Wang, Shinji Takaki, Junichi Yamagishi. Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis
1420 -- 1433Cassia Valentini-Botinhao, Junichi Yamagishi. Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech
1434 -- 1448Andreas I. Koutrouvelis, Thomas W. Sherson, Richard Heusdens, Richard C. Hendriks. A Low-Cost Robust Distributed Linearly Constrained Beamformer for Wireless Acoustic Sensor Networks With Arbitrary Topology

Volume 26, Issue 7

1173 -- 1180Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda. Mel-Cepstrum-Based Quantization Noise Shaping Applied to Neural-Network-Based Speech Waveform Synthesis
1181 -- 1193Qing Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee. A Multiobjective Learning and Ensembling Approach to High-Performance Speech Enhancement With Compact Neural Network Architectures
1194 -- 1202Miguel Ángel del Agua, Adrià Giménez, Alberto Sanchís, Jorge Civera, Alfons Juan. Speaker-Adapted Confidence Measures for ASR Using Deep Bidirectional Recurrent Neural Networks
1203 -- 1215Jorge Proença, Carla Lopes, Michael Tjalve, Andreas Stolcke, Sara Candeias, Fernando Perdigão. Mispronunciation Detection in Children's Reading of Sentences
1216 -- 1231LJubisa Stankovic, Milos Brajovic. Analysis of the Reconstruction of Sparse Signals in the DCT Domain Applied to Audio Signals
1232 -- 1242João Felipe Santos, Tiago H. Falk. Speech Dereverberation With Context-Aware Recurrent Neural Networks
1243 -- 1256Michele Geronazzo, Simone Spagnol, Federico Avanzini. Do We Need Individual Head-Related Transfer Functions for Vertical Localization? The Case Study of a Spectral Notch Distance Metric
1257 -- 1270Daniel Marquardt, Simon Doclo. Interaural Coherence Preservation for Binaural Noise Reduction Using Partial Noise Estimation and Spectral Postfiltering
1271 -- 1285Mojtaba Farmani, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen 0001. Bias-Compensated Informed Sound Source Localization Using Relative Transfer Functions
1286 -- 1298Fei Tao, Carlos Busso. Gating Neural Network for Large Vocabulary Audiovisual Speech Recognition

Volume 26, Issue 6

1025 -- 1036Hirokazu Kameoka, Takuya Higuchi, Mikihiro Tanaka, Li Li. Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization
1037 -- 1051Jacob Donley, Christian Ritz, W. Bastiaan Kleijn. Multizone Soundfield Reproduction With Privacy- and Quality-Based Speech Masking Filters
1052 -- 1067Sebastian Braun, Adam Kuklasinski, Ofer Schwartz, Oliver Thiergart, Emanuel A. P. Habets, Sharon Gannot, Simon Doclo, Jesper Jensen 0001. Evaluation and Comparison of Late Reverberation Power Spectral Density Estimators
1068 -- 1078Elie-Laurent Benaroya, Nicolas Obin, Marco Liuni, Axel Roebel, Wilson Raumel, Sylvain Argentieri. Binaural Localization of Multiple Sound Sources by Non-Negative Tensor Factorization
1079 -- 1090Nathanael Perraudin, Nicki Holighaus, Piotr Majdak, Péter Balázs. Inpainting of Long Audio Segments With Similarity Graphs
1091 -- 1101Paul Magron, Roland Badeau, Bertrand David. Model-Based STFT Phase Recovery for Audio Source Separation
1102 -- 1114Ina Kodrasi, Simon Doclo. Analysis of Eigenvalue Decomposition-Based Late Reverberation Power Spectral Density Estimation
1115 -- 1125Sebastian Braun, Emanuel A. P. Habets. Linear Prediction-Based Online Dereverberation and Noise Reduction Using Alternating Kalman Filters
1126 -- 1139Dhananjay Ram, Afsaneh Asaei, Hervé Bourlard. Sparse Subspace Modeling for Query by Example Spoken Term Detection
1140 -- 1149Martin Krawczyk-Becker, Timo Gerkmann. On Speech Enhancement Under PSD Uncertainty
1150 -- 1164Simon Leglaive, Roland Badeau, Gaël Richard. Student's t Source and Mixing Models for Multichannel Audio Source Separation

Volume 26, Issue 5

857 -- 872Youssef El Baba, Andreas Walther, Emanuel A. P. Habets. 3D Room Geometry Inference Based on Room Impulse Response Stacks
873 -- 882Qian Zhang, John H. L. Hansen. Language/Dialect Recognition Based on Unsupervised Deep Learning
883 -- 894Zhen-Hua Ling, Yang Ai, Yu Gu, Li-Rong Dai. Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension
895 -- 908Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Christian Huemmer, Tomohiro Nakatani. Context Adaptive Neural Network Based Acoustic Models for Rapid Adaptation
909 -- 923Linh Thi Thuc Tran, Sven Erik Nordholm, Henning F. Schepker, Hai Huyen Dam, Simon Doclo. Two-Microphone Hearing Aids Using Prediction Error Method for Adaptive Feedback Control
924 -- 936Jiho Chang, Marton Marschall. Periphony-Lattice Mixed-Order Ambisonic Scheme for Spherical Microphone Arrays
937 -- 950Nikolaos Dionelis, Mike Brookes. Phase-Aware Single-Channel Speech Enhancement With Modulation-Domain Kalman Filtering
951 -- 966Chengshi Zheng, Antoine Deleforge, Xiaodong Li, Walter Kellermann. Statistical Analysis of the Multichannel Wiener Filter Using a Bivariate Normal Distribution for Sample Covariance Matrices
967 -- 980Colin Vaz, Vikram Ramanarayanan, Shrikanth Narayanan. Acoustic Denoising Using Dictionary Learning With Spectral and Temporal Regularization
981 -- 994Lin Wang, Andrea Cavallaro. Pseudo-Determined Blind Source Separation for Ad-hoc Microphone Networks
995 -- 1009Sandro Cumani, Pietro Laface. Scoring Heterogeneous Speaker Vectors Using Nonlinear Transformations and Tied PLDA Models
1010 -- 1024Giuliano Bernardi, Toon van Waterschoot, Jan Wouters, Marc Moonen. Subjective and Objective Sound-Quality Evaluation of Adaptive Feedback Cancellation Algorithms

Volume 26, Issue 4

700 -- 712Zhili Tan, Man-Wai Mak, Brian Kan-Wing Mak. DNN-Based Score Calibration With Multitask Learning for Noise Robust Speaker Verification
713 -- 724Ya-Jun Hu, Zhen-Hua Ling. Extracting Spectral Features Using Deep Autoencoders With Binary Distributed Hidden Units for Statistical Parametric Speech Synthesis
725 -- 735Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot. A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models
736 -- 748Sandro Cumani, Pietro Laface. Speaker Recognition Using e-Vectors
749 -- 759Longting Xu, Kong-Aik Lee, Haizhou Li, Zhen Yang. Generalizing I-Vector Estimation for Rapid Speaker Recognition
760 -- 773Yaakov Buchris, Israel Cohen, Jacob Benesty. Frequency-Domain Design of Asymmetric Circular Differential Microphone Arrays
774 -- 786Jihui Zhang, Thushara D. Abhayapala, Wen Zhang 0002, Prasanga N. Samarasinghe, Shouda Jiang. Active Noise Control Over Space: A Wave Domain Approach
787 -- 796Yi Luo, Zhuo Chen, Nima Mesgarani. Speaker-Independent Speech Separation With Deep Attractor Network
797 -- 805Neethu Mariam Joy, Sandeep Reddy Kothinti, Srinivasan Umesh. FMLLR Speaker Normalization With i-Vector: In Pseudo-FMLLR and Distillation Framework
806 -- 819Swati Chandna, Wenwu Wang. Bootstrap Averaging for Model-Based Source Separation in Reverberant Conditions
820 -- 830Zhili Tan, Man-Wai Mak, Brian Kan-Wing Mak, Yingke Zhu. Denoised Senone I-Vectors for Robust Speaker Verification
831 -- 846Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models

Volume 26, Issue 3

461 -- 474Cesar Salvador, Shuichi Sakamoto, Jorge Trevino, Yôiti Suzuki. Boundary Matching Filters for Spherical Microphone and Loudspeaker Arrays
475 -- 484Ahmed Hussen Abdelaziz. Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition
485 -- 500Satoru Emura. Residual Echo Reduction for Multichannel Acoustic Echo Cancelers With a Complex-Valued Residual Echo Estimate
501 -- 514Van Hai Do, Nancy F. Chen, Boon Pang Lim, Mark A. Hasegawa-Johnson. Multitask Learning for Phone Recognition of Underresourced Languages Using Mismatched Transcription
515 -- 528Mehdi Zohourian, Gerald Enzner, Rainer Martin. Binaural Speaker Localization Integrated Into an Adaptive Beamformer for Hearing Aids
529 -- 539Yong Xiang, Iynkaran Natgunanathan, Dezhong Peng, Guang Hua, Bo Liu. Spread Spectrum Audio Watermarking Using Multiple Orthogonal PN Sequences and Variable Embedding Strengths and Polarities
540 -- 549Chuanqi Tan, Furu Wei, Qingyu Zhou, Nan Yang, Bowen Du, Weifeng Lv, Ming Zhou 0001. Context-Aware Answer Sentence Selection With Hierarchical Gated Recurrent Neural Networks
550 -- 563Jie Zhang, Sundeep Prabhakar Chepuri, Richard Christian Hendriks, Richard Heusdens. Microphone Subset Selection for MVDR Beamformer Based Noise Reduction
564 -- 579Syu-Siang Wang, Payton Lin, Yu Tsao, Jeih-Weih Hung, Borching Su. Suppression by Selecting Wavelets for Feature Compression in Distributed Speech Recognition
580 -- 594Yu Wang, Mike Brookes. Model-Based Speech Enhancement in the Modulation Domain
595 -- 608Christian Huemmer, Christian Hofmann, Roland Maas, Walter Kellermann. Estimating Parameters of Nonlinear Systems Using the Elitist Particle Filter Based on Evolutionary Strategies
609 -- 622Daniele Salvati, Carlo Drioli, Gian Luca Foresti. A Low-Complexity Robust Beamforming Using Diagonal Unloading for Acoustic Source Localization
623 -- 632Jinsong Su, Jiali Zeng, Deyi Xiong, Yang Liu 0005, Mingxuan Wang, Jun Xie. A Hierarchy-to-Sequence Attentional Neural Machine Translation Model
633 -- 645Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre. A Unified Joint Model to Deal With Nuisance Variabilities in the i-Vector Space
646 -- 656Gregory Gelly, Jean-Luc Gauvain. Optimization of RNN-Based Speech Activity Detection
657 -- 670Maja Taseska, Emanuel A. P. Habets. Blind Source Separation of Moving Sources Using Sparsity-Based Source Detection and Tracking
671 -- 681Liang-Chih Yu, Jin Wang, K. Robert Lai, Xue-Jie Zhang. Refining Word Embeddings Using Intensity Scores for Sentiment Analysis
682 -- 695Yuval Dorfan, Axel Plinge, Gershon Hazan, Sharon Gannot. Distributed Expectation-Maximization Algorithm for Speaker Localization in Reverberant Environments

Volume 26, Issue 2

215 -- 230Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Tatsuya Kawahara, Hiroshi G. Okuno. Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms
231 -- 242Yu-Ping Ruan, Qian Chen, Zhen-Hua Ling. A Sequential Neural Encoder With Latent Structured Description for Modeling Sentences
243 -- 255Amelia J. Gully, Helena Daffern, Damian T. Murphy. Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh
256 -- 265Chunyang Wu, Mark J. F. Gales, Anton Ragni, Penny Karanasou, Khe Chai Sim. Improving Interpretability and Regularization in Deep Learning
266 -- 280Kehai Chen, Tiejun Zhao, Muyun Yang, Lemao Liu, Akihiro Tamura, Rui Wang 0015, Masao Utiyama, Eiichiro Sumita. A Neural Approach to Source Dependence Based Context Model for Statistical Machine Translation
281 -- 295Joonas Nikunen, Aleksandr Diment, Tuomas Virtanen. Separation of Moving Sound Sources Using Multichannel NMF and Acoustic Tracking
296 -- 303Johan Sward, Hongbin Li, Andreas Jakobsson. Off-Grid Fundamental Frequency Estimation
304 -- 317Dylan Menzies, Marcos F. Simón Gálvez, Filippo Maria Fazi. A Low-Frequency Panning Method With Compensation for Head Rotation
318 -- 329Branimir Dropuljic, Igor Mijic, Davor Petrinovic, Tanja Jovanovic, Kresimir Cosic. Vocal Analysis of Acoustic Startle Responses
330 -- 341Philipp Aichinger, Martin Hagmüller, Berit Schneider-Stickler, Jean Schoentgen, Franz Pernkopf. Tracking of Multiple Fundamental Frequencies in Diplophonic Voices
342 -- 356Anastasios Alexandridis, Athanasios Mouchtaris. Multiple Sound Source Location Estimation in Wireless Acoustic Sensor Networks Using DOA Estimates: The Data-Association Problem
357 -- 366Robert Rehr, Timo Gerkmann. On the Importance of Super-Gaussian Speech Priors for Machine-Learning Based Speech Enhancement
367 -- 378Sonia Djaziri Larbi, Gaël Mahé, Imen Marrakchi-Mezghani, Monia Turki, Meriem Jaïdane. Watermark-Driven Acoustic Echo Cancellation
379 -- 393Annamaria Mesaros, Toni Heittola, Emmanouil Benetos, Peter Foster, Mathieu Lagrange, Tuomas Virtanen, Mark D. Plumbley. Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
394 -- 405Cheng-Tao Chung, Lin-Shan Lee. Unsupervised Discovery of Structured Acoustic Tokens With Applications to Spoken Term Detection
406 -- 414Tobias May. Robust Speech Dereverberation With a Neural Network-Based Post-Filter That Exploits Multi-Conditional Training of Binaural Cues
415 -- 421Majid Mirbagheri, Les Atlas, Adrian K. C. Lee. Regression Factor Analysis With an Application to Continuous HRIR Measurement
422 -- 435Jen-Tzung Chien. Bayesian Nonparametric Learning for Hierarchical and Sparse Topics
436 -- 450Johannes Stahl, Pejman Mowlaee. A Pitch-Synchronous Simultaneous Detection-Estimation Framework for Speech Enhancement

Volume 26, Issue 12

2255 -- 2266Xing Wang, Zhaopeng Tu, Min Zhang. Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
2267 -- 2276Yunxin Zhao, Mili Kuruvilla-Dugdale, Minguang Song. Structured Sparse Spectral Transforms and Structural Measures for Voice Conversion
2277 -- 2288Haniyeh Salehi, David Suelzle, Paula Folkeard, Vijay Parsa. Learning-Based Reference-Free Speech Quality Measures for Hearing Aid Applications
2289 -- 2304Gerald Enzner, Philipp Thüne. Bayesian MMSE Filtering of Noisy Speech by SNR Marginalization With Global PSD Priors
2305 -- 2318Gongping Huang, Jingdong Chen, Jacob Benesty. Insights Into Frequency-Invariant Beamforming With Concentric Circular Microphone Arrays
2319 -- 2327Shiqi Shen, Yun Chen, Cheng Yang, Zhiyuan Liu 0001, Maosong Sun. Zero-Shot Cross-Lingual Neural Headline Generation
2328 -- 2340Sudeep Surendran, T. Kishore Kumar. Oblique Projection and Cepstral Subtraction in Signal Subspace Speech Enhancement for Colored Noise Reduction
2341 -- 2354Qiang Li 0022, Derek F. Wong, Lidia S. Chao, Muhua Zhu, Tong Xiao, Jingbo Zhu, Min Zhang. Linguistic Knowledge-Aware Neural Machine Translation
2355 -- 2370Wen Zhang 0002, Christian Hofmann, Michael Buerger, Thushara Dheemantha Abhayapala, Walter Kellermann. Spatial Noise-Field Control With Online Secondary Path Modeling: A Wave-Domain Approach
2371 -- 2380Adrien Meynard, Bruno Torrésani. Spectral Analysis for Nonstationary Audio
2381 -- 2392Irene Martin-Morato, Maximo Cobos, Francesc J. Ferri. Adaptive Mid-Term Representations for Robust Audio Event Classification
2393 -- 2403Gergely Firtha, Peter Fiala, Frank-Schultz, Sascha Spors. On the General Relation of Wave Field Synthesis and Spectral Division Method for Linear Arrays
2404 -- 2411Peter Birkholz, Simon Stone, Klaus Wolf, Dirk Plettemeier. Non-Invasive Silent Phoneme Recognition Using Microwave Signals
2412 -- 2422Wei-Wei Lin, Man-Wai Mak, Jen-Tzung Chien. Multisource I-Vectors Domain Adaptation Using Maximum Mean Discrepancy Based Autoencoders
2423 -- 2435Mohammed Abdel-Wahab, Carlos Busso. Domain Adversarial for Acoustic Emotion Recognition
2436 -- 2446Dalia El Badawy, Ivan Dokmanic. Direction of Arrival With One Microphone, a Few LEGOs, and Non-Negative Matrix Factorization
2447 -- 2459Hung-yi Lee, Pei-Hung Chung, Yen-Chen Wu, Tzu-Hsiang Lin, Tsung-Hsien Wen. Interactive Spoken Content Retrieval by Deep Reinforcement Learning
2460 -- 2474Samy Elshamy, Nilesh Madhu, Wouter Tirry, Tim Fingscheidt. DNN-Supported Speech Enhancement With Cepstral Estimation of Both Excitation and Envelope
2475 -- 2488Yu Bao, Huawei Chen. A Chance-Constrained Programming Approach to the Design of Robust Broadband Beamformers With Microphone Mismatches
2489 -- 0Haizhou Li 0001. Farewell Editorial

Volume 26, Issue 11

1949 -- 1961Hossein Hadian, Hossein Sameti, Daniel Povey, Sanjeev Khudanpur. Flat-Start Single-Stage Discriminatively Trained HMM-Based Models for ASR
1962 -- 1975Fabrice Katzberg, Radoslaw Mazur, Marco Maaß, Philipp Koch, Alfred Mertins. A Compressed Sensing Framework for Dynamic Sound-Field Measurements
1976 -- 1990Sundar Harshavardhan, Thippur V. Sreenivas, Chandra Sekhar Seelamantula. TDOA-Based Multiple Acoustic Source Localization Without Association Ambiguity
1991 -- 2001Reza Sahraeian, Dirk Van Compernolle. Cross-Entropy Training of DNN Ensemble Acoustic Models for Low-Resource ASR
2002 -- 2014Heinrich Dinkel, Yanmin Qian, Kai Yu 0004. Investigating Raw Wave Deep Neural Networks for End-to-End Speaker Spoofing Detection
2015 -- 2026Jie Zhang 0042, Richard Heusdens, Richard Christian Hendriks. Rate-Distributed Spatial Filtering Based Noise Reduction in Wireless Acoustic Sensor Networks
2027 -- 2042Michael Heck, Sakriani Sakti, Satoshi Nakamura 0001. Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling
2043 -- 2055Shuai Nie, Shan Liang, Wenju Liu, Xueliang Zhang, Jianhua Tao. Deep Learning Based Speech Separation via NMF-Style Reconstructions
2056 -- 2071Harishchandra Dubey, Abhijeet Sangwan, John H. L. Hansen. Leveraging Frequency-Dependent Kernel and DIP-Based Clustering for Robust Speech Activity Detection in Naturalistic Audio Streams
2072 -- 2082Youngsoo Jang, Jiyeon Ham, Byung Jun Lee, Kee-Eung Kim. Cross-Language Neural Dialog State Tracker for Large Ontologies Using Hierarchical Attention
2083 -- 2097Gellért Weisz, Pawel Budzianowski, Pei-hao Su, Milica Gasic. Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces
2098 -- 2111Shoufeng Lin. Reverberation-Robust Localization of Speakers Using Distinct Speech Onsets and Multichannel Cross Correlations
2112 -- 2121Shamsiah Abidin, Roberto Togneri, Ferdous Ahmed Sohel. Spectrotemporal Analysis Using Local Binary Pattern Variants for Acoustic Scene Classification
2122 -- 2131Ning Ma, José A. González 0001, Guy J. Brown. Robust Binaural Localization of a Target Sound Source by Combining Spectral Source Models and Deep Neural Networks
2132 -- 2141Shuangzhi Wu, DongDong Zhang, Zhirui Zhang, Nan Yang 0002, Mu Li, Ming Zhou 0001. Dependency-to-Dependency Neural Machine Translation
2142 -- 2152Jingjing Xu, Hangfeng He, Xu Sun 0001, Xuancheng Ren, Sujian Li. Cross-Domain and Semisupervised Named Entity Recognition in Chinese Social Media: A Unified Model
2153 -- 2166Steven Van Kuyk, W. Bastiaan Kleijn, Richard Christian Hendriks. An Evaluation of Intrusive Instrumental Intelligibility Metrics
2167 -- 2179Xi Ouyang, Kang Gu, Pan Zhou. Spatial Pyramid Pooling Mechanism in 3D Convolutional Network for Sentence-Level Classification
2180 -- 2193Brian McFee, Justin Salamon, Juan Pablo Bello. Adaptive Pooling Operators for Weakly Labeled Sound Event Detection
2194 -- 2203Isabel Barbancho, George Tzanetakis, Ana M. Barbancho, Lorenzo J. Tardón. Discrimination Between Ascending/Descending Pitch Arpeggios
2204 -- 2214Younggwan Kim, Myung Jong Kim, Jahyun Goo, Hoirin Kim. Learning Self-Informed Feature Contribution for Deep Learning-Based Acoustic Modeling
2215 -- 2229Mert Burkay Çöteli, Orhun Olgun, Hüseyin Hacihabiboglu. Multiple Sound Source Localization With Steered Response Power Density and Hierarchical Grid Refinement
2230 -- 2239Jun-Wei Bao, Yeyun Gong, Nan Duan, Ming Zhou 0001, Tiejun Zhao. Question Generation With Doubly Adversarial Nets
2240 -- 2250Bing Bu, Changchun Bao, Mao-shen Jia. Design of a Planar First-Order Loudspeaker Array for Global Active Noise Control

Volume 26, Issue 10

1702 -- 1726DeLiang Wang, Jitong Chen. Supervised Speech Separation Based on Deep Learning: An Overview
1727 -- 1741Rui Wang 0015, Masao Utiyama, Andrew M. Finch, Lemao Liu, Kehai Chen, Eiichiro Sumita. Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation
1742 -- 1754Faheem Khan, Ben P. Milner, Thomas Le Cornu. Using Visual Speech Information in Masking Methods for Audio Speaker Separation
1755 -- 1768Xiaofei Li, Sharon Gannot, Laurent Girin, Radu Horaud. Multichannel Identification and Nonnegative Equalization for Dereverberation and Noise Reduction Based on Convolutive Transfer Function
1769 -- 1779Lutfi Kerem Senel, Ihsan Utlu, Veysel Yücesoy, Aykut Koc, Tolga Çukur. Semantic Structure and Interpretability of Word Embeddings
1780 -- 1792Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Yoichi Haneda. DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score
1793 -- 1808Constantin Paleologu, Jacob Benesty, Silviu Ciochina. Linear System Identification Based on a Kronecker Product Decomposition
1809 -- 1820Feifei Xiong, Stefan Goetze, Birger Kollmeier, Bernd T. Meyer. Exploring Auditory-Inspired Acoustic Features for Room Acoustic Parameter Estimation From Monaural Speech
1821 -- 1832Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. An Adaptive Method for Cross-Recording Speaker Diarization
1833 -- 1847Wei Xue, Alastair H. Moore, Mike Brookes, Patrick A. Naylor. Modulation-Domain Multichannel Kalman Filtering for Speech Enhancement
1848 -- 1859Kai Wu 0005, Vaninirappuputhenpurayil Gopalan Reju, Andy W. H. Khong. Multisource DOA Estimation in a Reverberant Environment Using a Single Acoustic Vector Sensor
1860 -- 1872Jizhou Huang, Yaming Sun, Wei Zhang 0088, Haifeng Wang, Ting Liu. Entity Highlight Generation as Statistical and Neural Machine Translation
1873 -- 1883Quoc Truong Do, Sakriani Sakti, Satoshi Nakamura 0001. Sequence-to-Sequence Models for Emphasis Speech Translation
1884 -- 1896Federico Fontana, Enrico Bozzo. Explicit Fixed-Point Computation of Nonlinear Delay-Free Loop Filter Networks
1897 -- 1912Simon Widmark. Causal IIR Audio Precompensator Filters Subject to Quadratic Constraints
1913 -- 1924Fiete Winter, Hagen Wierstorf, Christoph Hold, Frank Kruger, Alexander Raake, Sascha Spors. Colouration in Local Wave Field Synthesis
1925 -- 1939Asger Heidemann Andersen, Jan Mark de Haan, Zheng-Hua Tan, Jesper Jensen 0001. Nonintrusive Speech Intelligibility Prediction Using Convolutional Neural Networks

Volume 26, Issue 1

5 -- 18Dianna Yee, A. Homayoun Kamkar-Parsi, Rainer Martin, Henning Puder. A Noise Reduction Postfilter for Binaurally Linked Single-Microphone Hearing Aids Utilizing a Nearby External Microphone
19 -- 30Tomas Bäckström, Johannes Fischer 0002. Fast Randomization for Distributed Low-Bitrate Coding of Speech and Audio
31 -- 43Jun Deng, Xinzhou Xu, Zixing Zhang 0001, Sascha Frühholz, Björn W. Schuller. Semisupervised Autoencoders for Speech Emotion Recognition
44 -- 56Md. Sahidullah, Dennis Alexander Lehmann Thomsen, Rosa González Hautamäki, Tomi Kinnunen, Zheng-Hua Tan, Robert Parts, Martti Pitkänen. Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones
57 -- 70Gilles Degottex, Pierre Lanchantin, Mark J. F. Gales. A Log Domain Pulse Model for Parametric Speech Synthesis
71 -- 83Johannes Abel, Tim Fingscheidt. Artificial Speech Bandwidth Extension Using Deep Neural Networks for Wideband Spectral Envelope Estimation
84 -- 96Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari. Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks
97 -- 107Kristian Timm Andersen, Marc Moonen. Robust Speech-Distortion Weighted Interframe Wiener Filters for Single-Channel Noise Reduction
108 -- 121Chen-Yu Chiang. Cross-Dialect Adaptation Framework for Constructing Prosodic Models for Chinese Dialect Text-to-Speech Systems
122 -- 133Bingquan Liu, Zhen Xu, Chengjie Sun, Baoxun Wang, Xiaolong Wang, Derek F. Wong, Min Zhang. Content-Oriented User Modeling for Personalized Response Ranking in Chatbots
134 -- 144Zhiyuan Tang, Dong Wang, Yixiang Chen, Lantian Li, Andrew Abel. Phonetic Temporal Neural Model for Language Identification
145 -- 160Soumitro Chakrabarty, Emanuel A. P. Habets. A Bayesian Approach to Informed Spatial Filtering With Robustness Against DOA Estimation Errors
161 -- 170Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang. An Information Distillation Framework for Extractive Summarization
171 -- 183Ma Jin, Yan Song, Ian Vince McLoughlin, Li-Rong Dai. LID-Senones and Their Statistics for Language Identification
184 -- 196Zhehuai Chen, Jasha Droppo, Jinyu Li, Wayne Xiong. Progressive Joint Modeling in Unsupervised Single-Channel Overlapped Speech Recognition
197 -- 210Shivesh Ranjan, John H. L. Hansen. Curriculum Learning Based Approaches for Noise Robust Speaker Recognition