IEEE Transactions on Audio, Speech & Language Processing

researchr

You are not signed in
Sign in
Sign up

1741	--	1750	Jakob Abeßer, Gerald Schuller. Instrument-Centered Music Transcription of Solo Bass Guitar Recordings
1751	--	1761	Thomas Le Cornu, Ben Milner. Generating Intelligible Audio Speech From Visual Speech
1762	--	1772	Lemao Liu, Atsushi Fujita, Masao Utiyama, Andrew M. Finch, Eiichiro Sumita. Translation Quality Estimation Using Only Bilingual Corpora
1773	--	1783	Emad M. Grais, Gerard Roma, Andrew J. R. Simpson, Mark D. Plumbley. Two-Stage Single-Channel Audio Source Separation Using Deep Neural Networks
1784	--	1798	Giuliano Bernardi, Toon van Waterschoot, Jan Wouters, Marc Moonen. Adaptive Feedback Cancellation Using a Partitioned-Block Frequency-Domain Kalman Filter Approach With PEM-Based Signal Prewhitening
1799	--	1808	Vinal Patel, Jordan Cheer, Nithin V. George. Modified Phase-Scheduled-Command FxLMS Algorithm for Active Sound Profiling
1809	--	1820	Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato de Mori. Denoised Bottleneck Features From Deep Autoencoders for Telephone Conversation Analysis
1821	--	1835	Nikolaos Stefanakis, Despoina Pavlidi, Athanasios Mouchtaris. Perpendicular Cross-Spectra Fusion for Sound Source Localization With a Planar Microphone Array
1836	--	1845	Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda. Simultaneous Optimization of Multiple Tree-Based Factor Analyzed HMM for Speech Synthesis
1846	--	1858	Eita Nakamura, Kazuyoshi Yoshii, Simon Dixon. Note Value Recognition for Piano Transcription Using Markov Random Fields

1566	--	1578	Francis Stevens, Damian T. Murphy, Lauri Savioja, Vesa Välimäki. Modeling Sparsely Reflecting Outdoor Acoustic Scenes Using the Waveguide Web
1579	--	1591	Ferdinando Olivieri, Filippo Maria Fazi, Simone Fontana, Dylan Menzies, Philip Arthur Nelson. Generation of Private Sound With a Circular Loudspeaker Array and the Weighted Pressure Matching Method
1592	--	1605	Samy Elshamy, Nilesh Madhu, Wouter Tirry, Tim Fingscheidt. Instantaneous A Priori SNR Estimation by Cepstral Excitation Manipulation
1606	--	1617	Paavo Alku, Rahim Saeidi. The Linear Predictive Modeling of Speech From Higher-Lag Autocorrelation Coefficients Applied to Noise-Robust Speaker Recognition
1618	--	1632	Cheng Pang, Hong Liu, Jie Zhang, Xiaofei Li. Binaural Sound Localization Based on Reverberation Weighting and Generalized Parametric Mapping
1633	--	1643	Somanath Pradhan, Vinal Patel, Dipen Somani, Nithin V. George. An Improved Proportionate Delayless Multiband-Structured Subband Adaptive Feedback Canceller for Digital Hearing Aids
1644	--	1656	Szymon Drgas, Tuomas Virtanen, Jörg Lücke, Antti Hurmalainen. Binary Non-Negative Matrix Deconvolution for Audio Dictionary Learning
1657	--	1667	Fatemeh Saki, Nasser Kehtarnavaz. Real-Time Unsupervised Classification of Environmental Noise Signals
1668	--	1679	Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen. Automatic Sentiment Detection in Naturalistic Audio
1680	--	1693	Ofer Schwartz, Sharon Gannot, Emanuel A. P. Habets. Cramér-Rao Bound Analysis of Reverberation Level Estimators for Dereverberation and Noise Reduction
1694	--	1708	Seyran Khademi, Richard C. Hendriks, W. Bastiaan Kleijn. Intelligibility Enhancement Based on Mutual Information
1709	--	1717	Yuta Hatano, Chuang Shi, Yoshinobu Kajikawa. Compensation for Nonlinear Distortion of the Frequency Modulation-Based Parametric Array Loudspeaker
1718	--	1730	Yu-Ren Chien, Daryush D. Mehta, Jón Guðnason, Matías Zanartu, Thomas F. Quatieri. Evaluation of Glottal Inverse Filtering Algorithms Using a Physiologically Based Articulatory Speech Synthesizer

1409	--	1420	Yu-An Chen, Ju-Chiang Wang, Yi-Hsuan Yang, Homer H. Chen. Component Tying for Mixture Model Adaptation in Personalization of Music Emotion Recognition
1421	--	1435	Hossein Zeinali, Hossein Sameti, Lukás Burget. HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification
1436	--	1449	Xinzhou Xu, Jun Deng, Nicholas Cummins, Zixing Zhang, Chen Wu, Li Zhao, Björn W. Schuller. A Two-Dimensional Framework of Multiple Kernel Subspace Learning for Recognizing Emotion in Speech
1450	--	1461	Mandy Korpusik, James R. Glass. Spoken Language Understanding for a Nutrition Dialogue System
1462	--	1476	Mahmoud Fakhry, Piergiorgio Svaizer, Maurizio Omologo. Audio Source Separation in Reverberant Environments Using β-Divergence-Based Nonnegative Factorization
1477	--	1491	Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot. Semi-Supervised Source Localization on Multiple Manifolds With Distributed Microphones
1492	--	1501	Donald S. Williamson, DeLiang Wang. Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising
1502	--	1511	Liang Lu, Steve Renals. Small-Footprint Highway Deep Neural Networks for Speech Recognition
1512	--	1525	Ina Kodrasi, Simon Doclo. Signal-Dependent Penalty Functions for Robust Acoustic Multi-Channel Equalization
1526	--	1534	Jung Hee Kim, Jin Kim, Jae Hyeon Jeon, Sang-Won Nam. Delayless Individual-Weighting-Factors Sign Subband Adaptive Filter With Band-Dependent Variable Step-Sizes
1535	--	1546	Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee. A Gender Mixture Detection Approach to Unsupervised Single-Channel Speech Separation Based on Deep Neural Networks
1547	--	1561	Giacomo Vairetti, Enzo De Sena, Michael Catrysse, Søren Holdt Jensen, Marc Moonen, Toon van Waterschoot. A Scalable Algorithm for Physically Motivated and Sparse Approximation of Room Impulse Responses With Orthonormal Basis Functions

1169	--	1171	Gaël Richard, Tuomas Virtanen, Juan Pablo Bello, Nobutaka Ono, Hervé Glotin. Introduction to the Special Section on Sound Scene and Event Analysis
1172	--	1182	Hector A. Sanchez-Hevia, David Ayllón, Roberto Gil-Pita, Manuel Rosa-Zurera. Maximum Likelihood Decision Fusion for Weapon Classification in Wireless Acoustic Sensor Networks
1183	--	1192	Nithin Rao Koluguri, G. Nisha Meenakshi, Prasanta Kumar Ghosh. Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection
1193	--	1206	Dan Stowell, Emmanouil Benetos, Lisa F. Gill. On-Bird Sound Recordings: Automatic Acoustic Recognition of Activities and Contexts
1207	--	1215	Brandon T. Carroll, Bradley M. Whitaker, Wayne Daley, David V. Anderson. Outlier Learning via Augmented Frozen Dictionaries
1216	--	1229	Victor Bisot, Romain Serizel, Slim Essid, Gaël Richard. Feature Learning With Matrix Factorization Applied to Acoustic Scene Classification
1230	--	1241	Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson, Mark D. Plumbley. Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
1242	--	1252	Rene Grzeszick, Axel Plinge, Gernot A. Fink. Bag-of-Features Methods for Acoustic Event Detection and Classification
1253	--	1265	Alain Rakotomamonjy. Supervised Representation Learning for Audio Scene Classification
1266	--	1277	Emmanouil Benetos, Grégoire Lafay, Mathieu Lagrange, Mark D. Plumbley. Polyphonic Sound Event Tracking Using Linear Dynamical Systems
1278	--	1290	Huy Phan, Lars Hertel, Marco Maaß, Philipp Koch, Radoslaw Mazur, Alfred Mertins. Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks
1291	--	1303	Emre Çakir, Giambattista Parascandolo, Toni Heittola, Heikki Huttunen, Tuomas Virtanen. Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
1304	--	1314	Jens Schröder, Niko Moritz, Jörn Anemüller, Stefan Goetze, Birger Kollmeier. Classifier Architectures for Acoustic Scenes and Events: Implications for DNNs, TDNNs, and Perceptual Features from DCASE 2016
1315	--	1321	Wenjun Yang, Sridhar Krishnan. Combining Temporal Features by Local Binary Pattern for Acoustic Scene Classification
1322	--	1334	David Dov, Ronen Talmon, Israel Cohen. Multimodal Kernel Method for Activity Detection of Sound Sources
1335	--	1343	Keisuke Imoto, Nobutaka Ono. Spatial Cepstrum as a Spatial Feature Using a Distributed Microphone Array for Acoustic Scene Analysis
1344	--	1356	Ivo Trowitzsch, Johannes Mohr, Youssef Kashef, Klaus Obermayer. Robust Detection of Environmental Sounds in Binaural Auditory Scenes
1357	--	1370	Abu Shafin Mohammad Mahdee Jameel, Shaikh Anowarul Fattah, Rajib Goswami, Wei-Ping Zhu, M. Omair Ahmad. Noise Robust Formant Frequency Estimation Method Based on Spectral Model of Repeated Autocorrelation of Speech
1371	--	1383	Na Li, Man-Wai Mak, Jen-Tzung Chien. DNN-Driven Mixture of PLDA for Robust Speaker Verification
1384	--	1397	Kai Wu, Vaninirappuputhenpurayil Gopalan Reju, Andy W. H. Khong, Shu Ting Goh. Swarm Intelligence Based Particle Filter for Alternating Talker Localization and Tracking Using Microphone Arrays

929	--	939	Manu Airaksinen, Tomas Bäckström, Paavo Alku. Quadratic Programming Approach to Glottal Inverse Filtering by Joint Norm-1 and Norm-2 Optimization
940	--	951	Ofer Schwartz, Sharon Gannot, Emanuel A. P. Habets. Multispeaker LCMV Beamformer and Postfilter for Source Separation and Noise Reduction
952	--	964	Dongmei Wang, Chengzhu Yu, John H. L. Hansen. Robust Harmonic Features for Classification-Based Pitch Estimation
965	--	979	Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Bo Li, Arun Narayanan, Ehsan Variani, Michiel Bacchiani, Izhak Shafran, Andrew W. Senior, Kean K. Chin, Ananya Misra, Chanwoo Kim. Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition
980	--	995	Hanieh Khalilian, Ivan V. Bajic, Rodney G. Vaughan. A Simulation Study of a Three-Dimensional Sound Field Reproduction System for Immersive Communication
996	--	1010	Andreas Franck, Wenwu Wang, Filippo Maria Fazi. 1-Optimal Multiloudspeaker Panning and Its Relation to Vector Base Amplitude Panning
1011	--	1022	Songbin Li, Yizhen Jia, C. C. Jay Kuo. Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals
1023	--	1034	Naoyuki Kanda, Xugang Lu, Hisashi Kawai. Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models
1035	--	1047	Navid Shokouhi, John H. L. Hansen. Teager-Kaiser Energy Operators for Overlapped Speech Detection
1048	--	1060	Yi-Chin Huang, Chung-Hsien Wu, Yan-You Chen, Ming-Ge Shie, Jhing-Fa Wang. Personalized Spontaneous Speech Synthesis Using a Small-Sized Unsegmented Semispontaneous Speech
1061	--	1074	Jeongsoo Park, Jaeyoung Shin, Kyogu Lee. Exploiting Continuity/Discontinuity of Basis Vectors in Spectrogram Decomposition for Harmonic-Percussive Sound Separation
1075	--	1084	Xueliang Zhang, DeLiang Wang. Deep Learning Based Binaural Speech Separation in Reverberant Environments
1085	--	1094	Masood Delfarah, DeLiang Wang. Features for Masking-Based Monaural Speech Separation in Reverberant Conditions
1095	--	1106	Feiran Yang, Gerald Enzner, Jun Yang. Statistical Convergence Analysis for Optimal Control of DFT-Domain Adaptive Echo Canceler
1107	--	1116	Takashi Nose, Yusuke Arao, Takao Kobayashi, Komei Sugiura, Yoshinori Shiga. Sentence Selection Based on Extended Entropy Using Phonetic and Prosodic Contexts for Statistical Parametric Speech Synthesis
1117	--	1127	Gergely Firtha, Peter Fiala, Frank-Schultz, Sascha Spors. Improved Referencing Schemes for 2.5D Wave Field Synthesis Driving Functions
1128	--	1139	Esteban Maestre, Gary P. Scavone, Julius O. Smith. Joint Modeling of Bridge Admittance and Body Radiativity for Efficient Synthesis of String Instrument Sound by Digital Waveguides
1140	--	1153	Gongping Huang, Jacob Benesty, Jingdong Chen. On the Design of Frequency-Invariant Beampatterns With Uniform Circular Microphone Arrays
1154	--	1164	Zdenek Prusa, Péter Balázs, Peter L. Søndergaard. A Noniterative Method for Reconstruction of Phase From STFT Magnitude

692	--	730	Sharon Gannot, Emmanuel Vincent, Shmulik Markovich Golan, Alexey Ozerov. A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation
731	--	744	Dongwen Ying, Ruohua Zhou, Junfeng Li, Yonghong Yan 0002. Window-Dominant Signal Subspace Methods for Multiple Short-Term Speech Source Localization
745	--	755	Sean U. N. Wood, Jean Rouat, Stéphane Dupont, Gueorgui Pironkov. Blind Speech Separation and Enhancement With GCC-NMF
756	--	767	Constantin Spille, Birger Kollmeier, Bernd T. Meyer. Combining Binaural and Cortical Features for Robust Speech Recognition
768	--	779	Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Hitoshi Ohmuro. Informative Acoustic Feature Selection to Maximize Mutual Information for Collecting Target Sources
780	--	793	Takuya Higuchi, Nobutaka Ito, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Tomohiro Nakatani. Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR
794	--	806	Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama. Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices
807	--	817	Omid Ghahabi, Javier Hernando. Deep Learning Backend for Single and Multisession i-Vector Speaker Recognition
818	--	828	Penny Karanasou, Chunyang Wu, Mark J. F. Gales, Philip C. Woodland. I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models
829	--	838	G. Aneeja, B. Yegnanarayana. Extraction of Fundamental Frequency From Degraded Speech Using Temporal Envelopes at High SNR Frequencies
839	--	851	Seyyed Saeed Sarfjoo, Cenk Demiroglu, Simon King. Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data
852	--	862	Yung-Yue Chen, Jia-Hao Zhang. Background Noise Reduction Design for Dual Microphone Cellular Phones: Robust Approach
863	--	870	Liner Yang, Xinxiong Chen, Zhiyuan Liu, Maosong Sun. Improving Word Representations with Document Labels
871	--	884	Shiliang Zhang, Cong Liu, Hui Jiang 0001, Si Wei, Li-Rong Dai, Yu Hu. Nonrecurrent Neural Structure for Long-Term Dependence
885	--	894	Xuefeng Yang, Kezhi Mao. Task Independent Fine Tuning for Word Embeddings
895	--	907	Huawei Chen. Design of Robust Broadband Beamformers Using Worst-Case Performance Optimization: A Semidefinite Programming Approach
908	--	919	Sandro Cumani, Pietro Laface. Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition

457	--	468	Qi He, Feng Bao, Changchun Bao. Multiplicative Update of Auto-Regressive Gains for Codebook-Based Speech Enhancement
469	--	480	Zhongqing Wang, Sophia Yat Mei Lee, Shoushan Li, Guodong Zhou. Emotion Analysis in Code-Switching Text With Joint Factor Graph Model
481	--	492	Ashwin Bellur, Mounya Elhilali. Feedback-Driven Sensory Mapping Adaptation for Robust Speech Activity Detection
493	--	504	Zhiyuan Tang, Lantian Li, Dong Wang, Ravichander Vipperla. Collaborative Joint Training With Multitask Recurrent Model for Speech and Speaker Recognition
505	--	518	Bidisha Sharma, S. R. Mahadeva Prasanna. Sonority Measurement Using System, Source, and Suprasegmental Information
519	--	530	Hung-yi Lee, Bo-Hsiang Tseng, Tsung-Hsien Wen, Yu Tsao. Personalizing Recurrent-Neural-Network-Based Language Model by Social Network
531	--	543	Ji Ming, Danny Crookes. Speech Enhancement Based on Full-Sentence Correlation and Clean Speech Recognition
544	--	556	Quoc Truong Do, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura. Preserving Word-Level Emphasis in Speech-to-Speech Translation
557	--	571	Zhenghua Li, Jiayuan Chao, Min Zhang, Wenliang Chen, Meishan Zhang, Guohong Fu. Coupled POS Tagging on Heterogeneous Annotations
572	--	587	Clement S. J. Doire, Mike Brookes, Patrick A. Naylor, Christopher M. Hicks, Dave Betts, Mohammad A. Dmour, Søren Holdt Jensen. Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise
588	--	597	Aleksandr Sizov, Kong-Aik Lee, Tomi Kinnunen. Direct Optimization of the Detection Cost for I-Vector-Based Spoken Language Recognition
598	--	610	Imran A. Sheikh, Dominique Fohr, Irina Illina, Georges Linarès. Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition
611	--	623	Mojtaba Farmani, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen. Informed Sound Source Localization Using Relative Transfer Functions for Hearing Aid Applications
624	--	636	Vikram C. M., S. R. Mahadeva Prasanna. Epoch Extraction From Telephone Quality Speech Using Single Pole Filter
637	--	650	Motoi Omachi, Tetsuji Ogawa, Tetsunori Kobayashi. Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation
651	--	661	Dani Cherkassky, Sharon Gannot. Blind Synchronization in Wireless Acoustic Sensor Networks
662	--	673	Laurent Girin, Thomas Hueber, Xavier Alameda-Pineda. Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping
674	--	686	Mohamad Hasan Bahari, Alexander Bertrand, Marc Moonen. Blind Sampling Rate Offset Estimation for Wireless Acoustic Sensor Networks Through Weighted Least-Squares Coherence Drift Estimation
687	--	0	Adam Kuklasinski, Simon Doclo, Søren Holdt Jensen, Jesper Rindom Jensen. Correction to "Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise"

226	--	237	Hanchi Chen, Thushara Dheemantha Abhayapala, Prasanga N. Samarasinghe, Wen Zhang 0002. Direct-to-Reverberant Energy Ratio Estimation Using a First-Order Microphone
238	--	247	Peter Bell 0001, Pawel Swietojanski, Steve Renals. Multitask Learning of Context-Dependent Targets in Deep Neural Network Acoustic Models
248	--	260	Rui Zhao, Kezhi Mao. Topic-Aware Deep Compositional Models for Sentence Classification
261	--	272	Dalia El Badawy, Ngoc Q. K. Duong, Alexey Ozerov. On-the-Fly Audio Source Separation - A Novel User-Friendly Framework
273	--	284	Filip Elvander, Johan Sward, Andreas Jakobsson. Online Estimation of Multiple Harmonic Signals
285	--	295	Vincent Renkens, Hugo Van Hamme. Weakly Supervised Learning of Hidden Markov Models for Spoken Language Acquisition
296	--	309	Luca Remaggi, Philip J. B. Jackson, Philip Coleman, Wenwu Wang. Acoustic Reflector Localization: Novel Image Source Reversion and Direct Localization Methods
310	--	319	Prasanga N. Samarasinghe, Thushara D. Abhayapala, Hanchi Chen. Estimating the Direct-to-Reverberant Energy Ratio Using a Spherical Harmonics-Based Spatial Correlation Model
320	--	332	Shmulik Markovich Golan, Sharon Gannot, Walter Kellermann. Combined LCMV-TRINICON Beamforming for Separating Multiple Speech Sources in Noisy and Reverberant Environments
333	--	343	Shakeel Ahmed, Muhammad Tahir Akhtar. Gain Scheduling of Auxiliary Noise and Variable Step-Size for Online Acoustic Feedback Cancellation in Narrow-Band Active Noise Control Systems
344	--	358	Gabriel Sargent, Frédéric Bimbot, Emmanuel Vincent. Estimating the Structural Segmentation of Popular Music Pieces Under Regularity Constraints
359	--	373	Jordan Cheer, Stephen Daley. An Investigation of Delayless Subband Adaptive Filtering for Multi-Input Multi-Output Active Noise Control Applications
374	--	383	Sebastian J. Schlecht, Emanuel A. P. Habets. Feedback Delay Networks: Echo Density and Mixing Time
384	--	396	Johannes Abel, Magdalena Kaniewska, Cyril Guillaume, Wouter Tirry, Tim Fingscheidt. An Instrumental Quality Measure for Artificially Bandwidth-Extended Speech Signals
397	--	408	Robert Rehr, Timo Gerkmann. An Analysis of Adaptive Recursive Smoothing with Applications to Noise PSD Estimation
409	--	419	Emilio Granell, Carlos D. Martínez-Hinarejos. Multimodal Crowdsourcing for Transcribing Handwritten Documents
420	--	434	Yaping Ma, Yegui Xiao. A New Strategy for Online Secondary-Path Modeling of Narrowband Active Noise Control
435	--	447	Jose A. Belloch, Alberto González, Enrique S. Quintana-Ortí, Miguel Ferrer, Vesa Välimäki. GPU-Based Dynamic Wave Field Synthesis Using Fractional Delay Filters and Room Compensation

2254	--	2256	Tanja Schultz, Thomas Hueber, Dean J. Krusienski, J. S. Brumberg. Introduction to the Special Issue on Biosignal-Based Spoken Communication
2257	--	2271	Tanja Schultz, Michael Wand, Thomas Hueber, Dean J. Krusienski, Christian Herff, Jonathan S. Brumberg. Biosignal-Based Spoken Communication: A Survey
2272	--	2280	Christopher Dromey, Katherine M. Black. Effects of Laryngeal Activity on Articulation
2281	--	2291	Michal Borsky, Daryush D. Mehta, Jarrad H. Van Stan, Jón Guðnason. Modal and Nonmodal Voice Quality Classification Using Acoustic and Electroglottographic Features
2292	--	2300	Alborz Rezazadeh Sereshkeh, Robert Trott, Aurélien Bricout, Tom Chau. EEG Classification of Covert Speech Using Regularized Neural Networks
2301	--	2312	Reza Sahraeian, Dirk Van Compernolle. Crosslingual and Multilingual Speech Recognition Based on the Speech Manifold
2313	--	2322	Dorde T. Grozdic, Slobodan T. Jovicic. Whispered Speech Recognition Using Deep Denoising Autoencoder and Inverse Filtering
2323	--	2336	Myung Jong Kim, Beiming Cao, Ted Mau, Jun Wang. Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network
2337	--	2350	Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda. Articulatory Controllable Speech Modification Based on Statistical Inversion and Production Mappings
2351	--	2361	Ingmar Steiner, Sébastien Le Maguer, Alexander Hewer. Synthesis of Tongue Motion and Acoustics From Text Using a Multimodal Articulatory Database
2362	--	2374	José A. González 0001, Lam Aun Cheah, Angel M. Gomez, Phil D. Green, James M. Gilbert, Stephen R. Ell, Roger K. Moore, Ed Holdsworth. Direct Speech Reconstruction From Articulatory Sensor Data by Machine Learning
2375	--	2385	Matthias Janke, Lorenz Diener. EMG-to-Speech: Direct Generation of Speech From Facial Electromyographic Signals
2386	--	2398	Geoffrey S. Meltzner, James T. Heaton, Yunbin Deng, Gianluca De Luca, Serge H. Roy, Joshua C. Kline. Silent Speech Recognition as an Alternative Communication Device for Persons With Laryngectomy
2399	--	2409	Fei Chen, Lan Wang, Hui Chen, Gang Peng. Investigations on Mandarin Aspiratory Animations Using an Airflow Model
2410	--	2423	Wayne Xiong, Jasha Droppo, Xuedong Huang, Frank Seide, Michael L. Seltzer, Andreas Stolcke, Dong Yu, Geoffrey Zweig. Toward Human Parity in Conversational Speech Recognition
2424	--	2432	Biao Zhang, Deyi Xiong, Jinsong Su, Hong Duan. A Context-Aware Recurrent Encoder for Neural Machine Translation
2433	--	2443	Afsaneh Asaei, Milos Cernak, Hervé Bourlard. Perceptual Information Loss due to Impaired Speech Production
2444	--	2453	Ning Ma, Tobias May, Guy J. Brown. Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localization of Multiple Sources in Reverberant Environments

2045	--	2058	Qinghua Huang, Lin Zhang, Yong Fang. Two-Stage Decoupled DOA Estimation Based on Real Spherical Harmonics for Spherical Arrays
2059	--	2070	Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda. Duration-Controlled LSTM for Polyphonic Sound Event Detection
2071	--	2084	Monisankha Pal, Goutam Saha. Spectral Mapping Using Prior Re-Estimation of i-Vectors and System Fusion for Voice Conversion
2085	--	2097	Seppo Enarvi, Peter Smit, Sami Virpioja, Mikko Kurimo. Automatic Speech Recognition With Very Large Conversational Finnish and Estonian Vocabularies
2098	--	2111	Hannah Muckenhirn, Pavel Korshunov, Mathew Magimai-Doss, Sébastien Marcel. Long-Term Spectral Statistics for Voice Presentation Attack Detection
2112	--	2124	Brian Hamilton, Stefan Bilbao. FDTD Methods for 3-D Room Acoustics Simulation With High-Order Accuracy in Space and Time
2125	--	2137	Pejman Mowlaee, Martin Blass, W. Bastiaan Kleijn. New Results in Modulation-Domain Single-Channel Speech Enhancement
2138	--	2151	Dylan Menzies, Filippo Maria Fazi. Decoding and Compression of Channel and Scene Objects for Spatial Audio
2152	--	2161	Eunwoo Song, Frank K. Soong, Hong-Goo Kang. Effective Spectral and Excitation Modeling Techniques for LSTM-RNN-Based Speech Synthesis Systems
2162	--	2175	Pulkit Sharma, Vinayak Abrol, Anil Kumar Sao. Deep-Sparse-Representation-Based Features for Speech Recognition
2176	--	2187	Iynkaran Natgunanathan, Yong Xiang, Guang Hua, Gleb Beliakov, John Yearwood. Patchwork-Based Multilayer Audio Watermarking
2188	--	2198	Chengzhu Yu, John H. L. Hansen. Active Learning Based Constrained Clustering For Speaker Diarization
2199	--	2208	Emil Solsbæk Ottosen, Monika Dörfler. A Phase Vocoder Based on Nonstationary Gabor Frames
2209	--	2222	Boaz Schwartz, Sharon Gannot, Emanuel A. P. Habets. Two Model-Based EM Algorithms for Blind Source Separation in Noisy Environments
2223	--	2236	Maja Taseska, Emanuel A. P. Habets. Nonstationary Noise PSD Matrix Estimation for Multichannel Blind Speech Extraction
2237	--	2250	Bruno Di Giorgi, Simon Dixon, Massimiliano Zanoni, Augusto Sarti. A Data-Driven Model of Tonal Chord Sequence Complexity
2251	--	0	Nikolaos Stefanakis, Despoina Pavlidi, Athanasios Mouchtaris. Corrections to "Perpendicular Cross-Spectra Fusion for Sound Source Localization With a Planar Microphone Array"

1863	--	1876	Xiaohai Tian, Siu Wa Lee, Zhizheng Wu, Eng Siong Chng, Haizhou Li. An Exemplar-Based Approach to Frequency Warping for Voice Conversion
1877	--	1889	Siying Wang, Sebastian Ewert, Simon Dixon. Identifying Missing and Extra Notes in Piano Recordings Using Score-Informed Dictionary Learning
1890	--	1900	Sandro Cumani, Pietro Laface. Joint Estimation of PLDA and Nonlinear Transformations of Speaker Vectors
1901	--	1913	Morten Kolbaek, Dong Yu, Zheng-Hua Tan, Jesper Jensen 0001. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks
1914	--	1928	Cheng-Tao Chung, Cheng-Yu Tsai, Chia-Hsiang Liu, Lin-Shan Lee. Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection
1929	--	1941	Niccolo Antonello, Enzo De Sena, Marc Moonen, Patrick A. Naylor, Toon van Waterschoot. Room Impulse Response Interpolation Using a Sparse Spatio-Temporal Representation of the Sound Field
1942	--	1955	Yanmin Qian, Nanxin Chen, Heinrich Dinkel, Zhizheng Wu. Deep Feature Engineering for Noise Robust Spoofing Detection
1956	--	1968	Sina Hafezi, Alastair H. Moore, Patrick A. Naylor. Augmented Intensity Vectors for Direction of Arrival Estimation in the Spherical Harmonic Domain
1969	--	1984	Byeongho Jo, Jung-Woo Choi. Spherical Harmonic Smoothing for Localizing Coherent Sound Sources
1985	--	1996	Emma Jokinen, Ulpu Remes, Paavo Alku. Intelligibility Enhancement of Telephone Speech Using Gaussian Process Regression for Normal-to-Lombard Spectral Tilt Conversion
1997	--	2012	Xiaofei Li, Laurent Girin, Radu Horaud, Sharon Gannot. Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization
2013	--	2023	Marc Arnela, Oriol Guasch. Finite Element Synthesis of Diphthongs Using Tuned Two-Dimensional Vocal Tracts
2024	--	2035	Deepak Baby, Hugo Van Hamme. Joint Denoising and Dereverberation Using Exemplar-Based Sparse Representations and Decaying Norm Constraint

1	--	14	Jin Chu Wu, Alvin F. Martin, Craig S. Greenberg, Raghu N. Kacker. The Impact of Data Dependence on Speaker Recognition Evaluation
15	--	30	Hélène Papadopoulos, George Tzanetakis. Models for Music Analysis From a Markov Logic Networks Perspective
31	--	45	Ahmed Al-Tmeme, Wai Lok Woo, Satnam Singh Dlay, Bin Gao. Underdetermined Convolutive Source Separation Using GEM-MU With Variational Approximated Optimum Model Order NMF2D
46	--	59	Mark A. Hasegawa-Johnson, Preethi Jyothi, Daniel McCloy, Majid Mirbagheri, Giovanni M. di Liberto, Amit Das, Bradley Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Edmund C. Lalor, Nancy F. Chen, Paul Hager, Tyler Kekona, Rose Sloan, Adrian K. C. Lee. ASR for Under-Resourced Languages From Probabilistic Transcription
60	--	71	Zhen Huang, Sabato Marco Siniscalchi, Chin-Hui Lee. Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition
72	--	85	Simon Durand, Juan Pablo Bello, Bertrand David, Gaël Richard. Robust Downbeat Tracking Using an Ensemble of Convolutional Networks
86	--	97	Zhehuai Chen, Yimeng Zhuang, Yanmin Qian, Kai Yu. Phone Synchronous Speech Recognition With CTC Lattices
98	--	107	Bo Wu, Kehuang Li, Minglei Yang, Chin-Hui Lee. A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks
108	--	119	Hongjie Chen, Lei Xie, Cheung Chi Leung, Xiaoming Lu, Bin Ma, Haizhou Li. Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News
120	--	132	Hua Xing, John H. L. Hansen. Single Sideband Frequency Offset Estimation and Correction for Quality Enhancement and Speaker Recognition
133	--	148	Andreas I. Koutrouvelis, Richard Christian Hendriks, Richard Heusdens, Jesper Jensen. Relaxed Binaural LCMV Beamforming
149	--	163	Morten Kolbæk, Zheng-Hua Tan, Jesper Jensen. Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems
168	--	177	Jakob Abeßer, Klaus Frieler, Estefanía Cano, Martin Pfleiderer, Wolf-Georg Zaddach. Score-Informed Analysis of Tuning, Intonation, Pitch Modulation, and Dynamics in Jazz Solos
178	--	192	Alastair H. Moore, Christine Evers, Patrick A. Naylor. Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors
193	--	207	Kun Li, Xiaojun Qian, Helen M. Meng. Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks
208	--	221	Yoonchang Han, Jaehun Kim, Kyogu Lee. Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music

External Links

Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 25, Issue 9

Volume 25, Issue 8

Volume 25, Issue 7

Volume 25, Issue 6

Volume 25, Issue 5

Volume 25, Issue 4

Volume 25, Issue 3

Volume 25, Issue 2

Volume 25, Issue 12

Volume 25, Issue 11

Volume 25, Issue 10

Volume 25, Issue 1

External Links

Journal: IEEE Transactions on Audio, Speech &amp; Language Processing

Volume 25, Issue 9

Volume 25, Issue 8

Volume 25, Issue 7

Volume 25, Issue 6

Volume 25, Issue 5

Volume 25, Issue 4

Volume 25, Issue 3

Volume 25, Issue 2

Volume 25, Issue 12

Volume 25, Issue 11

Volume 25, Issue 10

Volume 25, Issue 1

Journal: IEEE Transactions on Audio, Speech & Language Processing