Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 15, Issue 8

2177 -- 2189Javier Ramírez, José C. Segura, Juan Manuel Górriz, L. Garcia. Improved Voice Activity Detection Using Contextual Multiple Hypothesis Testing for Robust Speech Recognition
2190 -- 2201Dagen Wang, S. S. Narayanan. Robust Speech Rate Estimation for Spontaneous Speech
2202 -- 2212Seung Seop Park, Nam Soo Kim. On Using Multiple Models for Automatic Speech Segmentation
2213 -- 2221Robert I. Damper, Tasanawan Soonklang. Subjective Evaluation of Techniques for Proper Name Pronunciation
2222 -- 2235Tomoki Toda, Alan W. Black, Keiichi Tokuda. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory
2236 -- 2248Te Li, Susanto Rahardja, Rongshan Yu, Soo Ngee Koh. On Integer MDCT for Perceptual Audio Coding
2249 -- 2256Enrique Alexandre, Lucas Cuadra, M. Rosa, Francisco López-Ferreras. Feature Selection for Sound Classification in Hearing Aids Through Restricted Search Driven by Genetic Algorithms
2257 -- 2269Hari Krishna Maganti, Daniel Gatica-Perez, Iain McCowan. Speech Enhancement and Recognition in Meetings With an Audio-Visual Sensor Array
2270 -- 2277Xiangyang Wang, Wei Qi, Panpan Niu. A New Adaptive Digital Audio Watermarking Based on Support Vector Regression
2278 -- 2286L. S. Smith, S. Collins. Determining ITDs Using Two Microphones on a Flat Panel During Onset Intervals With a Biologically Inspired Spike-Based Technique
2287 -- 2298H. I. K. Rao, V. J. Mathews, Young-Cheol Park. A Minimax Approach for the Joint Design of Acoustic Crosstalk Cancellation Filters
2299 -- 2310Mohammad H. Radfar, Richard M. Dansereau. Single-Channel Speech Separation Using Soft Mask Filtering
2311 -- 2330Jingyi Zhang, Wai Lok Woo, Satnam Singh Dlay. Blind Source Separation of Postnonlinear Convolutive Mixture
2331 -- 2347C. Busso, S. S. Narayanan. Interrelation Between Speech and Facial Gestures in Emotional Utterances: A Single Subject Study
2348 -- 2359A. Abramson, I. Cohen. Simultaneous Detection and Estimation Approach for Speech Enhancement
2360 -- 2372Zohra Yermeche, Nedelko Grbic, Ingvar Claesson. Blind Subband Beamforming With Time-Delay Constraints for Moving Source Speech Enhancement
2373 -- 2382Joseph Keshet, Shai Shalev-Shwartz, Yoram Singer, D. Chazan. A Large Margin Algorithm for Speech-to-Phoneme and Music-to-Score Alignment
2383 -- 2392Xinwei Li, Hui Jiang. Solving Large-Margin Hidden Markov Model Estimation via Semidefinite Programming
2393 -- 2404Jinyu Li, Ming Yuan, Chin-Hui Lee. Approximate Test Risk Bound Minimization Through Soft Margin Estimation
2405 -- 2417Mohamed Afify, Xinwei Li, Hui Jiang. Statistical Analysis of Minimum Classification Error Learning for Gaussian and Hidden Markov Model Classifiers
2418 -- 2430S. Umesh, Rohit Sinha. A Study of Filter Bank Smoothing in MFCC Features for Recognition of Children s Speech
2431 -- 2443Haitian Xu, Paul Dalsgaard, Zheng-Hua Tan, Børge Lindberg. Noise Condition-Dependent Training Based on Noise Classification and SNR Estimation
2444 -- 2453Rongqing Huang, John H. L. Hansen. Unsupervised Discriminative Training With Application to Dialect Classification
2454 -- 2464Shizhen Wang, Xiaodong Cui, Abeer Alwan. Speaker Adaptation With Limited Data Using Regression-Tree-Based Spectral Peak Alignment
2465 -- 2475J. Louradour, K. Daoudi, F. Bach. Feature Space Mahalanobis Sequence Kernels: Application to SVM Speaker Verification
2476 -- 2484Minho Jin, F. K. Soong, Chang D. Yoo. A Syllable Lattice Approach to Speaker Verification
2485 -- 2495M. Chibani, R. Lefebvre, P. Gournay. Fast Recovery for a CELP-Like Speech Codec After a Frame Erasure
2496 -- 2509B. Geiser, Peter Jax, Peter Vary, H. Taddei, S. Schandl, M. Gartner, C. Guillaume, S. Ragot. Bandwidth Extension for Hierarchical Speech and Audio Coding in ITU-T Rec. G.729.1
2510 -- 2526Jacek Dmochowski, Jacob Benesty, Sofiène Affes. A Generalized Steered Response Power Method for Computationally Viable Source Localization
2527 -- 2541Ken ichi Kumatani, Tobias Gehrig, Uwe Mayer, Emilian Stoimenov, John W. McDonough, Matthias Wölfel. Adaptive Beamforming With a Minimum Mutual Information Criterion
2542 -- 2550K. C. Ho, Ming Sun. An Accurate Algebraic Closed-Form Solution for Energy-Based Source Localization
2551 -- 2560Chien-Lin Huang, Chung-Hsien Wu. Spoken Document Retrieval Using Multilevel Knowledge and Semantic Verification
2561 -- 2565Toon van Waterschoot, Marc Moonen. A Pole-Zero Placement Technique for Designing Second-Order IIR Parametric Equalizer Filters

Volume 15, Issue 7

1951 -- 1959Mark A. Przybocki, Alvin F. Martin, A. N. Le. NIST Speaker Recognition Evaluations Utilizing the Mixer Corpora - 2004, 2005, 2006
1960 -- 1968B. G. B. Fauve, D. Matrouf, N. Scheffer, Jean-François Bonastre, John S. D. Mason. State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software
1969 -- 1978F. Castaldo, D. Colibro, E. Dalmasso, Pietro Laface, C. Vair. Compensation of Nuisance Factors for Speaker and Language Recognition
1979 -- 1986Lukas Burget, Pavel Matejka, Petr Schwarz, Ondrej Glembek, Jan Cernocký. Analysis of Feature Extraction and Channel Compensation in a GMM Speaker Recognition System
1987 -- 1998Andreas Stolcke, Sachin S. Kajarekar, Luciana Ferrer, E. Shrinberg. Speaker Recognition With Session Variability Normalization Based on MLLR Adaptation Transforms
1999 -- 2010Shou-Chun Yin, R. Rose, Patrick Kenny. A Joint Factor Analysis Approach to Progressive Model Adaptation in Text-Independent Speaker Verification
2011 -- 2022Xavier Anguera, Chuck Wooters, Javier Hernando. Acoustic Beamforming for Speaker Diarization of Meetings
2023 -- 2032Qin Jin, Tanja Schultz, Alex Waibel. Far-Field Speaker Recognition
2033 -- 2043Hagai Aronowitz, David Burshtein. Efficient Speaker Recognition Using Approximated Cross Entropy (ACE)
2044 -- 2052V. Prakash, John H. L. Hansen. In-Set/Out-of-Set Speaker Recognition Under Sparse Enrollment
2053 -- 2062Bin Ma, Haizhou Li, Rong Tong. Spoken Language Recognition Using Ensemble Classifiers
2063 -- 2071Yosef A. Solewicz, Moshe Koppel. UsingPost-Classifiers to Enhance Fusion of Low- and High-Level Speaker Recognition
2072 -- 2084N. Brummer, Lukas Burget, Jan Cernocký, Ondrej Glembek, Frantisek Grézl, Martin Karafiát, D. A. van Leeuwen, Pavel Matejka, Petr Schwarz, A. Strasheim. Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006
2085 -- 2094William M. Campbell, Joseph P. Campbell, T. P. Gleason, Douglas A. Reynolds, Wade Shen. Speaker Verification Using Support Vector Machines and High-Level Features
2095 -- 2103N. Dehak, Pierre Dumouchel, Patrick Kenny. Modeling Prosodic Features With Joint Factor Analysis for Speaker Verification
2104 -- 2115Joaquin Gonzalez-Rodriguez, P. Rose, Daniel Ramos, Doroteo Torre Toledano, Javier Ortega-Garcia. Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition
2116 -- 2129J. D. Williams, S. Young. Scaling POMDPs for Spoken Dialog Management
2130 -- 2140S. Srinivasan, DeLiang Wang. Transforming Binary Uncertainties for Robust Speech Recognition
2141 -- 2150J. Usher, Jacob Benesty. Enhancement of Spatial Sound Quality: A New Reverberation-Extraction Audio Upmixer
2151 -- 2159Cheng-Yuan Lin, Jyh-Shing Roger Jang. Automatic Phonetic Segmentation by Score Predictive Model for the Corpora of Mandarin Singing Voices
2160 -- 2168Rusheng Hu, Yunxin Zhao. Knowledge-Based Adaptive Decision Tree State Tying for Conversational Speech Recognition

Volume 15, Issue 6

1741 -- 1752Jan S. Erkelens, Richard C. Hendriks, Richard Heusdens, Jesper Jensen. Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors
1753 -- 1765Chang Huai You, Susanto Rahardja, Soo Ngee Koh. Audible Noise Reduction in Eigendomain for Speech Enhancement
1766 -- 1776A. M. Reddy, B. Raj. Soft Mask Methods for Single-Channel Speaker Separation
1777 -- 1790Ann Spriet, Geert Rombouts, Marc Moonen, Jan Wouters. Combined Feedback and Noise Suppression in Hearing Aids
1791 -- 1801Marc Delcroix, Takafumi Hikichi, Masato Miyoshi. Dereverberation and Denoising Using Multichannel Linear Prediction
1802 -- 1817Woojay Jeon, Biing-Hwang Juang. Speech Analysis in a Model of the Central Auditory System
1818 -- 1832Nikolaos Mitianoudis, Tania Stathaki. Batch and Online Underdetermined Source Separation Using Laplacian Mixture Models
1833 -- 1841Maurizio Mancini, Roberto Bresin, Catherine Pelachaud. A Virtual Head Driven by Music Expressivity
1842 -- 1849Shantanu Chakrabartty, Yunbin Deng, Gert Cauwenberghs. Robust Speech Feature Extraction by Growth Transformation in Reproducing Kernel Hilbert Space
1850 -- 1858Bertrand Mesot, David Barber. Switching Linear Dynamical Systems for Noise Robust Speech Recognition
1859 -- 1869Amit S. Malegaonkar, Aladdin M. Ariyaeeinia, P. Sivakumaran. Efficient Speaker Change Detection Using Adapted Gaussian Mixture Models
1870 -- 1883Yuan-Fu Liao, Zi-He Chen, Yau-Tarng Juang. Latent Prosody Analysis for Robust Speaker Identification
1884 -- 1892Wai Nang Chan, Nengheng Zheng, Tan Lee. Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation
1893 -- 1903Wei Wu, Thomas Fang Zheng, Ming-Xing Xu, Frank K. Soong. A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification
1904 -- 1911Jean-Luc Rouas. Automatic Prosodic Variations Modeling for Language and Dialect Discrimination
1912 -- 1921P. Taraba. Kneser-Ney Smoothing With a Correcting Transformation for Small Data Sets
1922 -- 1931Darko Kirovski, Fabien A. P. Petitcolas, Zeph Landau. The Replacement Attack
1932 -- 1943Kai Yu, M. J. F. Gales. Bayesian Adaptive Inference and Adaptive Training

Volume 15, Issue 5

1511 -- 1520Scott C. Douglas, Malay Gupta, Hiroshi Sawada, Shoji Makino. Spatio-Temporal FastICA Algorithms for the Blind Separation of Convolutive Mixtures
1521 -- 1528Intae Lee, Te-Won Lee. On the Assumption of Spherical Symmetry and Sparseness for the Frequency-Domain Speech Model
1529 -- 1539E. Warsitz, M. R. Haeb-Umbach. Blind Acoustic Beamforming Based on Generalized Eigenvalue Decomposition
1540 -- 1550Abdeldjalil Aïssa-El-Bey, Karim Abed-Meraim, Yves Grenier. Blind Separation of Underdetermined Convolutive Mixtures Using Their Time-Frequency Representation
1551 -- 1563Zhaoshui He, Shengli Xie, Shuxue Ding, Andrzej Cichocki. Convolutive Blind Source Separation in the Frequency Domain Based on Sparse Representation
1564 -- 1578Alexey Ozerov, P. Philippe, Frédéric Bimbot, Rémi Gribonval. Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs
1579 -- 1591Ken ichi Furuya, Akitoshi Kataoka. Robust Speech Dereverberation Using Multichannel Blind Deconvolution With Spectral Subtraction
1592 -- 1604Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino. Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation
1605 -- 1616Oscal T.-C. Chen, Chia-Hsiung Liu. Content-Dependent Watermarking Scheme in Compressed Speech With Identifying Manner and Location of Attacks
1617 -- 1624Vesa Siivola, Teemu Hirsimäki, Sami Virpioja. On Growing and Pruning Kneser-Ney Smoothed N-Gram Models
1625 -- 1634M. Lagrange, S. Marchand, J.-B. Rault. Enhancing the Tracking of Partials for the Sinusoidal Modeling of Polyphonic Sounds
1635 -- 1644Mads Græsbøll Christensen, Andreas Jakobsson, Søren Holdt Jensen. Joint High-Resolution Fundamental Frequency and Order Estimation
1645 -- 1653Xinglei Zhu, Gerald Beauregard, Lonce L. Wyse. Real-Time Signal Estimation From Modified Short-Time Fourier Transform Magnitude Spectra
1654 -- 1664Anders Meng, P. Ahrendt, Jan Larsen, Lars Kai Hansen. Temporal Feature Integration for Music Genre Classification
1665 -- 1680Masahiro Yukawa, Konstantinos Slavakis, Isao Yamada. Adaptive Parallel Quadratic-Metric Projection Algorithms
1681 -- 1695A. W. H. Khong, Patrick A. Naylor. Selective-Tap Adaptive Filtering With Performance Analysis for Identification of Time-Varying Systems
1696 -- 1710Guillaume Lathoud, Jean-Marc Odobez. Short-Term Spatio-Temporal Clustering Applied to Multiple Moving Speakers
1711 -- 1723Ji Ming, Timothy J. Hazen, James R. Glass, Douglas A. Reynolds. Robust Speaker Recognition in Noisy Conditions
1724 -- 1730Mark D. Skowronski, John G. Harris. Noise-Robust Automatic Speech Recognition Using a Predictive Echo State Network
1731 -- 1732Mohamed Afify, Olivier Siohan. Comments on Vocal Tract Length Normalization Equals Linear Transformation in Cepstral Space

Volume 15, Issue 4

1129 -- 1134Rasool Tahmasbi, Sadegh Rezaei. A Soft Voice Activity Detection Using GARCH Filter and Variance Gamma Distribution
1135 -- 1145Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Alain de Cheveigné, Shigeki Sagayama. Single and Multiple F::0:: Contour Estimation Through Parametric Spectrogram Modeling of Speech in Noisy Environments
1146 -- 1155Thomas Eriksson, Frank Norden. Memory-Based Vector Quantization of LSF Parameters by a Power Series Approximation
1156 -- 1166Bengt J. Borgstrom, Mihaela van der Schaar, A. Alwan. Rate Allocation for Noncollaborative Multiuser Speech Communication Systems Based on Bargaining Theory
1167 -- 1179M. Jelinek, R. Salami. Wideband Speech Coding Advances in VMR-WB Standard
1180 -- 1193Athanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller, Panagiotis Tsakalides. A Spectral Conversion Approach to Single-Channel Speech Enhancement
1194 -- 1203Esfandiar Zavarehei, Saeed Vaseghi, Qin Yan. Noisy Speech Enhancement Using Harmonic-Noise Model and Codebook-Based Post-Processing
1204 -- 1217Xuechuan Wang, D. O Shaughnessy. Environmental Independent ASR Model Adaptation/Compensation by Bayesian Parametric Representation
1218 -- 1226Peter Birkholz, D. Jackel, Bernd J. Kröger. Simulation of Losses Due to Turbulence in the Time-Varying Vocal System
1227 -- 1235Chung-Hsien Wu, Chi-Chun Hsia, Jiun-Fu Chen, Jhing-Fa Wang. Variable-Length Unit Selection in TTS Using Structural Syntactic Cost
1236 -- 1246Karthikeyan Umapathy, Sridhar Krishnan, R. K. Rao. Audio Signal Feature Extraction and Classification Using Local Discriminant Bases
1247 -- 1256Graham E. Poliner, Daniel P. W. Ellis, A. F. Ehmann, E. Gomez, S. Streich, Beesuan Ong. Melody Transcription From Music Audio: Approaches and Evaluation
1257 -- 1272Harvey D. Thornburg, Randal J. Leistikow, J. Berger. Melody Extraction and Musical Onset Detection via Probabilistic Models of Framewise STFT Peak Data
1273 -- 1282E. Vincent, M. D. Plumbley. Low Bit-Rate Object Coding of Musical Audio Using Bayesian Harmonic Models
1283 -- 1295C. Dubois, M. Davy. Joint Detection and Tracking of Time-Varying Harmonic Components: A Flexible Bayesian Approach
1296 -- 1304H. M. A. Malik, Rashid Ansari, Ashfaq A. Khokhar. Robust Data Hiding in Audio Using Allpass Filters
1305 -- 1319Y. Avargel, I. Cohen. System Identification in the Short-Time Fourier Transform Domain With Crossband Filtering
1320 -- 1326Fredric Lindström, Christian Schüldt, Ingvar Claesson. An Improvement of the Two-Path Algorithm Transfer Logic for Acoustic Echo Cancellation
1327 -- 1339Jacek Dmochowski, Jacob Benesty, Sofiène Affes. Direction of Arrival Estimation Using the Parameterized Spatial Correlation Matrix
1340 -- 1351Wolfgang Herbordt, Herbert Buchner, S. Nakamura, Walter Kellermann. Multichannel Bin-Wise Robust Frequency-Domain Adaptive Filtering and Its Application to Adaptive Beamforming
1352 -- 1365Takaaki Hori, C. Hori, Yasuhiro Minami, Atsushi Nakamura. Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition
1366 -- 1376Xiaodong Cui, Yifan Gong. A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition
1377 -- 1390M. De Wachter, M. Matton, Kris Demuynck, Patrick Wambacq, R. Cools, Dirk Van Compernolle. Template-Based Continuous Speech Recognition
1391 -- 1403Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg. Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition
1404 -- 1413Paris Smaragdis, Madhusudana V. S. Shashanka. A Framework for Secure Speech Recognition
1414 -- 1424Xunying Liu, Mark J. F. Gales. Automatic Model Complexity Control Using Marginalized Discriminative Growth Functions
1425 -- 1434Y. Han, Johan de Veth, Lou Boves. Trajectory Clustering for Solving the Trajectory Folding Problem in Automatic Speech Recognition
1435 -- 1447Patrick Kenny, Gilles Boulianne, Pierre Ouellet, Pierre Dumouchel. Joint Factor Analysis Versus Eigenchannels in Speaker Recognition
1448 -- 1460Patrick Kenny, Gilles Boulianne, Pierre Ouellet, Pierre Dumouchel. Speaker and Session Variability in GMM-Based Speaker Verification
1461 -- 1474Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang. Automatic Speaker Clustering Using a Voice Characteristic Reference Space and Maximum Purity Estimation
1475 -- 1487Yipeng Li, DeLiang Wang. Separation of Singing Voice From Music Accompaniment for Monaural Recordings
1488 -- 1495S. Bilbao, L. Savioja, J. O. Smith. Parameterized Finite Difference Schemes for Plates: Stability, the Reduction of Directional Dispersion and Frequency Warping
1496 -- 1499Angel M. Gomez, Antonio M. Peinado, Victoria E. Sánchez, Antonio J. Rubio. On the Ramsey Class of Interleavers for Robust Speech Recognition in Burst-Like Packet Loss

Volume 15, Issue 3

749 -- 755Pradeepa Yahampath, Paul Rondeau. Multiple-Description Predictive-Vector Quantization With Applications to Low Bit-Rate Speech Coding Over Networks
756 -- 769Ethan R. Duni, Bhaskar D. Rao. High-Rate Optimized Recursive Vector Quantization Structures Using Hidden Markov Models
770 -- 783Ethan R. Duni, Bhaskar D. Rao. A High-Rate Optimal Transform Coder With Gaussian Mixture Companders
784 -- 795Brian Kan-Wing Mak, Roger Wend-Huu Hsiao. Kernel Eigenspace-Based MLLR Adaptation
796 -- 802Bertrand Rivet, Laurent Girin, Christian Jutten. Log-Rayleigh Distribution: A Simple and Efficient Statistical Representation of Log-Spectral Coefficients
803 -- 812Patricia Scanlon, Daniel P. W. Ellis, Richard B. Reilly. Using Broad Phonetic Group Experts for Improved Speech Recognition
813 -- 822Barbara Resch, Mattias Nilsson, Anders Ekman, W. Bastiaan Kleijn. Estimation of the Instantaneous Pitch of Speech
823 -- 837Francesco Gianfelici, Giorgio Biagetti, Paolo Crippa, Claudio Turchetti. Multicomponent AM-FM Representations: An Asymptotically Exact Approach
838 -- 850Dima Ruinskiy, Y. Lavner. An Effective Algorithm for Automatic Detection and Exact Demarcation of Breath Sounds in Speech and Song Signals
851 -- 861Laurent Girin, Mohammad Firouzmand, Sylvain Marchand. Perceptual Long-Term Variable-Rate Sinusoidal Modeling of Speech
862 -- 872Jesper Jensen, Richard Heusdens. Improved Subspace-Based Single-Channel Speech Enhancement Using Generalized Super-Gaussian Priors
873 -- 881Juho Kontio, Laura Laaksonen, Paavo Alku. Neural Network-Based Artificial Bandwidth Expansion of Speech
882 -- 892David Y. Zhao, W. Bastiaan Kleijn. HMM-Based Gain Modeling for Enhancement of Speech in Noise
893 -- 900M. Khademul Islam Molla, Keikichi Hirose. Single-Mixture Audio Source Separation by Subspace Decomposition of Hilbert Spectrum
901 -- 917Karsten Vandborg Sorensen, Sren Vang Andersen. Rayleigh Mixture Model-Based Hidden Markov Modeling and Estimation of Noise in Noisy Speech Signals
918 -- 927Richard C. Hendriks, Rainer Martin. MAP Estimators for Speech Enhancement Under Normal and Rayleigh Inverse Gaussian Distributions
928 -- 938Nikos Chatzichrisafis, Vassilios Diakoloukas, Vassilios Digalakis, Costas Harizakis. Gaussian Mixture Clustering and Language Adaptation for the Development of a New Language Speech Recognition System
939 -- 948Ghinwa F. Choueiter, James R. Glass. An Implementation of Rational Wavelets and Filter Design for Phonetic Classification
949 -- 956Esther Klabbers, Jan P. H. van Santen, Alexander Kain. The Contribution of Various Sources of Spectral Mismatch to Audible Discontinuities in a Diphone Database
957 -- 965Jerome R. Bellegarda. Globally Optimal Training of Unit Boundaries in Unit Selection Text-to-Speech Synthesis
966 -- 981Pim Korten, Jesper Jensen, Richard Heusdens. High-Resolution Spherical Quantization of Sinusoidal Parameters
982 -- 994Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama. A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering
995 -- 1008Johannes Nix, Volker Hohmann. Combined Estimation of Spectral Envelopes and Sound Source Direction of Concurrent Voices by Multidimensional Statistical Filtering
1009 -- 1020Matthew E. P. Davies, Mark D. Plumbley. Context-Dependent Beat Tracking of Musical Audio
1021 -- 1029Leevi Peltola, Cumhur Erkut, Perry R. Cook, Vesa Välimäki. Synthesis of Hand Clapping Sounds
1030 -- 1034Jean-Marc Valin. On Adjusting the Learning Rate in Frequency Domain Echo Cancellation With Double-Talk
1035 -- 1043James D. Gordy, Rafik A. Goubran. Statistical Analysis of Doubletalk Detection for Calibration and Performance Evaluation
1044 -- 1052Felix Albu, Martin Bouchard, Yuriy V. Zakharov. Pseudo-Affine Projection Algorithms for Multichannel Active Noise Control
1053 -- 1065Jacob Benesty, Jingdong Chen, Yiteng Huang, Jacek Dmochowski. On Microphone-Array Beamforming From a MIMO Acoustic Signal Processing Perspective
1066 -- 1074Tuomas Virtanen. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria
1075 -- 1086Carlos Busso, Zhigang Deng, Michael Grimm, Ulrich Neumann, Shrikanth Narayanan. Rigid Head Motion in Expressive Speech Animation: Analysis and Synthesis
1087 -- 1097Chen Yang, Frank K. Soong, Tan Lee. Static and Dynamic Spectral Features: Their Noise Robustness and Optimal Weights for ASR
1098 -- 1113Luis Buera, Eduardo Lleida, A. Miguel, Alfonso Ortega, O. Saz. Cepstral Vector Normalization Based on Stereo Data for Robust Speech Recognition
1114 -- 1122Xianyu Zhao, Zhijian Ou. Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition

Volume 15, Issue 2

377 -- 386Y. Agiomyrgiannakis, Yannis Stylianou. Conditional Vector Quantization for Speech Coding
387 -- 395Sorin Dusan, James L. Flanagan, A. Karve, M. Balaraman. Speech Compression by Polynomial Approximation
396 -- 405G. Hu, D. Wang. Auditory Segmentation Based on Onset and Offset Analysis
406 -- 415Richard C. Hendriks, Richard Heusdens, Jesper Jensen. An MMSE Estimator for Speech Enhancement Under a Combined Stochastic-Deterministic Speech Model
416 -- 429Y. Nagata, T. Fujioka, M. Abe. Two-Dimensional DOA Estimation of Sound Sources Based on Weighted Wiener Gain Exploiting Two-Directional Microphones
430 -- 440Marc Delcroix, Takafumi Hikichi, Masato Miyoshi. Precise Dereverberation Using Multichannel Linear Prediction
441 -- 452S. Srinivasan, J. Samuelsson, W. Bastiaan Kleijn. Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments
453 -- 464Rongqing Huang, John H. L. Hansen, Pongtep Angkititrakul. Dialect/Accent Classification Using Unrestricted Audio
465 -- 477M. Akbacak, John H. L. Hansen. Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems
478 -- 488J. Wu, Q. Huo. A Study of Minimum Classification Error (MCE) Linear Regression for Supervised Adaptation of MCE-Trained Continuous-Density Hidden Markov Models
489 -- 497P. D. Teal. Tracking Wide-Band Targets Having Significant Doppler Shift
498 -- 508Pongtep Angkititrakul, John H. L. Hansen. Discriminative In-Set/Out-of-Set Speaker Recognition
509 -- 518Darko Kirovski, Zeph Landau. Generalized Lempel-Ziv Compression for Audio
519 -- 530T. L. Nwe, H. Li. Exploring Vibrato-Motivated Acoustic Features for Singer Identification
531 -- 541N. Laurenti, G. De Poli, D. Montagner. A Nonlinear Method for Stochastic Spectrum Estimation in the Modeling of Musical Sounds
542 -- 551Sunil Bharitkar, Chris Kyriakakis. Visualization of Multiple Listener Room Acoustic Equalization With the Sammon Map
552 -- 564D. T. Murphy, M. Beeson. The KW-Boundary Hybrid Digital Waveguide Mesh for Room Acoustics Applications
565 -- 576Ramani Duraiswami, Dmitry N. Zotkin, Nail A. Gumerov. Fast Evaluation of the Room Transfer Function Using Multipole Expansion
577 -- 585J. Mullen, D. M. Howard, D. T. Murphy. Real-Time Dynamic Articulations in the 2-D Waveguide Mesh Vocal Tract Model
586 -- 592X. Sun, S. M. Kuo. Active Narrowband Noise Control Systems Using Cascading Adaptive Filters
593 -- 600Muhammad Tahir Akhtar, Masahide Abe, Masayuki Kawamata. On Active Noise Control Systems With Online Acoustic Feedback Path Modeling
601 -- 616Daniel Gatica-Perez, Guillaume Lathoud, Jean-Marc Odobez, Iain McCowan. Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings
617 -- 631Simon Doclo, Marc Moonen. Superdirective Beamforming Robust Against Microphone Mismatch
632 -- 640C.-H. Lee, S.-K. Jung, H.-G. Kang. Applying a Speaker-Dependent Speech Compression Technique to Concatenative TTS Synthesizers
641 -- 651K.-S. Lee. Statistical Approach for Voice Personality Transformation
652 -- 660Xiaodong Cui, Abeer Alwan. Robust Speaker Adaptation by Weighted Model Averaging Based on the Minimum Description Length Criterion
661 -- 675M.-Y. Tsai, F.-C. Chou, L. S. Lee. Pronunciation Modeling With Reduced Confusion for Mandarin Chinese Using a Three-Stage Framework
676 -- 689Qin Yan, Saeed Vaseghi, D. Rentzos, C.-H. Ho. Analysis and Synthesis of Formant Spaces of British, Australian, and American Accents
690 -- 701D. Wang, S. Narayanan. An Acoustic Measure for Word Prominence in Spontaneous Speech
702 -- 714Zhiyun Li, Ramani Duraiswami. Flexible and Optimal Design of Spherical Microphone Arrays for Beamforming
715 -- 726M. Knaak, Shoko Araki, Shoji Makino. Geometrically Constrained Independent Component Analysis
727 -- 732I. Balmages, Boaz Rafaely. Open-Sphere Designs for Spherical Microphone Arrays
732 -- 734Peter Jancovic. Fast Algorithm for Calculation of the Union-Based Probability
734 -- 743Y. I. Kim, R. M. Kil. Estimation of Interaural Time Differences Based on Zero-Crossings in Noisy Multisource Environments

Volume 15, Issue 1

1 -- 12Paris Smaragdis. Convolutive Speech Bases and Their Application to Supervised Speech Separation
13 -- 23Li Deng, Leo J. Lee, Hagai Attias, Alex Acero. Adaptive Kalman Filtering and Smoothing for Tracking Vocal Tract Resonances Using a Continuous-Valued Hidden Dynamic Model
24 -- 33Ben Milner, Xu Shao. Prediction of Fundamental Frequency and Voicing From Mel-Frequency Cepstral Coefficients for Unconstrained Speech Reconstruction
34 -- 43Patrick A. Naylor, Anastasis Kounoudes, Jon Gudnason, Mike Brookes. Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm
44 -- 56Farshad Lahouti, Amir K. Khandani. Soft Reconstruction of Speech in the Presence of Noise and Packet Loss
57 -- 69Sean A. Ramprashad. Sparse Bit-Allocations Based on Partial Ordering Schemes With Application to Speech and Audio Coding
70 -- 79Taesu Kim, Hagai Thomas Attias, Soo-Young Lee, Te-Won Lee. Blind Source Separation Exploiting Higher-Order Frequency Dependencies
80 -- 95Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi. Harmonicity-Based Blind Dereverberation for Single-Channel Speech Signals
96 -- 108Bertrand Rivet, Laurent Girin, Christian Jutten. Mixing Audiovisual Speech Processing and Blind Source Separation for the Extraction of Speech Signals From Convolutive Mixtures
109 -- 118Guangji Shi, Parham Aarabi, Hui Jiang. Phase-Based Dual-Microphone Speech Enhancement Using A Prior Speech Model
119 -- 134Gwo-Hwa Ju, Lin-Shan Lee. A Perceptually Constrained GSVD-Based Approach for Enhancing Speech Corrupted by Colored Noise
135 -- 149Steven J. Rennie, Parham Aarabi, Brendan J. Frey. Variational Probabilistic Speech Separation Using Microphone Arrays
150 -- 161Ian R. Lane, Tatsuya Kawahara, Tomoko Matsui, Satoshi Nakamura. Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics
162 -- 171Christian Raymond, Frédéric Béchet, Nathalie Camelin, Renato de Mori, Géraldine Damnati. Sequential Decision Strategies for Machine Interpretation of Speech
172 -- 189Scott Axelrod, Vaibhava Goel, Ramesh A. Gopinath, Peder A. Olsen, Karthik Visweswariah. Discriminative Estimation of Subspace Constrained Gaussian Mixture Models for Speech Recognition
190 -- 202Rajesh M. Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde. Significance of the Modified Group Delay Feature in Speech Recognition
203 -- 223Erik McDermott, Timothy J. Hazen, Jonathan Le Roux, Atsushi Nakamura, Shigeru Katagiri. Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error
224 -- 234Satya Dharanipragada, Umit H. Yapanel, Bhaskar D. Rao. Robust Feature Extraction for Continuous Speech Recognition Using the MVDR Spectrum Estimation Method
235 -- 245Michael L. Seltzer, Alex Acero. Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition
246 -- 256Joe Frankel, Simon King. Speech Recognition Using Linear Dynamic Models
257 -- 270Chia-Ping Chen, Jeff A. Bilmes. MVA Processing of Speech Features
271 -- 284Haizhou Li, Bin Ma, Chin-Hui Lee. A Vector Space Modeling Approach to Spoken Language Identification
285 -- 295Peter Day, Asoke K. Nandi. Robust Text-Independent Speaker Verification Using Genetic Programming
296 -- 309Youngim Jung, Ae-sun Yoon, Hyuk-Chul Kwon. Grapheme-to-Phoneme Conversion of Arabic Numeral Expressions for Embedded TTS Systems
310 -- 319Jan H. Plasberg, W. Bastiaan Kleijn. The Sensitivity Matrix: Using Advanced Auditory Models in Speech and Audio Processing
320 -- 332Ixone Arroabarren, Alfonso Carlosena. Voice Production Mechanisms of Vocal Vibrato in Male Singers
333 -- 345Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno. Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates With Harmonic Structure Suppression
346 -- 357Kishan Thambiratnam, Sridha Sridharan. Rapid Yet Accurate Speech Indexing Using Dynamic Match Lattice Spotting
358 -- 368Paris Smaragdis, Petros Boufounos. Position and Trajectory Learning for Microphone Arrays