Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 19, Issue 8

2249 -- 2259Laurent Oudre, Cédric Févotte, Yves Grenier. Probabilistic Template-Based Chord Recognition
2260 -- 2272Jacob Benesty, Jingdong Chen, Yiteng Huang. Binaural Noise Reduction in the Time Domain With a Stereo Setup
2273 -- 2284Zengli Yang, Yahong Rosa Zheng, Steven L. Grant. Proportionate Affine Projection Sign Algorithms for Network Echo Cancellation
2285 -- 2293J. Du, Q. Huo. A Feature Compensation Approach Using High-Order Vector Taylor Series Approximation of an Explicit Distortion Model for Noisy Speech Recognition
2294 -- 2303J. Reed, C.-H. Lee. Preference Music Ratings Prediction Using Tokenization and Minimum Classification Error Training
2304 -- 2313H.-W. Hsu, C. M. Liu. Decimation-Whitening Filter in Spectral Band Replication
2314 -- 2327Theodore Petsatodis, Christos Boukis, Fotios Talantzis, Zheng-Hua Tan, Ramjee Prasad. Convex Combination of Multiple Statistical Models With Application to VAD
2328 -- 2337Z. Jin, D. Wang. Reverberant Speech Segregation Based on Multipitch Tracking and Classification
2338 -- 2347Dogan Can, Murat Saraclar. Lattice Indexing for Spoken Term Detection
2348 -- 2363Mikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Germán Bordel. Improved Modeling of Cross-Decoder Phone Co-Occurrences in SVM-Based Phonotactic Language Recognition
2364 -- 2373Trevor Burton, Rafik A. Goubran. A Generalized Proportionate Subband Adaptive Second-Order Volterra Filter for Acoustic Echo Cancellation in Changing Environments
2374 -- 2384Yusuke Hioka, Kenta Niwa, Sumitaka Sakauchi, Ken'ichi Furuya, Youichi Haneda. Estimating Direct-to-Reverberant Energy Ratio Using D/R Spatial Correlation Matrix Model
2385 -- 2397Cyril Joder, Slim Essid, Gaël Richard. A Conditional Random Field Framework for Robust and Scalable Audio-to-Score Matching
2398 -- 2411Richard E. Turner, Maneesh Sahani. Demodulation as Probabilistic Inference
2412 -- 2417Giovanni L. Sicuranza, Alberto Carini. A Generalized FLANN Filter for Nonlinear Active Noise Control
2418 -- 2429Qun Feng Tan, Panayiotis G. Georgiou, Shrikanth Narayanan. Enhanced Sparse Imputation Techniques for a Robust Speech Recognition Front-End
2430 -- 2438Boaz Rafaely. Bessel Nulls Recovery in Spherical Microphone Arrays for Time-Limited Signals
2439 -- 2450Fabio Valente, Mathew Magimai-Doss, Christian Plahl, Suman V. Ravuri, Wen Wang. Transcribing Mandarin Broadcast Speech Using Multi-Layer Perceptron Acoustic Features
2451 -- 2460Timothy J. Hazen. MCE Training Techniques for Topic Identification of Spoken Audio Documents
2461 -- 2473Dong Yu, Jinyu Li, Li Deng. Calibration of Confidence Measures in Speech Recognition
2474 -- 2485Cong Liu, Yu Hu 0003, Li-Rong Dai, Hui Jiang 0001. Trust Region-Based Optimization for Maximum Mutual Information Estimation of HMMs in Speech Recognition
2486 -- 2493Simo Särkkä, Antti Huovilainen. Accurate Discretization of Analog Audio Filters With Application to Parametric Equalizer Design
2494 -- 2505Deyi Xiong, Min Zhang, Haizhou Li 0001. A Maximum-Entropy Segmentation Model for Statistical Machine Translation
2506 -- 2515Magnus Berggren, Markus Borgh, Christian Schüldt, Fredric Lindström, Ingvar Claesson. Low-Complexity Network Echo Cancellation Approach for Systems Equipped With External Memory
2516 -- 2526Leonardo O. Nunes, Luiz W. P. Biscainho, Bowon Lee, Amir Said, Ton Kalker, Ronald W. Schafer. Degradation Type Classifier for Full Band Speech Contaminated With Echo, Broadband Noise, and Reverberation
2527 -- 2537Jorge I. Marin-Hurtado, David V. Anderson. FFT-Based Block Processing in Speech Enhancement: Potential Artifacts and Solutions
2538 -- 2551Sree Hari Krishnan Parthasarathi, Daniel Gatica-Perez, Hervé Bourlard, Mathew Magimai-Doss. Privacy-Sensitive Audio Features for Speech/Nonspeech Detection
2552 -- 2565S. R. Mahadeva Prasanna, Gayadhar Pradhan. Significance of Vowel-Like Regions for Speaker Verification Under Degraded Conditions
2566 -- 2578Julio Vargas, Steve McLaughlin 0001. Speech Analysis and Synthesis Based on Dynamic Modes
2579 -- 2590Bengt Jonas Borgstrom, Abeer Alwan. A Unified Framework for Designing Optimal STSA Estimators Assuming Maximum Likelihood Phase Equivalence of Speech and Noise
2591 -- 2597Brian King, Les Atlas. Single-Channel Source Separation Using Complex Matrix Factorization
2598 -- 2613Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, David Nahamoo, Dimitri Kanevsky. Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR
2614 -- 2623Huijun Ding, Ing Yann Soon, Chai Kiat Yeo. A DCT-Based Speech Enhancement System With Pitch Synchronous Analysis
2624 -- 2633Dongwen Ying, Yonghong Yan 0002, Jianwu Dang, Frank K. Soong. Voice Activity Detection Based on an Unsupervised Learning Framework

Volume 19, Issue 7

1853 -- 1864G. Seshadri, B. Yegnanarayana. Performance of an Event-Based Instantaneous Fundamental Frequency Estimator for Distant Speech Signals
1865 -- 1874Yegui Xiao. A New Efficient Narrowband Active Noise Control System and its Performance Analysis
1875 -- 1889Sheng-yi Kong, Lin-Shan Lee. Semantic Analysis and Organization of Spoken Documents Based on Parameters Derived From Latent Topics
1890 -- 1899Taufiq Hasan, John H. L. Hansen. A Study on Universal Background Model Training in Speaker Verification
1900 -- 1912Nilesh Madhu, Rainer Martin. A Versatile Framework for Speaker Separation Using a Model-Based Speaker Localization Approach
1913 -- 1924Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, Louis Goldstein. Articulatory Information for Noise Robust Speech Recognition
1925 -- 1937Qiang Huang, Stephen J. Cox. Inferring the Structure of a Tennis Game Using Audio Information
1938 -- 1948Maria E. Markaki, Yannis Stylianou. Voice Pathology Detection and Discrimination Based on Modulation Spectral Features
1949 -- 1961Eleftheria Georganti, Tobias May, Steven van de Par, Aki Härmä, John Mourjopoulos. Speaker Distance Detection Using a Single Microphone
1962 -- 1974Tacksung Choi, Young-Cheol Park, Dae Hee Youn, Seokpil Lee. Virtual Sound Rendering in a Stereophonic Loudspeaker Setup
1975 -- 1985Gil Dobry, Ron M. Hecht, Mireille Avigal, Yaniv Zigel. Supervector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal
1986 -- 1998Jesper Kjær Nielsen, Mads Græsbøll Christensen, Ali Taylan Cemgil, Simon J. Godsill, Søren Holdt Jensen. Bayesian Interpolation and Parameter Estimation in a Dynamic Sinusoidal Model
1999 -- 2012Sungwoong Kim, Sungrack Yun, Chang D. Yoo. Large Margin Discriminative Semi-Markov Model for Phonetic Recognition
2013 -- 2025Juan Pablo Bello. Measuring Structural Similarity in Music
2026 -- 2038Iain McCowan, David Dean, Mitchell McLaren, Robert Vogt, Sridha Sridharan. The Delta-Phase Spectrum With Application to Voice Activity Detection and Speaker Recognition
2039 -- 2045Chris Hummersone, Russell Mason, Tim Brookes. Ideal Binary Mask Ratio: A Novel Metric for Assessing Binary-Mask-Based Sound Source Separation Algorithms
2046 -- 2057Valentin Emiya, Emmanuel Vincent, Niklas Harlander, Volker Hohmann. Subjective and Objective Quality Assessment of Audio Source Separation
2058 -- 2066Muhammad Tahir Akhtar, Wataru Mitsuhashi. Improving Performance of Hybrid Active Noise Control Systems for Uncorrelated Narrowband Disturbances
2067 -- 2080Jort F. Gemmeke, Tuomas Virtanen, Antti Hurmalainen. Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
2081 -- 2090Brian Roark, Margaret Mitchell, John-Paul Hosom, Kristy Hollingshead, Jeffrey Kaye. Spoken Language Derived Measures for Detecting Mild Cognitive Impairment
2091 -- 2100Jun Du, Yu Hu, Hui Jiang. Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition
2101 -- 2110Nobutaka Ito, Hikaru Shimizu, Nobutaka Ono, Shigeki Sagayama. Diffuse Noise Suppression Using Crystal-Shaped Microphone Arrays
2111 -- 2124Serajul Haque, Roberto Togneri, Anthony Zaknich. An Auditory Motivated Asymmetric Compression Technique for Speech Recognition
2125 -- 2136Cees H. Taal, Richard C. Hendriks, Richard Heusdens, Jesper Jensen. An Algorithm for Intelligibility Prediction of Time-Frequency Weighted Noisy Speech
2137 -- 2145Ryouichi Nishimura, Parham Mokhtari, Hironori Takemoto, Hiroaki Kato. An Attempt to Calibrate Headphones for Reproduction of Sound Pressure at the Eardrum
2146 -- 2158Stefano Papetti, Federico Avanzini, Davide Rocchesso. Numerical Methods for a Nonlinear Impact Model: A Comparative Study With Closed-Form Corrections
2159 -- 2169Mehrez Souden, Jingdong Chen, Jacob Benesty, Sofiène Affes. An Integrated Solution for Online Multichannel Noise Tracking and Reduction
2170 -- 2183Hannu Pulakka, Paavo Alku. Bandwidth Extension of Telephone Speech Using a Neural Network and a Filter Bank Implementation for Highband Mel Spectrum
2184 -- 2196Yi-Hsuan Yang, Homer H. Chen. Prediction of the Distribution of Perceived Music Emotions Using Discrete Samples
2197 -- 2209Behnaz Ghoraani, Sridhar Krishnan. Time-Frequency Matrix Feature Extraction and Classification of Environmental Audio Signals
2210 -- 2221Amitai Koretz, Joseph Tabrikian. Maximum A Posteriori Probability Multiple-Pitch Tracking Using the Harmonic Model
2222 -- 2233Laurent Oudre, Yves Grenier, Cédric Févotte. Chord Recognition by Fitting Rescaled Chroma Vectors to Chord Templates
2234 -- 2238Boaz Rafaely, Dima Khaykin. Optimal Model-Based Beamforming and Independent Steering for Spherical Loudspeaker Arrays
2239 -- 2244Mads Græsbøll Christensen, Søren Holdt Jensen. New Results on Perceptual Distortion Minimization and Nonlinear Least-Squares Frequency Estimation

Volume 19, Issue 6

1457 -- 1466Hiroshi Saruwatari, Y. Ishikawa, Y. Takahashi, Takayuki Inoue, Kiyohiro Shikano, Kazunobu Kondo. Musical Noise Controllable Algorithm of Channelwise Spectral Subtraction and Adaptive Beamforming Based on Higher Order Statistics
1467 -- 1475A. Ando. Conversion of Multichannel Sound Signal Maintaining Physical Properties of Sound in Reproduced Sound Field
1476 -- 1489Antonio Miguel, Alfonso Ortega, Luis Buera, Eduardo Lleida. Bayesian Networks for Discrete Observation Distributions in Speech Recognition
1490 -- 1503Anthony Lombard, Yuanhang Zheng, Herbert Buchner, Walter Kellermann. TDOA Estimation for Multiple Sound Sources in Noisy and Reverberant Environments Using Broadband Independent Component Analysis
1504 -- 1516Dimitrios Dimitriadis, Petros Maragos, Alexandros Potamianos. On the Effects of Filterbank Design and Energy Computation on Robust Speech Recognition
1517 -- 1529Ciira Wa Maina, John MacLaren Walsh. Joint Speech Enhancement and Speaker Identification Using Approximate Bayesian Inference
1530 -- 1539Stefania Cecchi, Laura Romoli, Paolo Peretti, Francesco Piazza. A Combined Psychoacoustic Approach for Stereo Acoustic Echo Cancellation
1540 -- 1555A. Levy, Sharon Gannot, Emanuel A. P. Habets. Multiple-Hypothesis Extended Particle Filter for Acoustic Source Localization in Reverberant Environments
1556 -- 1568H. D. Tran, H. Li. Sound Event Recognition With Probabilistic Distance SVMs
1569 -- 1583Stefan Hahn, Marco Dinarelli, Christian Raymond, Fabrice Lefevre, Patrick Lehnen, Renato de Mori, Alessandro Moschitti, Hermann Ney, Giuseppe Riccardi. Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages
1584 -- 1599Ronen Talmon, Israel Cohen, Sharon Gannot. Transient Noise Reduction Using Nonlocal Diffusion Filters
1600 -- 1609Ke Hu, DeLiang Wang. Unvoiced Speech Segregation From Nonspeech Interference via CASA and Spectral Subtraction
1610 -- 1630Fabrizio Argenti, Paolo Nesi, Gianni Pantaleo. Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis
1631 -- 1641Hyunson Seo, Chi-Sang Jung, Hong-Goo Kang. Robust Session Variability Compensation for SVM Speaker Verification
1642 -- 1651Ibrahim Almajai, Ben Milner. Visually Derived Wiener Filters for Speech Enhancement
1652 -- 1664Dalei Wu, Yan Yin, Hui Jiang 0001. Large-Margin Estimation of Hidden Markov Models With Second-Order Cone Programming for Speech Recognition
1665 -- 1676Haitian Xu, Mark J. F. Gales, K. K. Chin. Joint Uncertainty Decoding With Predictive Methods for Noise Robust Speech Recognition
1677 -- 1687Roy Wallace, Brendan Baker, Robbie Vogt, Sridha Sridharan. Discriminative Optimization of the Figure of Merit for Phonetic Spoken Term Detection
1688 -- 1701Peter Grosche, Meinard Müller. Extracting Predominant Local Pulse Information From Music Recordings
1702 -- 1710Yao Qian, Zhizheng Wu, Boyang Gao, Frank K. Soong. Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units
1711 -- 1720Yan Jennifer Wu, Thushara D. Abhayapala. Spatial Multizone Soundfield Reproduction: Theory and Design
1721 -- 1733Mathieu Parvaix, Laurent Girin. Informed Source Separation of Linear Instantaneous Under-Determined Audio Mixtures by Source Index Embedding
1734 -- 1742Jacob Benesty, Constantin Paleologu, Silviu Ciochina. On Regularization in Adaptive Filtering
1743 -- 1753Mehdi Bekrani, Andy W. H. Khong, Mojtaba Lotfizad. A Linear Neural Network-Based Approach to Stereophonic Acoustic Echo Cancellation
1754 -- 1769Geoffroy Peeters, Hélène Papadopoulos. Simultaneous Beat and Downbeat-Tracking Using a Probabilistic Framework: Theory and Large-Scale Evaluation
1770 -- 1779Takayuki Inoue, Hiroshi Saruwatari, Yu Takahashi, Kiyohiro Shikano, Kazunobu Kondo. Theoretical Analysis of Musical Noise in Generalized Spectral Subtraction Based on Higher Order Statistics
1780 -- 1790Ki-Seung Lee, Seok-Pil Lee. A Relevant Distance Criterion for Interpolation of Head-Related Transfer Functions
1791 -- 1801Qi Li, Yan Huang. An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions
1802 -- 1812Pasi Saari, Tuomas Eerola, Olivier Lartillot. Generalizability and Simplicity as Criteria in Feature Selection: Application to Mood Classification in Music
1813 -- 1825Saikat Chatterjee, W. Bastiaan Kleijn. Auditory Model-Based Design and Optimization of Feature Vectors for Automatic Speech Recognition
1826 -- 1836Mehdi Bekrani, Andy W. H. Khong, Mojtaba Lotfizad. A Clipping-Based Selective-Tap Adaptive Filtering Approach to Stereophonic Acoustic Echo Cancellation
1837 -- 1842Hélène Lachambre, Régine André-Obrecht, Julien Pinquier. Distinguishing Monophonies From Polyphonies Using Weibull Bivariate Distributions
1843 -- 1848Joshua D. Reiss. Design of Audio Parametric Equalizer Filters Directly in the Digital Domain

Volume 19, Issue 5

1057 -- 1070Emily Mower, Maja J. Mataric, Shrikanth S. Narayanan. A Framework for Automatic Human Emotion Classification Using Emotion Profiles
1071 -- 1079Kai Yu, Steve Young. Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis
1080 -- 1090Gilles Degottex, Axel Röbel, Xavier Rodet. Phase Minimization for Glottal Model Estimation
1091 -- 1102Zhaozhang Jin, DeLiang Wang. HMM-Based Multipitch Tracking for Noisy and Reverberant Speech
1103 -- 1112Ralf Schlüter, Markus Nußbaum-Thom, Hermann Ney. On the Relationship Between Bayes Risk and Word Error Rate in ASR
1113 -- 1122Jerome R. Bellegarda. A Data-Driven Affective Analysis Framework Toward Naturally Expressive Speech Synthesis
1123 -- 1137Yang Lu, Philipos C. Loizou. Estimators of the Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty
1138 -- 1148Georg Heigold, Hermann Ney, Patrick Lehnen, Tobias Gass, Ralf Schlüter. Equivalence of Generative and Log-Linear Models
1149 -- 1159Aastha Gupta, Thushara D. Abhayapala. Three-Dimensional Sound Field Reproduction Using Multiple Circular Loudspeaker Arrays
1160 -- 1169Shasha Xie, Yang Liu. Using N-Best Lists and Confusion Networks for Meeting Summarization
1170 -- 1179W. Charoenruengkit, Nurgun Erdol. The Effect of Spectral Estimation on Speech Enhancement Performance
1180 -- 1195I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler. Estimation of Articulatory Trajectories Based on Gaussian Mixture Model (GMM) With Audio-Visual Information Fusion and Dynamic Kalman Smoothing
1196 -- 1205Wei-Ho Tsai, Hao-Ping Lin. Background Music Removal Based on Cepstrum Transformation for Popular Singer Identification
1206 -- 1220José A. González, Antonio M. Peinado, Angel M. Gomez, José L. Carmona. Efficient MMSE Estimation and Uncertainty Processing for Multienvironment Robust Speech Recognition
1221 -- 1230Shefeng Yan, Haohai Sun, Xiaochuan Ma, U. Peter Svensson, Chaohuan Hou. Time-Domain Implementation of Broadband Beamformer in Spherical Harmonics Domain
1231 -- 1241Vladimir Britanak. On Properties, Relations, and Simplified Implementation of Filter Banks in the Dolby Digital (Plus) AC-3 Audio Coding Standards
1242 -- 1252Geoffroy Peeters. Spectral and Temporal Periodicity Representations of Rhythm for the Automatic Classification of Music Audio Signal
1265 -- 1277Pejman Mowlaee, Mads Græsbøll Christensen, Søren Holdt Jensen. New Results on Single-Channel Speech Separation Using Sinusoidal Modeling
1278 -- 1288Stas Tiomkin, David Malah, Slava Shechtman, Zvi Kons. A Hybrid Text-to-Speech System That Combines Concatenative and Statistical Synthesis Units
1289 -- 1300H.-P. Shen, J.-F. Yeh, C. H. Wu. Speaker Clustering Using Decision Tree-Based Phone Cluster Models With Multi-Space Probability Distributions
1301 -- 1315T. Etame, Régine Le Bouquin-Jeannès, Catherine Quinquis, Lætitia Gros, Gérard Faucon. Towards a New Reference Impairment System in the Subjective Evaluation of Speech Codecs
1316 -- 1327C. Ma, C.-H. Lee. A Regularized Maximum Figure-of-Merit (rMFoM) Approach to Supervised and Semi-Supervised Learning
1328 -- 1342Fabien Ringeval, J. Demouy, György Szaszak, Mohamed Chetouani, L. Robel, J. Xavier, David Cohen, M. Plaza. Automatic Intonation Recognition for the Prosodic Assessment of Language-Impaired Children
1343 -- 1359Emanuele Coviello, Antoni B. Chan, Gert R. G. Lanckriet. Time Series Models for Semantic Music Annotation
1360 -- 1367Maider Lehr, Izhak Shafran. Learning a Discriminative Weighted Finite-State Transducer for Speech Recognition
1368 -- 1381Bram Cornelis, Marc Moonen, Jan Wouters. Performance Analysis of Multichannel Wiener Filter-Based Noise Reduction in Hearing Aids Under Second Order Statistics Estimation Errors
1382 -- 1395Anthony Griffin, Toni Hirvonen, Christos Tzagkarakis, Athanasios Mouchtaris, Panagiotis Tsakalides. Single-Channel and Multi-Channel Sinusoidal Audio Coding Using Compressed Sensing
1396 -- 1407Jibran Yousafzai, Peter Sollich, Zoran Cvetkovic, B. Yu. Combined Features and Kernel Design for Noise Robust Phoneme Classification Using Support Vector Machines
1408 -- 1421X. Fan, J. H. L. Hansen. Speaker Identification Within Whispered Speech Audio Streams
1422 -- 1433Peter Birkholz, Bernd J. Kröger, Christiane Neuschaefer-Rube. Model-Based Reproduction of Articulatory Trajectories for Consonant-Vowel Sequences
1434 -- 1443W. Kim, J. H. L. Hansen. A Novel Mask Estimation Method Employing Posterior-Based Representative Mean Estimate for Missing-Feature Speech Recognition
1444 -- 1449Kishore Prahallad, Alan W. Black. Segmentation of Monologues in Audio Books for Building Synthetic Voices

Volume 19, Issue 4

661 -- 676Ivan Himawan, Iain McCowan, Sridha Sridharan. Clustered Blind Beamforming From Ad-Hoc Microphone Arrays
677 -- 687Guilin Ma, Fredrik Gran, Finn Jacobsen, Finn T. Agerkvist. Adaptive Feedback Cancellation With Band-Limited LPC Vocoder in Digital Hearing Aids
688 -- 698Dong Wang, Simon King, Joe Frankel. Stochastic Pronunciation Modeling for Out-of-Vocabulary Spoken Term Detection
699 -- 710Ashutosh Pandey, V. John Mathews. Low-Delay Signal Processing for Digital Hearing Aids
711 -- 720Charles D. Creusere, Joseph C. Hardin. Assessing the Quality of Audio Containing Temporally Varying Distortions
721 -- 732Evgeny Matusov, Hermann Ney. Lattice-Based ASR-MT Interface for Speech Translation
733 -- 743Rogier C. van Dalen, Mark J. F. Gales. Extended VTS for Noise-Robust Speech Recognition
744 -- 753Romain Hennequin, Roland Badeau, Bertrand David. NMF With Time-Frequency Activations to Model Nonstationary Audio Events
754 -- 761Roberto Barra-Chicote, José Manuel Pardo, Javier Ferreiros, Juan Manuel Montero. Speaker Diarization Based on Intensity Channel Contribution
762 -- 774Yi-Hsuan Yang, Homer H. Chen. Ranking-Based Emotion Recognition for Music Organization and Retrieval
775 -- 787M. A. Haque, Toufiqul Islam, M. K. Hasan. Robust Speech Dereverberation Based on Blind Adaptive Estimation of Acoustic Channels
788 -- 798Najim Dehak, Patrick Kenny, Réda Dehak, Pierre Dumouchel, Pierre Ouellet. Front-End Factor Analysis for Speaker Verification
799 -- 810Michael Wohlmayr, Michael Stark, Franz Pernkopf. A Probabilistic Interaction Model for Multipitch Tracking With Factorial Hidden Markov Models
811 -- 821Perry Groot, Tom Heskes, Tjeerd Dijkstra, James M. Kates. Predicting Preference Judgments of Individual Normal and Hearing-Impaired Listeners With Gaussian Processes
822 -- 836Ji Ming, Ramji Srinivasan, Danny Crookes. A Corpus-Based Approach to Speech Enhancement From Nonstationary Noise
837 -- 846Arshia Cont, Shlomo Dubnov, Gérard Assayag. On the Information Geometry of Audio Streams With Applications to Similarity Computing
847 -- 860Hayley Hung, Yan Huang, Gerald Friedland, Daniel Gatica-Perez. Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
861 -- 870Kong-Aik Lee, Chang Huai You, Haizhou Li 0001, Tomi Kinnunen, Khe Chai Sim. Using Discrete Probabilities With Bhattacharyya Measure for SVM-Based Speaker Verification
871 -- 882Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen. Leveraging Kullback-Leibler Divergence Measures and Information-Rich Cues for Speech Summarization
883 -- 894Chi Zhang, John H. L. Hansen. Whisper-Island Detection Based on Unsupervised Segmentation With Entropy-Based Speech Feature Processing
895 -- 904Matthew Gibson, William Byrne. Unsupervised Intralingual and Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis Using Two-Pass Decision Tree Construction
905 -- 915Min-Seok Choi, Hong-Goo Kang. A Two-Channel Noise Estimator for Speech Enhancement in a Highly Nonstationary Environment
916 -- 926Saman Mousazadeh, Israel Cohen. AR-GARCH in Presence of Noise: Parameter Estimation and Its Application to Voice Activity Detection
927 -- 936Qiang Wu 0009, Liqing Zhang 0001, Guangchuan Shi. Robust Multifactor Speech Feature Extraction Based on Gabor Analysis
937 -- 946Peifeng Ji, Ee-Leng Tan, Woon-Seng Gan, Jun Yang 0004. A Comparative Analysis of Preprocessing Methods for the Parametric Loudspeaker Based on the Khokhlov-Zabolotskaya-Kuznetsov Equation for Speech Reproduction
947 -- 960Frank Rudzicz. Articulatory Knowledge in the Recognition of Dysarthric Speech
961 -- 976Bin Gao 0003, Wai Lok Woo, Satnam Singh Dlay. Single-Channel Source Separation Using EMD-Subband Variable Regularized Sparse Features
977 -- 989Daniel Rudoy, Thomas F. Quatieri, Patrick J. Wolfe. Time-Varying Autoregressions in Speech: Detection Theory and Applications
990 -- 1002Hyeon-Jin Jeon, Tae-Gyu Chang, Sungwook Yu, Sen M. Kuo. A Narrowband Active Noise Control System With Frequency Corrector
1003 -- 1014Emiru Tsunoo, George Tzanetakis, Nobutaka Ono, Shigeki Sagayama. Beyond Timbral Statistics: Improving Music Classification Using Percussive Patterns and Bass Lines
1015 -- 1028Matthew P. Black, Joseph Tepperman, Shrikanth S. Narayanan. Automatic Prediction of Children's Reading Ability for High-Level Literacy Assessment
1029 -- 1040DongHo Kim, Jin H. Kim, Kee-Eung Kim. Robust Performance Evaluation of POMDP-Based Dialogue Systems
1041 -- 1044Lifu Wu, Hongsen He, Xiaojun Qiu. An Active Impulsive Noise Control Algorithm With Logarithmic Transformation
1045 -- 1051Haohai Sun, Shefeng Yan, U. Peter Svensson. Robust Minimum Sidelobe Beamforming for Spherical Microphone Arrays

Volume 19, Issue 3

445 -- 457Mohammad A. Dmour, Mike E. Davies. A New Framework for Underdetermined Speech Extraction Using Mixture of Beamformers
458 -- 467Miroslav Zivanovic, Johan Schoukens. On The Polynomial Approximation for Time-Variant Harmonic Signal Modeling
468 -- 481Ana I. García-Moral, Rubén Solera-Ureña, Carmen Peláez-Moreno, Fernando Díaz-de-María. Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems
482 -- 495Jen-Tzung Chien, Chuang-Hua Chueh. Dirichlet Class Language Models for Speech Recognition
496 -- 504Feipeng Li, J. B. Allen. Manipulation of Consonants in Natural Speech
505 -- 515Donglai Zhu, Bin Ma, Haizhou Li. Speaker Verification With Feature-Space MAPLR Parameters
516 -- 527Hiroshi Sawada, Shoko Araki, Shoji Makino. Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment
528 -- 537Konrad Kowalczyk, Maarten van Walstijn, Damian T. Murphy. A Phase Grating Approach to Modeling Surface Diffusion in FDTD Room Acoustics Simulations
538 -- 548Fei Liu, Feifan Liu, Yang Liu. A Supervised Framework for Keyword Extraction From Meeting Transcripts
549 -- 557Lin Wang, Heping Ding, Fuliang Yin. A Region-Growing Permutation Alignment Approach in Frequency-Domain Blind Source Separation of Speech Mixtures
558 -- 569L. Anders Ekman, Volodya Grancharov, W. Bastiaan Kleijn. Double-Ended Quality Assessment System for Super-Wideband Speech
570 -- 582Jia Jia, Shen Zhang, Fanbo Meng, Yongxin Wang, Lianhong Cai. Emotional Audio-Visual Speech Synthesis Based on PAD
583 -- 599Francesco Nesta, Ted S. Wada, Biing-Hwang Juang. Batch-Online Semi-Blind Source Separation Applied to Multi-Channel Acoustic Echo Cancellation
600 -- 613Prasanta Kumar Ghosh, Andreas Tsiartas, Shrikanth S. Narayanan. Robust Voice Activity Detection Using Long-Term Signal Variability
614 -- 623Sheng Wu, Xiaojun Qiu, Ming Wu. Stereo Acoustic Echo Cancellation Employing Frequency-Domain Preprocessing and Adaptive Filter
624 -- 639Francesco Nesta, Piergiorgio Svaizer, Maurizio Omologo. Convolutive BSS of Short Mixtures by ICA Recursively Regularized Across Frequencies
640 -- 651Juan Andres Morales-Cordovilla, Antonio M. Peinado, Victoria E. Sánchez, José A. González. Feature Extraction Based on Pitch-Synchronous Averaging for Robust Speech Recognition
652 -- 657Miguel Ferrer, Alberto Gonzalez, Maria de Diego, Gema Piñero. Transient Analysis of the Conventional Filtered-x Affine Projection Algorithm for Active Noise Control

Volume 19, Issue 2

225 -- 241Joel Pinto, Garimella S. V. S. Sivaram, Mathew Magimai-Doss, Hynek Hermansky, Hervé Bourlard. Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator
242 -- 255Michael Stark, Michael Wohlmayr, Franz Pernkopf. Source-Filter-Based Single-Channel Speech Separation Using Pitch Information
256 -- 265Etan Fisher, Boaz Rafaely. Near-Field Spherical Microphone Array Processing With Radial Filtering
266 -- 276Weiqiang Zhang, Liang He, Yan Deng, Jia Liu, M. T. Johnson. Time-Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition
277 -- 289Colin Breithaupt, Rainer Martin. Analysis of the Decision-Directed SNR Estimator for Speech Enhancement With Respect to Low-SNR and Transient Conditions
290 -- 300Yannis Pantazis, Olivier Rosec, Yannis Stylianou. Adaptive AM-FM Signal Decomposition With Application to Speech Analysis
301 -- 314Mitsuko Aramaki, Mireille Besson, Richard Kronland-Martinet, Sølvi Ystad. Controlling the Perceived Material in an Impact Sound Synthesizer
315 -- 325D. K. Kim, M. J. F. Gales. Noisy Constrained Maximum-Likelihood Linear Regression for Noise-Robust Speech Recognition
326 -- 337Namgook Cho, C. C. Jay Kuo. Sparse Music Representation With Source-Specific Dictionaries and Its Application to Signal Separation
338 -- 347Ben Milner, Jonathan Darch. Robust Acoustic Speech Feature Prediction From Noisy Mel-Frequency Cepstral Coefficients
348 -- 360Joseph Tepperman, Sungbok Lee, Shrikanth Narayanan, Abeer Alwan. A Generative Student Model for Scoring Word Reading Skills
361 -- 371Shefeng Yan, Haohai Sun, U. Peter Svensson, Xiaochuan Ma, J. M. Hovem. Optimal Modal Beamforming for Spherical Microphone Arrays
372 -- 384Marco Kühne, Roberto Togneri, Sven Nordholm. A New Evidence Model for Missing Data Speech Recognition With Applications in Reverberant Multi-Source Environments
385 -- 395Dinh-Quy Nguyen, Woon-Seng Gan, Andy W. H. Khong. Time-Reversal Approach to the Stereophonic Acoustic Echo Cancellation Problem
396 -- 405F. Menzer, Christof Faller, H. Lissek. Obtaining Binaural Room Impulse Responses From B-Format Impulse Responses Using Frequency-Dependent Coherence Matching
406 -- 416Zbynek Koldovský, Petr Tichavský. Time-Domain Blind Separation of Audio Sources on the Basis of a Complete ICA Decomposition of an Observation Space
417 -- 430Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda. Continuous Stochastic Feature Mapping Based on Trajectory HMMs
431 -- 438Deepu Vijayasenan, Fabio Valente, Hervé Bourlard. An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization

Volume 19, Issue 1

1 -- 13T. May, Steven van de Par, Armin Kohlrausch. A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End
14 -- 23S. Strahl, H. Hansen, Alfred Mertins. A Dynamic Fine-Grain Scalable Compression Scheme With Application to Progressive Audio Coding
24 -- 33Albertus C. den Brinker, Harish Krishnamoorthi, E. A. Verbitskiy. Similarities and Differences Between Warped Linear Prediction and Laguerre Linear Prediction
34 -- 46Konrad Kowalczyk, Maarten van Walstijn. Room Acoustics Simulation Using 3-D Compact Explicit FDTD Schemes
47 -- 56Philipos C. Loizou, Gibak Kim. Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions
57 -- 68Mikel Gainza, Eugene Coyle. Tempo Detection Using a Hybrid Multiband Approach
69 -- 84Takuya Yoshioka, Tomohiro Nakatani, Masato Miyoshi, Hiroshi G. Okuno. Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization
85 -- 96Yun Lei, John H. L. Hansen. Dialect Classification via Text-Independent Training and Testing for Arabic, Spanish, and Chinese
97 -- 110Luis Antonio Azpicueta-Ruiz, Marcus Zeller, Aníbal R. Figueiras-Vidal, Jerónimo Arenas-García, Walter Kellermann. Adaptive Combination of Volterra Kernels and Its Application to Nonlinear Acoustic Echo Cancellation
111 -- 122Jayme G. A. Barbedo, George Tzanetakis. Musical Instrument Classification Using Individual Partials
123 -- 137Maarten Van Segbroeck, Hugo Van Hamme. Advances in Missing Feature Techniques for Robust Large-Vocabulary Continuous Speech Recognition
138 -- 152Helene Papadopoulos, Geoffroy Peeters. Joint Estimation of Chords and Downbeats From an Audio Signal
153 -- 165T. Raitio, Antti Suni, J. Yamagishi, H. Pulakka, J. Nurminen, M. Vainio, Paavo Alku. HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering
166 -- 175Mohsen Rashwan, Mohamed Al-Badrashiny, Mohamed Attia, Sherif Abdou, A. Rafea. A Stochastic Arabic Diacritizer Based on a Hybrid of Factorized and Unfactorized Textual Features
176 -- 185Andre Holzapfel, Yannis Stylianou. Scale Transform in Rhythmic Similarity of Music
186 -- 195S. Suhadi, C. Last, Tim Fingscheidt. A Data-Driven Approach to A Priori SNR Estimation
196 -- 205Ning Wang, P. C. Ching, Nengheng Zheng, Tan Lee. Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features
206 -- 219Alexander Krueger, Ernst Warsitz, Reinhold Haeb-Umbach. Speech Enhancement With a GSC-Like Structure Employing Eigenvector-Based Transfer Function Ratios Estimation