Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 17, Issue 8

1457 -- 1470Aren Jansen, Partha Niyogi. Point Process Models for Spotting Keywords in Continuous Speech
1471 -- 1482Viet Bac Le, Laurent Besacier. Automatic Speech Recognition for Under-Resourced Languages: Application to Vietnamese Language
1483 -- 1497Christos Tzagkarakis, Athanasios Mouchtaris, Panagiotis Tsakalides. A Multichannel Sinusoidal Model Applied to Spot Microphone Signals for Immersive Audio
1498 -- 1507Sampo Vesa. Binaural Sound Source Distance Learning in Rooms
1508 -- 1517Ioannis Andrianakis, Paul R. White. A Speech Enhancement Algorithm Based on a Chi MRF Model of the Speech STFT Amplitudes
1518 -- 1532Emre Özkan, I. Yücel Özbek, Mübeccel Demirekler. Dynamic Speech Spectrum Representation and Tracking Variable Number of Vocal Tract Resonance Frequencies With Time-Varying Dirichlet Process Mixture Models
1533 -- 1546Ilana Heintz, Eric Fosler-Lussier, Chris Brew. Discriminative Input Stream Combination for Conditional Random Field Phone Recognition
1547 -- 1556Zhi-Sheng Chen, J.-S. R. Jang. On the Use of Anti-Word Models for Audio Music Annotation and Retrieval
1557 -- 1566Mark R. P. Thomas, Patrick A. Naylor. The SIGMA Algorithm: A Glottal Activity Detector for Electroglottographic Signals
1567 -- 1576Zhiyong Wu, Helen M. Meng, Hongwu Yang, Lianhong Cai. Modeling the Expressivity of Input Text Semantics for Chinese Text-to-Speech Synthesis in a Spoken Dialog System
1577 -- 1590Stefan Windmann, Reinhold Haeb-Umbach. Parameter Estimation of a State-Space Model of Noise for Robust Speech Recognition
1591 -- 1601P. Loganathan, Andy W. H. Khong, Patrick A. Naylor. A Class of Sparseness-Controlled Algorithms for Echo Cancellation
1602 -- 1611Bo Shao, Mitsunori Ogihara, Dingding Wang, Tao Li. Music Recommendation Based on Acoustic Features and User Access Patterns
1612 -- 1623Chung-Hsien Wu, Chia-Hsin Hsieh. Story Segmentation and Topic Classification of Broadcast News via a Topic-Based Segmental Model and a Genetic Algorithm
1624 -- 1637Konrad Hofbauer, Gernot Kubin, W. Bastiaan Kleijn. Speech Watermarking for Analog Flat-Fading Bandpass Channels

Volume 17, Issue 7

1253 -- 1262Mei-Yuh Hwang, Gang Peng, Mari Ostendorf, Wen Wang, Arlo Faria, Aaron Heidel. Building A Highly Accurate Mandarin Speech Recognizer With Language-Independent Technologies and Language-Dependent Modules
1263 -- 1278Che-Kuang Lin, Lin-Shan Lee. Improved Features and Models for Detecting Edit Disfluencies in Transcribing Spontaneous Mandarin Speech
1279 -- 1291Jen-Tzung Chien, Chuan-Wei Ting. Acoustic Factor Analysis for Streamed Hidden Markov Modeling
1292 -- 1304Wooil Kim, John H. L. Hansen. Time-Frequency Correlation-Based Missing-Feature Reconstruction for Robust Speech Recognition in Band-Restricted Conditions
1305 -- 1315Hung-Yu Su, Chung-Hsien Wu. Improving Structural Statistical Machine Translation for Sign Language With Small Corpus Using Thematic Role Templates as Translation Memory
1316 -- 1324Engin Erzin. Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings
1325 -- 1334Mohamed Afify, Xiaodong Cui, Yuqing Gao. Stereo-Based Stochastic Mapping for Robust Speech Recognition
1335 -- 1347Rong Tong, Bin Ma, Haizhou Li, Chng Eng Siong. A Target-Oriented Phonotactic Front-End for Spoken Language Recognition
1348 -- 1360Dong Yu, Li Deng, Yifan Gong, Alex Acero. A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models
1361 -- 1371Yipeng Li, John Woodruff, DeLiang Wang. Monaural Musical Sound Separation Based on Pitch and Common Amplitude Modulation
1372 -- 1381Brian Kan-Wing Mak, Tsz-Chung Lai, Ivor W. Tsang, James Tin-Yau Kwok. Maximum Penalized Likelihood Kernel Regression for Fast Adaptation
1382 -- 1393Deepu Vijayasenan, Fabio Valente, Hervé Bourlard. An Information Theoretic Approach to Speaker Diarization of Meeting Data
1394 -- 1407Nitish Krishnamurthy, John H. L. Hansen. Babble Noise: Modeling, Analysis, and Applications
1408 -- 1419Giso Grimm, Volker Hohmann, Birger Kollmeier. Increase and Subjective Evaluation of Feedback Stability in Hearing Aids by a Binaural Coherence-Based Noise Reduction Scheme
1420 -- 1434Ronen Talmon, Israel Cohen, Sharon Gannot. Convolutive Transfer Function Generalized Sidelobe Canceler
1435 -- 1444J. Barbedo, A. Lopes, P. J. Wolfe. Empirical Methods to Determine the Number of Sources in Single-Channel Musical Signals

Volume 17, Issue 6

1061 -- 1070Nikolay D. Gaubitch, Patrick A. Naylor. Equalization of Multichannel Acoustic Systems in Oversampled Subbands
1071 -- 1086Shmulik Markovich, Sharon Gannot, Israel Cohen. Multichannel Eigenspace Beamforming in a Reverberant Noisy Environment With Multiple Interfering Speech Signals
1087 -- 1098Jerónimo Arenas-García, Aníbal R. Figueiras-Vidal. Adaptive Combination of Proportionate Filters for Sparse Echo Cancellation
1099 -- 1108Tiemin Mei, Fuliang Yin, Jun Wang. Blind Source Separation Based on Cumulants With Time and Frequency Non-Properties
1109 -- 1123Jacob Benesty, Jingdong Chen, Yiteng Arden Huang. Noise Reduction Algorithms in a Generalized Transform Domain
1124 -- 1132Tianshu Qu, Zheng Xiao, Mei Gong, Ying Huang, Xiaodong Li, Xihong Wu. Distance-Dependent Head-Related Transfer Functions Measured With High Spatial Resolution Using a Spark Gap
1133 -- 1141Nima Khademi Kalantari, Mohammad Ali Akhaee, Seyed Mohammad Ahadi, Hamidreza Amindavar. Robust Multiplicative Patchwork Method for Audio Watermarking
1142 -- 1158Selina Chu, Shrikanth S. Narayanan, C. C. Jay Kuo. Environmental Sound Recognition With Time-Frequency Audio Features
1159 -- 1170Jouni Paulus, Anssi Klapuri. Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm
1171 -- 1185Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, Ren-Hua Wang. Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis
1186 -- 1195Patricia Henriquez, Jesús B. Alonso, Miguel A. Ferrer, Carlos M. Travieso, Juan Ignacio Godino-Llorente, Fernando Díaz-de-María. Characterization of Healthy and Pathological Voice Through Measures Based on Nonlinear Dynamics
1196 -- 1207B. Yegnanarayana, R. Kumara Swamy, K. S. R. Murty. Determining Mixing Parameters From Multispeaker Data Using Speech-Specific Information
1208 -- 1230Junichi Yamagishi, Takashi Nose, Heiga Zen, Zhen-Hua Ling, Tomoki Toda, Keiichi Tokuda, Simon King, Steve Renals. Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis
1231 -- 1239Yao Qian, Hui Liang, Frank K. Soong. A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin-English) TTS

Volume 17, Issue 5

861 -- 862Ruhi Sarikaya, Katrin Kirchhoff, Tanja Schultz, Dilek Z. Hakkani-Tür. Introduction to the Special Issue on Processing Morphologically Rich Languages
863 -- 873Thomas Pellegrini, Lori Lamel. Automatic Word Decompounding for ASR in a Morphologically Rich Language: Application to Amharic
874 -- 883Ebru Arisoy, Dogan Can, Siddika Parlak, Hasim Sak, Murat Saraclar. Turkish Broadcast News Transcription and Retrieval
884 -- 894Hagen Soltau, George Saon, Brian Kingsbury, Hong-Kwang Jeff Kuo, Lidia Mangu, Daniel Povey, Ahmad Emami. Advances in Arabic Speech Transcription at IBM Under the DARPA GALE Program
895 -- 903U. Guz, Benoît Favre, Dilek Z. Hakkani-Tür, Gökhan Tür. Generative and Discriminative Methods Using Morphological Information for Sentence Segmentation of Turkish
904 -- 915Xabier Artola, Arantza Díaz de Ilarraza Sánchez, Aitor Soroa, A. Sologaistoa. Dealing With Complex Linguistic Annotations Within a Language Processing Framework
916 -- 925Mohamed Attia, Mohsen Rashwan, Mohamed Al-Badrashiny. Fassieh-, a Semi-Automatic Visual Interactive Tool for Morphological, PoS-Tags, Phonetic, and Semantic Annotation of Arabic Text Corpora
926 -- 934Yassine Benajiba, Mona T. Diab, Paolo Rosso. Arabic Named Entity Recognition: A Feature-Driven Study
935 -- 944Imed Zitouni, Xiaoqiang Luo, Radu Florian. A Cascaded Approach to Mention Detection and Chaining in Arabic
945 -- 955Do-Gil Lee, Hae-Chang Rim. Probabilistic Modeling of Korean Morphology
956 -- 965Kseniya B. Shalonova, Bruno Golenia, Peter Flach. Towards Learning Morphology for Under-Resourced Fusional and Agglutinating Languages
966 -- 973Paris Smaragdis. Dynamic Range Extension Using Interleaved Gains
974 -- 984Stefan Windmann, Reinhold Haeb-Umbach. Approaches to Iterative Speech Feature Enhancement and Recognition
985 -- 993Gerald Friedland, Oriol Vinyals, Yan Huang, Christian Müller. Prosodic and other Long-Term Features for Speaker Diarization
994 -- 1008Ken ichi Kumatani, John W. McDonough, Barbara Rauch, Dietrich Klakow, Philip N. Garner, Weifeng Li. Beamforming With a Maximum Negentropy Criterion
1009 -- 1024Ozlem Kalinli, Shrikanth S. Narayanan. Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information
1025 -- 1037Yu Tsao, Chin-Hui Lee. An Ensemble Speaker and Speaking Environment Modeling Approach to Robust Speech Recognition
1038 -- 1045Ali A. Milani, Issa M. S. Panahi, Philipos C. Loizou. A New Delayless Subband Adaptive Filtering Algorithm for Active Noise Control Systems
1046 -- 1051Arie Livshin, Xavier Rodet. Purging Musical Instrument Sample Databases Using Automatic Musical Instrument Recognition Methods

Volume 17, Issue 4

521 -- 533A. Homayoun Kamkar-Parsi, Martin Bouchard. Improved Noise Power Spectrum Density Estimation for Binaural Hearing Aids Operating in a Diffuse Noise Field Environment
534 -- 545Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi. Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction
546 -- 555Ronen Talmon, Israel Cohen, Sharon Gannot. Relative Transfer Function Identification Using Convolutive Transfer Function Approximation
556 -- 565S. R. Mahadeva Prasanna, B. V. Sandeep Reddy. Vowel Onset Point Detection Using Source, Spectral Peaks, and Modulation Spectrum Energies
566 -- 571Liang Wang, Woon-Seng Gan. Convergence Analysis of Narrowband Active Noise Equalizer System Under Imperfect Secondary Path Estimation
572 -- 581Leonardo Rey Vega, Hernan Rey, Jacob Benesty, Sara Tressens. A Family of Robust Algorithms Exploiting Sparsity in Adaptive Filters
582 -- 596Carlos Busso, Sungbok Lee, Shrikanth Narayanan. Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection
597 -- 606M. Kuster. Multichannel Room Impulse Response Generation With Coherence Control
607 -- 613S. Sen, Arye Nehorai. Performance Analysis of 3-D Direction Estimation Based on Head-Related Transfer Function
614 -- 624B. Yegnanarayana, K. S. R. Murty. Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals
625 -- 638Zhaozhang Jin, DeLiang Wang. A Supervised Learning Approach to Monaural Segregation of Reverberant Speech
639 -- 649Hiroko Kato Solvang, Y. Nagahara, Shoko Araki, Hiroshi Sawada, Shoji Makino. Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation
650 -- 664Yu Takahashi, Tomoya Takatani, Keiichi Osako, Hiroshi Saruwatari, Kiyohiro Shikano. Blind Spatial Subtraction Array for Speech Enhancement in Noisy Environment
665 -- 681Huawei Chen, Wee Ser. Design of Robust Broadband Beamformers With Passband Shaping Characteristics Using Tikhonov Regularization
682 -- 692Jwu-Sheng Hu, Wei-Han Liu. Location Classification of Nonstationary Sound Sources Using Binaural Room Distribution Patterns
693 -- 703Jesper Højvang Jensen, Mads Græsbøll Christensen, D. P. W. Ellis, Søren Holdt Jensen. Quantitative Analysis of a Common Audio Similarity Measure
704 -- 713Hong Kook Kim, R. C. Rose. Cepstrum-Domain Model Combination Based on Decomposition of Speech and Noise Using MMSE-LSA for ASR in Noisy Environments
714 -- 723Kai Yu, M. J. F. Gales, Philip C. Woodland. Unsupervised Adaptation With Discriminative Mapping Transforms
724 -- 732Teemu Hirsimäki, Janne Pylkkönen, Mikko Kurimo. Importance of High-Order N-Gram Models in Morph-Based Speech Recognition
733 -- 747Jost Schatzmann, S. Young. The Hidden Agenda User Simulation Model
748 -- 757C. Longworth, M. J. F. Gales. Combining Derivative and Parametric Kernels for Speaker Verification
758 -- 774Nicolás Morales, Doroteo T. Toledano, John H. L. Hansen, Javier Garrido. Feature Compensation Techniques for ASR on Band-Limited Speech
775 -- 786Y. Agiomyrgiannakis, Yannis Stylianou. Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech
787 -- 802Jingdong Chen, Jacob Benesty, Yiteng Huang. Study of the Noise-Reduction Problem in the Karhunen-LoÈve Expansion Domain
803 -- 818Klaus Macherey, Oliver Bender, Hermann Ney. Applications of Statistical Machine Translation Approaches to Spoken Language Understanding
819 -- 829Wen Zhang, Rodney A. Kennedy, Thushara D. Abhayapala. Efficient Continuous HRTF Model Using Data Independent Basis Functions: Experimentally Guided Approach
830 -- 839Malay Gupta, Scott C. Douglas. A Spatio-Temporal Speech Enhancement Technique Based on Generalized Eigenvalue Decomposition
840 -- 847Pavel Ircing, Josef V. Psutka, Josef Psutka. Using Morphological Information for Robust Language Modeling in Czech ASR System
848 -- 853V. R. Apsingekar, P. L. De Leon. Speaker Model Clustering for Efficient Speaker Identification in Large Population Applications

Volume 17, Issue 3

411 -- 422A. Katsamanis, George Papandreou, Petros Maragos. Face Active Appearance Modeling and Speech Acoustic Information to Recover Articulation
423 -- 435George Papandreou, A. Katsamanis, V. Pitsikalis, Petros Maragos. Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition
436 -- 445E. Sanchez-Soto, A. Potamianos, K. Daoudi. Unsupervised Stream-Weights Computation in Classification and Recognition Tasks
446 -- 458Jon Barker, Xu Shao. Energetic and Informational Masking Effects in an Audiovisual Speech Recognition System
459 -- 468Javier Melenchón, Elisa Martínez, Fernando De la Torre, José A. Montero. Emphatic Visual Speech Synthesis
469 -- 477Jianhua Tao, Le Xin, Panrong Yin. Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method
478 -- 485Peng Liu, Frank K. Soong. Graph-Based Partial Hypothesis Fusion for Pen-Aided Speech Input
486 -- 500Pui-Yu Hui, Helen M. Meng. Cross-Modality Semantic Integration With Hypothesis Rescoring for Robust Interpretation of Multimodal User Interactions
501 -- 513Dinesh Babu Jayagopi, Hayley Hung, Chuohao Yeo, Daniel Gatica-Perez. Modeling Dominance in Group Conversations Using Nonverbal Activity Cues

Volume 17, Issue 2

205 -- 220Chang-Wen Hsu, Lin-Shan Lee. Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition
221 -- 230Rade Kutil. Optimized Sinusoid Synthesis via Inverse Truncated Fourier Transform
231 -- 246T. Yoshioka, T. Nakatani, M. Miyoshi. Integrated Speech Enhancement Method Using Noise Suppression and Dereverberation
247 -- 252György Wersényi. Effect of Emulated Head-Tracking for Reducing Localization Errors in Virtual Audio Simulation
253 -- 266P. Krishnamoorthy, S. Prasanna. Reverberant Speech Enhancement by Temporal and Spectral Processing
267 -- 276Jen-Tzung Chien, Meng-Sung Wu. Minimum Rank Error Language Modeling
277 -- 286P. C. Pandey, M. S. Shah. Estimation of Place of Articulation During Stop Closures of Vowel-Consonant-Vowel Utterances
287 -- 298George Almpanidis, Margarita Kotti, Constantine Kotropoulos. Robust Detection of Phone Boundaries Using Model Selection Criteria With Few Observations
299 -- 311Leandro E. Di Persia, Diego H. Milone, Masuzo Yanagida. Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
312 -- 323Matthias Wölfel. Enhanced Speech Features by Single-Channel Joint Compensation of Noise and Reverberation
324 -- 334Marc Delcroix, Tomohiro Nakatani, S. Watanabe. Static and Dynamic Variance Compensation for Recognition of Reverberant Speech With Dereverberation Preprocessing
335 -- 343François Pachet, Pierre Roy. Improving Multilabel Analysis of Music Titles: A Large-Scale Validation of the Correction Approach
344 -- 353R. Saeidi, H. R. S. Mohammadi, T. Ganchev, R. D. Rodman. Particle Swarm Optimization for Sorted Adapted Gaussian Mixture Models
354 -- 365Yasser Hifny, Steve Renals. Speech Recognition Using Augmented Conditional Random Fields
366 -- 378J. Hansen, V. Varadarajan. Analysis and Compensation of Lombard Speech Across Noise Type and Levels With Application to In-Set/Out-of-Set Speaker Recognition
379 -- 391S. Subasingha, M. N. Murthi, Søren Vang Andersen. Gaussian Mixture Kalman Predictive Coding of Line Spectral Frequencies
392 -- 401Jacek Dmochowski, Jacob Benesty, Sofiène Affes. An Information-Theoretic Viewof ArrayProcessing

Volume 17, Issue 1

2 -- 12Serdar Yildirim, Shrikanth Narayanan. Automatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Information
13 -- 23Abhinav Sethy, Panayiotis G. Georgiou, Bhuvana Ramabhadran, Shrikanth S. Narayanan. An Iterative Relative Entropy Minimization-Based Data Selection Approach for n-Gram Model Adaptation
24 -- 37Jiucang Hao, Hagai Attias, Srikantan S. Nagarajan, Te-Won Lee, Terrence J. Sejnowski. Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation
38 -- 51Simon Doclo, Marc Moonen, Tim Van den Bogaert, Jan Wouters. Reduced-Bandwidth and Distributed MWF-Based Noise Reduction Algorithms for Binaural Hearing Aids
52 -- 65Y. Nagata, S. Iwasaki, T. Hariyama, T. Fujioka, T. Obara, T. Wakatake, M. Abe. Binaural Localization Based on Weighted Wiener Gain Improved by Incremental Source Attenuation
66 -- 83J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, J. Isogai. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm
84 -- 94Shih-Hsiang Lin, Berlin Chen, Yao-Ming Yeh. Exploring the Use of Speech Features and Their Corresponding Distribution Characteristics for Robust Speech Recognition
95 -- 106Yi-Ting Chen, Berlin Chen, Hsin-Min Wang. A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization
107 -- 116Yan Jennifer Wu, Thushara D. Abhayapala. Theory and Design of Soundfield Reproduction Using Continuous Loudspeaker Concept
117 -- 126Radoslaw Mazur, Alfred Mertins. An Approach for Solving the Permutation Problem of Convolutive Blind Source Separation Based on Statistical Signal Models
127 -- 137Chung-Hsien Wu, Chung-Han Lee, Chung-Hau Liang. Idiolect Extraction and Generation for Personalized Speaking Style Modeling
138 -- 149S. Ananthakrishnan, S. Narayanan. Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition
150 -- 162R. M. M. Derkx, K. Janse. Theoretical Analysis of a First-Order Azimuth-Steerable Superdirective Microphone Array
163 -- 173Ebru Arisoy, Murat Saraclar. Lattice Extension and Vocabulary Adaptation for Turkish LVCSR
174 -- 186Cyril Joder, Slim Essid, Gaël Richard. Temporal Integration for Audio Classification With Application to Musical Instrument Classification
187 -- 197Man-Hung Siu, Xi Yang, Herbert Gish. Discriminatively Trained GMMs for Language Classification Using Boosting Methods