INTERSPEECH 2006 - ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006

researchr

You are not signed in
Sign in
Sign up

INTERSPEECH 2006 - ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006. ISCA, 2006.

Conference: interspeech2006

Abstract is missing.

Comparison of Slovak and Czech speech recognition based on grapheme and phoneme acoustic modelsSlavomír Lihan, Jozef Juhar, Anton Cizmar. [doi]

A multi-space distribution (MSD) approach to speech recognition of tonal languagesHuanliang Wang, Yao Qian, Frank K. Soong, Jian-Lai Zhou, Jiqing Han. [doi]

A text-prompted distributed speaker verification system implemented on a cellular phone and a mobile terminalTsuneo Kato, Hisashi Kawai. [doi]

Developing consistent pronunciation models for phonemic variantsMarelie H. Davel, Etienne Barnard. [doi]

The segmentation of multi-channel meeting recordings for automatic speech recognitionJohn Dines, Jithendra Vepa, Thomas Hain. [doi]

Low complexity LID using pruned pattern tables of LZWS. V. Basavaraja, T. V. Sreenivas. [doi]

Interleaving and MMSE estimation with VQ replicas for distributed speech recognition over lossy packet networksAngel M. Gomez, Antonio M. Peinado, Victoria E. Sánchez, José L. Carmona, Antonio J. Rubio. [doi]

Automatic acoustic identification of insects inspired by the speaker recognition paradigmIlyas Potamitis, Todor Ganchev, Nikos Fakotakis. [doi]

Unsupervised detection of whispered speech in the presence of normal phonationMichael A. Carlin, Brett Y. Smolenski, Stanley J. Wenndt. [doi]

A multilingual embodied conversational agent for tutoring speech and language learningDominic W. Massaro, Ying Liu, Trevor H. Chen, Charles Perfetti. [doi]

Investigating automatic decomposition for ASR in less represented languagesThomas Pellegrini, Lori Lamel. [doi]

Online speaker change detection by combining BIC with microphone array beamformingJoerg Schmalenstroeer, Reinhold Haeb-Umbach. [doi]

Unsupervised language model adaptation based on automatic text collection from WWWMotoyuki Suzuki, Yasutomo Kajiura, Akinori Ito, Shozo Makino. [doi]

Acoustic characterization of children with speech delayH. Timothy Bunnell, James B. Polikoff. [doi]

Learning from errors in grapheme-to-phoneme conversionTatyana Polyakova, Antonio Bonafonte. [doi]

The ICSI+ multilingual sentence segmentation systemM. Zimmerman, Dilek Hakkani-Tür, James G. Fung, Nikki Mirghafori, L. Gottlieb, Elizabeth Shriberg, Yang Liu. [doi]

Pronunciation variation modeling for Mandarin with accentChi Zhang, Ji Wu, Xi Xiao, Zuoying Wang. [doi]

On the correlation between energy and pitch accent in read English speechAndrew Rosenberg, Julia Hirschberg. [doi]

Acoustic modeling for spoken dialogue systems based on unsupervised utterance-based selective trainingTobias Cincarek, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano. [doi]

Is ASR accurate enough for automated reading tutors, and how can we tell?Jack Mostow. [doi]

Issues with uncertainty decoding for noise robust speech recognitionH. Liao, M. J. F. Gales. [doi]

Identification of confusion and surprise in spoken dialog using prosodic featuresRohit Kumar, Carolyn Penstein Rosé, Diane J. Litman. [doi]

Modelling aspiration noise during phonation using the LF voice source modelChrister Gobl. [doi]

Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIAChristophe Servan, Christian Raymond, Frédéric Béchet, Pascal Nocera. [doi]

An ERB loudness pattern based objective speech quality measureGuo Chen, Vijay Parsa, Susan Scollie. [doi]

Automatic language identification using waveletsAna Lilia Reyes-Herrera, Luis Villaseñor Pineda, Manuel Montes-y-Gómez. [doi]

Multi-layered summarization of spoken document archives by information extraction and semantic structuringLin-Shan Lee, Sheng-yi Kong, Yi-Cheng Pan, Yi-Sheng Fu, Yu-tsun Huang. [doi]

Estimation of the quality dimension directness/frequency content for the instrumental assessment of speech qualityKirstin Scholz, Marcel Wältermann, Lu Huo, Alexander Raake, Sebastian Möller, Ulrich Heute. [doi]

From pre-recorded prompts to corporate voices: on the migration of interactive voice response applicationsVolker Fischer, Siegfried Kunzmann. [doi]

Efficient Gaussian mixture model evaluation in voice conversionJilei Tian, Jani Nurminen, Victor Popa. [doi]

A robust fusion method for multilingual spoken document retrieval systems employing tiered resourcesMurat Akbacak, John H. L. Hansen. [doi]

Performance analysis of various single channel speech enhancement algorithms for automatic speech recognitionMyung-Suk Song, Chang-Heon Lee, Hong-Goo Kang. [doi]

Automatic alignment and error correction of human generated transcripts for long speech recordingsTimothy J. Hazen. [doi]

Boosting HMM performance with a memory upgradeMathias De Wachter, Kris Demuynck, Dirk Van Compernolle. [doi]

Robust automatic speech recognition for accented Mandarin in car environmentsPei Ding, Lei He, Xiang Yan, Jie Hao. [doi]

An integrated solution for error concealment in DSR systems over wireless channelsAntonio M. Peinado, Angel M. Gomez, Victoria E. Sánchez, José L. Pérez-Córdoba, Antonio J. Rubio. [doi]

Word order and tonal shape in the production of focus in short Finnish utterancesMartti Vainio, Juhani Järvikivi, Stefan Werner. [doi]

Minimum generation error criterion for tree-based clustering of context dependent HMMsYi-Jian Wu, Wu Guo, Ren-Hua Wang. [doi]

Improvement speaker clustering using global similarity featuresKonstantin Biatov, Joachim Köhler. [doi]

A constrained baum-welch algorithm for improved phoneme segmentation and efficient trainingDavid Huggins-Daines, Alexander I. Rudnicky. [doi]

Acoustic analysis and automatic recognition of spontaneous children²s speechMatteo Gerosa, Diego Giuliani, Shrikanth Narayanan. [doi]

Distance measure between Gaussian distributions for discriminating speaking stylesGoshu Nagino, Makoto Shozakai. [doi]

Speaker localization based on oriented global coherence fieldAlessio Brutti, Maurizio Omologo, Piergiorgio Svaizer. [doi]

Physiologically-motivated synchrony-based processing for robust automatic speech recognitionChanwoo Kim, Yu-Hsiang Bosco Chiu, Richard M. Stern. [doi]

Frequency warping based on mapping formant parametersZhiwei Shuang, Raimo Bakis, Slava Shechtman, Dan Chazan, Yong Qin. [doi]

Language, gender, speaking style and language proficiency as factors influencing the autonomous vocalic filler production in spontaneous speechIoana Vasilescu, Martine Adda-Decker. [doi]

A time-synchronous phonetic decoder for a long-contextual-Span hidden trajectory modelXiaolong Li, Li Deng, Dong Yu, Alex Acero. [doi]

Phoneme-to-grapheme mapping for spoken inquiries to the semantic webAxel Horndasch, Elmar Nöth, Anton Batliner, Volker Warnke. [doi]

Sloparl - slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognitionAndrej Zgank, Tomaz Rotovnik, Matej Grasic, Marko Kos, Damjan Vlaj, Zdravko Kacic. [doi]

Novel entropy based moving average refiners for HMM landmarksRahul Chitturi, Mark Hasegawa-Johnson. [doi]

Discriminating speech and non-speech with regularized least squaresRyan Rifkin, Nima Mesgarani. [doi]

Pitch range and pause duration as markers of discourse hierarchy: perception experimentsJörg Mayer, Ekaterina Jasinskaja, Ulrike Kölsch. [doi]

A new dual-microphone speech enhancement method for oriented noisesH. R. Abutalebi, M. Pourahmadi, M. R. Aghabozorgi. [doi]

Multilingual non-native speech recognition using phonetic confusion-based acoustic model modification and graphemic constraintsGhazi Bouselmi, Dominique Fohr, Irina Illina, Jean-Paul Haton. [doi]

Automatic grammar correction for second-language learnersJohn Lee, Stephanie Seneff. [doi]

Dynamic extension of a grammar-based dialogue system: constructing an all-recipes knowing robotPetra Gieselmann, Alex Waibel. [doi]

Spontaneous Thai speech recognitionMonika Woszczyna, Paisarn Charoenpornsawat, Tanja Schultz. [doi]

The role of positional probability in the segmentation of Cantonese speechMichael C. W. Yip. [doi]

Improving body transmitted unvoiced speech with statistical voice conversionMikihiro Nakagiri, Tomoki Toda, Hideki Kashioka, Kiyohiro Shikano. [doi]

Enhanced dynamic codebook reordering for advanced quantizer structuresJani Nurminen. [doi]

Automatic speech recognition of Cantonese-English code-mixing utterancesJoyce Y. C. Chan, P. C. Ching, Tan Lee, Houwei Cao. [doi]

Comparison of prediction based LSF quantization methods using split VQSaikat Chatterjee, T. V. Sreenivas. [doi]

HMM-based unit selection using frame sized speech segmentsZhen-Hua Ling, Ren-Hua Wang. [doi]

System- versus user-initiative dialog strategy for driver information systemsChantal Ackermann, Marion Libossek. [doi]

Software architectures for incremental understanding of human speechGregory Aist, James F. Allen, Ellen Campana, Lucian Galescu, Carlos Gómez Gallo, Scott C. Stoness, Mary D. Swift, Michael K. Tanenhaus. [doi]

A texttiling based approach to topic boundary detection in meetingsSatanjeev Banerjee, Alexander I. Rudnicky. [doi]

Significance of formants from difference spectrum for speaker identificationKishore Prahallad, Varanasi Sudhakar, Veluru Ranganatham, Krishna M. Bharat, S. Roy Debashish. [doi]

Investigations of issues for using multiple acoustic models to improve continuous speech recognitionRong Zhang, Alexander I. Rudnicky. [doi]

A technique for controlling voice quality of synthetic speech using multiple regression HSMMMakoto Tachibana, Takashi Nose, Junichi Yamagishi, Takao Kobayashi. [doi]

Developing speech dialogs for multimodal HMIs using finite state machinesSilke Goronzy, Raquel Mochales, Nicole Beringer. [doi]

Improving Arabic HMM based speech synthesis qualityOssama Abdel Hamid, Sherif Mahdy Abdou, Mohsen Rashwan. [doi]

Feature analysis for emotion recognition from Mandarin speech considering the special characteristics of Chinese languageYi-Hao Kao, Lin-Shan Lee. [doi]

Example-based grapheme-to-phoneme conversion for ThaiPaisarn Charoenpornsawat, Tanja Schultz. [doi]

Perplexity based linguistic model adaptation for speech summarisationPierre Chatain, Edward W. D. Whittaker, Joanna Mrozinski, Sadaoki Furui. [doi]

A syllable based continuous speech recognizer for TamilA. Lakshmi, Hema A. Murthy. [doi]

yeah right : sarcasm recognition for spoken dialogue systemsJoseph Tepperman, David R. Traum, Shrikanth Narayanan. [doi]

Lingua machinae - an unorthodox proposalFlorian Schiel, Christoph Draxler, Marion Libossek. [doi]

Tone recognition of continuous speech of standard Chinese using neural network and tone nucleus modelKeikichi Hirose, Hui Hu, Xiaodong Wang, Nobuaki Minematsu. [doi]

Saliency parsing for automated directory assistanceIssac Alphonso, Shuangyu Chang. [doi]

Lost speech reconstruction method using speech recognition based on missing feature theory and HMM-based speech synthesisShingo Kuroiwa, Satoru Tsuge, Fuji Ren. [doi]

Discriminative models for spoken language understandingYe-Yi Wang, Alex Acero. [doi]

Generating complementary systems for speech recognitionCatherine Breslin, Mark J. F. Gales. [doi]

Linguistic tuple segmentation in n-gram-based statistical machine translationAdrià de Gispert, José B. Mariño. [doi]

Soft decision combining for dual channel noise reductionTimo Gerkmann, Rainer Martin. [doi]

Bayesian networks for phonetic classification using time-scale featuresFranz Pernkopf, Tuan Van Pham. [doi]

Automatic metadata generation and video editing based on speech and image recognition for medical education contentsSatoshi Tamura, Koji Hashimoto, Jiong Zhu, Satoru Hayamizu, Hirotsugu Asai, Hideki Tanahashi, Makoto Kanagawa. [doi]

Dynamic evidence models in a DBN phone recognizerWilliam Schuler, Tim Miller, Stephen Wu, Andrew Exley. [doi]

CENSREC2: corpus and evaluation environments for in car continuous digit speech recognitionSatoshi Nakamura, Masakiyo Fujimoto, Kazuya Takeda. [doi]

Cross-language evaluation of voice-to-phoneme conversions for voice-tag application in embedded platformsYan Ming Cheng, Changxue Ma, Lynette Melnar. [doi]

A study on detection based automatic speech recognitionChengyuan Ma, Yu Tsao, Chin-Hui Lee. [doi]

Modeling the precedence effect for binaural sound source localization in noisy and echoic environmentsMartin Heckmann, Tobias Rodemann, Björn Schölling, Frank Joublin, Christian Goerick. [doi]

Improved performance evaluation of speech event detectorsCarla Lopes, Fernando Perdigão. [doi]

Single-channel speech separation using sparse non-negative matrix factorizationMikkel N. Schmidt, Rasmus Kongsgaard Olsson. [doi]

Pitch-scale modification using the modulated aspiration noise sourceDaryush Mehta, Thomas F. Quatieri. [doi]

A comparison of inter-transcriber reliability for two systems of prosodic annotation: rap (rhythm and pitch) and toBI (tones and break indices)Laura Dilley, Mara Breen, Marti Bolivar, John Kraemer, Edward Gibson. [doi]

Improved tone modeling for Mandarin broadcast news speech recognitionXin Lei, Man-Hung Siu, Mei-Yuh Hwang, Mari Ostendorf, Tan Lee. [doi]

Latent prosodic modeling (LPM) for speech with applications in recognizing spontaneous Mandarin speech with disfluenciesChe-Kuang Lin, Lin-Shan Lee. [doi]

A stochastic approach for dialog management based on neural networksLluís F. Hurtado, David Griol, Encarna Segarra, Emilio Emilio, Sanchis Sanchis. [doi]

Adaptive multimodal fusion by uncertainty compensationVassilis Pitsikalis, Athanassios Katsamanis, George Papandreou, Petros Maragos. [doi]

Semi-automatic extraction of vocal tract movements from cineradiographic dataJulie Fontecave, Frédéric Berthommier. [doi]

Improving tone recognition with combined frequency and amplitude modellingSiwei Wang, Gina-Anne Levow. [doi]

Generating time-constrained audio presentations of structured informationBrian Langner, Rohit Kumar, Arthur Chan, Lingyun Gu, Alan W. Black. [doi]

Open-vocabulary spoken document retrieval based on new subword models and subword phonetic similarityKohei Iwata, Yoshiaki Itoh, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee. [doi]

A multilingual expectations model for contextual utterances in mixed-initiative spoken dialogueHartwig Holzapfel, Alex Waibel. [doi]

The IBM 2006 speech transcription system for european parliamentary speechesBhuvana Ramabhadran, Olivier Siohan, Lidia Mangu, Geoffrey Zweig, Martin Westphal, Henrik Schulz, Alvaro Soneiro. [doi]

Development and evaluation of speech database in automotive environments for practical speech recognition systemsYasunari Obuchi, Nobuo Hataoka. [doi]

Evaluation of a spoken dialogue system with usability tests and long-term pilot studies: similarities and differencesMarkku Turunen, Jaakko Hakulinen, Anssi Kainulainen. [doi]

Towards continuous speech recognition using surface electromyographySzu-Chen Stan Jou, Tanja Schultz, Matthias Walliczek, Florian Kraft, Alex Waibel. [doi]

Modeling of speech signals based on Bessel-like orthogonal transformGiorgio Biagetti, Paolo Crippa, Claudio Turchetti. [doi]

On the relation between maximum spectral transition positions and phone boundariesSorin Dusan, Lawrence R. Rabiner. [doi]

Multi-stream ASR: an oracle perspectiveHemant Misra, Jithendra Vepa, Hervé Bourlard. [doi]

Feature extraction for spectral continuity measures in concatenative speech synthesisBarry Kirkpatrick, Darragh O Brien, Ronan Scaife. [doi]

Adaptive filtering for attenuating musical noise caused by spectral subtractionTakahiro Murakami, Yoshihisa Ishida. [doi]

Recent advances in phonotactic language recognition using binary-decision treesJiri Navratil. [doi]

A case study in the identification of prosodic cues to turn-taking: back-channeling in ArabicNigel G. Ward, Yaffa Al Bayyari. [doi]

Tracking of involuntary formant frequency variations and application to parkinsonian speechLaurence Cnockaert, Jean Schoentgen, Pascal Auzou, Canan Ozsancak, Francis Grenez. [doi]

Speech technology for minority languages: the case of Irish (gaelic)Ailbhe Ní Chasaide, John Wogan, Brian Ó Raghallaigh, Áine Ní Bhriain, Eric Zoerner, Harald Berthelsen, Christer Gobl. [doi]

Classroom success of an intelligent tutoring system for lexical practice and reading comprehensionMichael Heilman, Kevyn Collins-Thompson, Jamie Callan, Maxine Eskenazi. [doi]

Cues for hesitation in speech synthesisRolf Carlson, Kjell Gustafson, Eva Strangert. [doi]

Automatic generation of statistical language models for interactive voice response applicationsMithun Balakrishna, Cyril Cerovic, Dan I. Moldovan, Ellis Cave. [doi]

The use of Bayesian network for incorporating accent, gender and wide-context dependency informationSakriani Sakti, Konstantin Markov, Satoshi Nakamura. [doi]

Using speech recognition technique for constructing a phonetically transcribed taiwanese (min-nan) text corpusMin-Siong Liang, Ren-Yuan Lyu, Yuang-Chin Chiang. [doi]

Multi-source far-distance microphone selection and combination for automatic transcription of lecturesMatthias Wölfel, Christian Fügen, Shajith Ikbal, John W. McDonough. [doi]

Infinite models for speaker clusteringFabio Valente. [doi]

Non-intrusive speech quality assessment with low computational complexityVolodya Grancharov, David Y. Zhao, Jonas Lindblom, W. Bastiaan Kleijn. [doi]

Improvements to bucket box intersection algorithm for fast GMM computation in embedded speech recognition systemsMin Tang, Aravind Ganapathiraju. [doi]

Automatic transcription of Somali languageAbdillahi Nimaan, Pascal Nocera, Jean-François Bonastre. [doi]

Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spacesHeriberto Cuayáhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira. [doi]

Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpusYang Liu 0004. [doi]

Comparison of the ITU-t p.85 standard to other methods for the evaluation of text-to-speech systemsDmitry Sityaev, Katherine Knill, Tina Burrows. [doi]

Speaker clustered regression-class trees for MLLR adaptationArindam Mandal, Mari Ostendorf, Andreas Stolcke. [doi]

An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognitionTarun Pruthi, Carol Y. Espy-Wilson. [doi]

Finding the gaps: applying a connectionist model of word segmentation to noisy phone-recognized speech dataC. Anton Rytting. [doi]

Robust interpretation in dialogue by combining confidence scores with contextual featuresMatthew Purver, Florin Ratiu, Lawrence Cavedon. [doi]

Distant-talking continuous speech recognition based on a novel reverberation model in the feature domainArmin Sehr, Marcus Zeller, Walter Kellermann. [doi]

Monitoring of the natural voice variations in open and closed phases with frequency warped ARMA modelingPedro J. Quintana-Morales, Juan L. Navarro-Mesa, Antonio G. Ravelo-Garcia, Fernando D. Lorenzo-Garcia. [doi]

Phonetic research on accented Chinese in three dialectal regions: Shanghai, Wuhan and XiamenAijun Li, Qiang Fang, Ziyu Xiong. [doi]

Recent advances of IBMs handheld speech translation systemWeizhong Zhu, Bowen Zhou, Charles Prosser, Pavel Krbec, Yuqing Gao. [doi]

Phone recognition analysis for trajectory HMMLe Zhang, Steve Renals. [doi]

Personality factors in human deception detection: comparing human to machine performanceFrank Enos, Stefan Benus, Robin L. Cautin, Martin Graciarena, Julia Hirschberg, Elizabeth Shriberg. [doi]

Segment connection networks for corpus-based speech synthesisGeert Coorman. [doi]

Optimization of class weights for LDA feature transformationsAndrej Ljolje. [doi]

Perceptive and acoustic measurement of average speaking pitch of female and male speakers in German radio newsSven Grawunder, Ines Bose, Birgit Hertha, Franziska Trauselt, Lutz Christian Anders. [doi]

Lattice LP filtering for noise reduction in speech signalsErhard Rank, Gernot Kubin. [doi]

New considerations for vowel nasalization based on separate mouth-nose recordingGang Feng, Cyril Kotenkoff. [doi]

Hypothesis-based feature combination of multiple speech inputs for robust speech recognition in automotive environmentsYasunari Obuchi, Nobuo Hataoka. [doi]

Pronunciation dependent language modelsAndrej Ljolje. [doi]

Characterization of cued speech vowels from the inner lip contourNoureddine Aboutabit, Denis Beautemps, Laurent Besacier. [doi]

Study on speaker verification on emotional speechWei Wu, Thomas Fang Zheng, Ming-Xing Xu, Huanjun Bao. [doi]

Fusion of phonotactic and prosodic knowledge for language identificationChi-Yueh Lin, Hsiao-Chuan Wang. [doi]

Prompt selection with reinforcement learning in an AT&t call routing applicationCharles Lewis, Giuseppe Di Fabbrizio. [doi]

Nasality perception of vowels in different language backgroundShahina Haque, Tomio Takara. [doi]

Continuous time-frequency masking method for blind speech separation with adaptive choice of threshold parameter using ICAZbynek Koldovský, Jan Nouza, Jan Kolorenc. [doi]

Identify language origin of personal names with normalized appearance number of web pagesJia-Li You, Yining Chen, Min Chu, Yong Zhao, Jin-Lin Wang. [doi]

Intra-speaker variability compensation in speaker verification with limited enrolling dataClaudio Garretón, Néstor Becerra Yoma, Carlos Molina, Fernando Huenupán. [doi]

Automatic detection of irregular phonation in continuous speechSrikanth Vishnubhotla, Carol Y. Espy-Wilson. [doi]

Friends and enemies: a novel initialization for speaker diarizationXavier Anguera, Chuck Wooters, Javier Hernando. [doi]

Speech enhancement using modified phase opponency modelOm Deshmukh, Carol Y. Espy-Wilson. [doi]

Thesaurus expansion using similar word pairs from patent documentsYoshimi Suzuki, Fumiyo Fukumoto. [doi]

Grapheme-to-phoneme conversion using automatically extracted associative rules for Korean TTS systemJinsik Lee, Seungwon Kim, Gary Geunbae Lee. [doi]

An unified unit-selection framework for ultra low bit-rate speech codingV. Ramasubramanian, D. Harish. [doi]

Topic-based language modeling with dynamic Bayesian networksPascal Wiggers, Léon J. M. Rothkrantz. [doi]

Synthesizing breathiness in natural speech with sinusoidal modellingBrett Matthews, Raimo Bakis, Ellen Eide. [doi]

A weight estimation method using LDA for multi-band speech recognitionKoji Iwano, Kaname Kojima, Sadaoki Furui. [doi]

Colloquial Iraqi ASR for speech translationShirin Saleem, Rohit Prasad, Prem Natarajan. [doi]

Soundbite detection in broadcast news domainSameer Maskey, Julia Hirschberg. [doi]

An online adaptive filtering algorithm for the vocal joystickXiao Li, Jonathan Malkin, Susumu Harada, Jeff A. Bilmes, Richard Wright, James A. Landay. [doi]

A successive state and mixture splitting for optimizing the size of models in speech recognitionSoo-Young Suk, Seong-Jun Hahm, Ho-Youl Jung, Hyun-Yeol Chung. [doi]

Memo: towards automatic usability evaluation of spoken dialogue services by user error simulationsSebastian Möller, Roman Englert, Klaus-Peter Engelbrecht, Verena Hafner, Anthony Jameson, Antti Oulasvirta, Alexander Raake, Norbert Reithinger. [doi]

Unfilled pauses in Japanese sentences read aloud by non-native learnersHiroko Hirano, Goh Kawai, Keikichi Hirose, Nobuaki Minematsu. [doi]

Feature and model space speaker adaptation with full covariance GaussiansDaniel Povey, George Saon. [doi]

Individual on-line variance adaptation of frequency filtered parameters for robust ASRJesús Vicente-Peña, Fernando Díaz-de-María, W. Bastiaan Kleijn. [doi]

Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognitionChia-Hsin Hsieh, Chung-Hsien Wu, Jun-Yu Lin. [doi]

Signal modification incorporating perceptual weighting filterJoon-Hyuk Chang, Woohyung Lim, Nam Soo Kim. [doi]

Noise robust model-based voice activity detectionÁngel de la Torre, Javier Ramírez, M. Carmen Benítez, José C. Segura, Luz García, Antonio J. Rubio. [doi]

Generating German intonation with a trainable prosodic modelGérard Bailly, Jan Gorisch. [doi]

Training native English speakers to identify Japanese vowel length with fast rate sentencesYukari Hirata, Elizabeth Whitehurst, Emily Cullings, Jacob Whiton, Carol Glenn. [doi]

Quality improvement of telephone speech by artificial bandwidth expansion - listening tests in three languagesHannu Pulakka, Laura Laaksonen, Paavo Alku. [doi]

Single frame selection for phoneme classificationTingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Hugo Van Hamme. [doi]

Multi-modal system ICANDO: intellectual computer assistant for disabled operatorsAlexey Karpov, Andrey Ronzhin, Alexandre Cadiou. [doi]

Two-microphone voice activity detection in the presence of coherent interferenceGibak Kim, Nam Ik Cho. [doi]

Totally data-driven intonation prediction model using a novel F0 contour parametric representationLifu Yi, Jian Li, Xiaoyan Lou, Jie Hao. [doi]

An improved mel-wiener filter for mel-LPC based speech recognitionMd. Babul Islam, Hiroshi Matsumoto, Kazumasa Yamamoto. [doi]

Detection of a third speaker in telephone conversationsUchechukwu O. Ofoegbu, Ananth N. Iyer, Robert E. Yantorno, Stanley J. Wenndt. [doi]

Improving the performance of HMM-based voice conversion using context clustering decision tree and appropriate regression matrix formatLong Qin, Yi-Jian Wu, Zhen-Hua Ling, Ren-Hua Wang. [doi]

Building an English-iraqi Arabic machine translation system for spoken utterances with limited resourcesJason Riesa, Behrang Mohit, Kevin Knight, Daniel Marcu. [doi]

Manifold HLDA and its application to robust speech recognitionToshiaki Kubo, Tetsuji Ogawa, Tetsunori Kobayashi. [doi]

Comparative analysis of formants of British, american and australian accentsSeyed Ghorshi, Saeed Vaseghi, Qin Yan. [doi]

Comparison of acoustic modeling techniques for Vietnamese and Khmer ASRViet Bac Le, Laurent Besacier. [doi]

New measures to chart toddlers² speech perception and language development: a test of the lexical restructuring hypothesisIris-Corinna Schwarz, Denis Burnham. [doi]

A new HMM adaptation approach for the case of a hands-free speech input in reverberant roomsHans-Günter Hirsch, Harald Finster. [doi]

Recent advances in speech fragment decoding techniquesJon Barker, André Coy, Ning Ma, Martin Cooke. [doi]

Infants² ability to extract verbs from continuous speechEllen Marklund, Francisco Lacerda. [doi]

Speaker adaptation using evolutionary-based linear transformSid-Ahmed Selouani, Douglas D. O Shaughnessy. [doi]

Native and nonnative audio-visual perception of English fricatives in quiet and cafe-noise backgroundsYue Wang, Dawn M. Behne, Haisheng Jiang, Chad Danyluck. [doi]

An integrated approach to improve speech recognition rate for non-native speakersY. Deng, X. Li, C. Kwan, Roger Xu, Bhiksha Raj, Richard M. Stern, D. Williamson. [doi]

Improving speech recognition accuracy with multi-confidence thresholdingShuangyu Chang. [doi]

Voting for two speaker segmentationNarayanaswamy Balakrishnan, Rashmi Gangadharaiah, Richard M. Stern. [doi]

Speaking faces for face-voice speaker identity verificationGirija Chetty, Michael Wagner. [doi]

TDA: a new trainable trajectory formation system for facial animationOxana Govokhina, Gérard Bailly, Gaspard Breton, Paul C. Bagshaw. [doi]

Improved speech activity detection using cross-channel features for recognition of multiparty meetingsKofi Boakye, Andreas Stolcke. [doi]

A simulated-data adaptation technique for robust speech recognitionNattanun Thatphithakkul, Boontee Kruatrachue, Chai Wutiwiwatchai, Sanparith Marukatat, Vataya Boonpiam. [doi]

Fast and effective retraining on contrastive vocal characteristics with bidirectional long short-term memory netsNicole Beringer. [doi]

Improved topic classification over maximum entropy model using k-norm based new objectivesXiang Li, Ea-Ee Jan, Cheng Wu, David Lubensky. [doi]

Gammatone auditory filterbank and independent component analysis for speaker identificationYushi Zhang, Waleed H. Abdulla. [doi]

Cross-system adaptation and combination for continuous speech recognition: the influence of phoneme set and acoustic front-endSebastian Stüker, Christian Fügen, Susanne Burger, Matthias Wölfel. [doi]

Reducing speech coding distortion for speaker identificationAlan McCree. [doi]

Improved source modeling and predictive classification for channel robust speech recognitionValentin Ion, Reinhold Haeb-Umbach. [doi]

Intonational cues to student questions in tutoring dialogsJennifer J. Venditti, Julia Hirschberg, Jackson Liscombe. [doi]

Highly noise robust text-dependent speaker recognition based on hypothesized wiener filteringV. Ramasubramanian, Deepak Vijaywargiay, Kumar V. Praveen. [doi]

Further developments in LSM-based boundary training for unit selection TTSJerome R. Bellegarda. [doi]

Emotion recognition in spontaneous speech using GMMsDaniel Neiberg, Kjell Elenius, Kornel Laskowski. [doi]

Decision tree-based training of probabilistic concatenation models for corpus-based speech synthesisShinsuke Sakai, Tatsuya Kawahara. [doi]

GMM-based acoustic modeling for embedded speech recognitionChristophe Lévy, Georges Linarès, Jean-François Bonastre. [doi]

Unsupervised language model adaptation for Mandarin broadcast conversation transcriptionDavid Mrva, Philip C. Woodland. [doi]

Unsupervised language model adaptation using latent semantic marginalsYik-Cheung Tam, Tanja Schultz. [doi]

Automatic recognition of speakers² age and gender on the basis of empirical studiesChristian Müller. [doi]

Examining knowledge sources for human error correctionYongmei Shi, Lina Zhou. [doi]

Prosodic features for speaker verificationLeena Mary, B. Yegnanarayana. [doi]

Evaluation of voice activity detection by combining multiple features with weight adaptationYusuke Kida, Tatsuya Kawahara. [doi]

A new framework for system combination based on integrated hypothesis spaceI-Fan Chen, Lin-Shan Lee. [doi]

An annotation scheme for complex disfluenciesPeter A. Heeman, Andy McMillin, J. Scott Yaruss. [doi]

A multipitch tracker for monaural speech segmentationAndré Coy, Jon Barker. [doi]

Independent components for acoustic modelingJan Trmal, Jan Vanek, Ludek Müller, Jan Zelinka. [doi]

Handling convolutional noise in missing data automatic speech recognitionMaarten Van Segbroeck, Hugo Van Hamme. [doi]

Measuring and comparing vowel qualities in a Dutch spontaneous speech corpusIrene Jacobi, Louis C. W. Pols, Jan Stroop. [doi]

Question answering with discriminative learning algorithmsJunlan Feng. [doi]

Extracting formants from short segments of speech using group delay functionsJoseph M. Anand, S. Guruprasad, B. Yegnanarayana. [doi]

A study of emotional speech articulation using a fast magnetic resonance imaging techniqueSungbok Lee, Erik Bresch, Jason Adams, Abe Kazemzadeh, Shrikanth Narayanan. [doi]

A DTW-based dissimilarity measure for left-to-right hidden Markov models and its application to word confusability analysisQiang Huo, Wei Li. [doi]

MMSE estimation of complex-valued discrete Fourier coefficients with generalized gamma priorsJesper Jensen, Richard C. Hendriks, Jan S. Erkelens, Richard Heusdens. [doi]

A simulation based parameter optimization for a coarticulation modelJianguo Wei, Xugang Lu, Jianwu Dang. [doi]

A comparative study of Gaussian selection methods in large vocabulary continuous speech recognitionDirk Gehrig, Thomas Schaaf. [doi]

Role of phase estimation in speech enhancementBenjamin J. Shannon, Kuldip K. Paliwal. [doi]

On the use of morphological analysis for dialectal Arabic speech recognitionMohamed Afify, Ruhi Sarikaya, Hong-Kwang Jeff Kuo, Laurent Besacier, Yuqing Gao. [doi]

Improving the performance of out-of-vocabulary word rejection by using support vector machinesShilei Huang, Xiang Xie, Jingming Kuang. [doi]

Towards an integrated understanding of speaking rate in conversationJiahong Yuan, Mark Liberman, Christopher Cieri. [doi]

Missing data mask models with global frequency and temporal constraintsSébastien Demange, Christophe Cerisara, Jean-Paul Haton. [doi]

Feature combination using linear discriminant analysis and its pitfallsRalf Schlüter, András Zolnay, Hermann Ney. [doi]

Comparison of keyword spotting methods for searching in speechLubos Smídl, Josef V. Psutka. [doi]

Dialogue act compression via pitch contour preservationGabriel Murray, Steve Renals. [doi]

Missing feature theory with soft spectral subtraction for speaker verificationMichael T. Padilla, Thomas F. Quatieri, Douglas A. Reynolds. [doi]

Perception of fundamental frequency in cochlear implant patientsÁngel de la Torre, Cristina Roldán, Manuel Sainz. [doi]

Pronunciation verification of children²s speech for automatic literacy assessmentJoseph Tepperman, Jorge Silva, Abe Kazemzadeh, Hong You, Sungbok Lee, Abeer Alwan, Shrikanth Narayanan. [doi]

Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogsLaurence Devillers, Laurence Vidrascu. [doi]

Automatic speech segmentation with multiple statistical modelsSeung Seop Park, Jong Won Shin, Nam Soo Kim. [doi]

Detection and separation of speech events in meeting recordingsFutoshi Asano, Jun Ogata. [doi]

Sentence boundary detection using sequential dependency analysis combined with CRF-based chunkingTakanobu Oba, Takaaki Hori, Atsushi Nakamura. [doi]

Multi-domain text-to-speech synthesis by automatic text classificationFrancesc Alías, Joan Claudi Socoró, Xavier Sevillano, Ignasi Iriondo Sanz, Xavier Gonzalvo. [doi]

Data-driven design of front-end filter bank for Lombard speech recognitionHynek Boril, Petr Fousek, Petr Pollák. [doi]

Building an English speech synthesis system from a Japanese ALS patient²s voiceAkemi Iida, Jun Ito, Shimpei Kajima, Tsutomu Sugawara. [doi]

Incremental learning of MAP context-dependent edit operations for spoken phone number recognition in an embedded platformHahn Koo, Yan Ming Cheng. [doi]

On the sufficiency of automatic phonetic transcriptions for pronunciation variation researchChristophe Van Bael, Hans van Halteren. [doi]

Speaker cluster based GMM tokenization for speaker recognitionBin Ma, Donglai Zhu, Rong Tong, Haizhou Li. [doi]

Totally data-driven duration modeling based on generalized linear model for Mandarin TTSLifu Yi, Jian Li, Xiaoyan Lou, Jie Hao. [doi]

Using latent semantic indexing for morph-based spoken document retrievalVille T. Turunen, Mikko Kurimo. [doi]

An annotation scheme for agreement analysisSiew Leng Toh, Fan Yang, Peter A. Heeman. [doi]

An acoustic and articulatory study of Lombard speech: global effects on the utteranceMaeva Garnier, Lucie Bailly, Marion Dohen, Pauline Welby, Hélène Loevenbruck. [doi]

A clustering approach to semantic decodingHui Ye, Steve Young. [doi]

Prosody of interrogative and affirmative sentences in vietnamese language: analysis and perceptive resultsMinh-Quang Vu, Do Dat Tran, Eric Castelli. [doi]

Call analysis with classification using speech and non-speech featuresYun-Cheng Ju, Ye-Yi Wang, Alex Acero. [doi]

Missing-feature reconstruction for band-limited speech recognition in spoken document retrievalWooil Kim, John H. L. Hansen. [doi]

Emotion detection in infants² cries based on a maximum likelihood approachShoichi Matsunaga, S. Sakaguchi, Masaru Yamashita, Sueharu Miyahara, S. Nishitani, K. Shinohara. [doi]

Sequence classification for machine translationSrinivas Bangalore, Patrick Haffner, Stephan Kanthak. [doi]

Robust phone lattice decodingKris Demuynck, Dirk Van Compernolle, Hugo Van Hamme. [doi]

Assessing the reading level of web pagesSarah E. Petersen, Mari Ostendorf. [doi]

Feature normalization using smoothed mixture transformationsPatrick Kenny, Vishwa Gupta, Gilles Boulianne, Pierre Ouellet, Pierre Dumouchel. [doi]

Frame based system combination and a comparison with weighted ROVER and CNCBjörn Hoffmeister, Tobias Klein, Ralf Schlüter, Hermann Ney. [doi]

On designing context sensitive language models for spoken dialog systemsVaibhava Goel, Ramesh A. Gopinath. [doi]

Tracking and beamforming for multiple simultaneous speakers with probabilistic data association filtersTobias Gehrig, Ulrich Klee, John W. McDonough, Shajith Ikbal, Matthias Wölfel, Christian Fügen. [doi]

Spoken language technologies applied to digital talking booksIsabel Trancoso, Carlos Duarte, António Joaquim Serralheiro, Diamantino Caseiro, Luís Carriço, Céu Viana. [doi]

Discriminative named entity recognition of speech data using speech recognition confidenceKatsuhito Sudoh, Hajime Tsukada, Hideki Isozaki. [doi]

Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separationJi Ming, Timothy J. Hazen, James R. Glass. [doi]

Multimodal authentication using qualitative support vector machinesFawaz Alsaade, Aladdin M. Ariyaeeinia, L. Meng, Amit S. Malegaonkar. [doi]

Evaluation of objective measures for speech enhancementYi Hu, Philipos C. Loizou. [doi]

Farsbayan: a unit selection based Farsi speech synthesizerM. Mehdi Homayounpour, Majid Namnabat. [doi]

Effects of familiarity with faces and voices on second-language speech processing: components of memory tracesDebra M. Hardison. [doi]

Integration of a CELP coder in the ARDOR universal sound codecBalázs Kövesi, Dominique Massaloux, David Virette, Julien Bensa. [doi]

Vocal emotion recognition with cochlear implantsXin Luo, Qian-Jie Fu, John J. Galvin III. [doi]

Analysis of nonmodal phonation using minimum entropy deconvolutionNicolas Malyska, Thomas F. Quatieri. [doi]

Use of incrementally regulated discriminative margins in MCE training for speech recognitionDong Yu, Li Deng, Xiaodong He, Alex Acero. [doi]

The target cost formulation in unit selection speech synthesisPaul Taylor. [doi]

Disentangling gestural and auditory contrast accounts of compensation for coarticulationNavin Viswanathan, James S. Magnuson, Carol A. Fowler. [doi]

Redundancy and productivity in the speech technology lexicon - can we do better?Susan Fitt, Korin Richmond. [doi]

Speech and speech recognition during dictation correctionsKeith Vertanen. [doi]

Developing an automatic assessment tool for children²s oral readingLeen Cleuren, Jacques Duchateau, Alain Sips, Pol Ghesquière, Hugo Van Hamme. [doi]

Analysis and detection of speech under sleep deprivationTin Lay Nwe, Haizhou Li, Minghui Dong. [doi]

A maximum likelihood training approach to irrelevant variability compensation based on piecewise linear transformationsQiang Huo, Donglai Zhu. [doi]

Automatic phonetic transcription of large speech corpora: a comparative studyChristophe Van Bael, Lou Boves, Henk van den Heuvel, Helmer Strik. [doi]

On the fusion of prosody, voice spectrum and face features for multimodal person verificationM. Farrs, Ainara Garde, Pascual Ejarque, Jordi Luque, Javier Hernando. [doi]

Sentence boundary detection of spontaneous Japanese using statistical language model and support vector machinesYuya Akita, Masahiro Saikou, Hiroaki Nanjo, Tatsuya Kawahara. [doi]

A spoken language understanding approach using successive learnersWei-Lin Wu, Ruzhan Lu, Hui Liu, Feng Gao. [doi]

Discriminant linear processing of time-frequency planeFabio Valente, Hynek Hermansky. [doi]

Evaluating a virtual speech cuerGuillaume Gibert, Gérard Bailly, Frédéric Elisei. [doi]

Exploiting polynomial-fit histogram equalization and temporal average for robust speech recognitionShih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen. [doi]

Improving speech recognition of two simultaneous speech signals by integrating ICA BSS and automatic missing feature mask generationRyu Takeda, Shun ichi Yamamoto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. [doi]

A hybrid phrase-based/statistical speech translation systemDavid Stallard, Fred Choi, Kriste Krstovski, Prem Natarajan, Rohit Prasad, Shirin Saleem. [doi]

Nonlinear dynamical invariants for speech recognitionS. Prasad, Sundar Srinivasan, M. Pannuri, Georgios Y. Lazarou, Joseph Picone. [doi]

Category formation and the role of spectral quality in the perception and production of English front vowelsRicardo Augusto Hoffmann Bion, Paola Escudero, Andréia S. Rauber, Barbara O. Baptista. [doi]

A joint intention-based dialogue engineRajah Annamalai Subramanian, Philip R. Cohen. [doi]

Speaker independent voiced-unvoiced detection evaluated in different speaking stylesMartin Heckmann, Marco Moebus, Frank Joublin, Christian Goerick. [doi]

Opinion mining in a telephone survey corpusNathalie Camelin, Géraldine Damnati, Frédéric Béchet, Renato de Mori. [doi]

Accident - execute: increased activation in nonnative listeningMirjam Broersma. [doi]

Constructing stylistic synthesis databases from audio booksYong Zhao, Di Peng, Lijuan Wang, Min Chu, Yining Chen, Peng Yu, Jun Guo. [doi]

Effects of featural similarity and overlap position on lexical confusions and overt similarity judgmentsSarah C. Creel, Delphine Dahan, Daniel Swingley. [doi]

Subspace modeling and selection for noisy speech recognitionJen-Tzung Chien, Chuan-Wei Ting. [doi]

Word structure and tone perception in MandarinHansjörg Mixdorff, Yu Hu. [doi]

Unsupervised Spanish dialect classificationRongqing Huang, John H. L. Hansen. [doi]

Conversational help desk: vague callers and context switchOsamuyimen Stewart, Juan M. Huerta, Ea-Ee Jan, Cheng Wu, Xiang Li, David Lubensky. [doi]

On a greedy learning algorithm for dPLRM with applications to phonetic feature detectionTor André Myrvoll, Tomoko Matsui. [doi]

Minimum divergence based discriminative trainingJun Du, Peng Liu, Frank K. Soong, Jian-Lai Zhou, Ren-Hua Wang. [doi]

The vocal joystick data collection effort and vowel corpusKelley Kilanski, Jonathan Malkin, Xiao Li, Richard Wright, Jeff A. Bilmes. [doi]

Local transformation models for speech recognitionAntonio Miguel, Eduardo Lleida, Alfons Juan, Luis Buera, Alfonso Ortega, Oscar Saz. [doi]

Real vs. acted emotional speechJanneke Wilting, Emiel Krahmer, Marc Swerts. [doi]

Corpus design based on the kullback-leibler divergence for text-to-speech synthesis applicationAleksandra Krul, Géraldine Damnati, François Yvon, Thierry Moudenc. [doi]

Joint prosodic and segmental unit selection speech synthesisRobert A. J. Clark, Simon King. [doi]

Phoneme recognition based on fisher weight map to higher-order local auto-correlationYasuo Ariki, Shunsuke Kato, Tetsuya Takiguchi. [doi]

Robust feature space adaptation for telephony speech recognitionXin Lei, Jon Hamaker, Xiaodong He. [doi]

Weighted codebook mapping for noisy speech enhancement using harmonic-noise modelEsfandiar Zavarehei, Saeed Vaseghi, Qin Yan. [doi]

Voice conversion based on mixtures of factor analyzersYosuke Uto, Yoshihiko Nankaku, Tomoki Toda, Akinobu Lee, Keiichi Tokuda. [doi]

Dynamic help generation by estimating user²s mental model in spoken dialogue systemsYuichiro Fukubayashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. [doi]

High-rate data embedding in unvoiced speechKonrad Hofbauer, Gernot Kubin. [doi]

Solving large margin estimation of HMMS via semidefinite programmingXinwei Li, Hui Jiang. [doi]

Quick individual fitting methods of simplified hearing compensation for elderly peopleKengo Fujita, Tsuneo Kato, Hisashi Kawai. [doi]

Auto-segmentation based VAD for robust ASRYu Shi, Frank K. Soong, Jian-Lai Zhou. [doi]

A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web textsTeruhisa Misu, Tatsuya Kawahara. [doi]

Speaker diarization for multiple distant microphone meetings: mixing acoustic features and inter-channel time differencesJosé M. Pardo, Xavier Anguera, Chuck Wooters. [doi]

Analysis of HMM temporal evolution for automatic speech recognition and utterance verificationMarta Casar, José A. R. Fonollosa. [doi]

Using genetic algorithms to weight acoustic features for speaker recognitionMaider Zamalloa, Germán Bordel, Luis Javier Rodríguez, Mikel Peñagarikano, Juan Pedro Uribe. [doi]

Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithmDavid Cournapeau, Tatsuya Kawahara, Kenji Mase, Tomoji Toriyama. [doi]

A spectral clustering approach to speaker diarizationHuazhong Ning, Ming Liu, Hao Tang, Thomas S. Huang. [doi]

Computer aided pronunciation learning system using speech recognition techniquesSherif Mahdy Abdou, Salah Eldeen Hamid, Mohsen Rashwan, Abdurrahman Samir, Ossama Abdel Hamid, Mostafa Shahin, Waleed Nazih. [doi]

Automatic speech recognition experiments with articulatory dataEsmeralda Uraga, Thomas Hain. [doi]

Assessment of articulatory sub-systems of dysarthric speech using an isolated-style phoneme recognition systemP. Vijayalakshmi, M. RamasubbaReddy, Douglas D. O Shaughnessy. [doi]

Timing levels in segment-based speech emotion recognitionBjörn Schuller, Gerhard Rigoll. [doi]

Phone vector DHMM to decode a phone recognizer s outputBong-Wan Kim, Dae-Lim Choi, Yongnam Um, Yong-Ju Lee. [doi]

Discriminative adaptation for speaker verificationChris Longworth, Mark J. F. Gales. [doi]

Moving speech recognition from software to silicon: the in silico vox projectEdward C. Lin, Kai Yu, Rob A. Rutenbar, Tsuhan Chen. [doi]

Conversion from phoneme based to grapheme based acoustic models for speech recognitionAndrej Zgank, Zdravko Kacic. [doi]

Automatic syllable-pattern induction in statistical Thai text-to-phone transcriptionAusdang Thangthai, Chatchawarn Hansakunbuntheung, Rungkarn Siricharoenchai, Chai Wutiwiwatchai. [doi]

An improved affine projection algorithm based crosstalk resistant adaptive noise cancellerGuo Chen, Vijay Parsa. [doi]

Voice source correlates of prosodic features in american English: a pilot studyMarkus Iseli, Yen-Liang Shue, Melissa A. Epstein, Patricia A. Keating, Jody Kreiman, Abeer Alwan. [doi]

New 20-word lists for word intelligibility test in JapaneseShuichi Sakamoto, Tadahiro Yoshikawa, Shigeaki Amano, Yôiti Suzuki, Tadahisa Kondo. [doi]

An assessment of automatic speech recognition as speech intelligibility estimation in the context of additive noiseWei M. Liu, John S. D. Mason, Nicholas W. D. Evans, Keith A. Jellyman. [doi]

Noisy speech recognition based on selection of multiple noise suppression methods using noise GMMsNorihide Kitaoka, Souta Hamaguchi, Seiichi Nakagawa. [doi]

Robust speech recognition by modifying clean and telephone feature vectors using bidirectional neural networkMansoor Vali, Seyyed Ali Seyyed Salehi, Kazem Karimi. [doi]

New improvements in decoding speed and latency for automatic captioningJian Xue, Rusheng Hu, Yunxin Zhao. [doi]

Low-complexity and efficient classification of voiced/unvoiced/silence for noisy environmentsTuan Van Pham, Gernot Kubin. [doi]

Detecting question-bearing turns in spoken tutorial dialoguesJackson Liscombe, Jennifer J. Venditti, Julia Hirschberg. [doi]

A vector space approach to environment modeling for robust speech recognitionYu Tsao, Chin-Hui Lee. [doi]

Dialog act tagging with support vector machines and hidden Markov modelsDinoj Surendran, Gina-Anne Levow. [doi]

Perceptual identification and phonetic analysis of 6 foreign accents in FrenchBianca Vieru-Dimulescu, Philippe Boula de Mareüil. [doi]

Multi-stream speaker diarization systems for the meetings domainAscensión Gallardo-Antolín, Xavier Anguera, Chuck Wooters. [doi]

Speech analyzer using a joint estimation model of spectral envelope and fine structureHirokazu Kameoka, Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama. [doi]

Discriminative MLE training using a product of Gaussian likelihoodsT. Nagarajan, Douglas D. O Shaughnessy. [doi]

Wavelet ridge track interpretation in terms of formantsSalma Chaari, Kaïs Ouni, Noureddine Ellouze. [doi]

Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithmMark R. Every, Philip J. B. Jackson. [doi]

Prosodic features for a maximum entropy language modelOscar Chan, Roberto Togneri. [doi]

Normalization of the inter-frame information using smoothing filteringLuz García, José C. Segura, M. Carmen Benítez, Javier Ramírez, Ángel de la Torre. [doi]

Highly directional multi-beam audio loudspeakerDirk Olszewski, Klaus Linhard. [doi]

Speech recognition using factorial hidden Markov models for separation in the feature spaceTuomas Virtanen. [doi]

Combining multiple-sized sub-word units in a speech recognition system using baseform selectionT. Nagarajan, P. Vijayalakshmi, Douglas D. O Shaughnessy. [doi]

Novel method for data clustering and mode selection with application in voice conversionJani Nurminen, Jilei Tian, Victor Popa. [doi]

Visual correlates to prominence in several expressive modesJonas Beskow, Björn Granström, David House. [doi]

Rapid speaker adaptation using regression-tree based spectral peak alignmentShizhen Wang, Xiaodong Cui, Abeer Alwan. [doi]

A pitch marks filtering algorithm based on restricted dynamic programmingFrancesc Alías, Carlos Monzo, Joan Claudi Socoró. [doi]

Geometrically constrained permutation-free source separation in an undercomplete speech unmixing scenarioErik Visser. [doi]

Automatic assignment of anchoring points on vowel templates for defining correspondence between time-frequency representations of speech samplesToru Takahashi, Masashi Nishi, Toshio Irino, Hideki Kawahara. [doi]

Detection of word fragments in Mandarin telephone conversationCheng-Tao Chu, Yun-Hsuan Sung, Yuan Zhao, Daniel Jurafsky. [doi]

Design and performance analysis of a factoid question answering system for spontaneous speech transcriptionsMihai Surdeanu, David Dominguez-Sal, Pere Comas. [doi]

Text-independent speaker identification in birdsE. J. S. Fox, J. D. Roberts, M. Bennamoun. [doi]

Development of a program for self assessment of Japanese pronunciation by English learnersChiharu Tsurutani, Yutaka Yamauchi, Nobuaki Minematsu, Dean Luo, Kazutaka Maruyama, Keikichi Hirose. [doi]

Two-step unsupervised speaker adaptation based on speaker and gender recognition and HMM combinationPetr Cerva, Jan Nouza, Jan Silovský. [doi]

Modeling the acoustic correlates of expressive elements in text genres for expressive text-to-speech synthesisHongwu Yang, Helen M. Meng, Lianhong Cai. [doi]

Time-dependent cross-probability model for multi-environment model based LInear normalizationLuis Buera, Eduardo Lleida, Juan Arturo Nolazco-Flores, Antonio Miguel, Alfonso Ortega. [doi]

Further investigations on the relationship between objective measures of speech quality and speech recognition rates in noisy environmentsFrancisco José Fraga, Carlos Alberto Ynoguti, André Godoi Chiovato. [doi]

Underlying quality dimensions of modern telephone connectionsMarcel Wältermann, Kirstin Scholz, Alexander Raake, Ulrich Heute, Sebastian Möller. [doi]

Factors affecting speakers² choice of fillers in Japanese presentationsMichiko Watanabe, Yasuharu Den, Keikichi Hirose, Shusaku Miwa, Nobuaki Minematsu. [doi]

A model of the regularities underlying speaker variation: evidence from hybrid synthesisSusan R. Hertz. [doi]

Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitationYamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano. [doi]

Conversational quality estimation model for wideband IP-telephony servicesHitoshi Aoki, Atsuko Kurashima, Akira Takahashi. [doi]

How to handle gender and number agreement in statistical language models?Caroline Lavecchia, Kamel Smaïli, Jean-Paul Haton. [doi]

Six approaches to limited domain concatenative speech synthesisRobert J. Utama, Ann K. Syrdal, Alistair Conkie. [doi]

Reducing computation on parallel decoding using frame-wise confidence scoresTomohiro Hakamata, Akinobu Lee, Yoshihiko Nankaku, Keiichi Tokuda. [doi]

Prosodic boundaries in Czech: an experiment based on delexicalized speechTomás Dubeda. [doi]

A new state-dependent phonetic tied-mixture model with head-body-tail structured HMM for real-time continuous phoneme recognition systemJunho Park, Hanseok Ko. [doi]

Study of time and frequency variability in pathological speech and error reduction methods for automatic speech recognitionOscar Saz, Antonio Miguel, Eduardo Lleida, Alfonso Ortega, Luis Buera. [doi]

Productions in bilinguism, early foreign language learning and monolinguism: a prosodic comparisonRanka Bijeljac-Babic, Christelle Dodane, Sabine Metta, Claire Gerard. [doi]

A novel environment-dependent speech enhancement method with optimized memory footprintSuhadi Suhadi, Sorel Stan, Tim Fingscheidt. [doi]

Evaluation of content presentation strategies for an in-car spoken dialogue systemHeather Pon-Barry, Fuliang Weng, Sebastian Varges. [doi]

Steady-state suppression in reverberation: a comparison of native and nonnative speech perceptionNao Hodoshima, Dawn M. Behne, Takayuki Arai. [doi]

50 years late: repeating miller-nicely 1955Andrew Lovitt, Jont B. Allen. [doi]

Unifying unit selection and hidden Markov model speech synthesisPaul Taylor. [doi]

An HMM-based singing voice synthesis systemKeijiro Saino, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda. [doi]

Automatic removal of typed keystrokes from speech signalsAmarnag Subramanya, Michael L. Seltzer, Alex Acero. [doi]

Summarization evaluation for text and speech: issues and approachesAni Nenkova. [doi]

Expressive prosody for unit-selection speech synthesisVolker Strom, Robert A. J. Clark, Simon King. [doi]

Effects of frequency shifts on perceived naturalness and gender information in speechPeter F. Assmann, Sophia Dembling, Terrance M. Nearey. [doi]

Expanding phonetic coverage in unit selection synthesis through unit substitution from a donor voiceAlistair Conkie, Ann K. Syrdal. [doi]

Unit selection and its relation to symbolic prosody: a new approachDaniel Tihelka, Jindrich Matousek. [doi]

High-quality speech translation in the flight domainChao Wang, Stephanie Seneff. [doi]

Unsupervised adaptation for acoustic language identificationEkaterina Timoshenko, Josef G. Bauer. [doi]

Exploiting semantic relations for a spoken language understanding applicationCatherine Kobus, Géraldine Damnati, Lionel Delphin-Poulat, Renato de Mori. [doi]

A framework for robust MFCC feature extraction using SNR-dependent compression of enhanced mel filter bank energiesBabak Nasersharif, Ahmad Akbari. [doi]

Improving the characterization of the alternative hypothesis via kernel discriminant analysis for likelihood ratio-based speaker verificationYi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang. [doi]

Forward-backwards training of hybrid HMM/BN acoustic modelsKonstantin Markov, Satoshi Nakamura. [doi]

The importance of different facial areas for signalling visual prominenceMarc Swerts, Emiel Krahmer. [doi]

HMM-based MAP prediction of voiced and unvoiced formant frequencies from noisy MFCC vectorsJonathan Darch, Ben Milner. [doi]

Combining phonetic attributes using conditional random fieldsJeremy Morris, Eric Fosler-Lussier. [doi]

An effective and efficient utterance verification technology using word n-gram filler modelsDong Yu, Yun-Cheng Ju, Alex Acero. [doi]

Specificity and generalizability of spontaneous phonetic imitationKuniko Y. Nielsen. [doi]

A computational auditory scene analysis system for robust speech recognitionSoundararajan Srinivasan, Yang Shao, Zhaozhang Jin, DeLiang Wang. [doi]

Two-stage vocabulary-free spoken document retrieval - subword identification and re-recognition of the identified sectionsYoshiaki Itoh, Takayuki Otake, Kohei Iwata, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee. [doi]

Coupling particle filters with automatic speech recognition for speech feature enhancementFriedrich Faubel, Matthias Wölfel. [doi]

Automatic English stop consonants classification using wavelet analysis and hidden Markov modelsMarco Kühne, Roberto Togneri. [doi]

Edge-splitting in a cumulative multimodal system, for a no-wait temporal threshold on information fusion, combined with an under-specified displayEdward C. Kaiser, Paulo Barthelmess. [doi]

Unsupervised segmentation of words into morphemes - morpho challenge 2005 application to automatic speech recognitionMikko Kurimo, Mathias Creutz, Matti Varjokallio, Ebru Arisoy, Murat Saraclar. [doi]

Robust feature extraction based on spectral peaks of group delay and autocorrelation function and phase domain analysisG. Farahani, Seyed Mohammad Ahadi, Mohammad Mehdi Homayounpour. [doi]

Map-based adaptation for speech conversion using adaptation data selection and non-parallel trainingChung-Han Lee, Chung-Hsien Wu. [doi]

Development of advanced dialog systems with PATENorbert Pfleger, Jan Schehl. [doi]

Noise-robust speech recognition of conversational telephone speechGang Chen, Hesham Tolba, Douglas D. O Shaughnessy. [doi]

Pitch resynchronization while recovering from a late frame in a predictive speech decoderKyle D. Anderson, Philippe Gournay. [doi]

User responses to prosodic variation in fragmentary grounding utterances in dialogGabriel Skantze, David House, Jens Edlund. [doi]

Effect of genre, speaker, and word class on the realization of given and new informationAgustín Gravano, Julia Hirschberg. [doi]

Testing the effect of audiovisual cues to prominence via a reaction-time experimentEmiel Krahmer, Marc Swerts. [doi]

Segmental duration modeling in TurkishÖzlem Öztürk, Tolga Çiloglu. [doi]

An information theoretic tool for investigating speech perceptionBryce E. Lobdell, Jont B. Allen. [doi]

Detecting anger in automated voice portal dialogsFelix Burkhardt, J. Ajmera, Roman Englert, Joachim Stegmann, W. Burleson. [doi]

Speaker verification with non-audible murmur segmentsMariko Kojima, Tomoko Matsui, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano. [doi]

Optimizing components for handheld two-way speech translation for an English-iraqi Arabic systemRoger Hsiao, Ashish Venugopal, Thilo Köhler, Ying Zhang, Paisarn Charoenpornsawat, Andreas Zollmann, Stephan Vogel, Alan W. Black, Tanja Schultz, Alex Waibel. [doi]

An investigation of manifold learning for speech analysisAndrew Errity, John McKenna. [doi]

Performance evaluation of three features for model-based single channel speech separation problemMohammad H. Radfar, Richard M. Dansereau, Abolghasem Sayadiyan. [doi]

Modified phase opponency based solution to the speech separation challengeOm Deshmukh, Carol Y. Espy-Wilson. [doi]

Emovoice: a system to generate emotions in speechJoão P. Cabral, Luís C. Oliveira. [doi]

Summarization of spontaneous conversationsXiaodan Zhu, Gerald Penn. [doi]

Speech recognition with phonological features: some issues to attendFrederik Stouten, Jean-Pierre Martens. [doi]

Lattice extension and rescoring based approaches for LVCSR of TurkishEbru Arisoy, Murat Saraclar. [doi]

Discourse structure and speech recognition problemsMihai Rotaru, Diane J. Litman. [doi]

Corpus-based generation of fundamental frequency contours using generation process model and considering emotional focusesKeikichi Hirose, Yasufumi Asano, Nobuaki Minematsu. [doi]

Radiobot-CFF: a spoken dialogue system for military trainingAntonio Roque, Anton Leuski, Vivek Kumar Rangarajan Sridhar, Susan Robinson, Ashish Vaswani, Shrikanth Narayanan, David R. Traum. [doi]

Observations of the spoken language acquisition process based on a multimodal infant behavior corpusRyo Tsuji, Tomohiko Kasami, Shogo Ishikawa, Shinya Kiriyama, Yoichi Takebayashi, Shigeyoshi Kitazawa. [doi]

Efficient interactive retrieval of spoken documents with key terms ranked by reinforcement learningYi-Cheng Pan, Jia-Yu Chen, Yen-shin Lee, Yi-Sheng Fu, Lin-Shan Lee. [doi]

CLUSTERGEN: a statistical parametric synthesizer using trajectory modelingAlan W. Black. [doi]

Clean speech feature estimation based on soft spectral maskingYoung-Joon Kim, Woohyung Lim, Nam Soo Kim. [doi]

Using posterior-based features in template matching for speech recognitionGuillermo Aradilla, Jithendra Vepa, Hervé Bourlard. [doi]

BINSEG: an efficient speaker-based segmentation techniqueJindrich Zdánský. [doi]

SPAM and full covariance for speech recognitionDaniel Povey. [doi]

Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speechKeigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano. [doi]

A multi-pass error detection and correction framework for Mandarin LVCSRZhengyu Zhou, Helen M. Meng, Wai Kit Lo. [doi]

Investigation on rescoring using minimum verification error (MVE) detectorsQiang Fu, Biing-Hwang Juang. [doi]

Rapid simulation-driven reinforcement learning of multimodal dialog strategies in human-robot interactionThomas Prommer, Hartwig Holzapfel, Alex Waibel. [doi]

Prototyping a call system for students of Japanese using dynamic diagram generation and interactive hintsChristopher J. Waple, Yasushi Tsubota, Masatake Dantsuji, Tatsuya Kawahara. [doi]

Enhancing the performance of a GMM-based speaker identification system in a multi-microphone setupAndreas Stergiou, Aristodemos Pnevmatikakis, Lazaros C. Polymenakos. [doi]

Evaluating prosody of Mandarin speech for language learningMinghui Dong, Haizhou Li, Tin Lay Nwe. [doi]

User simulation for spoken dialogue systems: learning and evaluationKallirroi Georgila, James Henderson, Oliver Lemon. [doi]

Silence energy normalization for robust speech recognition in additive noise environmentChung-fu Tai, Jeih-Weih Hung. [doi]

Using a differential microphone array to estimate the direction of arrival of two acoustic sourcesFotios Talantzis, Anthony G. Constantinides, Lazaros C. Polymenakos. [doi]

ASR-based corrective feedback on pronunciation: does it really work?Ambra Neri, Catia Cucchiarini, Helmer Strik. [doi]

Statistical analysis and performance of DFT domain noise reduction filters for robust speech recognitionColin Breithaupt, Rainer Martin. [doi]

A wavelet-based parameterization for speech/music segmentationE. Didiot, Irina Illina, Odile Mella, Dominique Fohr, Jean-Paul Haton. [doi]

Using system and user performance features to improve emotion detection in spoken tutoring dialogsHua Ai, Diane J. Litman, Katherine Forbes-Riley, Mihai Rotaru, Joel R. Tetreault, Amruta Purandare. [doi]

Recognition of interest in human conversational speechBjörn Schuller, Niels Köhler, Ronald Müller, Gerhard Rigoll. [doi]

Cross-lingual dialog model for speech to speech translationEmil Ettelaie, Panayiotis G. Georgiou, Shrikanth Narayanan. [doi]

Fast SVM training based on the choice of effective samples for audio classificationShilei Zhang, Hongchen Jiang, Shuwu Zhang, Bo Xu. [doi]

An efficient bispectrum phase entropy-based algorithm for VADJ. M. Górriz, Javier Ramírez, Carlos García Puntonet, José C. Segura. [doi]

Computer-assisted closed-captioning of live TV broadcasts in FrenchGilles Boulianne, Jean-Francois Beaumont, Maryse Boisvert, Julie Brousseau, Patrick Cardinal, Claude Chapdelaine, Michel Comeau, Pierre Ouellet, Frédéric Osterrath. [doi]

Analyzing dialogue data for real-world emotional speech classificationRyuichi Nisimura, Souji Omae, Hideki Kawahara, Toshio Irino. [doi]

Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labelingSankaranarayanan Ananthakrishnan, Shrikanth Narayanan. [doi]

Syllable-length path mixture hidden Markov models with trajectory clustering for continuous speech recognitionYan Han, Lou Boves. [doi]

A style control technique for speech synthesis using multiple regression HSMMTakashi Nose, Junichi Yamagishi, Takao Kobayashi. [doi]

Language model adaptation for tiny adaptation corporaDietrich Klakow. [doi]

Super-human multi-talker speech recognition: the IBM 2006 speech separation challenge systemTrausti T. Kristjansson, John R. Hershey, Peder A. Olsen, Steven J. Rennie, Ramesh A. Gopinath. [doi]

Unsupervised learning of HMM topology for text-dependent speaker verificationMing Liu, Thomas S. Huang. [doi]

Max-Gabor analysis and synthesis of spectrogramsTony Ezzat, Jake V. Bouvrie, Tomaso Poggio. [doi]

Prosodic modeling in large vocabulary Mandarin speech recognitionJui-Ting Huang, Lin-Shan Lee. [doi]

Speech enhancement based on spectral estimation from higher-lag autocorrelationBenjamin J. Shannon, Kuldip K. Paliwal, Climent Nadeu. [doi]

Soft margin estimation of hidden Markov model parametersJinyu Li, Ming Yuan, Chin-Hui Lee. [doi]

Towards automatic parameter extraction of command-response model for CantoneseRaymond W. M. Ng, Tan Lee, Wentao Gu. [doi]

Measuring the acceptable word error rate of machine-generated webcast transcriptsCosmin Munteanu, Gerald Penn, Ronald Baecker, Elaine G. Toms, David James. [doi]

Improved hybrid microphone array post-filter by integrating a robust speech absence probability estimator for speech enhancementJunfeng Li, Masato Akagi, Yôiti Suzuki. [doi]

An incremental algorithm for signal reconstruction from short-time fourier transform magnitudeJake V. Bouvrie, Tony Ezzat. [doi]

Recognition of classroom lectures in european portugueseIsabel Trancoso, Ricardo Nunes, Luís Neves, Céu Viana, Helena Moniz, Diamantino Caseiro, Ana Isabel Mata. [doi]

How auditory and visual prosody is used in end-of-utterance detectionPashiera Barkhuysen, Emiel Krahmer, Marc Swerts. [doi]

Adaptive speech enhancement for speech separation in diffuse noiseRong Hu, Yunxin Zhao. [doi]

Phonetically enriched labeling in unit selection TTS synthesisYeon-Jun Kim, Ann K. Syrdal, Alistair Conkie, Marc C. Beutnagel. [doi]

A new single-ended measure for assessment of speech qualityTimothy Murphy, Dorel Picovici, Abdulhussain E. Mahdi. [doi]

Language model adaptation with a word list and a raw corpusShinsuke Mori. [doi]

Automatic detection of voice onset time contrasts for use in pronunciation assessmentAbe Kazemzadeh, Joseph Tepperman, Jorge Silva, Hong You, Sungbok Lee, Abeer Alwan, Shrikanth Narayanan. [doi]

Reconstructing tongue movements from audio and videoHedvig Kjellström, Olov Engwall, Olle Bälter. [doi]

Analysis of correlation between audio and visual speech features for clean audio feature prediction in noiseIbrahim Almajai, Ben Milner, Jonathan Darch. [doi]

Maximum entropy modeling for diacritization of Arabic textRuhi Sarikaya, Ossama Emam, Imed Zitouni, Yuqing Gao. [doi]

Conditional random fields for hierarchical segment selection in text-to-speech synthesisChristian Weiss, Wolfgang Hess. [doi]

Prominent words as anchors for TRP projectionRob van Son, Wieneke Wesseling, Louis C. W. Pols. [doi]

Development of slovak GALAXY/voiceXML based spoken language dialogue system to retrieve information from the internetJozef Juhar, Stanislav Ondás, Anton Cizmar, Milan Rusko, Gregor Rozinaj, Roman Jarina. [doi]

HMM-based continuous sign language recognition using a fast optical flow parameterization of visual informationGuillermo Cortés, Luz García, M. Carmen Benítez, José C. Segura. [doi]

Sparseness and speech perception in noiseGuoping Li, Mark E. Lutman. [doi]

User expectations and real experience on a multimodal interactive systemKristiina Jokinen, Topi Hurtig. [doi]

Integrating phonetic boundary discrimination explicitly into HMM systemsYu Wang, Eric Fosler-Lussier. [doi]

Eigenvoice conversion based on Gaussian mixture modelTomoki Toda, Yamato Ohtani, Kiyohiro Shikano. [doi]

Joint interpretation of input speech and pen gestures for multimodal human-computer interactionPui-Yu Hui, Helen M. Meng. [doi]

Text-independent cross-language voice conversionDavid Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Julia Hirschberg. [doi]

Low-resource autodiacritization of abjads for speech keyword searchPatrick Schone. [doi]

A new set of features for text-independent speaker identificationCarol Y. Espy-Wilson, Sandeep Manocha, Srikanth Vishnubhotla. [doi]

Integrating spoken dialog and question answering: the ritel projectSophie Rosset, Olivier Galibert, Gabriel Illouz, Aurélien Max. [doi]

Analysis of overlaps in meetings by dialog factors, hot spots, speakers, and collection site: insights for automatic speech recognitionÖzgür Çetin, Elizabeth Shriberg. [doi]

An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato featuresTomoyasu Nakano, Masataka Goto, Yuzuru Hiraga. [doi]

Cooperation between global and local methods for the automatic segmentation of speech synthesis corporaSafaa Jarifi, Dominique Pastor, Olivier Rosec. [doi]

Formant-based English vowel assessment for Chinese in TaiwanJiang-Chun Chen, Wei-Tang Hsu, Jyh-Shing Roger Jang, Ren-Yuan Lyu, Yuang-Chin Chiang. [doi]

Modeling sensory-to-motor mappings using neural nets and a 3d articulatory speech synthesizerBernd J. Kröger, Peter Birkholz, Jim Kannampuzha, Christiane Neuschaefer-Rube. [doi]

An user-centered development of an intuitive dialog control for speech-controlled music selection in carsStefan Schulz, Hilko Donker. [doi]

CASA based speech separation for robust speech recognitionRunqiang Han, Pei Zhao, Qin Gao, Zhiping Zhang, Hao Wu, Xihong Wu. [doi]

State-level variable modeling for phoneme classificationHao-Zheng Li, Douglas D. O Shaughnessy. [doi]

Selective-LPC based representation of STRAIGHT spectrum and its applications in spectral smoothingHeng Kang, Wenju Liu. [doi]

Is voice quality enough? - study on how the situation and user²s awareness influence the utterance featuresShinya Yamada, Toshihiko Itoh, Kenji Araki. [doi]

A model for the f0 reset in corpus-based intonation approachesFrancisco Campillo Díaz, Jan P. H. van Santen, Eduardo Rodríguez Banga. [doi]

Incorporating second-order information into two-step major phrase break prediction for KoreanSeungwon Kim, Jinsik Lee, Byeongchang Kim, Gary Geunbae Lee. [doi]

Development of prototype text-to-speech systems for northern sothoH. J. Oosthuizen, S. T. Phihlela, M. J. D. Manamela. [doi]

Tracking of visible vocal tract resonances (VVTR) based on kalman filteringI. Yücel Özbek, Mübeccel Demirekler. [doi]

Substitute sounds for ventriloquism and speech disordersJörg Metzner, Marcel Schmittfull, Karl Schnell. [doi]

A speaker adaptation algorithm using principal curves in noisy environmentsJingying Wang, Zuoying Wang. [doi]

A quality measure method using Gaussian mixture models and divergence measure for speaker identificationRong Zheng, Shuwu Zhang, Bo Xu. [doi]

Posterior based keyword spotting with a priori thresholdsHamed Ketabdar, Jithendra Vepa, Samy Bengio, Hervé Bourlard. [doi]

Voice activity detection in personal audio recordings using autocorrelogram compensationKeansub Lee, Daniel P. W. Ellis. [doi]

Acoustic cues for the classification of regular and irregular phonationKushan Surana, Janet Slifka. [doi]

Language modeling of Chinese personal names based on character units for continuous Chinese speech recognitionXinhui Hu, Hirofumi Yamamoto, Gen-ichiro Kikui, Yoshinori Sagisaka. [doi]

A user simulator based on voiceXML for evaluation of spoken dialog systemsAkinori Ito, Keisuke Shimada, Motoyuki Suzuki, Shozo Makino. [doi]

A Spanish speech to sign language translation system for assisting deaf-mute peopleRubén San Segundo, Roberto Barra-Chicote, Luis Fernando D Haro, Juan Manuel Montero, Ricardo de Córdoba, Javier Ferreiros. [doi]

Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatarZhiyong Wu, Shen Zhang, Lianhong Cai, Helen M. Meng. [doi]

Speaker adaptation of trajectory HMMs using feature-space MLLRHeiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Tadashi Kitamura. [doi]

LDA based feature estimation methods for LVCSRJanne Pylkkönen. [doi]

Automatic phonetic segmentation by using a SPM-based approach for a Mandarin singing voice corpusCheng-Yuan Lin, Jyh-Shing Roger Jang. [doi]

Doing research on a deployed spoken dialogue system: one year of let s go! experienceAntoine Raux, Dan Bohus, Brian Langner, Alan W. Black, Maxine Eskenazi. [doi]

A robust feature extraction based on the MTF concept for speech recognition in reverberant environmentXugang Lu, Masashi Unoki, Masato Akagi. [doi]

Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous JapaneseRyoji Hamabe, Kiyotaka Uchimoto, Tatsuya Kawahara, Hitoshi Isahara. [doi]

Vector taylor series based joint uncertainty decodingHaitian Xu, Luca Rigazio, David Kryze. [doi]

An adaptive sampling procedure for speech perception experimentsGeoffrey Stewart Morrison. [doi]

Analysis of prosodic and linguistic cues of phrase finals for turn-taking and dialog actsCarlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita. [doi]

Within-class covariance normalization for SVM-based speaker recognitionAndrew O. Hatch, Sachin S. Kajarekar, Andreas Stolcke. [doi]

Visual speech segmentation and speaker recognition for transcription of TV newsJosef Chaloupka. [doi]

Two stage transform vector quantization of LSFs for wideband speech codingSaikat Chatterjee, T. V. Sreenivas. [doi]

Automatic emotion recognition of speech signal in MandarinSheng Zhang, P. C. Ching, Fanrang Kong. [doi]

Multi-microphone periodicity function for robust F0 estimation in real noisy and reverberant environmentsFederico Flego, Maurizio Omologo. [doi]

Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesisYuji Nakano, Makoto Tachibana, Junichi Yamagishi, Takao Kobayashi. [doi]

Online speech detection and dual-gender speech recognition for captioning broadcast newsToru Imai, Shoei Sato, Akio Kobayashi, Kazuo Onoe, Shinichi Homma. [doi]

Improved warping-invariant features for automatic speech recognitionJan Rademacher, Matthias Wächter, Alfred Mertins. [doi]

Phrase break prediction using logistic generalized linear modelLifu Yi, Jian Li, Xiaoyan Lou, Jie Hao. [doi]

A spectral-temporal method for pitch trackingStephen A. Zahorian, Princy Dikshit, Hongbing Hu. [doi]

A multiclass framework for speaker verification within an acoustic event sequence systemNicolas Scheffer, Jean-François Bonastre. [doi]

Automatic Mandarin pronunciation scoring for native learners with dialect accentSi Wei, Qing-Sheng Liu, Yu Hu, Ren-Hua Wang. [doi]

CHAT: a conversational helper for automotive tasksFuliang Weng, Sebastian Varges, Badri Raghunathan, Florin Ratiu, Heather Pon-Barry, Brian Lathrop, Qi Zhang, Harry Bratt, Tobias Scheideck, Kui Xu, Matthew Purver, Rohit Mishra, Annie Lien, Madhuri Raya, Stanley Peters, Yao Meng, J. Russell, Lawrence Cavedon, Elizabeth Shriberg, Hauke Schmidt, R. Prieto. [doi]

Multivariate analysis of frame-based acoustic cues of dysperiodicities in connected speechAbdellah Kacha, Francis Grenez, Jean Schoentgen. [doi]

Compact n-gram models by incremental growing and clustering of historiesSami Virpioja, Mikko Kurimo. [doi]

Single channel speech enhancement by frequency domain constrained optimization and temporal maskingWen Jin, Michael S. Scordilis. [doi]

Glottal closure and opening detection for flexible parametric voice codingPamornpol Jinachitra. [doi]

Voice GMM modelling for FESTIVAL/MBROLA emotive TTS synthesisMauro Nicolao, Carlo Drioli, Piero Cosi. [doi]

Effect of dynamic information of formants on discrimination of English vowels in consonantal contexts by Japanese listenersAkiyo Joto. [doi]

Minimum boundary error training for automatic phonetic segmentationJen-Wei Kuo, Hsin-Min Wang. [doi]

Efficient VQ techniques and general noise shaping in noise feedback codingJes Thyssen, Juin-Hwey Chen. [doi]

Discriminative kernel-based phoneme sequence recognitionJoseph Keshet, Shai Shalev-Shwartz, Samy Bengio, Yoram Singer, Dan Chazan. [doi]

Improving phrase-based Korean-English statistical machine translationJonghoon Lee, Donghyeon Lee, Gary Geunbae Lee. [doi]

Lexical stress in continuous speech recognitionRogier C. van Dalen, Pascal Wiggers, Léon J. M. Rothkrantz. [doi]

Training of coarticulation models using dominance functions and visual unit selection methods for audio-visual speech synthesisZdenek Krnoul, Milos Zelezný, Ludek Müller, Jakub Kanis. [doi]

The 2006 RWTH parliamentary speeches transcription systemJonas Lööf, Maximilian Bisani, Christian Gollan, Georg Heigold, Björn Hoffmeister, Christian Plahl, Ralf Schlüter, Hermann Ney. [doi]

Comparative study on contributions of pitch-synchronization and peak-amplitude towards robustness issue of ASRMuhammad Ghulam, Junsei Horikawa, Tsuneo Nitta. [doi]

Audio-visual speech recognition in the presence of a competing speakerXu Shao, Jon Barker. [doi]

/nailon/ - software for online analysis of prosodyJens Edlund, Mattias Heldner. [doi]

An optimum microphone array post-filter for speech applicationsStamatios Lefkimmiatis, Dimitrios Dimitriadis, Petros Maragos. [doi]

On speaker-specific prosodic models for automatic dialog act segmentation of multi-party meetingsJáchym Kolár, Elizabeth Shriberg, Yang Liu. [doi]

Towards a comprehensive investigation of factors relevant to peak alignment using a unit selection corpusMatthias Jilka, Bernd Möbius. [doi]

Audio person tracking in a smart-room environmentAlberto Abad, Carlos Segura, Dusan Macho, Javier Hernando, Climent Nadeu. [doi]

Evolving emotional prosodyCecilia Ovesdotter Alm, Xavier Llorà. [doi]

Word intelligibility estimation of noise-reduced speechTakeshi Yamada, Masakazu Kumakura, Nobuhiko Kitawaki. [doi]

Robust speaker diarization for meetings: ICSI RT06s evaluation systemXavier Anguera, Chuck Wooters, José M. Pardo. [doi]

Speaker identification under noisy environments by using harmonic structure extraction and reliable frame weightingHiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. [doi]

Realizations and representations of Thai tones in monomoraic syllablesRattima Nitisaroj. [doi]

Improving glottal waveform estimation through rank-based glottal quality assessmentElliot Moore II, Juan Torres. [doi]

The role of prosody in the perception of US native English accentsAyako Ikeno, John H. L. Hansen. [doi]

Have we met? MDP based speaker ID for robot dialogueFilip Krsmanovic, Curtis Spencer, Daniel Jurafsky, Andrew Y. Ng. [doi]

A comparison of singing evaluation algorithmsPartha Lal. [doi]

A noninvasive, low-cost device to study the velopharyngeal port during speech and some preliminary resultsXiaochuan Niu, Alexander Kain, Jan P. H. van Santen. [doi]

Effects of word frequency on the acoustic durations of affixesMark Pluymaekers, Mirjam Ernestus, R. Harald Baayen. [doi]

Articulatory features for meeting speech recognitionFlorian Metze. [doi]

On the use of Jacobian adaptation in real speaker verification applicationsJan Anguita, Javier Hernando. [doi]

Exploiting dendritic autocorrelogram structure to identify spectro-temporal regions dominated by a single sound sourceNing Ma, Phil Green, André Coy. [doi]

Effects of midline tongue piercing on spectral centroid frequencies of sibilantsTom Kovacs, Donald S. Finan. [doi]

Chinese input method based on reduced Mandarin phonetic alphabetChun-Han Tseng, Chia-Ping Chen. [doi]

A trajectory mixture density network for the acoustic-articulatory inversion mappingKorin Richmond. [doi]

A novel framework of text-independent speaker verification based on utterance transform and iterative cohort modelingMing Liu, Huazhong Ning, Thomas S. Huang, Zhengyou Zhang. [doi]

Improved language identification using support vector machines for language modelingXi Yang, Lu-Feng Zhai, Man-Hung Siu, Herbert Gish. [doi]

Bayesian decision tree state tying for conversational speech recognitionRusheng Hu, Yunxin Zhao. [doi]

Integrating Festival and WindowsRhys James Jones, Ambrose Choy, Briony Williams. [doi]

Amharic speech synthesis using cepstral method with stress generation ruleTadesse Anberbir, Tomio Takara. [doi]

Imperfect transcript driven speech recognitionBenjamin Lecouteux, Georges Linarès, Pascal Nocera, Jean-François Bonastre. [doi]

On speech variation and word type differentiation by articulatory feature representationsLouis ten Bosch, R. Harald Baayen, Mirjam Ernestus. [doi]

Multistage convolutive blind source separation for speech mixtureYanxue Liang, Ichiro Hagiwara. [doi]

A phrase-level machine translation approach for disfluency detection using weighted finite state transducersSameer Maskey, Bowen Zhou, Yuqing Gao. [doi]

A tone recognition framework for continuous Mandarin speechLei He, Jie Hao. [doi]

Vector-based spoken language recognition using output codingHaizhou Li, Bin Ma, Rong Tong. [doi]

Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognitionMatthew Gibson, Thomas Hain. [doi]

Automatic initial/final generation for dialectal Chinese speech recognitionLinquan Liu, Thomas Fang Zheng, Wenhu Wu. [doi]

Limitations of MLLR adaptation with Spanish-accented English: an error analysisConstance Clarke, Daniel Jurafsky. [doi]

Cluster-based user simulations for learning dialogue strategiesVerena Rieser, Oliver Lemon. [doi]

Recent progress on the discriminative region-dependent transform for speech feature extractionBing Zhang, Spyros Matsoukas, Richard M. Schwartz. [doi]

A study on lattice rescoring with knowledge scores for automatic speech recognitionSabato Marco Siniscalchi, Jinyu Li, Chin-Hui Lee. [doi]

Continual on-line monitoring of Czech spoken broadcast programsJan Nouza, Jindrich Zdánský, Petr Cerva, Jan Kolorenc. [doi]

A discriminative method for speaker verification using the difference informationZhenchun Lei, Yingchun Yang, Zhaohui Wu. [doi]

Classified comfort noise generation for efficient voice transmissionYasheng Qian, Wei-Shou Hsu, Peter Kabal. [doi]

Potential relevance of audio-visual integration in mammals for computational modelingEeva Klintfors, Francisco Lacerda. [doi]

Analysis of lombard effect under different types and levels of noise with application to in-set speaker ID systemsVaishnevi S. Varadarajan, John H. L. Hansen. [doi]

QASR: question answering using semantic roles for speech interfaceSvetlana Stenchikova, Dilek Hakkani-Tür, Gökhan Tür. [doi]

Prosodic feature generation for back-channel predictionThamar Solorio, Olac Fuentes, Nigel G. Ward, Yaffa Al Bayyari. [doi]

Advances in lecture recognition: the ISL RT-06s evaluation systemChristian Fügen, Matthias Wölfel, John W. McDonough, Shajith Ikbal, Florian Kraft, Kornel Laskowski, Mari Ostendorf, Sebastian Stüker, Ken ichi Kumatani. [doi]

Towards a multimodal topic tracking system for a mobile robotJan F. Maas, Britta Wrede, Gerhard Sagerer. [doi]

Identification of regional accents in French: perception and categorizationCécile Woehrling, Philippe Boula de Mareüil. [doi]

On the sufficiency and redundancy of pitch for TRP projectionWieneke Wesseling, Rob van Son, Louis C. W. Pols. [doi]

Multi-flow block interleaving applied to distributed speech recognition over IP networksAngel M. Gomez, Juan J. Ramos-Muñoz, Antonio M. Peinado, Victoria E. Sánchez. [doi]

A cohort - UBM approach to mitigate data sparseness for in-set/out-of-set speaker recognitionVinod Prakash, John H. L. Hansen. [doi]

From reaction to prediction: experiments with computational models of turn-takingDavid Schlangen. [doi]

Speech/non-speech discrimination combining advanced feature extraction and SVM learningJavier Ramírez, Pablo Yélamos, J. M. Górriz, José C. Segura, Luz García. [doi]

Locating phone boundaries from acoustic discontinuities using a two-staged approachPairote Leelaphattarakij, Proadpran Punyabukkana, Atiwong Suchato. [doi]

Improving perplexity measures to incorporate acoustic confusabilityAmit Anil Nanavati, Nitendra Rajput. [doi]

Basque-Spanish language identification using phone-based methodsVíctor G. Guijarrubia, M. Inés Torres. [doi]

Consonant and vowel confusions in speech-weighted noiseSandeep Phatak, Jont B. Allen. [doi]

Experiments on Chinese speech recognition with tonal models and pitch estimation using the Mandarin speecon dataYing Sun, Daniel Willett, Raymond Brueckner, Rainer Gruhn, Dirk Bühler. [doi]

Decision directed constrained iterative speech enhancementAmit Das, John H. L. Hansen. [doi]

Speech recognition of foreign out-of-vocabulary words using a hierarchical language modelHirofumi Yamamoto, Gen-ichiro Kikui, Satoshi Nakamura, Yoshinori Sagisaka. [doi]

Sub-word unit based non-audible speech recognition using surface electromyographyMatthias Walliczek, Florian Kraft, Szu-Chen Stan Jou, Tanja Schultz, Alex Waibel. [doi]

An efficient segment-based speech compression technique for hand-held TTS systemsChang-Heon Lee, Sung-Kyo Jung, Thomas Eriksson, Won-Suk Jun, Hong-Goo Kang. [doi]

Multi-accent Chinese speech recognitionYi Liu, Pascale Fung. [doi]

Generalization of the minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP)Qiang Fu, Antonio Moreno-Daniel, Biing-Hwang Juang, Jian-Lai Zhou, Frank K. Soong. [doi]

Noise update modeling for speech enhancement: when do we do enough?Nitish Krishnamurthy, John H. L. Hansen. [doi]

Exploring the unknown - collecting 1000 speakers over the internet for the ph@ttsessionz database of adolescent speakersChristoph Draxler. [doi]

Robust acoustic-based syllable detectionZhimin Xie, Partha Niyogi. [doi]

Frequency warping by linear transformation of standard MFCCSankaran Panchapagesan. [doi]

Pronunciation variant-based multi-path HMMs for syllablesAnnika Hämäläinen, Louis ten Bosch, Lou Boves. [doi]

Influence of pause length on listeners² impressions in simultaneous interpretationHitomi Tohyama, Shigeki Matsubara. [doi]

Bootstrapping language models for dialogue systemsKarl Weilhammer, Matthew N. Stuttle, Steve Young. [doi]

Mapping neural networks for bandwidth extension of narrowband speechA. Shahina, B. Yegnanarayana. [doi]

A probabilistic graphical model for microphone array source separation using rich pre-trained source modelsHagai Thomas Attias. [doi]

Robust speech recognition over mobile networks using combined weighted viterbi decoding and subvector based error concealmentZheng-Hua Tan, Paul Dalsgaard, Børge Lindberg. [doi]

Pitch determination using aligned AMDFM. Shahidur Rahman, Hirobumi Tanaka, Tetsuya Shimamura. [doi]

All-pole model estimation of vocal tract on the frequency domainLuis Weruaga, Amar Al-Khayat. [doi]

Powered cepstral normalization (p-CN) for robust features in speech recognitionChang-Wen Hsu, Lin-Shan Lee. [doi]

Unsupervised model adaptation for speaker verificationAlexandre Preti, Jean-François Bonastre. [doi]

Objective estimation of suicidal risk using vocal output characteristicsT. Yingthawornsuk, H. Kaymaz Keskinpala, D. France, D. M. Wilkes, R. G. Shiavi, R. M. Salomon. [doi]

Respiratory/laryngeal interactions during sustained vowel production in childrenDonald S. Finan, Carol A. Boliek. [doi]

Speech enhancement based on residual noise shapingJong Won Shin, Seung Yeol Lee, Hwan Sik Yun, Nam Soo Kim. [doi]

LINTest: a development tool for testing dialogue systemsLars Degerstedt, Arne Jönsson. [doi]

Evaluation of perceptual quality of control point reduction in rule-based synthesisKimmo Pärssinen, Marko Moberg. [doi]

Investigation on Mandarin broadcast news speech recognitionMei-Yuh Hwang, Xin Lei, Wen Wang, Takahiro Shinozaki. [doi]

Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesisKatsumi Ogata, Makoto Tachibana, Junichi Yamagishi, Takao Kobayashi. [doi]

Performance improvement of dialog speech translation by rejecting unreliable utterancesToshiyuki Takezawa, Tohru Shimizu. [doi]

Extension and further analysis of higher order cepstral moment normalization (HOCMN) for robust features in speech recognitionChang-Wen Hsu, Lin-Shan Lee. [doi]

Scalable and portable web-based multimodal dialogue interaction with geographical databasesAlexander Gruenstein, Stephanie Seneff, Chao Wang. [doi]

Intelligibility of machine translation output in speech synthesisLaura Mayfield Tomokiyo, Kay Peterson, Alan W. Black, Kevin A. Lenzo. [doi]

Minimum classification error training of hidden Markov models for acoustic language identificationJosef G. Bauer, Ekaterina Timoshenko. [doi]

Analyzing reusability of speech corpus based on statistical multidimensional scaling methodGoshu Nagino, Makoto Shozakai. [doi]

Novel time domain multi-class SVMs for landmark detectionRahul Chitturi, Mark Hasegawa-Johnson. [doi]

External Links

Cite Key

Statistics

PDF

Researchr

INTERSPEECH 2006 - ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006

Abstract

Table of Contents