7th International Conference on Spoken Language Processing, ICSLP2002 - INTERSPEECH 2002, Denver, Colorado, USA, September 16-20, 2002

researchr

You are not signed in
Sign in
Sign up

John H. L. Hansen, Bryan L. Pellom, editors, 7th International Conference on Spoken Language Processing, ICSLP2002 - INTERSPEECH 2002, Denver, Colorado, USA, September 16-20, 2002. ISCA, 2002.

Conference: interspeech2002

Abstract is missing.

The evolution of spoken language: a comparative approachW. Tecumseh Fitch. 1-8 [doi]

Talking to machines (statistically speaking)Steve Young. 9-16 [doi]

Evaluation of a noise-robust DSR front-end on Aurora databasesDuncan Macho, Laurent Mauuary, Bernhard Noé, Yan Ming Cheng, Douglas Ealey, Denis Jouvet, Holly Kelleher, David Pearce, Fabien Saadoun. 17-20 [doi]

Qualcomm-ICSI-OGI features for ASRAndré Gustavo Adami, Lukás Burget, Stéphane Dupont, Harinath Garudadri, Frantisek Grézl, Hynek Hermansky, Pratibha Jain, Sachin S. Kajarekar, Nelson Morgan, Sunil Sivadas. 21-24 [doi]

Improving word accuracy with Gabor feature extractionMichael Kleinschmidt, David Gelbart. 25-28 [doi]

Evaluation of SPLICE on the Aurora 2 and 3 tasksJasha Droppo, Li Deng, Alex Acero. 29-32 [doi]

Performance of discriminatively trained auditory features on Aurora2 and Aurora3Brian Kan-Wing Mak, Yik-Cheung Tam. 33-36 [doi]

Evidence for efficiency in vowel productionR. J. J. H. van Son, Louis C. W. Pols. 37-40 [doi]

Stochastic suprasegmentals: relationship between the spectral characteristics of vowels, redundancy and prosodic structureMatthew P. Aylett. 41-44 [doi]

Motor specifications of a baby robot via the analysis of infants² vocalizationsJ. Serkhane, Jean-Luc Schwartz, Louis-Jean Boë, B. Davis, C. Matyear. 45-48 [doi]

Oral-laryngeal control patterns for fricatives in 5-year-olds and adultsLaura L. Koenig, Jorge C. Lucero. 49-52 [doi]

French nasal vowels: acoustic and articulatory propertiesVéronique Delvaux, Thierry Metens, Alain Soquet. 53-56 [doi]

Maximum likelihood estimation of eigenvoices and residual variances for large vocabulary speech recognition tasksPatrick Kenny, Gilles Boulianne, Pierre Dumouchel. 57-60 [doi]

Rapid speaker adaptation using speaker clusteringErnest Pusateri, Timothy J. Hazen. 61-64 [doi]

Adaptive model combination for dynamic speaker selection trainingChao Huang, Tao Chen, Eric Chang. 65-68 [doi]

Unsupervised n-best based model adaptation using model-level confidence measuresKa-Yan Kwan, Tan Lee, Chen Yang. 69-72 [doi]

LU factorization for feature transformationPatrick Nguyen, Luca Rigazio, Christian Wellekens, Jean-Claude Junqua. 73-76 [doi]

Same talker, different language: a replicationVerna Stockmal, Zinny S. Bond. 77-80 [doi]

Automatic language identification using acoustic sub-word unitsA. K. V. Sai Jayram, V. Ramasubramanian, T. V. Sreenivas. 81-84 [doi]

Factors in human language identificationIan Maddieson, Ioana Vasilescu. 85-88 [doi]

Approaches to language identification using Gaussian mixture models and shifted delta cepstral featuresPedro A. Torres-Carrasquillo, Elliot Singer, Mary A. Kohler, Richard J. Greene, Douglas A. Reynolds, John R. Deller Jr.. 89-92 [doi]

Methods to improve Gaussian mixture model based language identification systemEddie Wong, Sridha Sridharan. 93-96 [doi]

Part-of-speech tagging in French text-to-speech synthesis: experiments in tagset selectionHongyan Jing, Evelyne Tzoukermann. 97-100 [doi]

Grapheme-to-phoneme conversion using pseudo-morphological unitsUlla Uebler. 101-104 [doi]

Investigations on joint-multigram models for grapheme-to-phoneme conversionMaximilian Bisani, Hermann Ney. 105-108 [doi]

Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversionLucian Galescu, James F. Allen. 109-112 [doi]

The AT&t German text-to-speech system: realistic linguistic descriptionMatthias Jilka, Ann K. Syrdal. 113-116 [doi]

Generating script using statistical information of the context variation unit vectorHaiping Li, Fangxin Chen, Liqin Shen. 117-120 [doi]

Efficient and scalable methods for text script generation in corpus-based TTS designChih-Chung Kuo, Jing-Yi Huang. 121-124 [doi]

A statistically motivated database pruning technique for unit selection synthesisPeter Rutten, Matthew P. Aylett, Justin Fackrell, Paul Taylor. 125-128 [doi]

A new method of building decision tree based on target informationYi-Jian Wu, Yu Hu, Xiaoru Wu, Ren-Hua Wang. 129-132 [doi]

A context clustering technique for average voice model in HMM-based speech synthesisJunichi Yamagishi, Masatsune Tamura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi. 133-136 [doi]

Feature extraction for unit selection in concatenative speech synthesis: comparison between AIM, LPC, and MFCCMinoru Tsuzaki, Hisashi Kawai. 137-140 [doi]

Combined prosody and candidate unit selections for corpus-based text-to-speech systemsFrancisco Campillo Díaz, Eduardo Rodríguez Banga. 141-144 [doi]

Automatic segmentation combining an HMM-based approach and spectral boundary correctionYeon-Jun Kim, Alistair Conkie. 145-148 [doi]

Refined speech segmentation for concatenative speech synthesisAbhinav Sethy, Shrikanth S. Narayanan. 149-152 [doi]

Refocussing on the text normalisation process in text-to-speech systemsAndrew P. Breen, Barry Eggleton, Peter Dion, Steve Minnis. 153-156 [doi]

A text-to-speech synthesis system for teluguJithendra Vepa, Jahnavi Ayachitam, K. V. K. Kalpana Reddy. 157-160 [doi]

Towards an intonation module for a portuguese TTS systemDiamantino Freitas, Daniela Braga. 161-164 [doi]

Applying a hybrid intonation model to a seamless speech synthesizerTakashi Saito, Masaharu Sakamoto. 165-168 [doi]

Flexible multimodal human-machine interaction in mobile environmentsDirk Bühler, Wolfgang Minker, Jochen Häußler, Sven Krger. 169-172 [doi]

Implementation testing of a hybrid symbolic/statistical multimodal architectureEdward C. Kaiser, Philip R. Cohen. 173-176 [doi]

Belief network based disambiguation of object reference in spoken dialogue system for robotYoko Yamakata, Tatsuya Kawahara, Hiroshi G. Okuno. 177-180 [doi]

Specification and realisation of multimodal output in dialogue systemsJonas Beskow, Jens Edlund, Magnus Nordstrand. 181-184 [doi]

Gestural trajectory symmetries and discourse segmentationFrancis K. H. Quek, Yingen Xiong, David McNeill. 185-188 [doi]

Gestural spatialization in natural discourse segmentationFrancis K. H. Quek, David McNeill, Robert K. Bryll, Mary P. Harper. 189-192 [doi]

Real-time sound source localization and separation for robot auditionKazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano. 193-196 [doi]

CU animate tools for enabling conversations with animated charactersJiyong Ma, Jie Yan, Ronald Cole. 197-200 [doi]

Multiparty multimodal interaction: a preliminary analysisPhilip R. Cohen, Rachel Coulston, Kelly Krout. 201-204 [doi]

Distributed audio-visual speech synchronizationPeter Poller, Jochen Müller. 205-208 [doi]

Lip-reading based on a fully automatic statistical modelPhilippe Daubias, Paul Deléglise. 209-212 [doi]

Audio-visual continuous speech recognition using a coupled hidden Markov modelXiaoxing Liu, Yibao Zhao, Xiaobo Pi, Luhong Liang, Ara V. Nefian. 213-216 [doi]

Data, annotation schemes and coding tools for natural interactivityLaila Dybkjær, Niels Ole Bernsen. 217-220 [doi]

VisSTA: a tool for analyzing multimodal discourse dataFrancis K. H. Quek, Yang Shi, Cemil Kirbas, Shunguang Wu. 221-224 [doi]

Feature extraction combining spectral noise reduction and cepstral histogram equalization for robust ASRJosé C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio. 225-228 [doi]

Bell labs approach to Aurora evaluation on connected digit recognitionJingdong Chen, Dimitris Dimitriadis, Hui Jiang, Qi Li, Tor André Myrvoll, Olivier Siohan, Frank K. Soong. 229-232 [doi]

Algorithms for distributed speech recognition in a noisy automobile environmentHong Kook Kim, Richard C. Rose. 233-236 [doi]

Quantile based histogram equalization for online applicationsFlorian Hilger, Sirko Molau, Hermann Ney. 237-240 [doi]

Frontend post-processing and backend model enhancement on the Aurora 2.0/3.0 databasesChia-Ping Chen, Karim Filali, Jeff A. Bilmes. 241-244 [doi]

The influence of identification training on identification and production of the american English mid and low vowels by native speakers of JapaneseStephen Lambacher, William Martens, Kazuhiko Kakehi. 245-248 [doi]

Perceptual learning of second-language syllable rhythm by elderly listenersKeiichi Tajima, Reiko Akahane-Yamada, Tsuneo Yamada. 249-252 [doi]

Absolute pitch and lexical tones: tone perception by non-musician, musician, and absolute pitch non-tonal language speakersDenis K. Burnham, Ron Brooker. 257-260 [doi]

Comprehension of non-native speech: inaccurate phoneme processing and activation of lexical competitorsMirjam Broersma. 261-264 [doi]

Overview on recent activities in speech understanding and dialogue systems evaluationWolfgang Minker. 265-268 [doi]

DARPA communicator: cross-system results for the 2001 evaluationMarilyn A. Walker, Alexander I. Rudnicky, Rashmi Prasad, John S. Aberdeen, Elizabeth Owen Bratt, John S. Garofolo, Helen Wright Hastie, Audrey N. Le, Bryan L. Pellom, Alexandros Potamianos, Rebecca J. Passonneau, Salim Roukos, Gregory A. Sanders, Stephanie Seneff, David Stallard. 269-272 [doi]

DARPA communicator evaluation: progress from 2000 to 2001Marilyn A. Walker, Alexander I. Rudnicky, John S. Aberdeen, Elizabeth Owen Bratt, John S. Garofolo, Helen Wright Hastie, Audrey N. Le, Bryan L. Pellom, Alexandros Potamianos, Rebecca J. Passonneau, Rashmi Prasad, Salim Roukos, Gregory A. Sanders, Stephanie Seneff, David Stallard. 273-276 [doi]

Effects of word error rate in the DARPA communicator data during 2000 and 2001Gregory A. Sanders, Audrey N. Le, John S. Garofolo. 277-280 [doi]

Subset languages for conversing with collaborative interface agentsCandace L. Sidner, Clifton Forlines. 281-284 [doi]

Transformation of spectral envelope for voice conversion based on radial basis function networksTomomi Watanabe, Takahiro Murakami, Munehiro Namba, Tetsuya Hoya, Yoshihisa Ishida. 285-288 [doi]

Subband based voice conversionOytun Türk, Levent M. Arslan. 289-292 [doi]

Evaluation of cross-language voice conversion using bilingual and non-bilingual databasesMikiko Mashimo, Tomoki Toda, Hiromichi Kawanami, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell. 293-296 [doi]

Voice transformations for improving children²s speech recognition in a publicly available dialogue systemJoakim Gustafson, Kåre Sjölander. 297-300 [doi]

The ISL meeting corpus: the impact of meeting type on speech styleSusanne Burger, Victoria MacLaren, Hua Yu. 301-304 [doi]

A new method for testing dialogue systems based on simulations of real-world conditionsRamón López-Cózar, Ángel de la Torre, José C. Segura, Antonio J. Rubio, Juan M. López-Soler. 305-308 [doi]

Comfort noise detection and GSM-FR-codec detection for speech-quality evaluations in telephone networksThorsten Ludwig. 309-312 [doi]

Validation and improvement of automatic phonetic transcriptionsCatia Cucchiarini, Diana Binnenpoorte. 313-316 [doi]

Development of Japanese infant speech database and speaking rate analysisShigeaki Amano, Kazumi Kato, Tadahisa Kondo. 317-320 [doi]

Automatic prosodic break labeling for Mandarin Chinese speech dataMinghui Dong, Kim-Teng Lua. 321-324 [doi]

Orientel: speech-based interactive communication applications for the mediterranean and the middle eastImed Zitouni, Joseph P. Olive, Dorota J. Iskra, Khalid Choukri, Ossama Emam, Oren Gedge, Emmanuel Maragoudakis, Herbert S. Tropf, Asunción Moreno, Albino Nogueiras Rodriguez, Barbara Heuft, Rainer Siemund. 325-328 [doi]

The reliability of the ITU-t p.85 standard for the evaluation of text-to-speech systemsYolanda Vazquez-Alvarez, Mark Huckvale. 329-332 [doi]

Automatic generation of phonetic transcriptions for large speech corporaKris Demuynck, Tom Laureys, Steven Gillis. 333-336 [doi]

Overview on recent activities in speech understanding and dialogue systems evaluationWolfgang Minker. 337-340 [doi]

The carnegie mellon communicator corpusChristina L. Bennett, Alexander I. Rudnicky. 341-344 [doi]

Globalphone: a multilingual speech and text database developed at karlsruhe universityTanja Schultz. 345-348 [doi]

On developing new text and audio corpora and speech recognition tools for the turkish languageÖzgül Salor, Bryan L. Pellom, Tolga Çiloglu, Kadri Hacioglu, Mübeccel Demirekler. 349-352 [doi]

FORM: an extensible, kinematically-based gesture annotation schemeCraig Martell. 353-356 [doi]

Automatic phoneme alignment based on acoustic-phonetic modelingJohn-Paul Hosom. 357-360 [doi]

Extracting clauses for spoken language understanding in conversational systemsNarendra K. Gupta, Srinivas Bangalore, Mazin G. Rahim. 361-364 [doi]

Issues in the development of a stochastic speech understanding systemF. Lefèvre, Hélène Bonneau-Maynard. 365-368 [doi]

10 years of phondat-II: a reassessmentHartmut R. Pfitzinger. 369-372 [doi]

Risk based lattice cutting for segmental minimum Bayes-risk decodingShankar Kumar, William Byrne. 373-376 [doi]

Dynamic search-space pruning for time-constrained speech recognitionSascha Wendt, Gernot A. Fink, Franz Kummert. 377-380 [doi]

A Gaussian selection method for multi-mixture HMM based continuous speech recognitionRaymond H. Lee, Eric H. C. Choi. 381-384 [doi]

On use of duration modeling for continuous digits speech recognitionRong Dong, Jie Zhu. 385-388 [doi]

Arc minimization in finite state decoding graphs with cross-word acoustic contextGeoffrey Zweig, George Saon, François Yvon. 389-392 [doi]

Fast hierarchical grammar optimization algorithm toward time and space efficiencyJing Zheng, Horacio Franco. 393-396 [doi]

Dynamic tuning of language model score in speech recognition using a confidence measureSherif Abdou, Michael S. Scordilis. 397-400 [doi]

Minimum perfect hashing for fast n-gram language model lookupXiao Zhang, Yunxin Zhao. 401-404 [doi]

Combining search spaces of heterogeneous recognizers for improved speech recognitonXiang Li, Rita Singh, Richard M. Stern. 405-408 [doi]

Transmission characteristics of outer ear canalKarel Pellant, Jan Mejzlík, Karel Prikryl, Zdenek Skvor. 409-412 [doi]

Hearing-aid benefits and limitations: predictions from a cochlear modelJames M. Kates. 413-416 [doi]

A psychoacoustic basis for spectral sharpeningPeggy B. Nelson, Jeffrey J. DiGiovanni, Robert S. Schlauch. 417-420 [doi]

Model-based predictions of intensity discrimination for normal- and impaired-hearing listenersLisa G. Huettel, Leslie M. Collins. 421-424 [doi]

Modeling the perception of frequency-shifted vowelsPeter F. Assmann, Terrance M. Nearey, Jack M. Scott. 425-428 [doi]

A phoneme recognizer for the hearing impairedMathias Johansson, Mats Blomberg, Kjell Elenius, Lars-Erik Hoffsten, Anders Torberger. 433-436 [doi]

HMM COmposition-based rapid model adaptation using a priori noise GMM adaptation evaluation on Aurora2 corpusMasaki Ida, Satoshi Nakamura. 437-440 [doi]

Data-driven temporal filters obtained via different optimization criteria evaluated on Aurora2 databaseJeih-Weih Hung, Lin-Shan Lee. 441-444 [doi]

Efficient additive and convolutional noise reduction proceduresBojan Kotnik, Damjan Vlaj, Zdravko Kacic, Bogomir Horvat. 445-448 [doi]

Progress with the philips continuous ASR system on the Aurora 2 noisy digits databaseMarkus Lieb, Alexander Fischer. 449-452 [doi]

An environment compensated minimum classification error training approach and its evaluation on Aurora2 databaseJian Wu, Qiang Huo. 453-456 [doi]

Evaluation of a noise adaptive speech recognition system on the Aurora 3 databaseKaisheng Yao, Donglai Zhu, Satoshi Nakamura. 457-460 [doi]

Distributed speech recognition over IP networks on the Aurora 3 databaseLaura Docío Fernández, Carmen García-Mateo. 461-464 [doi]

Evaluation of noisy speech recognition based on noise reduction and acoustic model adaptation on the Aurora2 tasksMasakiyo Fujimoto, Yasuo Ariki. 465-468 [doi]

Improvements to the IBM Aurora 2 multi-condition systemGeorge Saon, Juan M. Huerta. 469-472 [doi]

Distributed speech recognition using noise-robust MFCC and traps-estimated manner featuresPratibha Jain, Hynek Hermansky, Brian Kingsbury. 473-476 [doi]

Evaluation of spectral subtraction with smoothing of time direction on the Aurora 2 taskNorihide Kitaoka, Seiichi Nakagawa. 477-480 [doi]

Evaluation of noise robust features on the Aurora databasesXiaodong Cui, Markus Iseli, Qifeng Zhu, Abeer Alwan. 481-484 [doi]

Computationally efficient noise compensation for robust automatic speech recognition assessed under the Aurora 2/3 frameworkNicholas W. D. Evans, John S. D. Mason. 485-488 [doi]

Likelihood combination and recognition output voting for the decoding of non-native speech with multilingual HMMsVolker Fischer, Eric Janke, Siegfried Kunzmann. 489-492 [doi]

Stochastic trajectory model analysis for accent classificationPongtep Angkititrakul, John H. L. Hansen. 493-496 [doi]

Multilingual pronunciation modeling for improving multilingual speech recognitionJilei Tian, Juha Häkkinen, Olli Viikki. 497-500 [doi]

On text-based language identification for multilingual speech recognition systemsJilei Tian, Juha Häkkinen, Søren Riis, Kåre Jean Jensen. 501-504 [doi]

Multilingual speech recognition with language identificationBin Ma, Cuntai Guan, Haizhou Li, Chin-Hui Lee. 505-508 [doi]

Robust HMM training for unified dutch and German speech recognitionRathi Chengalvarayan. 509-512 [doi]

Using cross-language cues for story-specific language modelingSanjeev Khudanpur, Woosung Kim. 513-516 [doi]

Full-text story alignment models for Chinese-English bilingual news corporaBing Zhao, Stephan Vogel. 517-520 [doi]

Comparison of acoustic distance measures for automatic cross-language phoneme mappingJayren J. Sooful, Elizabeth C. Botha. 521-524 [doi]

Maximum expected likelihood based model selection and adaptation for nonnative English speakersXiaodong He, Yunxin Zhao. 525-528 [doi]

Integration of MLLR adaptation with pronunciation proficiency adaptation for non-native speech recognitionNobuaki Minematsu, Gakuto Kurata, Keikichi Hirose. 529-532 [doi]

Native and vietnamese production of compound and phrasal stress patternsThu Nguyen, John Ingram. 533-536 [doi]

On the function of the late rise and the early fall in dutch dialogue: a perception experimentJohanneke Caspers. 537-540 [doi]

Holds as gestural correlates to empty and filled speech pausesAnna Esposito, Susan Duncan, Francis K. H. Quek. 541-544 [doi]

Linguistic and acoustic changes of user²s utterances caused by different dialogue situationsToshihiko Itoh, Atsuhiko Kai, Tatsuhiro Konishi, Yukihiro Itoh. 545-548 [doi]

Automatic user-adaptive speaking rate selection for information deliveryNigel Ward, Satoshi Nakagawa. 549-552 [doi]

Coordination of referring expressions in multimodal human-computer dialogueGabriel Skantze. 553-556 [doi]

A comparison between feedback strategies in human-to-human and human-machine communicationLoredana Cerrato. 557-560 [doi]

Adaptation of users² spoken dialogue patterns in a conversational interfaceCourtney Darves, Sharon L. Oviatt. 561-564 [doi]

Unsupervised speaker segmentation of telephone conversationsAaron E. Rosenberg, Allen L. Gorin, Zhu Liu, S. Parthasarathy. 565-568 [doi]

An effective unsupervised scheme for multiple-speaker-change detectionP. Sivakumaran, Aladdin M. Ariyaeeinia, J. Fortuna. 569-572 [doi]

Unknown-multiple speaker clustering using HMMJitendra Ajmera, Hervé Bourlard, I. Lapidot, Iain McCowan. 573-576 [doi]

Speaker utterances tying among speaker segmented audio documents using hierarchical classification: towards speaker indexing of audio databasesSylvain Meignier, Jean-François Bonastre, Ivan Magrin-Chagnolleau. 577-580 [doi]

A comparative study of adaptation methods for speaker verificationJohnny Mariéthoz, Samy Bengio. 581-584 [doi]

Speaker verification with data fusion and model adaptationKevin R. Farrell. 585-588 [doi]

An adaptive speaker verification system with speaker dependent a priori decision thresholdsNikki Mirghafori, Larry P. Heck. 589-592 [doi]

A trainable spoken language understanding system for visual object selectionDeb Roy, Peter Gorniak, Niloy Mukherjee, Joshua Juster. 593-596 [doi]

Named entity extraction from spontaneous speech in how may i help you?Frédéric Béchet, Allen L. Gorin, Jerry H. Wright, Dilek Z. Hakkani-Tür. 597-600 [doi]

Recognition error processing for speech understandingCaroline Bousquet-Vernhettes, Nadine Vigouroux. 601-604 [doi]

Using part-of-speech tags, context thresholding, and trigram contexts to improve the auto-induction of semantic classesAndrew N. Pargellis, Eric Fosler-Lussier, Augustine Tsai. 605-608 [doi]

Combination of statistical and rule-based approaches for spoken language understandingYe-Yi Wang, Alex Acero, Ciprian Chelba, Brendan J. Frey, Leon Wong. 609-612 [doi]

Chinese spoken language analyzing based on combination of statistical and rule methodsGuodong Xie, Chengqing Zong, Bo Xu. 613-616 [doi]

A maximum entropy semantic parser using word classesNorbert Pfannerer. 617-620 [doi]

Speech watermarking through parametric modelingAparna Gurijala, John R. Deller Jr., Michael S. Seadle, John H. L. Hansen. 621-624 [doi]

An education software in teaching automatic speech recognition (ASR)Hong Kai Sze, Sh-Hussain Salleh. 625-628 [doi]

Multimodal integration patterns in childrenBenfang Xiao, Cynthia Girand, Sharon L. Oviatt. 629-632 [doi]

ASR in a human word recognition model: generating phonemic input for shortlistOdette Scharenborg, Lou Boves, Johan de Veth. 633-636 [doi]

Sign language translation using an error tolerant retrieval algorithmChung-Hsien Wu, Yu-Hsien Chiu, Kung-Wei Cheng. 637-640 [doi]

A sound source classification system based on subband processingOytun Türk, Omer Sayli, Helin Dutagaci, Levent M. Arslan. 641-644 [doi]

Automatic sign translationYing Zhang, Bing Zhao, Jie Yang, Alex Waibel. 645-648 [doi]

A study on the classification of whispered and normally phonated speechStanley J. Wenndt, Edward J. Cupples, Richard M. Floyd. 649-652 [doi]

Experiments on recognition of lavalier microphone speech and whispered speech in real world environmentsKiyoshi Tatara, Taisuke Ito, Parham Zolfaghari, Kazuya Takeda, Fumitada Itakura. 653-656 [doi]

An effect of amplitude modulation on perceptual segregation of tone sequencesMamoru Iwaki, Hiromi Seki. 657-660 [doi]

Automatic recognition of dutch dysarthric speech: a pilot studyEric Sanders, Marina B. Ruiter, Lilian Beijer, Helmer Strik. 661-664 [doi]

Evaluation of a system for concatenative articulatory visual speech synthesisOlov Engwall. 665-668 [doi]

Intrasyllabic articulatory control constraints in verbal working memoryMarc Sato, Jean-Luc Schwartz, Marie-Agnès Cathiard, Christian Abry, Hélène Loevenbruck. 669-672 [doi]

Towards a grammar of spoken language: incorporating paralinguistic informationNick Campbell. 673-676 [doi]

State clustering improvements for continuous HMMs in a Spanish large vocabulary recognition systemRicardo de Córdoba, Javier Macías Guarasa, Javier Ferreiros, Juan Manuel Montero, José Manuel Pardo. 677-680 [doi]

A comparison of HTK, ISIP and julius in slovenian large vocabulary continuous speech recognitionTomaz Rotovnik, Mirjam Sepesy Maucec, Bogomir Horvat, Zdravko Kacic. 681-684 [doi]

Parametric trajectory segment model for LVCSRLei Jia, Bo Xu. 685-688 [doi]

Efficient precalculation of LM contexts for large vocabulary continuous speech recognitionJavier Dieguez-Tirado, Antonio Cardenal López. 689-692 [doi]

Integrating multiple pronunciations during MCE-based acoustic model training for large vocabulary speech recognitionRathi Chengalvarayan. 693-696 [doi]

A hybrid approach to compounds in LVCSRTom Laureys, Vincent Vandeghinste, Jacques Duchateau. 697-700 [doi]

A confidence measure based on agreement among multiple LVCSR models - correlation between pair of acoustic models and confidenceTakehito Utsuro, Tetsuji Harada, Hiromitsu Nishizaki, Seiichi Nakagawa. 701-704 [doi]

Combining lexical and morphological knowledge in language model for inflectional (czech) languageJan Nouza, Jindra Drabkova. 705-708 [doi]

Modeling frequent allophones in Japanese speech recognitionLong Nguyen, Xuefeng Guo, John Makhoul. 709-712 [doi]

The structure and its implementation of hidden dynamic HMM for Mandarin speech recognitionFeili Chen, Jie Zhu, Wentao Song. 713-716 [doi]

A new lexicon optimization method for LVCSR based on linguistic and acoustic characteristics of wordsTakahiro Shinozaki, Sadaoki Furui. 717-720 [doi]

Retrieving phrases by selecting the history: application to automatic speech recognitionDavid Langlois, Kamel Smaïli, Jean-Paul Haton. 721-724 [doi]

Compact subnetwork-based large vocabulary continuous speech recognitionDong-Hoon Ahn, Minhwa Chung. 725-728 [doi]

A comparison of four language models for large vocabulary turkish speech recognitionHelin Dutagaci, Levent M. Arslan. 729-732 [doi]

Speech recognition for language teaching and evaluating: a study of existing commercial productsRebecca Hincks. 733-736 [doi]

Automatic intelligibility assessment and diagnosis of critical pronunciation errors for computer-assisted pronunciation learningAntoine Raux, Tatsuya Kawahara. 737-740 [doi]

Effects of production training with visual feedback on the acquisition of Japanese pitch and durational contrastsYukari Hirata. 741-744 [doi]

Acoustic modeling of sentence stress using differential features between syllables for English rhythm learning system developmentNobuaki Minematsu, Satoshi Kobashikawa, Keikichi Hirose, Donna Erickson. 745-748 [doi]

Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning systemKazunori Imoto, Yasushi Tsubota, Antoine Raux, Tatsuya Kawahara, Masatake Dantsuji. 749-752 [doi]

Perception of tone and vowel quantity in ThaiHansjörg Mixdorff, Sudaporn Luksaneeyanawin, Hiroya Fujisaki, Patavee Charnvivit. 753-756 [doi]

Duration and F0 as perceptual cues to Japanese vowel quantityKeisuke Kinoshita, Dawn M. Behne, Takayuki Arai. 757-760 [doi]

Effects of intra-phrase position on acceptability of changes in segmental duration in sentence speechMakiko Muto, Hiroaki Kato, Minoru Tsuzaki, Yoshinori Sagisaka. 761-764 [doi]

Processing of temporal cues marking phrasal boundaries in individuals with brain damageWendi A. Aasland, Shari R. Baum. 769-772 [doi]

A real-time acoustic human-machine front-end for multimedia applications integrating robust adaptive beamforming and stereophonic acoustic echo cancellationWolfgang Herbordt, J. Ying, Herbert Buchner, Walter Kellermann. 773-776 [doi]

Enhancement of single channel speech using perception-based wavelet transformChing-Ta Lu, Hsiao-Chuan Wang. 777-780 [doi]

Speech enhancement based on a perceptual modification of wiener filteringL. Lin, W. H. Holmes, Eliathamby Ambikairajah. 781-784 [doi]

A new approach to speech enhancement by a microphone array using EM and mixture modelsHagai Attias, Li Deng. 785-788 [doi]

Acoustic echo cancellation based on m-channel IIR cosine-modulated filter bankSang-Gyun Kim, Chang D. Yoo. 789-792 [doi]

High performance digit recognition in real car environmentsUmit H. Yapanel, Xianxian Zhang, John H. L. Hansen. 793-796 [doi]

Multiple regression of log-spectra for in-car speech recognitionTetsuya Shinde, Kazuya Takeda, Fumitada Itakura. 797-800 [doi]

Experiments on speaker-independent voice command recognition using in-vehicle hands free speechYifan Gong, Lorin Netsch. 801-804 [doi]

Application of over-complete blind source separation for robust automatic speech recognitionShubha Kadambe. 805-808 [doi]

Porting channel robustness across languagesFrançoise Beaufays, Daniel Boies, Mitch Weintraub. 809-812 [doi]

An efficient dialogue control method using decision tree-based estimation of out-of-vocabulary word attributesYasuhiro Takahashi, Kohji Dohsaka, Kiyoaki Aikawa. 813-816 [doi]

Semantic inference: a data-driven solution for NL interactionJerome R. Bellegarda. 817-820 [doi]

Unified task knowledge for spoken language understanding and dialog managementJerry H. Wright, Alicia Abella, Allen L. Gorin. 821-824 [doi]

Distributed Chinese keyword spotting and verification for spoken dialogues under wireless environmentYun-Tien Lee, Cheng-Huang Wu, Yumin Lee, Lin-Shan Lee. 825-828 [doi]

A method for evaluating incremental utterance understanding in spoken dialogue systemsRyuichiro Higashinaka, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa. 829-832 [doi]

Detection and recognition of repaired speech on misrecognized utterances for speech input of car navigation systemNaoko Kakutani, Norihide Kitaoka, Seiichi Nakagawa. 833-836 [doi]

Ingressive speech as an indication that humans are talking to humans (and not to machines)Robert Eklund. 837-840 [doi]

Compensating for hyperarticulation by modeling articulatory propertiesHagen Soltau, Florian Metze, Alex Waibel. 841-844 [doi]

Forms of introduction in map task dialogues: case of L2 Russian speakersOlga Goubanova. 845-848 [doi]

Bridges: regions between discourse segmentsNanette Veilleux. 849-852 [doi]

Robust semantic confidence scoringDidier Guillevic, Simona Gandrabur, Yves Normandin. 853-856 [doi]

Statistically based approach to rejection of incorrectly recognized wordsLudek Müller, Tomás Bartos. 857-860 [doi]

Learning decision trees to determine turn-taking by spoken dialogue systemsRyo Sato, Ryuichiro Higashinaka, Masafumi Tamoto, Mikio Nakano, Kiyoaki Aikawa. 861-864 [doi]

Integration of phonetic length properties in the acoustic models of false starts and out-of-vocabulary wordsH. Hamimed, Géraldine Damnati. 865-868 [doi]

N-word-sequence frequency noise mitigation for SLM based on binomial distributionYibao Zhao, Guojun Zhou. 869-872 [doi]

Combining acoustic and language information for emotion recognitionChul-Min Lee, Shrikanth S. Narayanan, Roberto Pieraccini. 873-876 [doi]

A figure of merit for the analysis of spoken dialog systemsKadri Hacioglu, Wayne Ward. 877-880 [doi]

Selective back-off smoothing for incorporating grammatical constraints into the n-gram language modelTomoyosi Akiba, Katunobu Itou, Atsushi Fujii, Tetsuya Ishikawa. 881-884 [doi]

Backoff hierarchical class n-gram language modelling for automatic speech recognition systemsImed Zitouni, Olivier Siohan, Hong-Kwang Jeff Kuo, Chin-Hui Lee. 885-888 [doi]

Constructing small language models from grammarsFrancis Picard, Dominique Boucher, Guy Lapalme. 889-892 [doi]

Improve latent semantic analysis based language model by integrating multiple level knowledgeRong Zhang, Alexander I. Rudnicky. 893-896 [doi]

Individual word language models and the frequency approachElvira I. Sicilia-Garcia, Ji Ming, F. Jack Smith. 897-900 [doi]

Efficient construction of long-range language models using log-linear interpolationEdward W. D. Whittaker, Dietrich Klakow. 905-908 [doi]

Integration of two stochastic context-free grammarsAnna Corazza. 909-912 [doi]

Grammar specialisation meets language modellingManny Rayner, Beth Ann Hockey, John Dowding. 913-916 [doi]

Maximum entropy model for punctuation annotation from speechJing Huang, Geoffrey Zweig. 917-920 [doi]

An automatic sentence boundary detector based on a structured language modelShinsuke Mori. 921-924 [doi]

Improved katz smoothing for language modeling in speech recognitonGenqing Wu, Fang Zheng, Wenhu Wu, Mingxing Xu, Ling Jin. 925-928 [doi]

On the use of structures in language models for dialogueRenato de Mori, Yannick Estève, Christian Raymond. 929-932 [doi]

Semantic structured language modelsHakan Erdogan, Ruhi Sarikaya, Yuqing Gao, Michael Picheny. 933-936 [doi]

Statistical language modeling with prosodic boundaries and its use for continuous speech recognitionKeikichi Hirose, Nobuaki Minematsu, Makoto Terao. 937-940 [doi]

Noise robust speech recognition using F0 contour extracted by hough transformKoji Iwano, Takahiro Seki, Sadaoki Furui. 941-944 [doi]

Sharing relative stress of cross-word syllables and lexical stress to spontaneous speech recognitionFarshad Almasganj, Farhad D. Dehnavi, Mahmood Bijankhan. 945-948 [doi]

Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cuesDon Baron, Elizabeth Shriberg, Andreas Stolcke. 949-952 [doi]

Pitch accent prediction using ensemble machine learningXuejing Sun. 953-956 [doi]

Kymographic imaging of the vocal fold oscillationsJan G. Svec, Frantisek Sram. 957-960 [doi]

Assessment of consonant articulation in glossectomee speech by dynamic MRIKatalin Mády, Robert Sader, A. Zimmermann, P. Hoole, A. Beer, Hans-Florian Zeilhofer, Ch. Hannig. 961-964 [doi]

An EPG therapy protocol for remediation and assessment of articulation disordersAlan Wrench, Fiona Gibbon, Alison M. McNeill, Sara Wood. 965-968 [doi]

How speakers with and without speech impairment mark the question statement contrastRupal Patel. 969-972 [doi]

Vowel classification for computer-based visual feedback for speech training for the hearing impairedStephen A. Zahorian, A. Matthew Zimmer, Fansheng Meng. 973-976 [doi]

All-pole modeling of wide-band speech using weighted sum of the LSP polynomialsPaavo Alku, Tomas Bäckström. 977-980 [doi]

Analysis and synthesis of the phonatory excitation signal by means of a pair of polynomial shaping functionsJean Schoentgen. 981-984 [doi]

Optimal speech signal partition into one-quasiperiodical segmentsTaras K. Vintsiuk. 985-988 [doi]

Sparse and independent representations of speech signals based on parametric modelsHugo Leonardo Rufiner, Luis F. Rocha, John Goddard Close. 989-992 [doi]

Improvement of the ELS-based time-varying complex speech analysisKeiichi Funaki. 993-996 [doi]

Maximum mutual information training of hidden Markov models with vector linear predictorsK. K. Chin, Philip C. Woodland. 997-1000 [doi]

A sparse modeling approach to speech recognition based on relevance vector machinesJ. E. Hamaker, Joseph Picone, Aravind Ganapathiraju. 1001-1004 [doi]

Mutual information phone clustering for decision tree inductionCiprian Chelba, Rachel Morton. 1005-1008 [doi]

Rethinking derived acoustic features in speech recognitionKevin S. Van Horn. 1009-1012 [doi]

Modeling HMM state distributions with Bayesian networksKonstantin Markov, Satoshi Nakamura. 1013-1016 [doi]

Mel-scaled wavelet filter based features for noisy unvoiced phoneme recognitionOmar Farooq, S. Datta. 1017-1020 [doi]

Filter bank subtraction for robust speech recognitionKazuo Onoe, Hiroyuki Segi, Takeshi Kobayakawa, Shoei Sato, Toru Imai, Akio Ando. 1021-1024 [doi]

Low cost duration modelling for noise robust speech recognitionAndrew C. Morris, Simon Payne, Hervé Bourlard. 1025-1028 [doi]

A comparative study of approximations for parallel model combination of static and dynamic parametersYifan Gong. 1029-1032 [doi]

Noise estimation for efficient speech enhancement and robust speech recognitionPetr Motlícek, Lukás Burget. 1033-1036 [doi]

The 2001 GMTK-based SPINE ASR systemÖzgür Çetin, Harriet J. Nock, Katrin Kirchhoff, Jeff A. Bilmes, Mari Ostendorf. 1037-1040 [doi]

Using adaptive signal limiter together with weighting techniques for noisy speech recognitionWei-Wen Hung. 1041-1044 [doi]

Spectral subtraction in noisy environments applied to speaker adaptation based on HMM sufficient statisticsShingo Yamade, Kanako Matsunami, Akira Baba, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano. 1045-1048 [doi]

Robust speech recognition against short-time noiseMan-Hung Siu, Yu-Chung Chan. 1049-1052 [doi]

Word endpoints detection in the presence of non-stationary noiseMario Toma, Andrea Lodi 0002, Roberto Guerrieri. 1053-1056 [doi]

Comparison and combination of RASTA-PLP and FF features in a hybrid HMM/MLP speech recognition systemPere Pujol Marsal, Susagna Pol, Astrid Hagen, Hervé Bourlard, Climent Nadeu. 1057-1060 [doi]

Robust MMSE-FW-LAASR scheme at low SNRsTao Xu, Zhigang Cao. 1061-1064 [doi]

Robust speech recognition using a voiced-unvoiced featureAndrás Zolnay, Ralf Schlüter, Hermann Ney. 1065-1068 [doi]

Accumulated kullback divergence for analysis of ASR performance in the presence of noiseFebe de Wet, Johan de Veth, Bert Cranen, Lou Boves. 1069-1072 [doi]

A hybrid HMM/traps model for robust voice activity detectionBrian Kingsbury, Pratibha Jain, André Gustavo Adami. 1073-1076 [doi]

Run time information fusion in speech recognitionChengyi Zheng, YongHong Yan. 1077-1080 [doi]

Laryngoscopic analysis of tibetan chanting modes and their relationship to register in sino-tibetanJohn H. Esling. 1081-1084 [doi]

A corpus-based study of danish laryngealizationKathleen Murray, Betina Simonsen. 1085-1088 [doi]

Variability in direction of dorsal movement during production of /l/Natasha Warner, Allard Jongman, Doris Mcke. 1089-1092 [doi]

Segmentation of glides with tonal alignment as referenceYi Xu, Fang Liu. 1093-1096 [doi]

Variability in the production of glottalized sonorants: data from yapeseIan Maddieson, Julie Larson. 1097-1100 [doi]

A phonetic study of vietnamese tones: acoustic and electroglottographic measurementsVu Ngoc Tuan, Christophe d Alessandro, Sophie Rosset. 1101-1104 [doi]

Segment duration in spoken koreanHyunsong Chung. 1105-1108 [doi]

Pause duration and variability in read textsElena Zvonik, Fred Cummins. 1109-1112 [doi]

Intrinsic phone durations are speaker-specificHartmut R. Pfitzinger. 1113-1116 [doi]

Preaspirated stops in southern SwedishMechtild Tronnier. 1117-1120 [doi]

Stop epenthesis at syllable boundariesNatasha Warner, Andrea Weber. 1121-1124 [doi]

An analysis of transcription consistency in spontaneous speech from the buckeye corpusWilliam D. Raymond, Mark A. Pitt, Keith Johnson, Elizabeth Hume, Matthew Makashay, Robin Dautricourt, Craig Hilts. 1125-1128 [doi]

Contextual effects on voicing judgment of stop consonants in JapaneseMakiko Aoyagi. 1129-1132 [doi]

Discrimination of English vowels in consonantal contexts by native speakers of Japanese and its relations to dynamic information of formantsAkiyo Joto, Motohisa Imaishi, Yoshiki Nagase, Seiya Funatsu. 1133-1136 [doi]

Improving spoken language understanding using word confusion networksGökhan Tür, Jerry H. Wright, Allen L. Gorin, Giuseppe Riccardi, Dilek Z. Hakkani-Tür. 1137-1140 [doi]

Improving latent semantic indexing based classifier with information gainLi Li, Wu Chou. 1141-1144 [doi]

Discriminative training for call classification and routingHong-Kwang Jeff Kuo, Chin-Hui Lee, Imed Zitouni, Eric Fosler-Lussier, Egbert Ammicht. 1145-1148 [doi]

Speech and language processing for a constrained speech translation systemStephen Cox. 1149-1152 [doi]

Automatic concept identification in goal-oriented conversationsAnanlada Chotimongkol, Alexander I. Rudnicky. 1153-1156 [doi]

Using EM-trained string-edit distances for approximate matching of acoustic morphemesMichael Levit, Elmar Nöth, Allen L. Gorin. 1157-1160 [doi]

Speech-enabled natural language call routing: BBN call directorPremkumar Natarajan, Rohit Prasad, Bernhard Suhm, Daniel McCarthy. 1161-1164 [doi]

Quantitative evaluation of relevant prosodic factors for text-to-speech synthesis in SpanishDavid Escudero Mancebo, César González Ferreras, Valentín Cardeñoso-Payo. 1165-1168 [doi]

Tone recognition in Thai continuous speech based on coarticulaion, intonation and stress effectsNuttakorn Thubthong, Boonserm Kijsirikul, Sudaporn Luksaneeyanawin. 1169-1172 [doi]

Combination of pause and F0 information in dependency analysis of Japanese sentencesKazuyuki Takagi, Hajime Kubota, Kazuhiko Ozeki. 1173-1176 [doi]

Estimating syntactic structure from F0 contour and pause duration in Japanese speechYasuo Horiuchi, Tomoko Ohsuga, Akira Ichikawa. 1177-1180 [doi]

Extraction of important sentences using F0 information for speech summarizationYoichi Yamashita, Akira Inoue. 1181-1184 [doi]

Influence of prosody, context, and word order in the identification of focus in Japanese dialogueTatsuya Kitamura, Kayo Itoh, Toshihiko Itoh, Shigeyoshi Kitazawa. 1185-1188 [doi]

Influence of different dialogue situations on user²s behavior in spoken correctionsAtsuhiko Kai, Yukari Nonomura, Toshihiko Itoh, Tatsuhiro Konishi, Yukihiro Itoh. 1189-1192 [doi]

Interpreting meaning from context: modeling the prosody of discourse markers in speechLi-chiung Yang. 1193-1196 [doi]

Prosodic parameter for speaker identificationKatarina Bartkova, David Le Gac, Delphine Charlet, Denis Jouvet. 1197-1200 [doi]

Juncture segmentation of Japanese prosodic unit based on the spectrographic featuresShigeyoshi Kitazawa, Toshihiko Itoh, Tatsuya Kitamura. 1201-1204 [doi]

Recognition and verification of English by Japanese students for computer-assisted language learning systemYasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji. 1205-1208 [doi]

Feedback in computer assisted pronunciation training: technology push or demand pull?Ambra Neri, Catia Cucchiarini, Helmer Strik. 1209-1212 [doi]

Corpus-based analysis of English spoken by Japanese students in view of the entire phonemic system of EnglishNobuaki Minematsu, Gakuto Kurata, Keikichi Hirose. 1213-1216 [doi]

Computer-assisted second-language speech learning: generalization of prosody-focused trainingDebra M. Hardison. 1217-1220 [doi]

Predicting oral reading miscuesJack Mostow, Joseph Beck, S. Vanessa Winter, Shaojun Wang, Brian Tobin. 1221-1224 [doi]

Implementation of an intonational quality assessment systemChanwoo Kim, Wonyong Sung. 1225-1228 [doi]

English call system with functions of speech segmentation and pronunciation evaluation using speech recognition technologyYasuo Ariki, Jun Ogata. 1229-1232 [doi]

Weighted graph based decision tree optimization for high accuracy acoustic modelingSheng Gao, Jin-Song Zhang, Satoshi Nakamura, Chin-Hui Lee, Tat-Seng Chua. 1233-1236 [doi]

Speech recognition using syllable patternsLi Zhang, William H. Edmondson. 1237-1240 [doi]

A comparison of L1 and african-mother-tongue acoustic models for south african English speech recognitionJanus D. Brink, Elizabeth C. Botha. 1241-1244 [doi]

Speech modeling using variational Bayesian mixture of GaussiansPanu Somervuo. 1245-1248 [doi]

On the use of Gaussian mixture model for speaker variability analysisTao Chen, Chao Huang, Eric Chang, Jingchun Wang. 1249-1252 [doi]

Models of speech dynamics in a segmental-HMM recognizer using intermediate linear representationsPhilip J. B. Jackson, Martin J. Russell. 1253-1256 [doi]

Decision tree distribution tying based on a dimensional split techniqueHeiga Zen, Keiichi Tokuda, Tadashi Kitamura. 1257-1260 [doi]

Speech synthesis, speech simulation and speech scienceMark Huckvale. 1261-1264 [doi]

Expressive speech synthesis using a concatenative synthesizerMurtaza Bulut, Shrikanth S. Narayanan, Ann K. Syrdal. 1265-1268 [doi]

Eigenvoices for HMM-based speech synthesisKengo Shichiri, Atsushi Sawabe, Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura. 1269-1272 [doi]

Combining information sources for memory-based pitch accent placementErwin Marsi, Bertjan Busser, Walter Daelemans, Véronique Hoste, Martin Reynaert, Antal van den Bosch. 1273-1276 [doi]

Eye-fixation as a measure of real-time processing of synthesized wordsMary D. Swift, Ellen Campana, James F. Allen, Michael K. Tanenhaus. 1277-1280 [doi]

User-tailored generation for spoken dialogue: an experimentAmanda Stent, Marilyn A. Walker, Steve Whittaker, Preetam Maloor. 1281-1284 [doi]

A system that learns to describe objects in visual scenesDeb Roy. 1285-1288 [doi]

Integration of supra-lexical linguistic models with speech recognition using shallow parsing and finite state transducersXiaolong Mou, Stephanie Seneff, Victor Zue. 1289-1292 [doi]

EM training of finite-state transducers and its application to pronunciation modelingHan Shu, I. Lee Hetherington. 1293-1296 [doi]

Finite-state transducer based hungarian LVCSR with explicit modeling of phonological changesMate Szarvas, Sadaoki Furui. 1297-1300 [doi]

Using dynamic WFST composition for recognizing broadcast newsDiamantino Caseiro, Isabel Trancoso. 1301-1304 [doi]

Transducer search space modelings for large-vocabulary speech recognitionHans J. G. A. Dolfing. 1305-1308 [doi]

A comparison of two LVR search optimization techniquesStephan Kanthak, Hermann Ney, Michael Riley, Mehryar Mohri. 1309-1312 [doi]

An efficient algorithm for the n-best-strings problemMehryar Mohri, Michael Riley. 1313-1316 [doi]

Structural Gaussian mixture models for efficient text-independent speaker verificationBing Xiang, Toby Berger. 1317-1320 [doi]

Text-dependent speaker verification using lyapunov exponentsAdriano Petry, Dante Augusto Couto Barone. 1321-1324 [doi]

User-customized password speaker verification based on HMM/ANN and GMM modelsMohamed Faouzi BenZeghiba, Hervé Bourlard. 1325-1328 [doi]

Exploiting support vector machines in hidden Markov models for speaker verificationDong Xin, Zhaohui Wu, Yingchun Yang. 1329-1332 [doi]

Speaker identification by location in an optimal space of anchor modelsYassine Mami, Delphine Charlet. 1333-1336 [doi]

ASR dependent techniques for speaker identificationAlex Park, Timothy J. Hazen. 1337-1340 [doi]

Factor analyzed Gaussian mixture models for speaker identificationPeng Ding, Yang Liu, Bo Xu. 1341-1344 [doi]

Phonetic speaker identificationQin Jin, Tanja Schultz, Alex Waibel. 1345-1348 [doi]

DETAC: a discriminative criterion for speaker verificationJiri Navratil, Ganesh N. Ramaswamy. 1349-1352 [doi]

Hierarchical Gaussian mixture model for speaker verificationMing Liu, Eric Chang, Bei-qian Dai. 1353-1356 [doi]

A reverse turing test using speechGreg Kochanski, Daniel P. Lopresti, Chilin Shih. 1357-1360 [doi]

On effective speaker verification based on subword modelSungjoo Ahn, Sunmee Kang, Hanseok Ko. 1361-1364 [doi]

Speaker verification using Gaussian component strings in dynamic trajectory spaceBing Xiang. 1365-1368 [doi]

Combining speaker and speech recognition systemsLarry P. Heck, Dominique Genoud. 1369-1372 [doi]

Automatic enrollment for speaker authenticationQi Li, Hui Jiang, Qiru Zhou, Jinsong Zheng. 1373-1376 [doi]

Experiments in confidence scoring for word and sentence verificationM. Andorno, Pietro Laface, Roberto Gemello. 1377-1380 [doi]

Confidence metrics for speaker identificationMark C. Huggins, John J. Grieco. 1381-1384 [doi]

Characteristics of a low reject mode speaker verification systemDaniel Elenius, Mats Blomberg. 1385-1388 [doi]

Implementing vocal tract length normalization in the MLLR frameworkGuo-Hong Ding, Yi-fei Zhu, Chengrong Li, Bo Xu. 1389-1392 [doi]

Markov models based on speaker space model evolutionDong Kook Kim, Nam Soo Kim. 1393-1396 [doi]

Robust speech recognition using inter-speaker and intra-speaker adaptationBaojie Li, Keikichi Hirose, Nobuaki Minematsu. 1397-1400 [doi]

Continuous environmental adaptation of a speech recogniser in telephone line conditionsCarlos Lima, Luís B. Almeida, João L. Monteiro. 1401-1404 [doi]

Tree-structured maximum a posteriori adaptation for a segment-based speech recognition systemIrina Illina. 1405-1408 [doi]

Robust time-synchronous environmental adaptation for continuous speech recognition systemsThomas Plötz, Gernot A. Fink. 1409-1412 [doi]

Unsupervised language model adaptation for lecture speech transcriptionThomas Niesler, Daniel Willett. 1413-1416 [doi]

Incremental on-line feature space MLLR adaptation for telephony speech recognitionYongxin Li, Hakan Erdogan, Yuqing Gao, Etienne Marcheret. 1417-1420 [doi]

Enhanced histogram normalization in the acoustic feature spaceSirko Molau, Florian Hilger, Daniel Keysers, Hermann Ney. 1421-1424 [doi]

Blind normalization of speech from different channels and speakersDavid N. Levin. 1425-1428 [doi]

Unsupervised acoustic model adaptation based on phoneme error minimizationJun Ogata, Yasuo Ariki. 1429-1432 [doi]

Improved structural maximum likelihood eigenspace mapping for rapid speaker adaptationBowen Zhou, John H. L. Hansen. 1433-1436 [doi]

Statistical adaptation of acoustic models to noise conditions for robust speech recognitionÁngel de la Torre, Dominique Fohr, Jean-Paul Haton. 1437-1440 [doi]

Issues in automatic transcription of historical audio dataFabio Brugnara, Mauro Cettolo, Marcello Federico, Diego Giuliani. 1441-1444 [doi]

Special session: issues in audiovisual spoken language processing (when, where, and how?)Lynne E. Bernstein, Denis Burnham, Jean-Luc Schwartz. 1445-1448 [doi]

Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)Sabine Deligne, Gerasimos Potamianos, Chalapathy Neti. 1449-1452 [doi]

Audiovisual speech synthesis. from ground truth to modelsGérard Bailly. 1453-1456 [doi]

The stimulus as basis for audiovisual integrationEric Vatikiotis-Bateson, Harold Hill, Miyuki Kamachi, Karen Lander, Kevin G. Munhall. 1457-1460 [doi]

The perceptual basis for audiovisual speech integrationLawrence D. Rosenblum. 1461-1464 [doi]

Sources of variability in the perceptual training of /r/ and /l/: interaction of adjacent vowel, word position, talkers² visual and acoustic cuesDebra M. Hardison. 1465-1468 [doi]

Can confidence scores help users post-editing speech recognizer output?Taku Endo, Nigel Ward, Minoru Terada. 1469-1472 [doi]

Information retrieval based on speech recognition resultsMasatoshi Watanabe, Masahide Sugiyama. 1473-1476 [doi]

Efficient combination of type-in and wizard-of-oz tests in speech interface development processSaija-Maaria Lemmelä, Péter Pál Boda. 1477-1480 [doi]

Probabilistic retrieval based on document representationsWolfgang Macherey, Hans Jörg Viechtbauer, Hermann Ney. 1481-1484 [doi]

Radiodoc: a voice-accessible document systemTakuya Nishimoto, Masahiro Araki, Yasuhisa Niimi. 1485-1488 [doi]

Speech completion: on-demand completion assistance using filled pauses for speech input interfacesMasataka Goto, Katunobu Itou, Satoru Hayamizu. 1489-1492 [doi]

Design of system-initiated digressive proposals for automated banking dialoguesJenny Wilkie, Mervyn A. Jack, Peter J. Littlewood. 1493-1496 [doi]

Towards every-citizen²s speech interface: an application generator for speech interfaces to databasesArthur R. Toth, Thomas K. Harris, James Sanders, Stefanie Shriver, Roni Rosenfeld. 1497-1500 [doi]

Training topic classifiers for conversational speech with limited dataRukmini Iyer, Jeffrey Ma, Herbert Gish, Owen Kimball. 1501-1504 [doi]

Comparing isolately spoken keywords with spontaneously spoken queries for Japanese spoken document retrievalHiromitsu Nishizaki, Seiichi Nakagawa. 1505-1508 [doi]

Choosing speech or touchtone modality for navigation within a telephony natural language systemJennifer C. Lai, Kwan Min Lee. 1509-1512 [doi]

Multi-scale and multi-model integration for improved performance in Chinese spoken document retrievalWai Kit Lo, Helen M. Meng, P. C. Ching. 1513-1516 [doi]

Development of a GUI-based articulatory speech synthesis systemKohichi Ogata, Yorinobu Sonoda. 1517-1520 [doi]

Investigation of coarticulation based on electromagnetic articulographic dataJianwu Dang, Masaaki Honda, Kiyoshi Honda. 1521-1524 [doi]

Frequency dependence of vocal-tract lengthTakuya Niikawa, Takanori Ando, Masafumi Matsumura. 1525-1528 [doi]

Functional modeling of face movements during speechShinji Maeda, Martine Toda, Andreas J. Carlen, Lyes Meftahi. 1529-1532 [doi]

Control system for talking robot to replicate articulatory movement of natural speechTakemi Mochida, Masaaki Honda, Kouki Hayashi, Toshiharu Kuwae, Kunihiro Tanahashi, Kazufumi Nishikawa, Atsuo Takanishi. 1533-1536 [doi]

Feed the tiger: a method for evoking reliable jaw stretch reflexes in childrenDonald S. Finan, Anne Smith, Michael Ho. 1537-1540 [doi]

Using time-stretched pulses for accurate splitting of speech utterances played back in noisy reverberant environmentsDorothea Kolossa, Qiang Huo. 1541-1544 [doi]

X-JToBI: an extended j-toBI for spontaneous speechKikuo Maekawa, Hideaki Kikuchi, Yosuke Igarashi, Jennifer J. Venditti. 1545-1548 [doi]

Dutch HLT resources: from BLARK to priority listsHelmer Strik, Walter Daelemans, Diana Binnenpoorte, Janienke Sturm, Folkert de Vriend, Catia Cucchiarini. 1549-1552 [doi]

ACT: a graphical dialogue annotation comparison toolFan Yang, Susan E. Strayer, Peter A. Heeman. 1553-1556 [doi]

A training prompts generation algorithm for connected spoken word recognitionHa-Jin Yu, Jin Suk Kim. 1557-1560 [doi]

Using observation uncertainty in HMM decodingJon A. Arrowood, Mark A. Clements. 1561-1564 [doi]

Combining a Gaussian mixture model front end with MFCC parametersMatthew N. Stuttle, M. J. F. Gales. 1565-1568 [doi]

Noise from corrupted speech log mel-spectral energiesJasha Droppo, Alex Acero, Li Deng. 1569-1572 [doi]

Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognitionCarlos Lima, Luís B. Almeida, João L. Monteiro. 1573-1576 [doi]

Building an ASR system for noisy environments: SRIs 2001 SPINE evaluation systemVenkata Ramana Rao Gadde, Andreas Stolcke, Dimitra Vergyri, Jing Zheng, M. Kemal Sönmez, Anand Venkataraman. 1577-1580 [doi]

A low-resource, miniature implementation of the ETSI distributed speech recognition front-endEtienne Cornu, Hamid Sheikhzadeh, Robert L. Brennan. 1581-1584 [doi]

Memory space reduction for hidden Markov models in low-resource speech recognition systemsSergey Astrov. 1585-1588 [doi]

Low complexity Mandarin speaker-independent isolated word recognitionXia Wang, Juha Iso-Sipilä. 1589-1592 [doi]

Low complexity techniques for embedded ASR systemsImre Kiss, Marcel Vasilache. 1593-1596 [doi]

Optimization of hidden Markov models for embedded systemsKlaus Reinhard, Jochen Junkawitsch, Andreas Kießling, Stefan Dobler. 1597-1600 [doi]

Data-driven vector clustering for low-memory footprint ASRKarim Filali, Xiao Li, Jeff A. Bilmes. 1601-1604 [doi]

Utterance verification based on neighborhood information and Bayes factorsHui Jiang, Chin-Hui Lee. 1605-1608 [doi]

Vocabulary independent OOV detection using support vector machinesTommi Lahti, Janne Suontausta. 1609-1612 [doi]

A multi-class approach for modelling out-of-vocabulary wordsIssam Bazzi, James R. Glass. 1613-1616 [doi]

Unconstrained versus constrained acoustic normalisation in confidence scoringJacques Duchateau, Patrick Wambacq. 1617-1620 [doi]

Acoustic and word lattice based algorithms for confidence scoresDaniele Falavigna, Roberto Gretter, Giuseppe Riccardi. 1621-1624 [doi]

Error-tolerant spoken language understanding with confidence measuringHuei-Ming Wang, Yi-Chung Lin. 1625-1628 [doi]

Comparing intelligibility of several non-native accent classes in noiseShawn A. Weil. 1629-1632 [doi]

Effect of F0 fluctuation and amplitude modulation of natural vowels on vowel identification in noisy environmentsKentaro Ishizuka, Kiyoaki Aikawa. 1633-1636 [doi]

Similarities of words in noise in JapaneseKiyoko Yoneyama. 1637-1640 [doi]

The effects of F0 manipulation on the perceived distance of speechDouglas Brungart, Alexander J. Kordik, Koel Das, Arnab K. Shaw. 1641-1644 [doi]

Time-compressing natural and synthetic speechEsther Janse. 1645-1648 [doi]

Accounting for perceptual identification of consonants and vowels through acoustic dissimilarityJianxia Xue, Sumiko Takayanagi, Lynne E. Bernstein. 1649-1652 [doi]

Modeling recognition of speech sounds with minerva2Travis Wade, Deborah K. Eakin, Russell Webb, Arvin Agah, Frank Brown, Allard Jongman, John Gauch, Thomas A. Schreiber, Joan Sereno. 1653-1656 [doi]

Syllable processing in EnglishRuth Kearns, Dennis Norris, Anne Cutler. 1657-1660 [doi]

Perceptual effects of assimilation-induced violation of final devoicing in dutchCecile T. L. Kuijpers, Wilma van Donselaar, Anne Cutler. 1661-1664 [doi]

Access to homophonic meanings during spoken language comprehension: effects of context and neighborhood densityMichael C. W. Yip. 1665-1668 [doi]

Intelligibility of reverse speech in French: a perceptual studyIvan Magrin-Chagnolleau, Melissa Barkat, Fanny Meunier. 1669-1672 [doi]

Contextual effects in the perception of fricative place of articulation: a rotational hypothesisWilly Serniclaes, René Carré. 1673-1676 [doi]

What relationship between protrusion anticipation and auditory perception?Rudolph Sock, Béatrice Vaxelaire, Véronique Hecker, Fabrice Hirsch. 1677-1680 [doi]

On the role of the schwa in the perception of plosive consonantsRené Carré, Jean-Sylvain Liénard, Egidio Marsico, Willy Serniclaes. 1681-1684 [doi]

Audiovisual perception in L2 learnersValérie Hazan, Anke Sennema, Andrew Faulkner. 1685-1688 [doi]

Audiovisual integration of speech by children and adults with cochlear implantsKaren Iler Kirk, David B. Pisoni, Lorin Lachs. 1689-1692 [doi]

Auditory-visual speech perception examined by brain imaging and reaction timeKaoru Sekiyama, Yoichi Sugita. 1693-1696 [doi]

Neurocognitive basis for audiovisual speech perception: evidence from event-related potentialsCurtis W. Ponton, Edward T. Auer, Lynne E. Bernstein. 1697-1700 [doi]

Perception and integration of audiovisual speech in human infantsDavid J. Lewkowicz. 1701-1704 [doi]

Design for a speech-to-speech translator for field useDavid Stallard, Premkumar Natarajan, Mohammed Noamany, Richard M. Schwartz, John Makhoul. 1705-1708 [doi]

Rapid development of speech-to-speech translation systemsAlan W. Black, Ralf D. Brown, Robert E. Frederking, Kevin A. Lenzo, John Moody, Alexander I. Rudnicky, Rita Singh, Eric Steinbrecher. 1709-1712 [doi]

Bilingual corpus cleaning focusing on translation literalityKenji Imamura, Eiichiro Sumita. 1713-1716 [doi]

Speech to speech translation system for monologues-data driven approachHideki Tanaka, Stephen Nightingale, Hideki Kashioka, Kenji Matsumoto, Masamchi Nishiwaki, Tadashi Kumano, Takehiko Maruyama. 1717-1720 [doi]

Separation of voiced source characteristics and vocal tract transfer function characteristics for speech sounds by iterative analysis based on AR-HMM modelNobuyuki Nishizawa, Keikichi Hirose, Nobuaki Minematsu. 1721-1724 [doi]

Automatic extraction of model parameters from fundamental frequency contours of English utterancesShuichi Narusawa, Nobuaki Minematsu, Keikichi Hirose, Hiroya Fujisaki. 1725-1728 [doi]

Pitch extraction of speech signals using an eigen-based subspace methodTakahiro Murakami, Munehiro Namba, Tetsuya Hoya, Yoshihisa Ishida. 1729-1732 [doi]

Robust fundamental frequency estimation against background noise and spectral distortionTomohiro Nakatani, Toshio Irino. 1733-1736 [doi]

2-d processing of speech with application to pitch estimationThomas F. Quatieri. 1737-1740 [doi]

Towards automatic closed captioning : low latency real time broadcast news transcriptionMurat Saraclar, Michael Riley, Enrico Bocchieri, Vincent Goffin. 1741-1744 [doi]

Automatic transcription of courtroom speechRohit Prasad, Long Nguyen, Richard M. Schwartz, John Makhoul. 1745-1748 [doi]

Japanese broadcast news transcriptionLong Nguyen, Xuefeng Guo, Richard M. Schwartz, John Makhoul. 1749-1752 [doi]

German broadcast news transcriptionRobert Hecht, Jürgen Riedler, Gerhard Backfried. 1753-1756 [doi]

Speech recognition with a re-speak method for subtitling live broadcastsToru Imai, Atsushi Matsui, Shinichi Homma, Takeshi Kobayakawa, Kazuo Onoe, Shoei Sato, Akio Ando. 1757-1760 [doi]

Evaluation of the method to detect Japanese local speech rate deceleration applying the variable threshold with a constant termKeiichi Takamaru, Makoto Hiroshige, Kenji Araki, Koji Tochinai. 1761-1764 [doi]

Modeling durational variability in reading aloud a connected textCaroline L. Smith. 1769-1772 [doi]

Duration modeling for arabic text to speech synthesisYasser Hifny, Mohsen Rashwan. 1773-1776 [doi]

Learning syllable duration and intonation of Mandarin ChineseOliver Jokisch, Hongwei Ding, Hans Kruschke, Guntram Strecha. 1777-1780 [doi]

Speech enhancement in car environment using blind source separationHiroshi Saruwatari, Katsuyuki Sawai, Akinobu Lee, Kiyohiro Shikano, Atsunobu Kaminuma, Masao Sakata. 1781-1784 [doi]

Speech enhancement based on combining perceptual enhancement and short-time spectral attenuationIlyas Potamitis, Nikos Fakotakis, George Kokkinakis. 1785-1788 [doi]

Suitable design of adaptive beamformer based on average speech spectrum for noisy speech recognitionTakanobu Nishiura, Satoshi Nakamura, Yuka Okada, Takeshi Yamada, Kiyohiro Shikano. 1789-1792 [doi]

Highly oversampled subband adaptive filters for noise cancellation on a low-resource DSP systemKing Tam, Hamid Sheikhzadeh, Todd Schneider. 1793-1796 [doi]

A perceptually motivated subspace approach for speech enhancementYi Hu, Philipos C. Loizou. 1797-1800 [doi]

Speech enhancement based on generalized singular value decomposition approachGwo-Hwa Ju, Lin-Shan Lee. 1801-1804 [doi]

Subspace speech enhancement using subband whitening filterJong-Uk Kim, Chang D. Yoo. 1805-1808 [doi]

Speech enhancement using wavelet packet transformSungwook Chang, Sung-il Jung, Younghun Kwon, Sung-il Yang. 1809-1812 [doi]

Sequential MAP noise estimation and a phase-sensitive model of the acoustic environmentLi Deng, Jasha Droppo, Alex Acero. 1813-1816 [doi]

Auditory fovea based speech enhancement and its application to human-robot dialog systemKazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano. 1817-1820 [doi]

A spatio-temporal speech enhancement scheme for robust speech recognitionErik M. Visser, Manabu Otsuka, Te-Won Lee. 1821-1824 [doi]

Comparative evaluation of CASA and BSS models for subband cocktail-party speech separationFrédéric Berthommier, Seungjin Choi. 1825-1828 [doi]

Speech enhancement in non-stationary noise environmentsHyoung-Gook Kim, Dietmar Ruwisch. 1829-1832 [doi]

The 2ch hybrid subtractive beamformer applied to line sound sourcesMitsunori Mizumachi, Satoshi Nakamura. 1833-1836 [doi]

Controlling perceived degradation in spectrum envelope modeling via predistortionPushkar Patwardhan, Preeti Rao. 1837-1840 [doi]

Benefit and cost analysis of using the improved vector quantizer design algorithm for glottal source waveform compressionPeter Veprek, Alan B. Bradley. 1841-1844 [doi]

Speech coding and transmission for improved automatic recognitionXin Zhong, Jon A. Arrowood, Mark A. Clements. 1845-1848 [doi]

Coding speech at very low rates using straight and temporal decompositionPhu Chien Nguyen, Takao Ochi, Masato Akagi. 1849-1852 [doi]

On improving the performance of analysis-by-synthesis coding using a multi-magnitude algebraic code-book excitation signalOmar Halmi, Hesham Tolba, Driss Guerchi, Douglas D. O Shaughnessy. 1857-1860 [doi]

Improved performance speech codec for mobile communicationsK. Humphreys, R. Lawlor. 1861-1864 [doi]

Fixed-length segment coding of LSF parametersEvgeni Yakhnich, Yuval Bistritz. 1865-1868 [doi]

Interaction of voice over internet protocol speech coders and disordered speech samplesVijay Parsa, Donald G. Jamieson. 1869-1872 [doi]

Speech recognition performance comparison between DSR and AMR transcoded speechHolly Kelleher, David Pearce, Douglas Ealey, Laurent Mauuary. 1873-1876 [doi]

The influence of speech coding on recognition performance in telecommunication networksHans-Günter Hirsch. 1877-1880 [doi]

Spectral enhancement preprocessing for the HNM coding of noisy speechGautam Moharir, Pushkar Patwardhan, Preeti Rao. 1881-1884 [doi]

Using x-grams for speech-to-speech translationAdrià de Gispert, José B. Mariño. 1885-1888 [doi]

Statistical machine translation decoder based on phraseTaro Watanabe, Eiichiro Sumita. 1889-1892 [doi]

Reliability measures for translation qualityEiichiro Sumita, Yasuhiro Akiba, Kenji Imamura. 1893-1896 [doi]

Statistical natural language generation for speech-to-speech machine translation systemsBowen Zhou, Yuqing Gao, Jeffrey S. Sorensen, Zijian Diao, Michael Picheny. 1897-1900 [doi]

Improving statistical machine translation for a speech-to-speech translation taskStephan Vogel, Alicia Tribble. 1901-1904 [doi]

Speech-to-speech translation system evaluation: results for French for the NESPOLE! project first showcaseSolange Rossato, Hervé Blanchon, Laurent Besacier. 1905-1908 [doi]

Interlingua based statistical machine translationManuel Kauers, Stephan Vogel, Christian Fügen, Alex Waibel. 1909-1912 [doi]

Seeing tongue movements from outsideGérard Bailly, Pierre Badin. 1913-1916 [doi]

An audio-visual corpus for multimodal speech recognition in dutch languageJacek C. Wojdel, Pascal Wiggers, Léon J. M. Rothkrantz. 1917-1920 [doi]

Medium vocabulary continuous audio-visual speech recognitionPascal Wiggers, Jacek C. Wojdel, Léon J. M. Rothkrantz. 1921-1924 [doi]

DCT-based video features for audio-visual speech recognitionMartin Heckmann, Kristian Kroschel, Christophe Savariaux, Frédéric Berthommier. 1925-1928 [doi]

The effect of auditory-visual information and orthographic background in L2 acquisitionV. Dogu Erdener, Denis Burnham. 1929-1932 [doi]

Perceptual evaluation of audiovisual cues for prominenceEmiel Krahmer, Zsófia Ruttkay, Marc Swerts, Wieger Wesselink. 1933-1936 [doi]

Audio-visual scene analysis: evidence for a very-early integration process in audio-visual speech perceptionJean-Luc Schwartz, Frédéric Berthommier, Christophe Savariaux. 1937-1940 [doi]

Design of an audio-visual speech corpus for the czech audio-visual speech synthesisMilos Zelezný, Petr Císar, Zdenek Krnoul, Jan Novák. 1941-1944 [doi]

Coordination of hand and orofacial movements for CV sequences in French cued speechVirginie Attina, Denis Beautemps, Marie-Agnès Cathiard. 1945-1948 [doi]

Controling anticipatory behavior for rounding in French cued speechVirginie Attina, Marie-Agnès Cathiard, Denis Beautemps. 1949-1952 [doi]

Audio-visual speech sources separation: a new approach exploiting the audio-visual coherence of speech stimuliDavid Sodoyer, Laurent Girin, Christian Jutten, Jean-Luc Schwartz. 1953-1956 [doi]

Intonational and visual cues in the perception of interrogative mode in SwedishDavid House. 1957-1960 [doi]

A link between cepstral shrinking and the weighted product rule in audio-visual speech recognitionSimon Lucey, Sridha Sridharan, Vinod Chandran. 1961-1964 [doi]

Contribution to topic identification by using word similarityArmelle Brun, Kamel Smaïli, Jean-Paul Haton. 1965-1968 [doi]

Speechfind: an experimental on-line spoken document retrieval system for historical audio archivesBowen Zhou, John H. L. Hansen. 1969-1972 [doi]

Topic tracking using subject templatesYoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi. 1973-1976 [doi]

Topic detection of an utterance for speech dialogue processingKatsushi Asami, Toshiyuki Takezawa, Gen-ichiro Kikui. 1977-1980 [doi]

Real-time rich-content transcription of Chinese broadcast newsDaben Liu, Jeffrey Ma, Dongxin Xu, Amit Srivastava, Francis Kubala. 1981-1984 [doi]

Improved Chinese spoken document retrieval with hybrid modeling and data-driven indexing featuresChun-Jen Wang, Berlin Chen, Lin-Shan Lee. 1985-1988 [doi]

Exploring sub-word features and linear support vector machines for German spoken document classificationMartha Larson, Stefan Eickeler, Gerhard Paaß, Edda Leopold, Jörg Kindermann. 1989-1992 [doi]

Goal-directed ASR in a multimedia indexing and searching environment (MUMIS)Mirjam Wester, Judith M. Kessens, Helmer Strik. 1993-1996 [doi]

Confusion-based query expansion for OOV words in spoken document retrievalBeth Logan, Jean-Manuel Van Thong. 1997-2000 [doi]

Cluster identification for speaker-environment trackingJ. T. Wickramaratna, Philip C. Woodland. 2001-2004 [doi]

Robust speech / music classification in audio documentsJulien Pinquier, Jean-Luc Rouas, Régine André-Obrecht. 2005-2008 [doi]

Speech, music and songs discrimination in the context of handsets variabilityHassan Ezzaidi, Jean Rouat. 2013-2016 [doi]

Acoustic correlates of task load and stressKlaus R. Scherer, Didier Grandjean, Tom Johnstone, Gudrun Klasmeyer, Thomas Bänziger. 2017-2020 [doi]

Frequency band analysis for stress detection using a teager energy operator based featureMandar A. Rahurkar, John H. L. Hansen, James Meyerhoff, George Saviolakis, Michael Koenig. 2021-2024 [doi]

The acoustic realization of anger, fear, joy and sadness in ChineseJiahong Yuan, Liqin Shen, Fangxin Chen. 2025-2028 [doi]

Emotional space improves emotion recognitionRaquel Tato, Rocío Santos, Ralf Kompe, J. M. Pardo. 2029-2032 [doi]

Emotion recognition from textual input using an emotional semantic networkZe-Jing Chuang, Chung-Hsien Wu. 2033-2036 [doi]

Prosody-based automatic detection of annoyance and frustration in human-computer dialogJeremy Ang, Rajdip Dhillon, Ashley Krupski, Elizabeth Shriberg, Andreas Stolcke. 2037-2040 [doi]

RUSLANA: a database of Russian emotional utterancesVeronika Makarova, Valery A. Petrushin. 2041-2044 [doi]

A pragmatic confirmation mechanism for an object-based spoken dialogue managerIan M. O Neill, Michael F. McTear. 2045-2048 [doi]

Serving complex user wishes with an enhanced spoken dialogue systemSunna Torge, Stefan Rapp, Ralf Kompe. 2049-2052 [doi]

Integrating speech with keypad input for automatic entry of spelling and pronunciation of new wordsGrace Chung, Stephanie Seneff. 2053-2056 [doi]

Reference resolution by human partners in a natural interactive problem-solving taskEllen Campana, Sarah Brown-Schmidt, Michael K. Tanenhaus. 2057-2060 [doi]

Is the speaker done yet? faster and more accurate end-of-utterance detection using prosodyLuciana Ferrer, Elizabeth Shriberg, Andreas Stolcke. 2061-2064 [doi]

Adding intelligent help to mixed-initiative spoken dialogue systemsGenevieve Gorrell, Ian Lewin, Manny Rayner. 2065-2068 [doi]

Analysis of user behavior under error conditions in spoken dialogsJongho Shin, Shrikanth S. Narayanan, Laurie Gerber, Abe Kazemzadeh, Dani Byrd. 2069-2072 [doi]

Production based pitch modification of voiced speechYinglong Jiang, Peter Murphy. 2073-2076 [doi]

F0 generation for speech synthesis using a multi-tier approachXuejing Sun. 2077-2080 [doi]

From text to prosody without toBIVolker Strom. 2081-2084 [doi]

Improved corpus-based synthesis of fundamental frequency contours using generation process modelKeikichi Hirose, Masaya Eto, Nobuaki Minematsu. 2085-2088 [doi]

Intonation modelling for the synthesis of structured documentsJeska Buhmann, Jean-Pierre Martens, Lieve Macken, Bert Van Coile. 2089-2092 [doi]

Applying fallback to prosodic unit selection from a small imitation databaseJoram Meron. 2093-2096 [doi]

Clustering and feature learning based F0 prediction for Chinese speech synthesisJianhua Tao, Lianhong Cai. 2097-2100 [doi]

Evaluation of formant-like features for ASRKatrin Weber, Febe de Wet, Bert Cranen, Lou Boves, Samy Bengio, Hervé Bourlard. 2101-2104 [doi]

Entropy of energy operator as feature for large vocabulary Mandarin speaker independent speech recognitionFadhil H. T. Al-Dulaimy, Zuoying Wang. 2105-2108 [doi]

Improving parametric trajectory modeling by integration of pitch and tone informationYiyan Zhang, Wenju Liu, Bo Xu, Huayun Zhang. 2109-2112 [doi]

Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for automatic speech recognition using a multi-stream paradigmHesham Tolba, Sid-Ahmed Selouani, Douglas D. O Shaughnessy. 2113-2116 [doi]

Speech recognition using combined acoustic and articulatory information with retraining of acoustic model parametersKa-Yee Leung, Man-Hung Siu. 2117-2120 [doi]

Improved phone recognition on TIMIT using formant frequency data and confidence measuresN. J. Wilkinson, Martin J. Russell. 2121-2124 [doi]

Speaker independent speech recognition using features based on glottal sound sourceNorihide Kitaoka, Daisuke Yamada, Seiichi Nakagawa. 2125-2128 [doi]

An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognitionMohamed Kamal Omar, Ken Chen, Mark Hasegawa-Johnson, Yigal Brandman. 2129-2132 [doi]

A flexible stream architecture for ASR using articulatory featuresFlorian Metze, Alex Waibel. 2133-2136 [doi]

Speech recognition using fundamental frequency and voicing in acoustic modelingAndrej Ljolje. 2137-2140 [doi]

A comparison of front-end analyses for Thai speech recognitionMontri Karnjanadecha, Patimakorn Kimsawad. 2141-2144 [doi]

New model for speech residual signal shaping with static nonlinearityJari Turunen, Juha T. Tanttu, Pekka Loula. 2145-2148 [doi]

Formant model estimation and transformation for voice morphingChing-Hsiang Ho, Dimitrios Rentzos, Saeed Vaseghi. 2149-2152 [doi]

Production and perception of pauses and their linguistic context in read and spontaneous speech in SwedishBeáta Megyesi, Sofia Gustafson-Capková. 2153-2156 [doi]

Non-linear techniques for dysphonic voice analysis and correctionClaudia Manfredi, Lorenzo Matassini. 2157-2160 [doi]

Adaptive estimation of time-varying features from high-pitched speech based on an excitation source HMMAkira Sasou, Kazuyo Tanaka. 2161-2164 [doi]

Lip gestures in English sibilants: articulatory - acoustic relationshipMartine Toda, Shinji Maeda, Andreas J. Carlen, Lyes Meftahi. 2165-2168 [doi]

Bark resolution from speech dataNaren Malayath, Hynek Hermansky. 2169-2172 [doi]

Noise-robust speech recognition in car environments using genetic algorithms and a mel-cepstral subspace approachSid-Ahmed Selouani, Douglas D. O Shaughnessy. 2173-2176 [doi]

Modeling with a subspace constraint on inverse covariance matricesScott Axelrod, Ramesh Gopinath, Peder A. Olsen. 2177-2180 [doi]

Improving speech recognition performance of small microphone arrays using missing data techniquesIain McCowan, Andrew C. Morris, Hervé Bourlard. 2181-2184 [doi]

Double the trouble: handling noise and reverberation in far-field automatic speech recognitionDavid Gelbart, Nelson Morgan. 2185-2188 [doi]

Model-based independent component analysis for robust multi-microphone automatic speech recognitionLaurent Couvreur, Christophe Ris. 2189-2192 [doi]

Compensation of channel effect on line spectrum frequenciesAn-Tze Yu, Hsiao-Chuan Wang. 2193-2196 [doi]

Codebook dependent dynamic channel estimation for Mandarin speech recognition over telephoneHuayun Zhang, Zhaobing Han, Bo Xu. 2197-2200 [doi]

Robust multiple resolution analysis for automatic speech recognitionRoberto Gemello, Franco Mana, Paolo Pegoraro, Renato de Mori. 2201-2204 [doi]

HMM-based methods for channel error mitigation in distributed speech recognitionAntonio M. Peinado, Victoria E. Sánchez, José L. Pérez-Córdoba, José C. Segura, Antonio J. Rubio. 2205-2208 [doi]

Network-based vs. distributed speech recognition in adaptive multi-rate wireless systemsTim Fingscheidt, Stefanie Aalburg, Sorel Stan, Christophe Beaugeant. 2209-2212 [doi]

Channel noise robustness for low-bitrate remote speech recognitionAlexis Bernard, Abeer Alwan. 2213-2216 [doi]

Influence of transmission errors on ASR systemsCarmen Peláez-Moreno, Ascensión Gallardo-Antolín, Jesús Vicente-Peña, Fernando Díaz-de-María. 2217-2220 [doi]

Robust feature extraction in a variety of input devices on the basis of ETSI standard DSR front-endSatoru Tsuge, Shingo Kuroiwa, Masami Shishibori, Fuji Ren, Kenji Kita. 2221-2224 [doi]

Channel error protection scheme for distributed speech recognitionZheng-Hua Tan, Paul Dalsgaard. 2225-2228 [doi]

The effects of speech compression on speech recognition and text-to-speech synthesisYeshwant K. Muthusamy, Yifan Gong, Roshan Gupta. 2229-2232 [doi]

Transform-based feature vector compression for distributed speech recognitionBen Milner, Xu Shao. 2233-2236 [doi]

Multimodal language processing for mobile information accessMichael Johnston, Srinivas Bangalore, Amanda Stent, Gunaranjan Vasireddy, Patrick Ehlen. 2237-2240 [doi]

SALT: a spoken language interface for web-based multimodal dialog systemsKuansan Wang. 2241-2244 [doi]

Building voiceXML-based applicationsChristina L. Bennett, Ariadna Font Llitjós, Stefanie Shriver, Alexander I. Rudnicky, Alan W. Black. 2245-2248 [doi]

Operations for context-based multimodal interpretation in conversational systemsJoyce Yue Chai. 2249-2252 [doi]

A distributed multimodal dialogue system based on dialogue system and web convergenceFeng Liu, Antoine Saad, Li Li, Wu Chou. 2253-2256 [doi]

An acoustic comparison between american English and australian English vowelsKimiko Tsukada. 2257-2260 [doi]

A case study of portuguese and English bilingualityLuis M. T. Jesus, Christine H. Shadle. 2261-2264 [doi]

An IPA vowel diagram approach to analysing L1 effects on vowel production and perceptionOlga I. Dioubina, Hartmut R. Pfitzinger. 2265-2268 [doi]

Phonological norms in faroese speech synthesisPétur Helgason, Sjrðhur Gullbein. 2269-2272 [doi]

Studying pronunciation variants in French by using alignment techniquesPhilippe Boula de Mareüil, Martine Adda-Decker. 2273-2276 [doi]

Perceived boundary strengthPetra Hansson. 2277-2280 [doi]

Syntax over focusSun-Ah Jun. 2281-2284 [doi]

Duration related phase realignment of Thai tonesJohn J. Ohala, Rungpat Roengpitya. 2285-2288 [doi]

Probabilistic ranking of constraintsLouis ten Bosch. 2289-2292 [doi]

Multi-dimensional analysis of sonority: perception, acoustics, and phonologyMasahiko Komatsu, Shinichi Tokuma, Won Tokuma, Takayuki Arai. 2293-2296 [doi]

Three-dimensional electromagnetic articulograph based on a nonparametric representation of the magnetic fieldTokihiko Kaburagi, Kohei Wakamiya, Masaaki Honda. 2297-2300 [doi]

Introduction of constraints in an acoustic-to-articulatory inversion method based on a hypercubic articulatory tableYves Laprie, Slim Ouni. 2301-2304 [doi]

Acoustic-to-articulatory inverse mapping using an HMM-based speech production modelSadao Hiroya, Masaaki Honda. 2305-2308 [doi]

Modeling articulatory dynamics in autoregressive linear systemKiyoshi Hashimoto. 2309-2312 [doi]

A study of the two-mass model in terms of acoustic parametersDenisse Sciamarella, Christophe d Alessandro. 2313-2316 [doi]

On the relevance of bandwidth extension for speaker verificationMarcos Faúndez-Zanuy, Mattias Nilsson, W. Bastiaan Kleijn. 2317-2320 [doi]

Speaker recognition using discriminative features selectionBogdan Sabac. 2321-2324 [doi]

Designing a speaker-discriminative adaptive filter bank for speaker recognitionTomi Kinnunen. 2325-2328 [doi]

Divergence-based out-of-class rejection for telephone handset identificationChi-Leung Tsang, Man-Wai Mak, Sun-Yuan Kung. 2329-2332 [doi]

A handset identifier using support vector machinesPurdy Ho. 2333-2336 [doi]

An analysis of the causes of increased error rates in children²s speech recognitionQun Li, Martin J. Russell. 2337-2340 [doi]

A new computer-based analytical speech perception test for prelingually deaf children and children with speech disordersAnne-Marie Öster. 2341-2344 [doi]

Vocalization age as a clinical toolHarriet J. Fell, Joel MacAuslan, Linda J. Ferrier, Susan G. Worst, Karen Chenausky. 2345-2348 [doi]

Baldini: baldi speaks italian!Piero Cosi, Michael M. Cohen, Dominic W. Massaro. 2349-2352 [doi]

Eyebrow movements and voice variations in dialogue situations: an experimental investigationChristian Cavé, Isabelle Guaïtella, Serge Santi. 2353-2356 [doi]

Using start/end timings of spectral transitions between phonemes in concatenative speech synthesisToshio Hirai, Seiichi Tenpaku, Kiyohiro Shikano. 2357-2360 [doi]

Design of a Mandarin sentence set for corpus-based speech synthesis by use of a multi-tier algorithm taking account of the varied prosodic and spectral characteristicsJinfu Ni, Hisashi Kawai. 2361-2364 [doi]

A data-driven approach to source-formant type text-to-speech systemHiroki Mori, Takahiro Ohtsuka, Hideki Kasuya. 2365-2368 [doi]

Power spectral density based channel equalization of large speech database for concatenative TTS systemYu Shi, Eric Chang, Hu Peng, Min Chu. 2369-2372 [doi]

CU VOCAL: corpus-based syllable concatenation for Chinese speech synthesis across domains and dialectsHelen M. Meng, Chi-Kin Keung, Kai-Chung Siu, Tien Ying Fung, P. C. Ching. 2373-2376 [doi]

Perceptual evaluation of naturalness due to substitution of Chinese syllable for concatenative speech synthesisJinlin Lu, Hisashi Kawai. 2377-2380 [doi]

Reducing the footprint of the IBM trainable speech synthesis systemDan Chazan, Ron Hoory, Zvi Kons, Dorel Silberstein, Alexander Sorin. 2381-2384 [doi]

Computationally efficient time-scale modification of speech using 3 level clippingSung Joo Lee, Hyung Soon Kim. 2385-2388 [doi]

A miniature Chinese TTS system based on tailored corpusZhiwei Shuang, Yu Hu, Zhen-Hua Ling, Ren-Hua Wang. 2389-2392 [doi]

Phonetic normalization using z-score in segmental prosody estimation for corpus-based TTS systemHoeun Song, Jaein Kim, Kyongrok Lee, Jinyoung Kim. 2393-2396 [doi]

On F0 trajectory optimization for very high-quality speech manipulationHideki Kawahara, Parham Zolfaghari, Alain de Cheveigné. 2397-2400 [doi]

Modeling tones in continuous Cantonese speechTan Lee, Greg Kochanski, Chilin Shih, Yujia Li. 2401-2404 [doi]

Pitch contour model for Chinese text-to-speech using CART and statistical modelMinghui Dong, Kim-Teng Lua. 2405-2408 [doi]

Basque intonation modelling for text to speech conversionEva Navas, Inmaculada Hernáez, Juan María Sánchez. 2409-2412 [doi]

Application of microprosody models in text to speech synthesisPhuay Hui Low, Saeed Vaseghi. 2413-2416 [doi]

Prosodic phrasing with inductive learningSheng Zhao, Jianhua Tao, Lianhong Cai. 2417-2420 [doi]

Speech reconstruction from mel-frequency cepstral coefficients using a source-filter modelBen Milner, Xu Shao. 2421-2424 [doi]

Designing Japanese speech database covering wide range in prosody for hybrid speech synthesizerHiromichi Kawanami, Tsuyoshi Masuda, Tomoki Toda, Kiyohiro Shikano. 2425-2428 [doi]

Towards the question: why has speaking rate such an impact on speech recognition performance?Robert Faltlhauser, Günther Ruske, Matthias Thomae. 2429-2432 [doi]

Robust voiced-unvoiced decision associated to continuous pitch tracking in noisy telephone speechMijail Arcienega, Andrzej Drygajlo. 2433-2436 [doi]

Noise adaptive speech recognition with acoustic models trained from noisy speech evaluated on Aurora-2 databaseKaisheng Yao, Kuldip K. Paliwal, Satoshi Nakamura. 2437-2440 [doi]

Recognition of noisy speech using normalized momentsJingdong Chen, Yiteng Huang, Qi Li, Frank K. Soong. 2441-2444 [doi]

Low-resource noise-robust feature post-processing on Aurora 2.0Chia-Ping Chen, Jeff A. Bilmes, Katrin Kirchhoff. 2445-2448 [doi]

Exploiting variances in robust feature extraction based on a parametric model of speech distortionLi Deng, Jasha Droppo, Alex Acero. 2449-2452 [doi]

Improving performance of an HMM-based ASR system by using monophone-level normalized confidence measureMuhammad Ghulam, Takashi Fukuda, Takaharu Sato, Tsuneo Nitta. 2453-2456 [doi]

Model partial pronunciation variations for spontaneous Mandarin speech recognitionYi Liu, Pascale Fung. 2457-2460 [doi]

Reducing pronunciation lexicon confusion and using more data without phonetic transcription for pronunciation modelingThomas Fang Zheng, Zhanjiang Song, Pascale Fung, William Byrne. 2461-2464 [doi]

Classification error from the theoretical Bayes classification riskErik McDermott, Shigeru Katagiri. 2465-2468 [doi]

Combined binary classifiers with applications to speech recognitionAldebaro Klautau, Nikola Jevtic, Alon Orlitsky. 2469-2472 [doi]

Optimal selection of speech data for automatic speech recognition systemsArkadiusz Nagórski, Lou Boves, Herman J. M. Steeneken. 2473-2476 [doi]

Hypophonia in parkinson disease: neural correlates of voice treatment with LSVT revealed by PETMario Liotti, Lorraine O. Ramig, Deanie Vogel, Pamela New, Chris Cook, Peter Fox. 2477-2480 [doi]

Preliminary data on effects of behavioral and levodopa therapies on speech-accompanying gesture in parkinson²s diseaseSusan Duncan. 2481-2484 [doi]

Speech pauses and gestural holds in parkinson²s diseaseFrancis K. H. Quek, Mary P. Harper, Yonca Haciahmetoglu, Lei Chen 0004, Lorraine O. Ramig. 2485-2488 [doi]

Oro-facial changes in parkinson²s disease following intensive voice therapy (LSVT)Jennifer L. Spielman, Lorraine O. Ramig, Joan C. Borod. 2489-2492 [doi]

Application of the lee silverman voice treatment (LSVT) to individuals with multiple sclerosis, ataxic dysarthria, and strokeLeslie Will, Lorraine O. Ramig, Jennifer L. Spielman. 2497-2500 [doi]

On the estimation of signal-to-noise ratio in continuous speech for abnormal voicesVijay Parsa, Donald G. Jamieson, Karen Stenning, Herbert A. Leeper. 2505-2508 [doi]

Computationally efficient method of speech enhancement based on block representation of signal in state space and vector quantizationV. Semenov, Alexander Kovtonyuk, Alexander Kalyuzhny. 2509-2512 [doi]

Active speech cancellation for cellular speechKazuhiro Kondo, Kiyoshi Nakagawa. 2513-2516 [doi]

Warped-LP residual resampling using DCT for pitch modificationR. Muralishankar, A. G. Ramakrishnan, P. Prathibha. 2517-2520 [doi]

Application of real-time AMDF pitch-detection in a voice gender normalisation systemE. Jung, A. Schwarzbacher, K. Humphreys, R. Lawlor. 2521-2524 [doi]

A copy synthesis method to pilot the klatt synthesiserYves Laprie, Anne Bonneau. 2525-2528 [doi]

Speaker recognizability evaluation of a voicefont-based text-to-speech systemMasaharu Sakamoto, Takashi Saito. 2529-2532 [doi]

Time-frequency transforms and beamforming for speaker recognitionAntonio Satué-Villar, Juan Fernández-Rubio. 2533-2536 [doi]

Speaker change detection using a new weighted distance measureSoonil Kwon, Shrikanth S. Narayanan. 2537-2540 [doi]

FPGA hardware for speech recognition using hidden Markov modelsJosé L. Gómez-Cipriano, Roger P. Nunes, Dante A. C. Barone. 2541-2544 [doi]

Evaluation of a speech recognition / generation method based on HMM and straightToshio Irino, Yasuhiro Minami, Tomohiro Nakatani, Minoru Tsuzaki, H. Tagawa. 2545-2548 [doi]

A modality-independent MMI system architectureKouichi Katsurada, Yoshihiko Ootani, Yusaku Nakamura, Satoshi Kobayashi, Hirobumi Yamada, Tsuneo Nitta. 2549-2552 [doi]

An architecture for a multi-modal web browserCristiana Armaroli, Ivano Azzini, Lorenza Ferrario, Toni Giorgino, Luca Nardelli, Marco Orlandi, Carla Rognoni. 2553-2556 [doi]

Collecting mobile multimodal data for matchPatrick Ehlen, Michael Johnston, Gunaranjan Vasireddy. 2557-2560 [doi]

ISIS: a multi-modal, trilingual, distributed spoken dialog system developed with CORBA, java, XML and KQMLHelen M. Meng, P. C. Ching, Yee Fong Wong, Cheong Chat Chan. 2561-2564 [doi]

The perception of stop consonant sequences in dyslexic and normal childrenNoël Nguyen, Ludovic Jankowski, Michel Habib. 2565-2568 [doi]

Submoraic awareness by Japanese school children: evidence from a novel gameTakashi Otake, Akemi Iijima. 2569-2572 [doi]

Speaker intelligibility of adults and childrenD. Markham, Valérie Hazan. 2573-2576 [doi]

Acoustical correlates to SD ratings of speaker characteristics in two speaking stylesYasuki Yamashita, Hiroshi Matsumoto. 2577-2580 [doi]

Subjective assessment of frequency bands for perception of speaker identityEda Ormanci, U. Hakan Nikbay, Oytun Türk, Levent M. Arslan. 2581-2584 [doi]

Discriminative linear transforms for feature normalization and speaker adaptation in HMM estimationStavros Tsakalidis, Vlasios Doumpiotis, William Byrne. 2585-2588 [doi]

Speaking rate compensation based on likelihood criterion in acoustic model training and decodingKozo Okuda, Tatsuya Kawahara, Satoshi Nakamura. 2589-2592 [doi]

Combining maximum likelihood and maximum a posteriori estimation for detailed acoustic modeling of context dependencyMichiel Bacchiani. 2593-2596 [doi]

Large vocabulary conversational speech recognition with the extended maximum likelihood linear transformation (EMLLT) modelJing Huang, Vaibhava Goel, Ramesh Gopinath, Brian Kingsbury, Peder A. Olsen, Karthik Visweswariah. 2597-2600 [doi]

Modeling varying pauses to develop robust acoustic models for recognizing noisy conversational speechJin-Song Zhang, Satoshi Nakamura. 2601-2604 [doi]

Objective distance measures for spectral discontinuities in concatenative speech synthesisJithendra Vepa, Simon King, Paul Taylor. 2605-2608 [doi]

Data-driven segment preselection in the IBM trainable speech synthesis systemWael Hamza, Robert Donovan. 2609-2612 [doi]

Perpetually optimizing the cost function for unit selection in a TTS system with one single run of MOS evaluationHu Peng, Yong Zhao, Min Chu. 2613-2616 [doi]

Information-theoretic criteria for unit selection synthesisJon R. W. Yi, James R. Glass. 2617-2620 [doi]

Acoustic measures vs. phonetic features as predictors of audible discontinuity in concatenative speech synthesisHisashi Kawai, Minoru Tsuzaki. 2621-2624 [doi]

Improving phone-level discrimination in LDA with subphone-level classesHwa Jeon Song, Hyung Soon Kim. 2625-2628 [doi]

A combined model of statics-dynamics of speech optimized using maximum mutual informationZhijian Ou, Zuoying Wang. 2629-2632 [doi]

Syllable recognition using syllable-segment statistics and syllable-based HMMNobutoshi Takahashi, Seiichi Nakagawa. 2633-2636 [doi]

Recurrent neural network-enhanced HMM speech recognition systemsJ. W. F. Thirion, Elizabeth C. Botha. 2637-2640 [doi]

Sharing trend information of trajectory in segmental-feature HMMYoung-Sun Yun. 2641-2644 [doi]

Framewise phone classification using support vector machinesSimon King, Jesper Salomon. 2645-2648 [doi]

A state-tying approach to building syllable HMMsDarryl Stewart, Ming Ji, Philip Hanna, F. Jack Smith. 2649-2652 [doi]

Recognition of continuous speech segments of monophone units using support vector machinesWeifeng Lee, C. Chandra Sekhar, Kazuya Takeda, Fumitada Itakura. 2653-2656 [doi]

Construction of decision tree from data driven clusteringJunho Park, Hanseok Ko. 2657-2660 [doi]

Selective multi-path acoustic model based on database likelihoodsAkinobu Lee, Yuichiro Mera, Hiroshi Saruwatari, Kiyohiro Shikano. 2661-2664 [doi]

Auxiliary variables in conditional Gaussian mixtures for automatic speech recognitionTodd A. Stephenson, Mathew Magimai-Doss, Hervé Bourlard. 2665-2668 [doi]

Constructing shared-state hidden Markov models based on a Bayesian approachShinji Watanabe, Yasuhiro Minami, Atsushi Nakamura, Naonori Ueda. 2669-2672 [doi]

Generalization of state-observation-dependency in partly hidden Markov modelsTetsuji Ogawa, Tetsunori Kobayashi. 2673-2676 [doi]

A study of multi-speaker dialogue system for mobile information retrievalHsien-Chang Wang, Chieh-Yi Huang, Chung-Hsien Yang, Jhing-Fa Wang. 2677-2680 [doi]

AT&t help deskGiuseppe Di Fabbrizio, Dawn Dutton, Narendra K. Gupta, Barbara Hollister, Mazin G. Rahim, Giuseppe Riccardi, Robert E. Schapire, Juergen Schroeter. 2681-2684 [doi]

Basurde[lite], a machine-driven dialogue system for accessing railway timetable informationRoger Trias-Sanz, José B. Mariño. 2685-2688 [doi]

Amplitude convergence in children²s conversational speech with animated personasRachel Coulston, Sharon L. Oviatt, Courtney Darves. 2689-2692 [doi]

Flexible dialogue management in the talk²ntravel systemDavid Stallard. 2693-2696 [doi]

E-mail goes mobile: the design and implementation of a spoken language interface to e-mailDaniela Oria, Esa Koskinen. 2697-2700 [doi]

Wizard of oz evaluation of a dialogue with communicator system in ChileNéstor Becerra Yoma, Angela Cortés, Mauricio Hormazábal, Enrique López. 2701-2704 [doi]

A portable, server-side dialog framework for voiceXMLBob Carpenter, Sasha Caskey, Krishna Dayanidhi, Caroline Drouin, Roberto Pieraccini. 2705-2708 [doi]

Spoken dialogue system for home health careShinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta. 2709-2712 [doi]

ACIMET: access to meteorological information by telephoneJaume Padrell, Javier Hernando. 2713-2716 [doi]

SPIN: language understanding for spoken dialogue systems using a production system approachRalf Engel. 2717-2720 [doi]

External Links

Cite Key

Statistics

PDF

Researchr

7th International Conference on Spoken Language Processing, ICSLP2002 - INTERSPEECH 2002, Denver, Colorado, USA, September 16-20, 2002

Abstract

Table of Contents