INTERSPEECH 2004 - ICSLP, 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004

researchr

You are not signed in
Sign in
Sign up

INTERSPEECH 2004 - ICSLP, 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004. ISCA, 2004.

Conference: interspeech2004

Abstract is missing.

From X-ray or MRU data to sounds through articulatory synthesis: towards an integrated view of the speech communication processJacqueline Vaissière. [doi]

From decoding-driven to detection-based paradigms for automatic speech recognitionChin-Hui Lee. [doi]

Stochastic gradient adaptation of front-end parametersSreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goel. 1-4 [doi]

Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributionsAntoine Raux, Rita Singh. 5-8 [doi]

Transformation and combination of hiden Markov models for speaker selection trainingChao Huang, Tao Chen, Eric Chang. 9-12 [doi]

Improving eigenspace-based MLLR adaptation by kernel PCABrian Kan-Wing Mak, Roger Wend-Huu Hsiao. 13-16 [doi]

Rapid acoustic model development using Gaussian mixture clustering and language adaptationNikos Chatzichrisafis, Vassilios Digalakis, Vassilios Diakoloukas, Costas Harizakis. 17-20 [doi]

Adaptation of front end parameters in a speech recognizerKarthik Visweswariah, Ramesh A. Gopinath. 21-24 [doi]

Language recognition using phone laticesJean-Luc Gauvain, Abdelkhalek Messaoudi, Holger Schwenk. 25-28 [doi]

ACCDIST: a metric for comparing speakers accentsMark Huckvale. 29-32 [doi]

Aspects of named entity processingMichael Levit, Allen L. Gorin, Patrick Haffner, Hiyan Alshawi, Elmar Nöth. 33-36 [doi]

Finite-state-based and phrase-based statistical machine translationJosep Maria Crego, José B. Mariño, Adrià de Gispert. 37-40 [doi]

Using word latice information for a tighter coupling in speech translation systemsTanja Schultz, Szu-Chen Stan Jou, Stephan Vogel, Shirin Saleem. 41-44 [doi]

Confirmation strategy for document retrieval systems with spoken dialog interfaceTeruhisa Misu, Tatsuya Kawahara, Kazunori Komatani. 45-48 [doi]

Correlation between VOT and F0 in the perception of Korean stops and affricatesMidam Kim. 49-52 [doi]

The development of anticipatory labial coarticulation in French: a pionering studyAude Noiray, Lucie Ménard, Marie-Agnès Cathiard, Christian Abry, Christophe Savariaux. 53-56 [doi]

Data-driven approaches for automatic detection of syllable boundariesJilei Tian. 61-64 [doi]

Phonemic repertoire and similarity within the vocabularyAnne Cutler, Dennis Norris, Núria Sebastián-Gallés. 65-68 [doi]

Boostrapping phonetic lexicons for new languagesSameer Maskey, Alan W. Black, Laura Tomokiya. 69-72 [doi]

Biomechanical parameter fingerprint in the mucosal wave power spectral densityJuan Ignacio Godino-Llorente, María Victoria Rodellar Biarge, Pedro Gómez Vilda, Francisco Díaz Pérez, Agustín Álvarez Marquina, Rafael Martínez-Olalla. 73-76 [doi]

Classification of pathological voice including severely noisy casesCheolwoo Jo, Soo-Geon Wang, Byung-Gon Yang, Hyung Soon Kim, Tao Li. 77-80 [doi]

A robust glottal source model estimation techniqueQiang Fu, Peter Murphy. 81-84 [doi]

F0 and formant frequency distribution of dysarthric speech - a comparative studyHiroki Mori, Yasunori Kobayashi, Hideki Kasuya, Hajime Hirose, Noriko Kobayashi. 85-88 [doi]

Procedure senza vibrato : a key component for morphing singingHideki Kawahara, Yumi Hirachi, Masanori Morise, Hideki Banno. 89-92 [doi]

Thyroplastic medialisation in unilateral vocal fold paralysis: assessing voice quality recoveringClaudia Manfredi, Giorgio Peretti, Laura Magnoni, Fabrizio Dori, Ernesto Iadanza. 93-96 [doi]

Evaluation of universal compensation on Aurora 2 and 3 and beyondMing Ji, Baochun Hou. 97-100 [doi]

Accounting for the uncertainty of speech estimates in the context of model-based feature enhancementHugo Van Hamme, Patrick Wambacq, Veronique Stouten. 105-108 [doi]

Applying the Aurora feature extraction schemes to a phoneme based recognition taskHans-Günter Hirsch, Harald Finster. 109-112 [doi]

Evaluation of tree-structured piecewise linear transformation-based noise adaptation on AURORA2 databaseZhipeng Zhang, Tomoyuki Ohya, Sadaoki Furui. 113-116 [doi]

Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithmTor André Myrvoll, Satoshi Nakamura. 117-120 [doi]

HMM-based feature compensation method: an evaluation using the AURORA2Akira Sasou, Kazuyo Tanaka, Satoshi Nakamura, Futoshi Asano. 121-124 [doi]

Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mappingXuechuan Wang, Douglas D. O Shaughnessy. 125-128 [doi]

MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognitionBenjamin J. Shannon, Kuldip K. Paliwal. 129-132 [doi]

A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASRMuhammad Ghulam, Takashi Fukuda, Junsei Horikawa, Tsuneo Nitta. 133-136 [doi]

Including uncertainty of speech observations in robust speech recognitionJosé C. Segura, Ángel de la Torre, Javier Ramírez, Antonio J. Rubio, M. Carmen Benítez. 137-140 [doi]

Integration of n-best recognition results obtained by multiple noise reduction algorithmsTakeshi Yamada, Jiro Okada, Nobuhiko Kitawaki. 141-144 [doi]

Revisiting some model-based and data-driven denoising algorithms in Aurora 2 contextPanji Setiawan, Sorel Stan, Tim Fingscheidt. 145-148 [doi]

Exploring high-performance speech recognition in noisy environments using high-order taylor series expansionGuo-Hong Ding, Bo Xu. 149-152 [doi]

A robust training algorithm based on neighborhood informationWing-Hei Au, Man-Hung Siu. 153-156 [doi]

In-phase feature induction: an effective compensation technique for robust speech recognitionSiu Wa Lee, Pak-Chung Ching. 157-160 [doi]

Improved performance of Aurora 4 using HTK and unsupervised MLLR adaptationJeff Siu-Kei Au-Yeung, Man-Hung Siu. 161-164 [doi]

A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filteringShang-Nien Tsai, Lin-Shan Lee. 165-168 [doi]

Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognitionChristian Fügen, Hartwig Holzapfel, Alex Waibel. 169-172 [doi]

Noise robust real world spoken dialogue system using GMM based rejection of unintended inputsAkinobu Lee, Keisuke Nakamura, Ryuichi Nisimura, Hiroshi Saruwatari, Kiyohiro Shikano. 173-176 [doi]

Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionaryHironori Oshikawa, Norihide Kitaoka, Seiichi Nakagawa. 177-180 [doi]

Constrained minimization technique for topic identification using discriminative training and support vector machinesImed Zitouni, Minkyu Lee, Hui Jiang. 181-184 [doi]

Characterizing task-oriented dialog using a simulated ASR chanelJason D. Williams, Steve Young. 185-188 [doi]

A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robotsTakashi Konashi, Motoyuki Suzuki, Akinori Ito, Shozo Makino. 189-192 [doi]

Noise adaptive spoken dialog system based on selection of multiple dialog strategiesAkinori Ito, Takanobu Oba, Takashi Konashi, Motoyuki Suzuki, Shozo Makino. 193-196 [doi]

Flexible dialogue management using distributed and dynamic dialogue controlMikko Hartikainen, Markku Turunen, Jaakko Hakulinen, Esa-Pekka Salonen, J. Adam Funk. 197-200 [doi]

Contextual revision in information seeking conversation systemsKeith Houck. 201-204 [doi]

Cross domain dialogue modelling: an object-based approachIan M. O Neill, Philip Hanna, Xingkun Liu, Michael F. McTear. 205-208 [doi]

A comparison of confirmation styles for error handling in a speech dialog systemHirohiko Sagawa, Teruko Mitamura, Eric Nyberg. 209-212 [doi]

Using computer simulation to compare two models of mixed-initiativeFan Yang, Peter A. Heeman. 213-216 [doi]

Towards understanding mixed-initiative in task-oriented dialoguesFan Yang, Peter A. Heeman, Kristy Hollingshead. 217-220 [doi]

Spokenquery: an alternate approach to chosing items with speechPeter Wolf, Joseph Woelfel, Jan van Gemert, Bhiksha Raj, David Wong. 221-224 [doi]

Mining customer care dialogs for daily news Shona Douglas, Deepak Agarwal, Tirso Alonso, Robert M. Bell, Mazin G. Rahim, Deborah F. Swayne, Chris Volinsky. 225-228 [doi]

Higgins - a spoken dialogue system for investigating error handling techniquesJens Edlund, Gabriel Skantze, Rolf Carlson. 229-232 [doi]

A conversational dialogue system for cognitively overloaded usersFuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Hua Cheng, Hauke Schmidt, Harry Bratt, Rohit Mishra, Stanley Peters, Sandra Upson, Elizabeth Shriberg, Carsten Bergmann, Lin Zhao. 233-236 [doi]

Modeling generic dialog applications for embedded systemsGerhard Hanrieder, Stefan W. Hamerich. 237-240 [doi]

A framework for dialogue data collection with a simulated ASR channelMatthew N. Stuttle, Jason D. Williams, Steve Young. 241-244 [doi]

A multi-layer conversation management approach for information seeking applicationsShimei Pan. 245-248 [doi]

A universal speech interface for appliancesThomas K. Harris, Roni Rosenfeld. 249-252 [doi]

Speech understanding, dialogue management and response generation in corpus-based spoken dialogue systemKeita Hayashi, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi. 253-256 [doi]

Implementation of dialog applications in an open-source voiceXML platformFernando Fernandez, Valentín Sama, Luis Fernando D Haro, Rubén San Segundo, Ricardo de Córdoba, Juan Manuel Montero. 257-260 [doi]

Fuzzy logic decision fusion in a multimodal biometric systemChun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu Sang Moon, Yeung Yam. 261-264 [doi]

A state model for the realization of visual perceptive feedback in smartkomPeter Poller, Norbert Reithinger. 265-268 [doi]

A vector-based method for efficiently representing multivariate environmental informationAkemi Iida, Yoshito Ueno, Ryohei Matsuura, Kiyoaki Aikawa. 269-272 [doi]

A multi-modal dialog system for a mobile robotIoannis Toptsis, Shuyin Li, Britta Wrede, Gernot A. Fink. 273-276 [doi]

Structured interview-based evaluation of spoken multimodal conversation with h.c. andersenNiels Ole Bernsen, Laila Dybkjær. 277-280 [doi]

Memory efficient decoding graph compilation with wide cross-word acoustic contextMiroslav Novak, Vladimír Bergl. 281-284 [doi]

Dynamic beam pruning strategy using adaptive controlDongbin Zhang, Limin Du. 285-288 [doi]

Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognitionTakaaki Hori, Chiori Hori, Yasuhiro Minami. 289-292 [doi]

A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speechPeng Yu, Frank Torsten Bernd Seide. 293-296 [doi]

Keyword spotting for highly inflectional languagesLubos Smídl, Ludek Müller. 297-300 [doi]

Optimizing an engine network that allows dynamic maskingFrédéric Tendeau. 301-304 [doi]

Topic structure extraction for meeting indexingKatsutoshi Ohtsuki, Nobuaki Hiroshima, Yoshihiko Hayashi, Katsuji Bessho, Shoichi Matsunaga. 305-308 [doi]

Automatic detection of dialog acts based on multilevel informationSophie Rosset, Lori Lamel. 309-312 [doi]

Identifying local corrections in human-computer dialogueGina-Anne Levow. 313-316 [doi]

Hot discussion or frosty dialogue? towards a temperature metric for conversational interactivityPeter Reichl, Florian Hammer. 317-320 [doi]

A dynamic vocabulary spoken dialogue interfaceStephanie Seneff, Chao Wang, I. Lee Hetherington, Grace Chung. 321-324 [doi]

Learning dialogue policies using state aggregation in reinforcement learningMatthias Denecke, Kohji Dohsaka, Mikio Nakano. 325-328 [doi]

A compensation method for word-familiarity difference with SNR control in intelligibility testShuichi Sakamoto, Yôiti Suzuki, Shigeaki Amano, Tadahisa Kondo, Naoki Iwaoka. 333-336 [doi]

Phoneme-based word activation in spoken-word recognition: evidence from Japanese school childrenTakashi Otake, Yoko Sakamoto, Yasuyuki Konomi. 337-340 [doi]

Role of segmental and suprasegmental cues in the perception of maghrebian-acented FrenchBelynda Brahimi, Philippe Boula de Mareüil, Cédric Gendrot. 341-344 [doi]

Effect of speaking rate on the acceptability of change in segment durationHiroaki Kato, Yoshinori Sagisaka, Minoru Tsuzaki, Makiko Muto. 345-348 [doi]

A cross-linguistic study of diphthongs in spoken word processing in Japanese and EnglishKiyoko Yoneyama. 349-352 [doi]

Speech translation: past, present and futureAlex Waibel. 353-356 [doi]

Multilingual corpora for speech-to-speech translation researchGen-ichiro Kikui, Toshiyuki Takezawa, Seiichi Yamamoto. 357-360 [doi]

Statistical machine translation and its challengesHermann Ney. 361-364 [doi]

Translingual grammar inductionJohn Lee, Stephanie Seneff. 365-368 [doi]

Usability considerations of speech-to-speech translation systemYoungJik Lee, Jun Park, Seung-Shin Oh. 369-372 [doi]

Worldwide ongoing activities on multilingual speech to speech translationGianni Lazzari, Alex Waibel, Chengqing Zong. 373-376 [doi]

The automatic news transcription system: ANTS, some real time experimentsDominique Fohr, Odile Mella, Christophe Cerisara, Irina Illina. 377-380 [doi]

Use of metadata to improve recognition of spontaneous speech and named entitiesBhuvana Ramabhadran, Olivier Siohan, Geoffrey Zweig. 381-384 [doi]

Duration modeling techniques for continuous speech recognitionJanne Pylkkönen, Mikko Kurimo. 385-388 [doi]

Large vocabulary continuous speech recognition for estonian using morpheme classesTanel Alumäe. 389-392 [doi]

Combining agglomerative and tree-based state clustering for high accuracy acoustic modelingZhaobing Han, Shuwu Zhang, Bo Xu. 393-396 [doi]

Parallel tone score association method for tone language speech recognitionWilliam S.-Y. Wang, Gang Peng. 397-400 [doi]

Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognitionJing Zheng, Horacio Franco, Andreas Stolcke. 401-404 [doi]

Automatic transcription of continuous speech using unsupervised and incremental trainingG. L. Sarada Ghadiyaram, N. Hemalatha Nagarajan, T. Nagarajan Thangavelu, Hema A. Murthy. 405-408 [doi]

Very large vocabulary speech recognition system for automatic transcription of czech broadcast programsJan Nouza, Dana Nejedlová, Jindrich Zdánský, Jan Kolorenc. 409-412 [doi]

Speech recognition error analysis on the English MALACH corpusOlivier Siohan, Bhuvana Ramabhadran, Geoffrey Zweig. 413-416 [doi]

A frame level boosting training scheme for acoustic modelingRong Zhang, Alexander I. Rudnicky. 417-420 [doi]

Optimizing boosting with discriminative criteriaRong Zhang, Alexander I. Rudnicky. 421-424 [doi]

Restructuring HMM states for speaker adaptation in Mandarin speech recognitionXianghua Xu, Qiang Guo, Jie Zhu. 425-428 [doi]

A discriminative locally weighted distance measure for speaker independent template based speech recognitionMike Matton, Mathias De Wachter, Dirk Van Compernolle, Ronald Cools. 429-432 [doi]

Deterministic annealing EM algorithm in parameter estimation for acoustic modelYohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura. 433-436 [doi]

TRAP based features for LVCSR of meting dataFrantisek Grézl, Martin Karafiát, Jan Cernocký. 437-440 [doi]

Optimal acoustic and language model weights for minimizing word verification errorsFrank K. Soong, Wai Kit Lo, Satoshi Nakamura. 441-444 [doi]

Structuring of baseball live games based on speech recognition using task dependant knowledgeAtsushi Sako, Yasuo Ariki. 445-448 [doi]

A two-level schema for detecting recognition errorsZhengyu Zhou, Helen M. Meng. 449-452 [doi]

Large vocabulary continuous speech recognition based on cross-morpheme phonetic informationIn-Jeong Choi, Nam-Hoon Kim, Su-Youn Yoon. 453-456 [doi]

Automatic phonetic base form generation based on maximum context treeChangxue Ma. 457-460 [doi]

Temporal variables in parkinsonian speechDanielle Duez. 461-464 [doi]

Speaker adaptation of a three-dimensional tongue modelOlov Engwall. 465-468 [doi]

Perception of non-native phonemes in noiseNicole Cooper, Anne Cutler. 469-472 [doi]

Intelligibility of degraded speech from smeared STRAIGHT spectrumHideki Kawahara, Hideki Banno, Toshio Irino, Jiang Jin. 473-476 [doi]

Sound source localization based on zero-crosing peak-amplitude codingYoung Ik Kim, Rhee Man Kil. 477-480 [doi]

Adult and infant sensitivity to phonotactic features in spoken JapaneseKajikawa Sachiyo, Fais Laurel, Shigeaki Amano, Werker Janet. 481-484 [doi]

Revisiting dysarthria assessment intelligibility metricsPhil Green, James Carmichael. 485-488 [doi]

The effect of intonation on perception of Cantonese lexical tonesValter Ciocca, Tara L. Whitehill, Joan K. Y. Ma. 489-492 [doi]

Maximum short quantity in Japanese and finish in two perception tests with F0 and db variantsToshiko Isei-Jaakkola. 493-496 [doi]

Evaluation of an inverse filtering technique using physical modeling of voice productionPaavo Alku, Matti Airas, Brad Story. 497-500 [doi]

Positional and phonotactic effects on the realization of taiwan Mandarin tone 2Hui-ju Hsu, Janice Fon. 501-504 [doi]

Speech production based on lossy tube models: unit concatenation and sound transitionsKarl Schnell, Arild Lacroix. 505-508 [doi]

Modelling and ranking of differences across formants of british, australian and american accentsQin Yan, Saeed Vaseghi, Dimitrios Rentzos, Ching-Hsiang Ho. 509-512 [doi]

An experimental method for measuring transfer functions of acoustic tubesTatsuya Kitamura, Satoru Fujita, Kiyoshi Honda, Hironori Nishimoto. 513-516 [doi]

Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networksTakuya Tsuji, Tokihiko Kaburagi, Kohei Wakamiya, Jiji Kim. 517-520 [doi]

Computation of the acoustic characteristics of vocal-tract models with geometrical perturbationKunitoshi Motoki, Hiroki Matsuzaki. 521-524 [doi]

Analysis of hypernasality by synthesisP. Vijayalakshmi, M. RamasubbaReddy. 525-528 [doi]

Adaptive long-term predictive analysis of disordered speechAbdellah Kacha, Francis Grenez, Frédéric Bettens, Jean Schoentgen. 529-532 [doi]

Phoneme restoration in degraded speech communicationSlobodan Jovicic, Sandra Antesevic, Zoran Saric. 533-536 [doi]

Automatic detection of vocal fold paralysis and edemaMaria Marinaki, Constantine Kotropoulos, Ioannis Pitas, Nikolaos Maglaveras. 537-540 [doi]

Voice enhancement of male speakers with laryngeal neoplasmGernot Kubin, Martin Hagmüller. 541-544 [doi]

A comparison of the perturbation analysis between PRAAT and computerize speech labJong Min Choi, Myung-Whun Sung, Kwang Suk Park, Jeong-Hun Hah. 545-548 [doi]

A theoretical analysis of speech recognition based on feature trajectory modelsYasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri. 549-552 [doi]

Discriminative combination of multiple linear predictions for speech recognitionZhijian Ou, Zuoying Wang. 553-556 [doi]

Use of formants in stressed and unstressed continuous speech recognitionDavood Gharavian, Seyed Mohammad Ahadi. 557-560 [doi]

Integration of articulatory dynamic parameters in HMM/BN based speech recognition systemKonstantin Markov, Satoshi Nakamura, Jianwu Dang. 561-564 [doi]

ASR on speech reconstructed from short-time fourier phase spectraLeigh David Alsteris, Kuldip K. Paliwal. 565-568 [doi]

Estimation of semantic confidences on lattice hierarchiesRobert Lieb, Tibor Fábián, Günther Ruske, Matthias Thomae. 569-572 [doi]

Learning subject drift for topic trackingFumiyo Fukumoto, Yoshimi Suzuki. 573-576 [doi]

The ICSI-SRI-UW metadata extraction systemElizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, Mary P. Harper, Yang Liu. 577-580 [doi]

Automatic detection of contrast for speech understandingMark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang. 581-584 [doi]

Integrating layer concept inform ation into n-gram modeling for spoken language understandingNick Jui Chang Wang, Jia-lin Shen, Ching-Ho Tsai. 585-588 [doi]

A robust understanding model for spoken dialoguesJunyan Chen, Ji Wu, Zuoying Wang. 589-592 [doi]

Scoring unknown speaker clustering : VB vs. BICFabio Valente, Christian Wellekens. 593-596 [doi]

Speaker segmentation and clustering in meetingsQin Jin, Tanja Schultz. 597-600 [doi]

Speaker diarization from speech transcriptsLori Lamel, Jean-Luc Gauvain, Leonardo Canseco-Rodriguez. 601-604 [doi]

Evolutive speaker segmentation using a repository systemXavier Anguera Miró, Javier Hernando Pericas. 605-608 [doi]

Speaker indexing in audio archives using test utterance Gaussian mixture modelingHagai Aronowitz, David Burshtein, Amihood Amir. 609-612 [doi]

Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognitionAntoine Raux. 613-616 [doi]

Scalable distributed speech recognition using multi-frame GMM-based block quantizationKuldip K. Paliwal, Stephen So. 617-620 [doi]

Robust speech recognition over packet networks: an overviewNaveen Srinivasamurthy, Kyu Jeong Han, Shrikanth Narayanan. 621-624 [doi]

Theory for speaker recognition over IPThomas Eriksson, Samuel Kim, Hong-Goo Kang, Chungyong Lee. 625-628 [doi]

Voice portal services in packet network and voIP environmentWu Chou, Feng Liu. 629-632 [doi]

Synchronization of speaker selection for centralized tandem free voIP conferencingPeter Kabal, Colm Elliott. 633-636 [doi]

Measuring the perceived importance of time- and frequency-divided speech blocks for transmitting over packet networksAkitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo. 637-640 [doi]

Comparison of transmitter - based packet-loss recovery techniques for voice transmissionMoo-young Kim, W. Bastiaan Kleijn. 641-644 [doi]

Context dependent long units for speech recognitionDenis Jouvet, Ronaldo O. Messina. 645-648 [doi]

Rapid EM training based on model-integrationShinichi Yoshizawa, Kiyohiro Shikano. 649-652 [doi]

Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription systemDominique Fohr, Odile Mella, Irina Illina, Christophe Cerisara. 653-656 [doi]

A statistical discrimination measure for hidden Markov models based on divergenceJorge Silva, Shrikanth Narayanan. 657-660 [doi]

A hybrid SVM/HMM acoustic modeling approach to automatic speech recognitionJan Stadermann, Gerhard Rigoll. 661-664 [doi]

Data driven number-of-states selection in HMM topologiesDirk Knoblauch. 665-668 [doi]

Hybrid model using subspace distribution clustering hidden Markov models and semi-continuous hidden Markov models for embedded speech recognizersYoungkyu Cho, Sung-a Kim, Dongsuk Yook. 669-672 [doi]

Fast clustering of Gaussians and the virtue of representing Gaussians in exponential model formatPeder A. Olsen, Karthik Visweswariah. 673-676 [doi]

Feature-based pronunciation modeling with trainable asynchrony probabilitiesKaren Livescu, James R. Glass. 677-680 [doi]

Maximum entropy direct model as a unified model for acoustic modeling in speech recognitionHong-Kwang Jeff Kuo, Yuqing Gao. 681-684 [doi]

Explicit duration modeling for Cantonese connected-digit recognitionYu Zhu, Tan Lee. 685-688 [doi]

Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systemsArthur Chan, Mosur Ravishankar, Alexander I. Rudnicky, Jahanzeb Sherwani. 689-692 [doi]

Compact acoustic model for embedded implementationJunho Park, Hanseok Ko. 693-696 [doi]

Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approachTakatoshi Jitsuhiro, Satoshi Nakamura. 697-700 [doi]

Comparison of ML, MAP, and VB based acoustic models in large vocabulary speech recognitionPanu Somervuo. 701-704 [doi]

Discriminative training with tied covariance matricesWolfgang Macherey, Ralf Schlüter, Hermann Ney. 705-708 [doi]

Acoustic phonetic modeling using local codebook featuresFrank Diehl, Asunción Moreno. 709-712 [doi]

An efficient codebook design in SDCHMM for mobile communication environmentsGue Jun Jung, Su-Hyun Kim, Yung-Hwan Oh. 713-716 [doi]

Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic modelsMakoto Shozakai, Goshu Nagino. 717-720 [doi]

Context dependent phoneme duration modeling with tree-based state tyingMyoung-Wan Koo, Ho-Hyun Jeon, Sang-Hong Lee. 721-724 [doi]

Chinese prosody phrase break prediction based on maximum entropy modelJian-Feng Li, Guoping Hu, Ren-Hua Wang. 729-732 [doi]

Intonation modeling for indian languagesKrothapalli Sreenivasa Rao, Bayya Yegnanarayana. 733-736 [doi]

Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification frameworkYu Zheng, Gary Geunbae Lee, Byeongchang Kim. 737 [doi]

Using part-of-speech for predicting phrase breaksIan Read, Stephen Cox. 741-744 [doi]

A proposal to quantitatively select the right intonation unit in data-driven intonation modelingDavid Escudero Mancebo, Valentín Cardeñoso-Payo. 745-748 [doi]

Formulating contextual tonal variations in MandarinJinfu Ni, Hisashi Kawai, Keikichi Hirose. 749-752 [doi]

Automatic adaptation of the momel F0 stylisation algorithm to new corporaSalma Mouline, Olivier Boëffard, Paul C. Bagshaw. 753-756 [doi]

Joint extraction and prediction of fujisaki s intonation model parametersPablo Daniel Agüero, Klaus Wimmer, Antonio Bonafonte. 757-760 [doi]

Evaluation of corpus based tone prediction in mismatched environments for greek tts synthesisPanagiotis Zervas, Nikos Fakotakis, George K. Kokkinakis, Georgios Kouroupetroglou, Gerasimos Xydas. 761-764 [doi]

The duration of pitch transition phase and its relative factorsZiyu Xiong, Juanwen Chen. 765-768 [doi]

Polynomial regression model for duration prediction in MandarinYu Hu, Ren-Hua Wang, Lu Sun. 769-772 [doi]

Prediction of the glottal LF parameters using regression treesMichelle Tooher, John G. McKenna. 773-776 [doi]

Bonntempo-corpus and bonntempo-tools: a database for the study of speech rhythm and rateVolker Dellwo, Bianca Aschenberner, Petra Wagner, Jana Dancovicova, Ingmar Steiner. 777-780 [doi]

Analysis of F0 contours of Cantonese utterances based on the command-response modelWentao Gu, Keikichi Hirose, Hiroya Fujisaki. 781-784 [doi]

Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in FrenchMarion Dohen, Hélène Loevenbruck. 785-788 [doi]

Duration modeling for hindi text-to-speech synthesis systemSridhar Krishna Nemala, Partha Pratim Talukdar, Kalika Bali, A. G. Ramakrishnan. 789-792 [doi]

A new prosodic phrasing model for indian language teluguNemala Sridhar Krishna, Hema A. Murthy. 793-796 [doi]

Evolutionary optimization of an adaptive prosody modelOliver Jokisch, Michael Hofmann. 797-800 [doi]

An intonation model for embedded devices based on natural F0 samplesGerasimos Xydas, Georgios Kouroupetroglou. 801-804 [doi]

Prosodic characteristics of czech contrastive topicKaterina Vesela, Nino Peterek, Eva Hajicová. 805-808 [doi]

Combination of standard and throat microphones for robust speech recognition in highly noisy environmentsMartin Graciarena, Federico Cesari, Horacio Franco, Gregory K. Myers, Cregg Cowan, Victor Abrash. 809-812 [doi]

Noise robust digit recognition using a glottal radar sensor for voicing detectionCenk Demiroglu, David V. Anderson. 813-816 [doi]

A cepstral domain maximum likelihod beamformer for speech recognitionDominik Raub, John W. McDonough, Matthias Wölfel. 817-820 [doi]

Recognition of three simultaneous utterance of speech by four-line directivity microphone mounted on head of robotNaoya Mochiki, Tetsunori Kobayashi, Toshiyuki Sekiya, Tetsuji Ogawa. 821-824 [doi]

Complex spectrum circle centroid for microphone-array-based noisy speech recognitionShigeki Sagayama, Okajima Takashi, Kamamoto Yutaka, Takuya Nishimoto. 825-828 [doi]

Automatic speech recognition of co-channel speech: integrated speaker and speech recognition approachLarry P. Heck, Mark Mao. 829-832 [doi]

A first experience on multilingual acoustic modeling of the languages spoken in moroccoJosé B. Mariño, Asunción Moreno, Albino Nogueiras. 833-836 [doi]

Data driven multidialectal phone set for Spanish dialectsMónica Caballero, Asunción Moreno, Albino Nogueiras. 837-840 [doi]

Multilingual e-mail text processing for speech synthesisDaniela Oria, Akos Vetek. 841-844 [doi]

Multi-context rules for phonological processing in polyglot TTS synthesisHarald Romsdorfer, Beat Pfister. 845-848 [doi]

A general approach to TTS reading of mixed-language textsLeonardo Badino, Claudia Barolo, Silvia Quazza. 849-852 [doi]

Context dependent statistical augmentation of persian transcriptsPanayiotis G. Georgiou, Shrikanth S. Narayanan, Hooman Shirani-Mehr. 853-856 [doi]

A soft decision MMSE amplitude estimator as a noise preprocessor to speech coder s using a glottal sensorCenk Demiroglu, David V. Anderson. 857-860 [doi]

Single acoustic-channel speech enhancement based on glottal correlation using non-acoustic sensorRongqiang Hu, David V. Anderson. 861-864 [doi]

In-vehicle based speech processing for hearing impaired subjectsXianxian Zhang, John H. L. Hansen, Kathryn Hoberg Arehart, Jessica Rossi-Katz. 865-868 [doi]

Speech enhancement using adaptive time-domain segmentationSriram Srinivasan, W. Bastiaan Kleijn. 869-872 [doi]

Harmonicity based monaural speech dereverberation with time warping and F0 adaptive windowTomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi, Parham Zolfaghari. 873-876 [doi]

Dereverberation of speech signals based on linear predictionMarc Delcroix, Takafumi Hikichi, Masato Miyoshi. 877-880 [doi]

Perception of affect in speech - towards an automatic processing of paralinguistic information in spoken conversationNick Campbell. 881-884 [doi]

Analysis of emotional speech in voice mail messages: the influence of speakers genderNoël Chateau, Valérie Maffiolo, Christophe Blouin. 885-888 [doi]

Emotion recognition based on phoneme classesChul-Min Lee, Serdar Yildirim, Murtaza Bulut, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan. 889-892 [doi]

Visualizing dynamic features of expressions in speechPeter Robinson, Tal Sobol Shikler. 893-896 [doi]

Friendly speech analysis and perception in standard ChineseAijun Li, Haibo Wang. 897-900 [doi]

Decomposing linguistic and affective components of phonatory qualityAilbhe Ní Chasaide, Christer Gobl. 901-904 [doi]

Continuous speech recognition using joint features derived from the modified group delay function and MFCCRajesh Mahanand Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde. 905-908 [doi]

Phase-space representation of speechHua Yu. 909-912 [doi]

The modified group delay feature: a new spectral representation of speechHema A. Murthy, Rajesh Mahanand Hegde, Venkata Ramana Rao Gadde. 913-916 [doi]

ICA-based feature extraction for phoneme recognitionOh-Wook Kwon, Te-Won Lee. 917-920 [doi]

On using MLP features in LVCSRQifeng Zhu, Barry Y. Chen, Nelson Morgan, Andreas Stolcke. 921-924 [doi]

Learning long-term temporal features in LVCSR using neural networksBarry Y. Chen, Qifeng Zhu, Nelson Morgan. 925-928 [doi]

Neural spike rate spectrum as a noise robust, speaker invariant feature for automatic speech recognitionT. V. Sreenivas, G. V. Kiran, A. G. Krishna. 929-932 [doi]

An adaptive MEL-LPC analysis for speech recognitionYoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada. 933-936 [doi]

Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decompositionKentaro Ishizuka, Noboru Miyazaki, Tomohiro Nakatani, Yasuhiro Minami. 937-940 [doi]

A new acoustic measure for aspiration noise detectionCarlos Toshinori Ishi. 941-944 [doi]

Synthesizing speech from speech recognition parametersKris Demuynck, Oscar Garcia, Dirk Van Compernolle. 945-948 [doi]

LP-TRAP: linear predictive temporal patternsMarios Athineos, Hynek Hermansky, Daniel P. W. Ellis. 949-952 [doi]

Parallel feature generation based on maximizing normalized acoustic likelihoodXiang Li, Richard M. Stern. 953-956 [doi]

An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environmentsKun-Ching Wang. 957-960 [doi]

Improved voice activity detection combining noise reduction and subband divergence measuresJavier Ramírez, José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio. 961-964 [doi]

Voice activity detection using global soft decision with mixture of Gaussian modelKiyoung Park, Changkyu Choi, Jeongsu Kim. 965-968 [doi]

Environmental robust features for speech detectionThomas Kemp, Climent Nadeu, Yin Hay Lam, Josep Maria Sola i Caros. 969-972 [doi]

Crosscorrelation-based multispeaker speech activity detectionKornel Laskowski, Qin Jin, Tanja Schultz. 973-976 [doi]

A quantitative model for formant dynamics and contextually assimilated reduction in fluent speechLi Deng, Yu Dong, Alex Acero. 981-984 [doi]

DWT-based classification of acoustic-phonetic classes and phonetic unitsGernot Kubin, Tuan Van Pham. 985-988 [doi]

Learning nonnegative features of spectro-temporal sounds for classificationYong-Choon Cho, Seungjin Choi. 989-992 [doi]

N-gram language modeling of Japanese using bunsetsu boundariesSungyup Chung, Keikichi Hirose, Nobuaki Minematsu. 993-996 [doi]

Dynamic language modeling for broadcast newsLangzhou Chen, Lori Lamel, Jean-Luc Gauvain, Gilles Adda. 997-1000 [doi]

A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese regionalects Ren-Yuan Lyu, Dau-Cheng Lyu, Min-Siong Liang, Min-Hong Wang, Yuang-Chin Chiang, Chun-Nan Hsu. 1001-1004 [doi]

The influence of target size and distance on the production of speech and gesture in multimodal referring expressionsIelka van der Sluis, Emiel Krahmer. 1005-1008 [doi]

Dynamic time windows for multimodal input fusionAnurag Kumar Gupta, Tasos Anastasakos. 1009-1012 [doi]

MICot : a tool for multimodal input data collectionRaymond H. Lee, Anurag Kumar Gupta. 1013-1016 [doi]

Simulating multimodal applicationsChakib Tadj, Hicham Djenidi, Madjid Haouani, Amar Ramdane-Cherif, Nicole Lévy. 1017-1020 [doi]

A multimodal communication aid for global aphasia patientsJakob Schou Pedersen, Paul Dalsgaard, Børge Lindberg. 1021-1024 [doi]

Mis-recognized utterance detection using hierarchical language modelHirofumi Yamamoto, Gen-ichiro Kikui, Yoshinori Sagisaka. 1025-1028 [doi]

Cross-lingual phoneme mapping for multilingual synthesis systemsMarko Moberg, Kimmo Pärssinen, Juha Iso-Sipilä. 1029-1032 [doi]

Robot motion control using listener s back-channels and head gesture informationKazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, Tsuyoshi Tasaki, Takeshi Yamaguchi. 1033-1036 [doi]

Indonesian speech recognition for hearing and speaking impaired peopleSakriani Sakti, Arry Akhmad Arman, Satoshi Nakamura, Paulus Hutagaol. 1037-1040 [doi]

A two phase arabic language model for speech recognition and other language applicationsMohsen Rashwan. 1041-1044 [doi]

Language model adaptation based on PLSA of topics and speakersYuya Akita, Tatsuya Kawahara. 1045-1048 [doi]

Unified language modeling using finite-state transducers with first applicationsHans J. G. A. Dolfing, Pierce Gerard Buckley, David Horowitz. 1049-1052 [doi]

Effects of language modeling on speech-driven question answeringKatsunobu Itou, Atsushi Fujii, Tomoyosi Akiba. 1053-1056 [doi]

Measuring convergence in language model estimation using relative entropyAbhinav Sethy, Shrikanth Narayanan, Bhuvana Ramabhadran. 1057-1060 [doi]

High-level feature weighted GMM network for audio stream classificationRongqing Huang, John H. L. Hansen. 1061-1064 [doi]

An improved preprocessor for the automatic transcription of broadcast news audio streamJindrich Zdánský, Petr David, Jan Nouza. 1065-1068 [doi]

Speaker-and-environment change detection in broadcast news using the common component GMM-based divergence measureYih-Ru Wang, Chi-Han Huang. 1069-1072 [doi]

Beginning of utterance detection algorithm for low complexity ASR enginesTommi Lahti. 1073-1076 [doi]

Convolutional networks for speech detectionSomsak Sukittanon, Arun C. Surendran, John C. Platt, Christopher J. C. Burges. 1077-1080 [doi]

Detection of vowel on set points in continuous speech using autoassociative neural network modelsSuryakanth V. Gangashetty, Chellu Chandra Sekhar, B. Yegnanarayana. 1081-1084 [doi]

Reconstruction filter design for bone-conducted speechToshiki Tamiya, Tetsuya Shimamura. 1085-1088 [doi]

Frequency warped ARMA analysis of the closed and the open phase of voiced speechPedro J. Quintana-Morales, Juan L. Navarro-Mesa. 1089-1192 [doi]

Zeros of z-transform (ZZT) decomposition of speech for source-tract separationBoris Doval, Baris Bozkurt, Christophe d Alessandro, Thierry Dutoit. 1093-1096 [doi]

Use of neural network mapping and extended kalman filter to recover vocal tract resonances from the MFCC parameters of speechLi Deng, Roberto Togneri. 1097-1100 [doi]

Graphical model approach to pitch trackingXiao Li, Jonathan Malkin, Jeff Bilmes. 1101-1104 [doi]

A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimationBo Xu, Jianhua Tao, Yongguo Kang. 1105-1108 [doi]

From real-time MRI to 3d tongue movementsOlov Engwall. 1109-1112 [doi]

Coarticulatory variability and directionality in [s, ..]: an EPG studyMitsuhiro Nakamura. 1113-1116 [doi]

Flow representation through the glottis having a polygonal boundary shapeYosuke Tanabe, Tokihiko Kaburagi. 1117-1120 [doi]

Analysis of the voice source in different phonation types: simultaneous high-sped imaging of the vocal fold vibration and glottal inverse filteringHannu Pulakka, Paavo Alku, Svante Granqvist, Stellan Hertegard, Hans Larsson, Anne-Maria Laukkanen, Per-Ake Lindestad, Erkki Vilkman. 1121-1124 [doi]

Influence of temporal discretization schemes on formant frequencies and bandwidths in time domain simulations of the vocal tract systemPeter Birkholz, Dietmar Jackel. 1125-1128 [doi]

Acoustic-to-articulatory inversion mapping with Gaussian mixture modelTomoki Toda, Alan W. Black, Keiichi Tokuda. 1129-1132 [doi]

Audio-visual spoken language processingJinyoung Kim, Jeesun Kim, Chris Davis. 1133-1136 [doi]

Issues in the development of auditory-visual speech perception: adults, infants, and childrenKaoru Sekiyama, Denis Burnham. 1137-1140 [doi]

Signaling and detecting uncertainty in audiovisual speech by children and adultsEmiel Krahmer, Marc Swerts. 1141-1144 [doi]

Effect of intensive audiovisual perceptual training on the perception and production of the /l/-/r/ contrast for Japanese learners of EnglishValérie Hazan, Anke Sennema, Andrew Faulkner. 1145-1148 [doi]

Visual recalibration of auditory speech versus selective speech adaptation: different build-up coursesJean Vroomen, Sabine van Linden, Béatrice de Gelder, Paul Bertelson. 1149-1152 [doi]

Of the top of the head: audio-visual speech perception from the nose upChris Davis, Jeesun Kim. 1153-1156 [doi]

Aspects of speaking-face data corpus design methodologyJ. Bruce Millar, Michael Wagner, Roland Goecke. 1157-1160 [doi]

Voice conversion for unknown speakersHui Ye, Steve Young. 1161-1164 [doi]

Domain adaptation methods in the IBM trainable text-to-speech systemVolker Fischer, Jaime Botella Ordinas, Siegfried Kunzmann. 1165-1168 [doi]

Applying pitch connection control in Mandarin speech synthesisYi Zhou, Yiqing Zu, Zhenli Yu, Dongjian Yue, Guilin Chen. 1169-1172 [doi]

A first step towards text-independent voice conversionHermann Ney, David Sündermann, Antonio Bonafonte, Harald Höge. 1173-1176 [doi]

Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systemsZhenli Yu, Kaizhi Wang, Yiqing Zu, Dongjian Yue, Guilin Chen. 1177-1180 [doi]

Subjective evaluation of join cost functions used in unit selection speech synthesisJithendra Vepa, Simon King. 1181-1184 [doi]

Constructing emotional speech synthesizers with limited speech databaseHeiga Zen, Tadashi Kitamura, Murtaza Bulut, Shrikanth Narayanan, Ryosuke Tsuzuki, Keiichi Tokuda. 1185-1188 [doi]

A two-phase pitch marking method for TD-PSOLA synthesisCheng-Yuan Lin, Jyh-Shing Roger Jang. 1189-1192 [doi]

Including dynamic and phonetic information in voice conversion systemsAntonio Bonafonte, Alexander Kain, Jan P. H. van Santen, Helenca Duxans. 1193-1196 [doi]

A novel voice conversion system based on codebook mapping with phoneme-tied weightingZixiang Wang, Ren-Hua Wang, Zhiwei Shuang, Zhen-Hua Ling. 1197-1200 [doi]

Compression of speech database by feature separation and pattern clustering using STRAIGHTZhen-Hua Ling, Yu Hu, Zhiwei Shuang, Ren-Hua Wang. 1201-1204 [doi]

Decision-tree backing-off in HMM-based speech synthesisShunsuke Kataoka, Nobuaki Mizutani, Keiichi Tokuda, Tadashi Kitamura. 1205-1208 [doi]

Using a depth-restricted search to reduce delays in unit selectionNobuyuki Nishizawa, Hisashi Kawai. 1209-1212 [doi]

MLLR adaptation for hidden semi-Markov model based speech synthesisJunichi Yamagishi, Takashi Masuko, Takao Kobayashi. 1213-1216 [doi]

Phoxsy: multi-phone segments for unit selection speech synthesisStefan Breuer, Julia Abresch. 1217-1220 [doi]

Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTSFrancesc Alías, Xavier Llorà, Ignasi Iriondo Sanz, Joan Claudi Socoró, Xavier Sevillano, Lluís Formiga. 1221-1224 [doi]

A voice conversion method based on joint pitch and spectral envelope transformationTaoufik En-Najjary, Olivier Rosec, Thierry Chonavel. 1225-1228 [doi]

Fast GMM-based voice conversion for text-to-speech synthesis systemsTaoufik En-Najjary, Olivier Rosec, Thierry Chonavel. 1229-1232 [doi]

A genetic algorithm for unit selection based speech synthesisRohit Kumar. 1233-1236 [doi]

A memory efficient grapheme-to-phoneme conversion system for speech processingJun Huang, Lex Olorenshaw, Gustavo Hernández Ábrego, Lei Duan. 1237-1240 [doi]

Lexical representation of non-native phonemesMirjam Broersma, K. Marieke Kolkman. 1241-1244 [doi]

A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakersJong-Pyo Lee, Tae-Yeoub Jang. 1245-1248 [doi]

Articulatory correlates of voice qualities of god guys and bad guys in Japanese anime: an MRI studyEmi Zuiki Murano, Mihoko Teshigawara. 1249-1252 [doi]

Effects of phonetic contexts on the duration of phonetic segments in fluent read speechSorin Dusan. 1253-1256 [doi]

A study on nasal coda los in continuous speechQiang Fang. 1257-1260 [doi]

An improved pair-wise variability index for comparing the timing characteristics of speechHua-Li Jian. 1261-1264 [doi]

An acoustic study of speech rhythm in taiwan EnglishHua-Li Jian. 1265-1268 [doi]

Language specific phonetic rules: evidence from domain-initial strengtheningSung-A. Kim. 1269-1272 [doi]

Spectral characteristics of the release bursts in Korean alveolar stopsHansang Park. 1273-1276 [doi]

Frequency effects on vowel reduction in three typologically different languages (dutch, finish, Russian)Rob van Son, Olga Bolotova, Louis C. W. Pols, Mietta Lennes. 1277-1280 [doi]

Assessment of non-native phones in anglicisms by German listenersJulia Abresch, Stefan Breuer. 1281-1284 [doi]

Acoustic and prosodic analysis of Japanese vowel-vowel hiatus with laryngeal effectShigeyoshi Kitazawa, Shinya Kiriyama. 1289-1293 [doi]

A cross-linguistic acoustic comparison of unreleased word-final stops: Korean and ThaiKimiko Tsukada. 1293-1296 [doi]

Acoustic correlates of phrase-internal lexical boundaries in dutchTaehong Cho, Elizabeth K. Johnson. 1297-1300 [doi]

Phonotactics vs. phonetic cues in native and non-native listening: dutch and Korean listeners perception of dutch and EnglishTaehong Cho, James M. McQueen. 1301-1304 [doi]

Comparing intonation of two varieties of French using normalized F0 valuesSvetlana Kaminskaia, François Poiré. 1305-1308 [doi]

Phonetic realization of the suffix-suppressed accentual phrase in KoreanMira Oh, Kee-Ho Kim. 1309-1312 [doi]

Spectral moment vs. bark cepstral analysis of children s word-initial voiceles stopsH. Timothy Bunnell, James B. Polikoff, Jane McNicholas. 1313-1316 [doi]

Pronunciation assessment based upon the compatibility between a learner s pronunciation structure and the target language s lexical structureNobuaki Minematsu. 1317-1320 [doi]

Spread of high tone in akita JapaneseKenji Yoshida. 1321-1324 [doi]

Classifying emotion in Chinese speech by decomposing prosodic featuresDan-Ning Jiang, Lian-Hong Cai. 1325-1328 [doi]

Detecting user engagement in everyday conversationsChen Yu, Paul M. Aoki, Allison Woodruff. 1329-1332 [doi]

Identifying emotion in speech prosody using acoustical cues of harmonyTakashi X. Fujisawa, Norman D. Cook. 1333-1336 [doi]

Context based emotion detection from text inputJianhua Tao. 1337-1340 [doi]

Complex emotion recognition system for a specific user using SOM based on prosodic featuresAtsushi Iwai, Yoshikazu Yano, Shigeru Okuma. 1341-1344 [doi]

Emotion verification for emotion detection and unknown emotion rejectionHoon-Young Cho, Kaisheng Yao, Te-Won Lee. 1345-1348 [doi]

Improvement in corpus-based generation of F0 contours using generation process model for emotional speech synthesisKeikichi Hirose. 1349-1352 [doi]

Dependency structure analysis and sentence boundary detection in spontaneous JapaneseTatsuya Kawahara, Kiyotaka Uchimoto, Hitoshi Isahara, Kazuya Shitaoka. 1353-1356 [doi]

Statistical feature language modelSalma Jamoussi, David Langlois, Jean-Paul Haton, Kamel Smaïli. 1357-1360 [doi]

Vocabulary and language model adaptation using information retrievalBrigitte Bigi, Yan Huang, Renato de Mori. 1361-1364 [doi]

Word n-gram probability estimation from a Japanese raw corpusShinsuke Mori, Daisuke Takuma. 1365-1368 [doi]

Mining of association patterns for language modelingJen-Tzung Chien, Hung-Ying Chen. 1369-1372 [doi]

On latent semantic language modeling and smoothingJen-Tzung Chien, Meng-Sung Wu, Hua-Jui Peng. 1373-1376 [doi]

Automatic pruning of unit selection speech databases for synthesis without loss of naturalnessRohit Kumar, S. Prahallad Kishore. 1377-1380 [doi]

A database design for a TTS synthesis system using lexical diphonesTanya Lambert, Andrew P. Breen. 1381-1384 [doi]

A family-of-models approach to HMM-based segmentation for unit selection speech synthesisJohn Kominek, Alan W. Black. 1385-1388 [doi]

Mutual-information based segment pre-selection in concatenative text-to-speechWei Zhang, Ling Jin, Xijun Ma. 1389-1392 [doi]

Hidden semi-Markov model based speech synthesisHeiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura. 1393-1396 [doi]

DFW-based spectral smoothing for concatenative speech synthesisHartmut R. Pfitzinger. 1397-1400 [doi]

Segmentation and relevance measure for speaker verificationJérôme Louradour, Régine André-Obrecht, Khalid Daoudi. 1401-1404 [doi]

A new nonlinear feature extraction algorithm for speaker verificationMohamed Chetouani, Bruno Gas, Jean-Luc Zarader, Marcos Faúndez-Zanuy. 1405-1408 [doi]

SVM modeling of SNERF-grams for speaker recognitionElizabeth Shriberg, Luciana Ferrer, Anand Venkataraman, Sachin S. Kajarekar. 1409-1412 [doi]

SVM kernel adaptation in speaker classification and verificationPurdy Ho, Pedro J. Moreno. 1413-1416 [doi]

Noise-robust speaker verification using F0 featuresKoji Iwano, Taichi Asami, Sadaoki Furui. 1417-1420 [doi]

Eigen-prosody analysis for robust speaker recognition under mismatch handset environmentZi-He Chen, Yuan-Fu Liao, Yau-Tarng Juang. 1421 [doi]

A trainable prosodic model: learning the contours implementing communicative functions within a superpositional model of intonationGérard Bailly, Bleicke Holm, Véronique Aubergé. 1425-1428 [doi]

Fujisaki model based F0 contours in vietnamese TTSDung Tien Nguyen, Luong Chi Mai, Bang Kim Vu, Hansjörg Mixdorff, Huy Hoang Ngo. 1429-1432 [doi]

Estimating speaking rate in spontaneous speech from z-scores of pattern durationsKazuyuki Ashimura, Hideki Kashioka, Nick Campbell. 1433-1436 [doi]

A style control technique for HMM-based speech synthesisTakashi Masuko, Takao Kobayashi, Keisuke Miyanaga. 1437-1440 [doi]

Children s emotion recognition in an intelligent tutoring scenarioMark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang. 1441-1444 [doi]

Use of prosodic features for speech recognitionKeikichi Hirose, Nobuaki Minematsu. 1445-1448 [doi]

Transformation-based error correction for speech-to-text systemsJochen Peters, Christina Drexel. 1449-1452 [doi]

Phone classification in pseudo-euclidean vector spacesAlexander Gutkin, Simon King. 1453-1456 [doi]

Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generationGrace Chung, Chao Wang, Stephanie Seneff, Edward Filisko, Min Tang. 1457-1460 [doi]

Modeling pronunciation variation using artificial neural networks for English spontaneous speechKen Chen, Mark Hasegawa-Johnson. 1461-1464 [doi]

Foreign-accented speaker-independent speech recognitionStefanie Aalburg, Harald Höge. 1465-1468 [doi]

Non-audible murmur (NAM) speech recognition using a stethoscopic NAM microphonePanikos Heracleous, Yoshitaka Nakajima, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano. 1469-1472 [doi]

Recognition of read and spontaneous children s speech using two new corporaMartin Russell, Shona D Arcy, Lit Ping Wong. 1473-1476 [doi]

Articulatory feature recognition using dynamic Bayesian networksJoe Frankel, Mirjam Wester, Simon King. 1477-1480 [doi]

Predicting word correct rate from acoustic and linguistic confusabilityGies Bouwman, Bert Cranen, Lou Boves. 1481-1484 [doi]

Disambiguation in determining phonemes of sound-imitation words for environmental sound recognitionKazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. 1485-1488 [doi]

Word confusability prediction in automatic speech recognitionJan Anguita, Stephane Peillon, Javier Hernando, Alexandre Bramoulle. 1489-1492 [doi]

Adaptation for soft whisper recognition using a throat microphoneSzu-Chen Stan Jou, Tanja Schultz, Alex Waibel. 1493-1496 [doi]

A statistical lexicon for non-native speech recognitionRainer Gruhn, Konstantin Markov, Satoshi Nakamura. 1497-1500 [doi]

Modeling auxiliary features in tandem systemsMathew Magimai-Doss, Shajith Ikbal, Todd A. Stephenson, Hervé Bourlard. 1501-1504 [doi]

Survey of spontaneous speech phenomena in a multimodal dialogue system and some implications for ASRLouis ten Bosch, Lou Boves. 1505-1508 [doi]

Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic modelsTobias Cincarek, Rainer Gruhn, Satoshi Nakamura. 1509-1512 [doi]

Coping with disfluencies in spontaneous speech recognitionFrederik Stouten, Jean-Pierre Martens. 1513-1516 [doi]

Speaker model quantization for unsupervised speaker indexingSoonil Kwon, Shrikanth Narayanan. 1517-1520 [doi]

Investigating automatic recognition of non-native children s speechMatteo Gerosa, Diego Giuliani. 1521-1524 [doi]

Using machine learning to cope with imbalanced classes in natural speech: evidence from sentence boundary and disfluency detectionYang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary P. Harper. 1525-1528 [doi]

Hybrid utterance verification based on n-best models and model derived from kulback-leibler divergenceMinho Jin, Gyucheol Jang, Sungrack Yun, Chang Dong Yoo. 1529-1532 [doi]

Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situationsMasataka Goto, Koji Kitayama, Katsunobu Itou, Tetsunori Kobayashi. 1533-1536 [doi]

Pronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognitionKyong-Nim Lee, Minhwa Chung. 1537-1540 [doi]

Performance of speech recognition and synthesis in packet-based networksSebastian Möller, Jan Felix Krebber, Alexander Raake. 1541-1544 [doi]

A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet lossAlastair Bruce James, Ben P. Milner, Angel Manuel Gomez. 1545-1548 [doi]

An analysis of packet loss models for distributed speech recognitionBen P. Milner, Alastair Bruce James. 1549-1552 [doi]

Multilayer subword units for open-vocabulary spoken document retrievalShi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh. 1553-1556 [doi]

An efficient partial matching algorithm toward speech retrieval by speechYoshiaki Itoh, Kazuyo Tanaka, Shi-wook Lee. 1557-1560 [doi]

Language detection by neural discriminationCelestin Sedogbo, Sébastien Herry, Bruno Gas, Jean-Luc Zarader. 1561-1564 [doi]

Language identification techniques based on full recognition in an air traffic control taskRicardo de Córdoba, Javier Ferreiros, Valentín Sama, Javier Macías Guarasa, Luis Fernando D Haro, Fernando Fernandez. 1565-1568 [doi]

Dialect analysis and modeling for automatic classificationJohn H. L. Hansen, Umit H. Yapanel, Rongqing Huang, Ayako Ikeno. 1569-1572 [doi]

Rhythm in read british English: interdialect variabilityEmmanuel Ferragne, François Pellegrino. 1573-1576 [doi]

A grammar-based Chinese to English speech translation system for portable devicesPascale Fung, Yi Liu, Yongsheng Yang, Yihai Shen, Dekai Wu. 1577-1580 [doi]

Cost-sensitive call classificationGökhan Tür. 1581-1584 [doi]

An evaluation of a spoken document retrieval baseline system in finishMikko Kurimo, Ville T. Turunen, Inger Ekman. 1585-1588 [doi]

Discriminative training of naive Bayes classifiers for natural language call routingHui Jiang, Pengfei Liu, Imed Zitouni. 1589-1592 [doi]

Phonetic confusion based document expansion for spoken document retrievalNicolas Moreau, Hyoung-Gook Kim, Thomas Sikora. 1593-1596 [doi]

Hybrid named entity recognition for question-answering systemEuisok Chung, Soojong Lim, Yi-Gyu Hwang, Myung-Gil Jang. 1597-1600 [doi]

An online audio indexing systemJitendra Ajmera, Iain McCowan, Hervé Bourlard. 1601-1604 [doi]

Histogram normalisation and the recognition of names and ontology words in the MUMIS projectEric Sanders, Febe de Wet. 1605-1608 [doi]

Improving the topic indexation and segmentation modules of a media watch systemRui Amaral, Isabel Trancoso. 1609-1612 [doi]

Speech timing and rhythmic structure in arabic dialects: a comparison of two approachesMelissa Barkat-Defradas, Rym Hamdi, Emmanuel Ferragne, François Pellegrino. 1613-1616 [doi]

METRIC-SEQDAC: a hybrid approach for audio segmentationHsin-Min Wang, Shih-Sian Cheng. 1617-1620 [doi]

Statistical Chinese spoken document retrieval using latent topical informationJen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-Min Wang. 1621-1624 [doi]

Keyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval taskMasahiko Matsushita, Hiromitsu Nishizaki, Seiichi Nakagawa, Takehito Utsuro. 1625-1628 [doi]

Improved spoken language translation using n-best speech recognition hypothesesRuiqiang Zhang, Gen-ichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita, Wai Kit Lo. 1629-1632 [doi]

Automatic language identification using discrete hidden Markov modelKakeung Wong, Man-Hung Siu. 1633-1636 [doi]

Two-way speech-to-speech translation on handheld devicesBowen Zhou, Daniel Déchelotte, Yuqing Gao. 1637-1640 [doi]

HLT modules scalability within the NESPOLE! projectHervé Blanchon. 1641-1644 [doi]

Why speech recognizers make errors ? a robustness viewHong Kook Kim, Mazin G. Rahim. 1645-1648 [doi]

An energy normalization scheme for improved robustness in speech recognitionSeyed Mohammad Ahadi, Hamid Sheikhzadeh, Robert L. Brennan, George Freeman. 1649-1652 [doi]

Rapid on-line environment compensation for server - based speech recognition in noisy mobile environmentsJuan M. Huerta, Etienne Marcheret, Sreeram Balakrishnan. 1653-1656 [doi]

Modeling phones coarticulation effects in a neural network based speech recognition systemLeila Ansary, Seyyed Ali Seyyed Salehi. 1657-1660 [doi]

Error - weighted discriminative training for HMM parameter estimationDaniel Willett. 1661-1664 [doi]

Robust verification of recognized words in noiseWai Kit Lo, Frank K. Soong, Satoshi Nakamura. 1665-1668 [doi]

Pronunciation assessment based upon the phonological distortions observed in language learners utterancesNobuaki Minematsu. 1669-1672 [doi]

Analysis of the phone level contributions to objective evaluation of English speech by non-nativesYasuo Suzuki, Yoshinori Sagisaka, Katsuhiko Shirai, Makiko Muto. 1673-1676 [doi]

An interactive English pronunciation dictionary for Korean learnersChao Wang, Mitchell Peabody, Stephanie Seneff, Jong-mi Kim. 1677-1680 [doi]

Development of the knowledge-based spoken English evaluation system and its applicationSeok-Chae Rhee, Jeon G. Park. 1681-1684 [doi]

Theory and data in spoken language assessmentJared Bernstein, Isabella Barbier, Elizabeth Rosenfeld, John H. A. L. de Jong. 1685-1688 [doi]

Practical use of English pronunciation system for Japanese students in the CALL classroomTatsuya Kawahara, Masatake Dantsuji, Yasushi Tsubota. 1689-1692 [doi]

Design strategies for a virtual language tutorJonas Beskow, Olov Engwall, Björn Granström, Preben Wik. 1693-1696 [doi]

Dictionary refinements based on phonetic consensus and non-uniform pronunciation reductionGustavo Hernández Ábrego, Lex Olorenshaw, Raquel Tato, Thomas Schaaf. 1697-1700 [doi]

Transcription of arabic broadcast newsAbdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain. 1701-1704 [doi]

Spontaneous speech recognition using a massively parallel decoderTakahiro Shinozaki, Sadaoki Furui. 1705-1708 [doi]

Issues in meeting transcription - the ISL meeting transcription systemTanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen. 1709-1712 [doi]

Multi-pass ASR using vocabulary expansionKatsutoshi Ohtsuki, Nobuaki Hiroshima, Shoichi Matsunaga, Yoshihiko Hayashi. 1713-1716 [doi]

Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognitionVlasios Doumpiotis, William Byrne. 1717-1720 [doi]

Evaluating cognitive load in spoken language interfaces using a dual-task paradigmEllen Campana, Michael K. Tanenhaus, James F. Allen, Roger W. Remington. 1721-1724 [doi]

The voice-logbook: integrating human factors for a chronic care systemLesley-Ann Black, Norman D. Black, Roy Harper, Michelle Lemon, Michael F. McTear. 1725-1728 [doi]

Communicative competence and adaptation in a spoken dialogue systemKristiina Jokinen. 1729-1732 [doi]

Evaluation of the difference between the driving behavior of a speech based and a speech-visual based task of an in-car computeZhan Fu, Lay Ling Pow, Fang Chen. 1733-1736 [doi]

Evaluating system metaphors via the speech output of a smart home systemSebastian Möller, Jan Felix Krebber, Paula M. T. Smeele. 1737-1740 [doi]

Elements of interactivity in telephone conversationsFlorian Hammer, Peter Reichl, Alexander Raake. 1741-1744 [doi]

Triphone-based confidence system for speaker identificationAaron D. Lawson, Mark C. Huggins. 1745-1748 [doi]

Improved model training and automatic weight adjustment for multi-SNR multi-band speaker identification systemKenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki. 1749-1752 [doi]

A new approach to channel robust speaker verification via constrained stochastic feature transformationMan-Wai Mak, Kwok-Kwong Yiu, Ming-Cheung Cheung, Sun-Yuan Kung. 1753-1756 [doi]

Best speaker-based structure tree for speaker verificationChakib Tadj, Christian S. Gargour, Nabil Badri. 1757-1760 [doi]

Robust speaker identification based on perceptual log area ratio and Gaussian mixture modelsDavid Chow, Waleed H. Abdulla. 1761-1764 [doi]

Channel frequency response correction for speaker recognitionStanley J. Wenndt, Richard M. Floyd. 1765-1768 [doi]

Unseen handset mismatch compensation based on a priori knowledge interpolation for robust speaker recognitionYh-Her Yang, Yuan-Fu Liao. 1769-1772 [doi]

A comparison of soft and hard spectral subtraction for speaker verificationMichael T. Padilla, Thomas F. Quatieri. 1773-1776 [doi]

Comparison of several speaker verification procedures based on GMMVlasta Radová, Ales Padrta. 1777-1780 [doi]

Improving performance of text-independent speaker identification by utilizing contextual principal curves filteringYong Guan, Wenju Liu, Hongwei Qi, Jue Wang. 1781-1784 [doi]

Speaker identification using probabilistic PCA model selectionJen-Tzung Chien, Chuan-Wei Ting. 1785-1788 [doi]

Text independent speaker recognition using speaker dependent word spottingHagai Aronowitz, David Burshtein, Amihood Amir. 1789-1792 [doi]

A study on model-based equal error rate estimation for automatic speaker verificationHsiao-Chuan Wang, Jyh-Min Cheng. 1793-1796 [doi]

Probabilistic speaker identification with dual penalized logistic regression machineTomoko Matsui, Kunio Tanabe. 1797-1800 [doi]

Model quality evaluation during enrolment for speaker verificationJavier R. Saeta, Javier Hernando. 1801-1804 [doi]

Real-time speaker identificationPasi Fränti, Evgeny Karpov, Tomi Kinnunen. 1805-1808 [doi]

Multi-codebook vector quantization algorithm for speaker identificationMohammed Abu El-Yazeed, Nemat S. Abdel Kader, Mohammed El-Henawy. 1809-1812 [doi]

Multi-sample fusion with constrained feature transformation for robust speaker verificationMing-Cheung Cheung, Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung. 1813-1816 [doi]

Generating gestures from speechRubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, Ricardo de Córdoba, Javier Ferreiros, José Manuel Pardo. 1817-1820 [doi]

Subtopic segmentation in the lecture speechNoboru Kanedera, Asuka Sumida, Takao Ikehata, Tetsuo Funada. 1821-1824 [doi]

Some articulatory measurements of real sadnessDonna Erickson, Caroline Menezes, Akinori Fujino. 1825-1828 [doi]

Application of voice conversion to hearing-impaired Mandarin speech enhancementChen-Long Lee, Wen-Whei Chang, Yuan-Chuan Chiang. 1829-1832 [doi]

A Japanese dialogue-based CALL system with mispronunciation and grammar error detectionOh Pyo Kweon, Akinori Ito, Motoyuki Suzuki, Shozo Makino. 1833-1836 [doi]

Statistics-based direction finding for training vowelsCheolwoo Jo, Ilsuh Bak. 1837-1840 [doi]

Reference marking in children s computer-directed speech: an integrated analysis of discourse and gesturesSimona Montanari, Serdar Yildirim, Elaine Andersen, Shrikanth Narayanan. 1841-1844 [doi]

What makes a non-native accent?: a study of Korean EnglishJong-mi Kim, Suzanne Flynn. 1845-1848 [doi]

Study on emotional speech features in Korean with its aplication to voice color conversionSang-Jin Kim, Kwang-Ki Kim, Minsoo Hahn. 1849-1852 [doi]

Developmental changes in voiced-segment ratio for Japanese infants and parentsShigeaki Amano, Tomohiro Nakatani, Tadahisa Kondo. 1853-1856 [doi]

Implementation of an intonational quality assessment system for a handheld deviceKisun You, Hoyoun Kim, Wonyong Sung. 1857-1860 [doi]

Characterizing and classifying cued speech vowels from labial parametersDenis Beautemps, Thomas Burger, Laurent Girin. 1861-1864 [doi]

Cough detection in spoken dialogue system for home health careShinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta. 1865-1868 [doi]

A prosodic phrasing model for a Korean text-to-speech synthesis systemKyuchul Yoon. 1873-1876 [doi]

A comparison of statistical methods and features for the prediction of prosodic structuresQin Shi, Volker Fischer. 1877-1880 [doi]

Letter-to-sound for small-footprint multilingual TTS engineGui-Lin Chen, Ke-Song Han. 1881-1884 [doi]

Grapheme-to-phoneme conversion for Chinese text-to-speechJun Xu, Guohong Fu, Haizhou Li. 1885-1888 [doi]

XML representation languages as a way of interconnecting TTS modulesMarc Schröder, Stefan Breuer. 1889-1892 [doi]

Approach to interchange-format based Chinese generationWenjie Cao, Chengqing Zong, Bo Xu. 1893-1896 [doi]

Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesisEnrico Zovato, Stefano Sandri, Silvia Quazza, Leonardo Badino. 1897-1900 [doi]

Synthesis of vowels and tones in Thai language by articulatory modelingThanate Khaorapapong, Montri Karnjanadecha, Keerati Inthavisas. 1909-1912 [doi]

Source-filter separation for articulation-to-speech synthesisYoshinori Shiga, Simon King. 1913-1916 [doi]

Long vowel detection for letter-to-sound conversion for Japanese sourced words transliterated into the alphabetHisako Asano, Hideharu Nakajima, Hideyuki Mizuno, Oku Masahiro. 1917-1920 [doi]

Inexactness and robustness in cepstral-to-formant transformation of spoken and sung vowelsFrantz Clermont, Thomas John Millhouse. 1921-1924 [doi]

Analysis of acoustic features affecting singing-ness and its application to singing-voice synthesis from speaking-voiceTakeshi Saitou, Naoya Tsuji, Masashi Unoki, Masato Akagi. 1925-1928 [doi]

Statistical corpus-based speech segmentationVincent Pollet, Geert Coorman. 1929-1932 [doi]

Recent improvements on ARTIC: czech text-to-speech systemJindrich Matousek, Jan Romportl, Daniel Tihelka, Zbynek Tychtl. 1933-1936 [doi]

Learning for transliteration of arabic-numeral expressions using decision tree for Korean TTSHyeonSook Nam, Youngim Jung, Donghun Lee, Hyuk-Chul Kwon, Ae-sun Yoon. 1937-1940 [doi]

How to integrate phonetic and linguistic knowledge in a text-to-phoneme conversion task: a syllabic TPC tool for FrenchNicole Beringer. 1941-1944 [doi]

Task-specific minimum Bayes-risk decoding using learned edit distanceIzhak Shafran, William Byrne. 1945-1948 [doi]

Apply n-best list re-ranking to acoustic model combinations of boosting trainingRong Zhang, Alexander I. Rudnicky. 1949-1952 [doi]

Using VTLN for broadcast news transcriptionDo Yeong Kim, S. Umesh, M. J. F. Gales, Thomas Hain, Philip C. Woodland. 1953-1956 [doi]

From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition systemAndreas Stolcke, Chuck Wooters, Ivan Bulyko, Martin Graciarena, Scott Otterson, Barbara Peskin, Mari Ostendorf, David Gelbart, Nikki Mirghafori, Tuomo W. Pirinen. 1957-1960 [doi]

An efficient repair procedure for quick transcriptionsAnand Venkataraman, Andreas Stolcke, Wen Wang, Dimitra Vergyri, Jing Zheng, Venkata Ramana Rao Gadde. 1961-1964 [doi]

Tone information as a confidence measure for improving Cantonese LVCSRYao Qian, Tan Lee, Frank K. Soong. 1965-1968 [doi]

Unsupervised learning from users error correction in speech dictationDong Yu, Mei-Yuh Hwang, Peter Mau, Alex Acero, Li Deng. 1969-1972 [doi]

Robustness aspects of active learning for acoustic modelingGerard G. L. Meyer, Teresa M. Kamm. 1973-1976 [doi]

Task adaptation of acoustic and language models based on large quantities of dataKarthik Visweswariah, Ramesh A. Gopinath, Vaibhava Goel. 1977-1980 [doi]

Unsupervised language model adaptation methods for spontaneous speechLuc Lussier, Edward W. D. Whittaker, Sadaoki Furui. 1981-1984 [doi]

On-line incremental adaptation based on reinforcement learning for robust speech recognitionMasafumi Nishida, Yoshitaka Mamiya, Yasuo Horiuchi, Akira Ichikawa. 1985-1988 [doi]

Unsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systemsTomohiro Watanabe, Hiromitsu Nishizaki, Takehito Utsuro, Seiichi Nakagawa. 1989-1992 [doi]

Speech coding using trajectory compression and multiple sensorsSorin Dusan, James L. Flanagan, Amod Karve, Mridul Balaraman. 1993-1996 [doi]

How sparse can we make the auditory representation of speech?Christian Feldbauer, Gernot Kubin. 1997-2000 [doi]

Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applicationsDavid Malah, Slava Shechtman. 2001-2004 [doi]

Perceptual wavelet packet audio coderTeddy Surya Gunawan, Eliathamby Ambikairajah, Julien Epps. 2005-2008 [doi]

Performance analysis of transcoding algorithms in packet-loss environmentsSung-Kyo Jung, Hong-Goo Kang, Dae Hee Youn, Chang-Heon Lee. 2009-2012 [doi]

Speech quality estimation using Gaussian mixture modelsTiago H. Falk, Wai-Yip Chan, Peter Kabal. 2013-2016 [doi]

Modeling audio-visual speech perception: back on fusion architectures and fusion controlJean-Luc Schwartz, Marie-Agnès Cathiard. 2017-2020 [doi]

Neurocognition of speech-specific audiovisual perceptionMikko Sams, Ville Ojanen, Jyrki Tuomainen, Vasily Klucharev. 2021-2024 [doi]

Target practice on talking facesAdriano Vilela Barbosa, Eric Vatikiotis-Bateson, Andreas Daffertshofer. 2025-2028 [doi]

Audiovisual perceptual evaluation of resynthesised speech movementsMatthias Odisio, Gérard Bailly. 2029-2032 [doi]

Video-realistic synthetic speech with a parametric visual speech synthesizerSascha Fagel. 2033-2036 [doi]

Mutual information based visual feature selection for lipreadingPatricia Scanlon, Gerasimos Potamianos, Vit Libal, Stephen M. Chu. 2037-2040 [doi]

Robust automatic speech recognition using an optimal spectral amplitude estimator algorithm in low-SNR car environmentsZili Li, Hesham Tolba, Douglas D. O Shaughnessy. 2041-2044 [doi]

Robust speech recognition using data-driven temporal filters based on independent component analysisJunhui Zhao, Jingming Kuang, Xiang Xie. 2045-2048 [doi]

Robust distant speech recognition based on position dependent CMNNorihide Kitaoka, Longbiao Wang, Seiichi Nakagawa. 2049-2052 [doi]

Robust speech recognition based on HMM composition and modified wiener filterSumitaka Sakauchi, Yoshikazu Yamaguchi, Satoshi Takahashi, Satoshi Kobashikawa. 2053-2056 [doi]

Feature-dependent compensation in speech recognitionIvan Brito, Néstor Becerra Yoma, Carlos Molina. 2057-2060 [doi]

Using context to correct phone recognition errorsStephen Cox. 2061-2064 [doi]

Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptationYasunari Obuchi. 2065-2068 [doi]

Weighting observation vectors for robust speech recognition in noisy environmentsZhenyu Xiong, Thomas Fang Zheng, Wenhu Wu. 2069-2072 [doi]

Hands-free speech recognition using blind source separation post-processed by two-stage spectral subtractionMasanori Tsujikawa, Ken-ichi Iso. 2073-2076 [doi]

Robust speech recognition with spectral subtraction in low SNRRandy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano. 2077-2080 [doi]

Active perception: using a priori knowledge from clean speech models to ignore non-target featuresBert Cranen, Johan de Veth. 2081-2084 [doi]

Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognitionHaitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg. 2085-2088 [doi]

Using linear interpolation to improve histogram equalization for speech recognitionFilip Korkmazsky, Dominique Fohr, Irina Illina. 2089-2092 [doi]

A factorial HMM aproach to robust isolated digit recognition in background musicMark Hasegawa-Johnson, Ameya Deoras. 2093-2096 [doi]

Multi-eigenspace normalization for robust speech recognition in noisy environmentsYoonjae Lee, Hanseok Ko. 2097-2100 [doi]

Exploiting models intrinsic robustness for noisy speech recognitionChristophe Cerisara, Dominique Fohr, Odile Mella, Irina Illina. 2101-2104 [doi]

Speech recognition experiments with the SPEECON database using several robust front-endsPere Pujol, Jaume Padrell, Climent Nadeu, Dusan Macho. 2105-2108 [doi]

Spectro-temporal activity pattern (STAP) features for noise robust ASRShajith Ikbal, Mathew Magimai-Doss, Hemant Misra, Hervé Bourlard. 2109-2112 [doi]

Improvement of confidence measure performance using background model set algorithmByoung-Don Kim, Jin Young Kim, Seung Ho Choi, Young-Bum Lee, Kyoung-Rok Lee. 2113-2116 [doi]

Using RASTA in task independent TANDEM feature extractionGuillermo Aradilla, John Dines, Sunil Sivadas. 2117-2120 [doi]

A distributed speech recognition system in multi-user environmentsKyu Jeong Han, Shrikanth Narayanan, Naveen Srinivasamurthy. 2121-2124 [doi]

Soft features for improved distributed speech recognition over wireless networksReinhold Haeb-Umbach, Valentin Ion. 2125-2128 [doi]

Belief-based nonlinear rescoring in Thai speech understandingChai Wutiwiwatchai, Sadaoki Furui. 2129-2133 [doi]

An understanding strategy based on plausibility score in recognition history using CSR confidence measureToshihiko Itoh, Atsuhiko Kai, Yukihiro Itoh, Tatsuhiro Konishi. 2133-2136 [doi]

Speech recognition error correction using maximum entropy language modelSangkeun Jung, Minwoo Jeong, Gary Geunbae Lee. 2137-2140 [doi]

Discriminative training of compound-word based multinomial classifiers for speech routingXiang Li, Juan M. Huerta. 2141-2144 [doi]

An information extraction approach for spoken language understandingJihyun Eun, Changki Lee, Gary Geunbae Lee. 2145-2148 [doi]

A maximum entropy shallow functional parser for spoken language understandingDavid Horowitz, Partha Lal, Pierce Gerard Buckley. 2149-2152 [doi]

Mixture language models for call routingQiang Huang, Stephen J. Cox. 2153-2156 [doi]

Speech act identification using an ontology-based partial pattern treeChung-Hsien Wu, Jui-Feng Yeh, Ming-Jun Chen. 2157-2160 [doi]

Creating speech recognition grammars from regular expressions for alphanumeric conceptsYe-Yi Wang, Yun-Cheng Ju. 2161-2164 [doi]

Poetry assistantIsabel Trancoso, Paulo Araújo, Céu Viana, Nuno J. Mamede. 2165-2168 [doi]

Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markersTasuku Kitade, Tatsuya Kawahara, Hiroaki Nanjo. 2169-2172 [doi]

Robust dependency parsing of spontaneous Japanese speech and its evaluationTomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki. 2173-2176 [doi]

Strategies for optimizing a stochastic spoken natural language parserWolfgang Minker, Dirk Bühler, Christiane Beuschel. 2177-2180 [doi]

Prolongation in spontaneous MandarinTzu-Lun Lee, Ya-Fang He, Yun-Ju Huang, Shu-Chuan Tseng, Robert Eklund. 2181-2184 [doi]

Speech intention understanding based on decision tree learningYuki Irie, Shigeki Matsubara, Nobuo Kawaguchi, Yukiko Yamaguchi, Yasuyoshi Inagaki. 2185-2188 [doi]

Using simple speech-based features to detect the state of a meeting and the roles of the meeting participantsSatanjeev Banerjee, Alexander I. Rudnicky. 2189-2192 [doi]

An acoustic study of emotions expressed in speechSerdar Yildirim, Murtaza Bulut, Chul-Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan, Carlos Busso. 2193-2196 [doi]

Topic classification and verification modeling for out-of-domain utterance detectionTatsuya Kawahara, Ian Richard Lane, Tomoko Matsui, Satoshi Nakamura. 2197-2200 [doi]

Partially lexicalized parsing model utilizing rich featuresSo-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim, Soo Hong Kim. 2201-2204 [doi]

Clustering similar nouns for selecting related news articlesYoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi. 2205-2208 [doi]

Chinese text word-segmentation considering semantic links among sentencesLeonardo Badino. 2209-2212 [doi]

Syllable-based probabilistic morphological analysis model of KoreanDo-Gil Lee, Hae-Chang Rim. 2213-2216 [doi]

Evaluation of the speech output of a smart-home system in a car environmentPaula M. T. Smeele, Sebastian Möller, Jan Felix Krebber. 2221-2225 [doi]

How does the integration of speech recognition controls and spatialized auditory displays affect user workload?Ellen Haas. 2225-2228 [doi]

Speech interaction system - how to increase its usability?Fang Chen. 2229-2232 [doi]

Human language acquisition methods in a machine learning taskNicole Beringer. 2233-2236 [doi]

Conditional maximum likelihood estimation for improving annotation performance of n-gram models incorporating stochastic finite state grammarsVaibhava Goel. 2237-2241 [doi]

Morphology-based language modeling for arabic speech recognitionDimitra Vergyri, Katrin Kirchhoff, Kevin Duh, Andreas Stolcke. 2245-2248 [doi]

Speech enhanced multi-Span language modelA. Nayeemulla Khan, B. Yegnanarayana. 2249-2252 [doi]

Neural network language models for conversational speech recognitionHolger Schwenk, Jean-Luc Gauvain. 2253-2256 [doi]

A PLSA-based language model for conversational telephone speechDavid Mrva, Philip C. Woodland. 2257-2260 [doi]

New challenges in usability evaluation - beyond task-oriented spoken dialogue systemsLaila Dybkjær, Niels Ole Bernsen, Wolfgang Minker. 2261-2264 [doi]

Using quick transcriptions to improve conversational speech modelsOwen Kimball, Chia-Lin Kao, Rukmini Iyer, Teodoro Arvizo, John Makhoul. 2265-2268 [doi]

A wizard of oz framework for collecting spoken human-computer dialogsRohit Mishra, Elizabeth Shriberg, Sandra Upson, Joyce Chen, Fuliang Weng, Stanley Peters, Lawrence Cavedon, John Niekrasz, Hua Cheng, Harry Bratt. 2269-2272 [doi]

Subjective evaluation of spoken dialogue systems using SER VQUAL methodMikko Hartikainen, Esa-Pekka Salonen, Markku Turunen. 2273-2276 [doi]

Fiction database for emotion detection in abnormal situationsIoana Vasilescu, Laurence Devillers, Chloé Clavel, Thibaut Ehrette. 2277-2280 [doi]

Fast semi-automatic semantic annotation for spoken dialog systemsRuhi Sarikaya, Yuqing Gao, Paola Virga. 2281-2284 [doi]

Modeling data entry rates for ASR and alternative input methodsRoger K. Moore. 2285-2288 [doi]

Speech recognition using synchronization between speech and finger tappingHiromitsu Ban, Chiyomi Miyajima, Katsunobu Itou, Fumitada Itakura, Kazuya Takeda. 2289-2292 [doi]

Integration patterns during multimodal interactionAnurag Kumar Gupta, Tasos Anastasakos. 2293-2296 [doi]

Efficient likelihood computation in multi-stream HMM based audio-visual speech recognitionEtienne Marcheret, Stephen M. Chu, Vaibhava Goel, Gerasimos Potamianos. 2297-2300 [doi]

Separation of multiple concurrent speeches using audio-visual speaker localization and minimum variance beam-formingChangkyu Choi, Donggeon Kong, Hyoung-Ki Lee, Sang Min Yoon. 2301-2304 [doi]

Multimodal expression for humanoid robots by integration of human speech mimicking and facial colorTokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino. 2305-2308 [doi]

Towards large vocabulary ASR on embedded platformsMiroslav Novak. 2309-2312 [doi]

Analysis of in-car speech recognition experiments using a large-scale multi-mode dialogue corpusHiroshi Fujimura, Katsunobu Itou, Kazuya Takeda, Fumitada Itakura. 2313-2316 [doi]

On the integration of speech recognition into personal networksZheng-Hua Tan, Paul Dalsgaard, Børge Lindberg. 2317-2320 [doi]

Robust speech recognition in client-server scenariosRichard C. Rose, Hong Kook Kim. 2321-2324 [doi]

Memory and computation reduction for embedded ASR systemsSangbae Jeong, Icksang Han, Eugene Jon, Jeongsu Kim. 2325-2328 [doi]

Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMsMichael Betser, Frédéric Bimbot, Mathieu Ben, Guillaume Gravier. 2329-2332 [doi]

Time -frequency analysis of vocal source signal for speaker recognitionNengheng Zheng, P. C. Ching, Tan Lee. 2333-2336 [doi]

A novel method for two-speaker segmentationRashmi Gangadharaiah, Balakrishnan Narayanaswamy, Narayanaswamy Balakrishnan. 2337-2340 [doi]

Throat microphone signal for speaker recognitionBayya Yegnanarayana, A. Shahina, M. R. Kesheorey. 2341-2344 [doi]

Posteriori probabilities and likelihoods combination for speech and speaker recognitionMohamed Faouzi BenZeghiba, Hervé Bourlard. 2345-2348 [doi]

The use of typical sequences for robust speaker identificationMohamed Mihoubi, Douglas D. O Shaughnessy, Pierre Dumouchel. 2349-2352 [doi]

Mixture Gaussian model training against impostor model parameters: an application to speaker identificationT. V. Sreenivas, Sameer Badaskar. 2357-2360 [doi]

Jacobian adaptation with improved noise reference for speaker verificationJan Anguita, Javier Hernando, Alberto Abad. 2361-2364 [doi]

Objective wavelet packet features for speaker verificationMihalis Siafarikas, Todor Ganchev, Nikos Fakotakis. 2365-2368 [doi]

Policy analysis framework for conversational biometricsUpendra V. Chaudhari, Ganesh N. Ramaswamy. 2369-2372 [doi]

A new score normalization method for speaker verification with virtual impostor modelWoo-Yong Choi, Jung Gon Kim, Hyung Soon Kim, Sung Bum Pan. 2373-2376 [doi]

On the time variability of vocal tract for speaker recognitionSamuel Kim, Thomas Eriksson, Hong-Goo Kang. 2377-2380 [doi]

Distributed speaker recognitionVeena Desai, Hema A. Murthy. 2381-2384 [doi]

Cluster-dependent modeling and confidence measure processing for in-set/out-of-set speaker identificationPongtep Angkititrakul, Sepideh Baghaii, John H. L. Hansen. 2385-2388 [doi]

Distributed speaker recognition using earth mover s distanceYoshiyuki Umeda, Shingo Kuroiwa, Satoru Tsuge, Fuji Ren. 2389-2392 [doi]

A forensically-motivated tool for selecting cepstrally-consistent steady-states from non-contemporaneous vowel utterancesMichael Barlow, Mehrdad Khodai-Joopari, Frantz Clermont. 2393-2396 [doi]

Scoring and direct methods for the interpretation of evidence in forensic speaker recognitionAnil Alexander, Andrzej Drygajlo. 2397-2400 [doi]

Efficient online cohort selection method for speaker verificationTomi Kinnunen, Evgeny Karpov, Pasi Fränti. 2401-2404 [doi]

A concurrent curve strategy for formant trackingYves Laprie. 2405-2408 [doi]

A formant tracking LP model for speech processingQin Yan, Esfandiar Zavarehei, Saeed Vaseghi, Dimitrios Rentzos. 2409-2412 [doi]

Application of long-term filtering to formant estimationHong You. 2413-2416 [doi]

A method for glottal formant frequency estimationBaris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d Alessandro. 2417-2420 [doi]

Improved differential phase spectrum processing for formant trackingBaris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d Alessandro. 2421-2424 [doi]

MAP prediction of pitch from MFCC vectors for speech reconstructionXu Shao, Ben P. Milner. 2425-2428 [doi]

New harmonicity measures for pitch estimation and voice activity detectionAn-Tze Yu, Hsiao-Chuan Wang. 2429-2432 [doi]

Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear kalman filteringTakuya Nishimoto, Shigeki Sagayama, Hirokazu Kameoka. 2433-2436 [doi]

Automatic pitch marking and reconstruction of glottal closure instants from noisy and deformed electro-glotto-graph signalsAttila Ferencz, Jeongsu Kim, Yong-Beom Lee, Jae-Won Lee. 2437-2440 [doi]

On the use of a weighted autocorrelation based fundamental frequency estimation for a multidimensional speech inputFederico Flego, Luca Armani, Maurizio Omologo. 2441-2444 [doi]

A minimum mean squared error estimator for single channel speaker separationAarthi M. Reddy, Bhiksha Raj. 2445-2448 [doi]

Audio source separation from the mixture using empirical mode decomposition with independent subspace analysisMd. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu. 2449-2452 [doi]

Audio watermarking in sub-band signals using multiple echo kernelsIn-Jung Oh, Hyun-Yeol Chung, Jae-Won Cho, Ho-Youl Jung, Rémy Prost. 2453-2456 [doi]

A piecewise interpolation method based on log-least square error criterion for HRTFJie Zhang, Zhenyang Wu. 2457-2460 [doi]

Modified realizable frequency warped ARMA modeling and its application in synthesis structures for voiced speechJuan L. Navarro-Mesa, Pedro J. Quintana-Morales. 2461-2464 [doi]

Time-scaling of speech using independent subspace analysisR. Muralishankar, A. G. Ramakrishnan, Lakshmish N. Kaushik. 2465-2468 [doi]

Long term modeling of phase trajectories within the speech sinusoidal model frameworkLaurent Girin, Mohammad Firouzmand, Sylvain Marchand. 2469-2472 [doi]

An acoustic shock limiting algorithm using time and frequency domain speech featuresTina Soltani, Dave Hermann, Etienne Cornu, Hamid Sheikhzadeh, Robert L. Brennan. 2473-2476 [doi]

Speech probability distribution based on generalized gama distributionJong Won Shin, Joon-Hyuk Chang, Nam Soo Kim. 2477-2480 [doi]

Stop consonant classification by dynamic formant trajectoryYanli Zheng, Mark Hasegawa-Johnson, Sarah Borys. 2481-2484 [doi]

Estimating detailed spectral envelopes using articulatory clusteringYoshinori Shiga, Simon King. 2485-2488 [doi]

AVICAR: audio-visual speech corpus in a car environmentBowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, Thomas S. Huang. 2489-2492 [doi]

Adaptive classifier cascade for multimodal speaker identificationEngin Erzin, Yucel Yemez, A. Murat Tekalp. 2493-2496 [doi]

Use of visual cues in the perception of a labial/labiodental contrast by Spanish-L1 and Japanese-L1 learners of EnglishMidori Iba, Anke Sennema, Valérie Hazan, Andrew Faulkner. 2497-2500 [doi]

Audio-visual SPeaker localization for car navigation systemsXianxian Zhang, Kazuya Takeda, John H. L. Hansen, Toshiki Maeno. 2501-2504 [doi]

Automatic lips reading for audio-visual speech processing and recognitionJosef Chaloupka. 2505-2508 [doi]

liveness verification in audio-video authenticationMichael Wagner, Girija Chetty. 2509-2512 [doi]

Speech recognition using motion based lipreadingMaria José Sanchez Martinez, Juan Pablo de la Cruz Gutiérrez. 2513-2516 [doi]

Comparative study of linear and non-linear models for viseme in version: modeling of a cortical associative functionFrédéric Berthommier. 2517-2520 [doi]

3d lip-tracking for audio-visual speech recognition in real applicationsPetr Císar, Zdenek Krnoul, Milos Zelezný. 2521-2524 [doi]

The audio-video australian English speech data corpus AVOZESJ. Bruce Millar, Roland Goecke. 2525-2528 [doi]

Correcting Korean vowel speech recognition errors with limited lip featuresKi-Hyung Hong, Yong-Ju Lee, Jae-Young Suh, Kyong-Nim Lee. 2529-2532 [doi]

Segmental differences in the visual contribution to speech inteligibilityKuniko Y. Nielsen. 2533-2536 [doi]

Canonicalization of feature parameters for automatic speech recognitionTakashi Fukuda, Tsuneo Nitta. 2537-2540 [doi]

On binary and ratio time-frequency masks for robust speech recognitionSoundararajan Srinivasan, Nicoleta Roman, DeLiang Wang. 2541-2544 [doi]

New features based on multiple word graphs for utterance verificationAlberto Sanchís, Alfons Juan, Enrique Vidal. 2545-2548 [doi]

Combination of speech features using smoothed heteroscedastic linear discriminant analysisLukas Burget. 2549-2552 [doi]

Entropy based combination of tandem representations for noise robust ASRShajith Ikbal, Hemant Misra, Sunil Sivadas, Hynek Hermansky, Hervé Bourlard. 2553-2556 [doi]

Fast speech adaptation in linear spectral domain for additive and convolutional noiseDongsuk Yook, Donghyun Kim. 2557-2560 [doi]

Reconciling pronunciation differences between the front-end and the back-end in the IBM speech synthesis systemWael Hamza, Ellen Eide, Raimo Bakis. 2561-2564 [doi]

High quality text-to-pinyin conversion using two-phase unknown word predictionJuhong Ha, Yu Zheng, Gary Geunbae Lee, Yoon-Suk Seong, Byeongchang Kim. 2565-2568 [doi]

Pronunciation lexicon adaptation for TTS voice buildingYeon-Jun Kim, Ann K. Syrdal, Alistair Conkie. 2569-2572 [doi]

Improving letter-to-pronunciation accuracy with automatic morphologically-based stress predictionGabriel Webster. 2573-2576 [doi]

The IBM expressive speech synthesis systemWael Hamza, Ellen Eide, Raimo Bakis, Michael Picheny, John F. Pitrelli. 2577-2580 [doi]

What concept-to-speech can gain for prosodyMarkus Schnell, Rüdiger Hoffmann. 2581-2584 [doi]

Statistical model migration in speaker recognitionJiri Navratil, Ganesh N. Ramaswamy, Ran D. Zilca. 2585-2588 [doi]

Latent semantic analysis for speaker recognitionA. Nayeemulla Khan, Bayya Yegnanarayana. 2589-2592 [doi]

Model-based sequential organization for cochannel speaker identificationYang Shao, DeLiang Wang. 2593-2596 [doi]

Articulatory feature-based conditional pronunciation modeling for speaker verificationKa-Yee Leung, Man-Wai Mak, Sun-Yuan Kung. 2597-2600 [doi]

A comparison of normalization and training approaches for ASR-dependent speaker identificationAlex Park, Timothy J. Hazen. 2601-2604 [doi]

New background modeling for speaker verificationDat Tran. 2605-2608 [doi]

The MIT finite-state transducer toolkit for speech and language processingI. Lee Hetherington. 2609-2612 [doi]

Question-answering in webtalk: an evaluation studyJunlan Feng, Srinivas Bangalore, Mazin G. Rahim. 2613-2616 [doi]

Automatic network optimization of voice applicationsJuan M. Huerta, Chaitanya Ekanadham. 2617-2620 [doi]

Voicebuilder: a framework for automatic speech application developmentMiguel Angel Rodriguez-Moreno, Heriberto Cuayáhuitl, Juventino Montiel-Hernández. 2621-2624 [doi]

On the development of telephone applications: some practical issues and evaluationAndrea Facco, Daniele Falavigna, Roberto Gretter, Marcello Viganò. 2625-2628 [doi]

The GEMINI platform: semi-automatic generation of dialogue applicationsStefan W. Hamerich, Volker Schless, Basilis Kladis, Volker Schubert, Otilia Kocsis, Stefan Igel, Ricardo de Córdoba, Luis Fernando D Haro, José Manuel Pardo. 2629-2632 [doi]

A packet loss concealment method using recursive linear predictionKazuhiro Kondo, Kiyoshi Nakagawa. 2633-2636 [doi]

On a n-gram model approach for packet loss concealmentMinkyu Lee, Imed Zitouni, Qiru Zhou. 2637-2640 [doi]

Efficient vector quantisation of line spectral frequencies using the switched split vector quantiserStephen So, Kuldip K. Paliwal. 2641-2644 [doi]

Enhancement of reverberant speech using excitation source informationM. Chaitanya, S. R. Mahadeva Prasanna, B. Yegnanarayana. 2645-2648 [doi]

Improving automatic speech recognition performance and speech inteligibility with harmonicity based dereverberationKeisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi. 2649-2652 [doi]

Inner product based-multiband vector quantization for wideband speech coding at 16 kbpsSeung Yeol Lee, Nam Soo Kim, Joon-Hyuk Chang. 2653-2656 [doi]

Speech enhancement and recognition by integrating adaptive beamforming and wiener filteringAlberto Abad, Javier Hernando. 2657-2660 [doi]

Temporal normalization techniques for transform-type speech coding and application to split-band wideband codersKyung Tae Kim, Sung-Kyo Jung, MiSuk Lee, Hong-Goo Kang, Dae Hee Youn. 2661-2664 [doi]

Interface for barge-in free spoken dialogue system using adaptive sound field controlTatsunori Asai, Shigeki Miyabe, Hiroshi Saruwatari, Kiyohiro Shikano. 2665-2668 [doi]

Multi-mode harmonic transfrom excitation LPC coding for speech and musicJong-Hark Kim, Jae-Hyun Shin, Insung Lee. 2669-2672 [doi]

Source separation using particle filtersMital Gandhi, Mark Hasegawa-Johnson. 2673-2676 [doi]

Segmental speech coding model for storage applicationsAnssi Rämö, Jani Nurminen, Sakari Himanen, Ari Heikkinen. 2677-2680 [doi]

Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decompositionGwo-Hwa Ju, Lin-Shan Lee. 2681-2684 [doi]

Minimum phase compensation in speech coding using hammerstein modelJari Juhani Turunen, Juha T. Tanttu, Frank Cameron. 2685-2688 [doi]

Optimizing regression for in-car speech recognition using multiple distributed microphonesWeifeng Li, Fumitada Itakura, Kazuya Takeda. 2689-2692 [doi]

Speech enhancement based on magnitude estimation using the gamma priorWeifeng Li, Kazuya Takeda, Fumitada Itakura, Tran Huy Dat. 2693-2696 [doi]

Unscented kalman filtering of line spectral frequenciesAndrew Errity, John McKenna, Stephen Isard. 2697-2700 [doi]

Speech enhancement based on smoothing of spectral noise floorHyoung-Gook Kim, Thomas Sikora. 2701-2704 [doi]

Noise reduction using hybrid noise estimation technique and post-filteringJunfeng Li, Masato Akagi. 2705-2708 [doi]

An adaptive kalman filter for the enhancement of speech signalsMarcel Gabrea. 2709-2712 [doi]

Improved iterative wiener filtering for non-stationary noise speech enhancementT. V. Sreenivas, K. Sharath Rao, A. Sreenivasa Murthy. 2713-2716 [doi]

Highband spectrum envelope estimation of telephone speech using hard/soft-classificationYasheng Qian, Peter Kabal. 2717-2720 [doi]

A study on automatic detection of Japanese vowel devoicing for speech synthesisYi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang. 2721-2724 [doi]

Orientel-turkish: telephone speech database description and notes on the experienceTolga Çiloglu, Dinc Acar, Ahmet Tokatli. 2725-2728 [doi]

Intertranscriber reliability of prosodic labeling on telephone conversation using toBITaejin Yoon, Sandra Chavarria, Jennifer Cole, Mark Hasegawa-Johnson. 2729-2732 [doi]

Efficient compression method for pronunciation dictionariesJilei Tian. 2733-2736 [doi]

Construct a multi-lingual speech corpus in taiwan with extracting phonetically balanced articlesMin-Siong Liang, Dau-Cheng Lyu, Yuang-Chin Chiang, Ren-Yuan Lyu. 2737-2740 [doi]

Automatic prosody labeling of read norwegianPer Olav Heggtveit, Jon Emil Natvig. 2741-2744 [doi]

Towards automatic word segmentation of dialect speechEric Sanders, Andrea Diersen, Willy Jongenburger, Helmer Strik. 2745-2748 [doi]

New nonsense syllables database - analyses and preliminary ASR experimentsPetr Fousek, Frantisek Grézl, Hynek Hermansky, Petr Svojanovsky. 2749-2752 [doi]

Speech input and output module assessment for remote access to a smart-home spoken dialog systemJan Felix Krebber, Sebastian Möller, Alexander Raake. 2753-2756 [doi]

An implement of speech DB gathering system using voiceXMLDong-hyun Kim, Yong-Wan Roh, Kwang-Seok Hong. 2757-2760 [doi]

Precise phone boundary detection using wavelet packet and recurrent neural networksFarshad Almasganj. 2761-2764 [doi]

From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognitionAndrew Cameron Morris, Viktoria Maier, Phil Green. 2765-2768 [doi]

Design and construction of Korean-spoken English corpusSeok-Chae Rhee, Sook-Hyang Lee, Young-Ju Lee, Seok-Keun Kang. 2769-2772 [doi]

Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspectiveFolkert de Vriend, Giulio Maltese. 2773-2776 [doi]

Spoken language interface in ECMA/ISO telecommunication standardsKuansan Wang. 2777-2780 [doi]

The efficient generation of pronunciation dictionaries: machine learning factors during bootstrappingMarelie H. Davel, Etienne Barnard. 2781-2784 [doi]

Towards a new level of anotation detail of multilingual speech corporaAnja Geumann. 2785-2788 [doi]

CIAIR in-car speech databaseNobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Fumitada Itakura. 2789-2792 [doi]

Investigating speech style specific pronunciation variation in large spoken language corporaChristophe Van Bael, Henk van den Heuvel, Helmer Strik. 2793-2796 [doi]

The efficient generation of pronunciation dictionaries: human factors during bootstrappingMarelie H. Davel, Etienne Barnard. 2797-2800 [doi]

Hidden factor dynamic Bayesian networks for speech recognitionFilip Korkmazsky, Murat Deviren, Dominique Fohr, Irina Illina. 2801-2804 [doi]

Design of compact acoustic models through clustering of tied-covariance GaussiansMark Mao, Vincent Vanhoucke. 2805-2808 [doi]

Model composition by lagrange polynomial approximation for robust speech recognition in noisy environmentChandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama. 2809-2812 [doi]

A study of minimum classification error training for segmental switching linear Gaussian hidden Markov modelsJian Wu, Donglai Zhu, Qiang Huo. 2813-2816 [doi]

Speech recognition system robust to noise and speaking stylesShigeki Matsuda, Takatoshi Jitsuhiro, Konstantin Markov, Satoshi Nakamura. 2817-2820 [doi]

The stochastic weighted viterbi algorithm: a frame work to compensate additive noise and low-bit rate coding distortionNéstor Becerra Yoma, Ivan Brito, Carlos Molina. 2821-2824 [doi]

Shaping spoken input in user-initiative systemsStefanie Tomko, Roni Rosenfeld. 2825-2828 [doi]

Etiology of user experience with natural language speechChristopher J. Pavlovski, Jennifer C. Lai, Stella Mitchell. 2829-2832 [doi]

Side effect free dialogue management in a voice enabled procedure browserManny Rayner, Beth Ann Hockey. 2833-2836 [doi]

Example-based training of dialogue planning incorporating user and situation modelsIan Richard Lane, Tatsuya Kawahara, Shinichi Ueno. 2837-2840 [doi]

Prosody based attitude recognition with feature selection and its application to spoken dialog system as para-linguistic informationShinya Fujie, Tetsunori Kobayashi, Daizo Yagi, Hideaki Kikuchi. 2841-2844 [doi]

MS connect: a fully featured auto-attendant: system design, implementation and performanceDavid Ollason, Yun-Cheng Ju, Siddharth Bhatia, Daniel Herron, Jackie Liu. 2845-2848 [doi]

Adaptive beamforming combined with particle filtering for acoustic source localizationReinhold Haeb-Umbach, Sven Peschke, Ernst Warsitz. 2849-2852 [doi]

Time delay estimation using weighted CPSP functionHong-Seok Kwon, Siho Kim, Keun-Sung Bae. 2853-2856 [doi]

DOA estimation of speech signals using semi-blind source separation techniquesIlyas Potamitis, Panagiotis Zervas, Nikos Fakotakis. 2857-2860 [doi]

Blind separation of speech and sub-Gaussian signals in underdetermined caseSang-Gyun Kim, Chang D. Yoo. 2861-2864 [doi]

Adaptive cross-channel interference cancellation on blind signal separation outputs using source absence/presence detection and spectral subtractionGil-Jin Jang, Changkyu Choi, Yongbeom Lee, Yung-Hwan Oh. 2865-2868 [doi]

A comparison of simultaneous 3-channel blind source separation to selective separation on channel pairs using 2-channel BSSErik M. Visser, Kwokleung Chan, Stanley Kim, Te-Won Lee. 2869-2872 [doi]

Towards a grammar of spoken language - prosody of ill-formed utterances and listener's understanding in discourse -Miyoko Sugito. 2877-2880 [doi]

Automatic transformation of lecture transcription into document style using statistical frameworkTatsuya Kawahara, Kazuya Shitaoka, Hiroaki Nanjo. 2881-2884 [doi]

Automatic extraction of phonetically rich sentences from large text corpus of indian languagesKarunesh Arora, Sunita Arora, Kapil Verma, Shyam Sunder Agrawal. 2885-2888 [doi]

European initiatives to promote cooperation between speech and text communitiesNicoletta Calzolari. 2889-2892 [doi]

Speaker normalization through constrained MLLR based transformsDiego Giuliani, Matteo Gerosa, Fabio Brugnara. 2893-2896 [doi]

Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tyingXiangyu Mu, Shuwu Zhang, Bo Xu. 2897-2900 [doi]

Adaptation in the pronunciation space for non-native speech recognitionGeorg Stemmer, Stefan Steidl, Christian Hacker, Elmar Nöth. 2901-2904 [doi]

Robust ASR model adaptation by feature-based statistical data mappingXuechuan Wang, Douglas D. O Shaughnessy. 2905-2908 [doi]

A novel target-driven generalized JMAP adaptation algorithmZhaobing Han, Shuwu Zhang, Bo Xu. 2909-2912 [doi]

Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCABrian Mak, Simon Ho, James T. Kwok. 2913-2916 [doi]

Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognitionHyung Bae Jeon, Dong Kook Kim. 2917-2920 [doi]

Vocal tract normalization based on spectral warpingWei Wang, Stephen Zahorian. 2921-2924 [doi]

Acoustic model adaptation for coded speech using synthetic speechKoji Tanaka, Fuji Ren, Shingo Kuroiwa, Satoru Tsuge. 2925-2928 [doi]

Speaker adaptation method for CALL system using bilingual speakers utterancesMotoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Yuichi Ohkawa, Shozo Makino. 2929-2932 [doi]

Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation taskShinji Watanabe. 2933-2936 [doi]

Speaker clustering of speech utterances using a voice characteristic reference spaceWei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang. 2937-2940 [doi]

Performance improvement of connected digit recognition using unsupervised fast speaker adaptationYoung-Kuk Kim, Hwa Jeon Song, Hyung Soon Kim. 2941-2944 [doi]

Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptationHyung Soon Kim, Hwa Jeon Song. 2945-2948 [doi]

Speaker dependent model order selection of spectral envelopesMatthias Wölfel. 2949-2952 [doi]

Methods for task adaptation of acoustic models with limited transcribed in-domain dataEnrico Bocchieri, Michael Riley, Murat Saraclar. 2953-2956 [doi]

Unsupervised topic adaptation for lecture speech retrievalAtsushi Fujii, Tetsuya Ishikawa, Katsunobu Itou, Tomoyosi Akiba. 2957-2960 [doi]

Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMsHaibin Liu, Zhenyang Wu. 2961-2964 [doi]

Design of ready-made acoustic model library by two-dimensional visualization of acoustic spaceGoshu Nagino, Makoto Shozakai. 2965-2968 [doi]

Evaluation of a threshold for detecting local slower phrases in Japanese spontaneous conversational speechKeiichi Takamaru. 2969-2972 [doi]

Intonation recognition for indonesian speech based on fujisaki modelNazrul Effendy, Ekkarit Maneenoi, Patavee Charnvivit, Somchai Jitapunkul. 2973-2976 [doi]

Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating featuresJin-Song Zhang, Satoshi Nakamura, Keikichi Hirose. 2977-2980 [doi]

Clause types and filed pauses in Japanese spontaneous monologuesMichiko Watanabe, Yasuharu Den, Keikichi Hirose, Nobuaki Minematsu. 2981-2984 [doi]

Effect of voice prosody on the decision making process in human-computer interactionYohei Yabuta, Yasuhiro Katagiri, Noriko Suzuki, Yugo Takeuchi. 2985-2988 [doi]

Alignment of human prosodic patterns for spoken dialogue systemsNoriko Suzuki, Yasuhiro Katagiri. 2989-2992 [doi]

Evaluation of a prosodic labeling system utilizing linguistic informationShinya Kiriyama, Shigeyoshi Kitazawa. 2993-2996 [doi]

Functions of intonation boundaries during spoken language comprehension in EnglishAllison Blodgett. 2997-3000 [doi]

Voice activation using prosodic featuresMarco Khne, Matthias Wolff, Matthias Eichner, Rüdiger Hoffmann. 3001-3004 [doi]

The role of prosodic cues in word segmentation of KoreanSahyang Kim. 3005-3008 [doi]

Default phrasing and attachment preference in KoreanSun-Ah Jun. 3009-3012 [doi]

Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition modelsSarah Borys, Aaron Cohen, Mark Hasegawa-Johnson, Jennifer Cole. 3013-3016 [doi]

The role of pitch range variation in the discourse structure and intonation structure of KoreanEunjong Kong. 3017-3020 [doi]

Dependency analysis of read Japanese sentences using pause and F0 information: a speaker independent caseKazuyuki Takagi, Kazuhiko Ozeki. 3021-3024 [doi]

Effects of prosodic boundaries on ambiguous syntactic clause boundaries in JapaneseShari R. Speer, Soyoung Kang. 3025-3028 [doi]

The superior effectivenes of the F0 range for identifying the context from sounds without phonemesYasuko Nagasaki, Takanori Komatsu. 3029-3032 [doi]

A study of tone classification for continuous Thai speech recognitionTan Li, Montri Karnjanadecha, Thanate Khaorapapong. 3033-3036 [doi]

Estimating syntactic structure from prosodic features in Japanese speechTomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa. 3041-3044 [doi]

Perceptual discrimination of prosodic types and their preliminary acoustic analysisMasahiko Komatsu, Tsutomu Sugawara, Takayuki Arai. 3045-3048 [doi]

DORIS, a multiagent/IP platform for multimodal dialogue applicationsJohann L Hour, Olivier Boëffard, Jacques Siroux, Laurent Miclet, Francis Charpentier, Thierry Moudenc. 3049-3052 [doi]

EVITA-RAD: an extensible enterprise voice porTAI - rapid application development toolYu Chen. 3053-3056 [doi]

Strategies to reduce design time in multimodal/multilingual dialog applicationsLuis Fernando D Haro, Ricardo de Córdoba, Rubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, José Manuel Pardo. 3057-3060 [doi]

Three-way system-user-expert interactions help you expand the capabilities of an existing spoken dialogue systemGregory Aist. 3061-3064 [doi]

Florence: a dialogue manager framework for spoken dialogue systemsGiuseppe Di Fabbrizio, Charles Lewis. 3065-3068 [doi]

Recent progress of open-source LVCSR engine julius and Japanese model repositoryTatsuya Kawahara, Akinobu Lee, Kazuya Takeda, Katsunobu Itou, Kiyohiro Shikano. 3069-3072 [doi]

Example-based spoken dialogue system with online example augmentationHiroya Murao, Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Yasuyoshi Inagaki. 3073-3076 [doi]

Robust and adaptive architecture for multilingual spoken dialogue systemsMarkku Turunen, Esa-Pekka Salonen, Mikko Hartikainen, Jaakko Hakulinen. 3081-3084 [doi]

Towards ubiquitous task managementPorfírio P. Filipe, Nuno J. Mamede. 3085-3088 [doi]

External Links

Cite Key

Statistics

PDF

Researchr

INTERSPEECH 2004 - ICSLP, 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004

Abstract

Table of Contents