INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005

researchr

You are not signed in
Sign in
Sign up

INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005. ISCA, 2005.

Conference: interspeech2005

Abstract is missing.

The multiple-channel cochlear implant: interfacing electronic technology to human consciousnessGraeme M. Clark. 1-4 [doi]

Dynamic language model adaptation using variational Bayes inferenceYik-Cheung Tam, Tanja Schultz. 5-8 [doi]

The hidden vector state language modelVidura Seneviratne, Steve Young. 9-12 [doi]

Class-based variable memory length Markov modelShinsuke Mori, Gakuto Kurata. 13-16 [doi]

Context-sensitive statistical language modelingAlexander Gruenstein, Chao Wang, Stephanie Seneff. 17-20 [doi]

Language model data filtering via user simulation and dialogue resynthesisChao Wang, Stephanie Seneff, Grace Chung. 21-24 [doi]

Bayesian learning for latent semantic analysisJen-Tzung Chien, Meng-Sung Wu, Chia-Sheng Wu. 25-28 [doi]

The effect of stress and boundaries on segmental duration in a corpus of authentic speech (british English)Daniel Hirst, Caroline Bouzon. 29-32 [doi]

Investigation of the relationship between turn-taking and prosodic features in spontaneous dialogueTomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa. 33-36 [doi]

Filled pauses as cues to the complexity of following phrasesMichiko Watanabe, Keikichi Hirose, Yasuharu Den, Nobuaki Minematsu. 37-40 [doi]

Perceptual magnet effect in German boundary tonesKatrin Schneider, Bernd Möbius. 41-44 [doi]

Constraints on the acquisition of simplex and complex words in GermanAngela Grimm, Jochen Trommer. 45-48 [doi]

Whistled speech: a natural phonetic description of languages adapted to human perception and to the acoustical environmentJulien Meyer. 49-52 [doi]

Fast vocabulary-independent audio search using path-based graph indexingOlivier Siohan, Michiel Bacchiani. 53-56 [doi]

The effects of speech recognition and punctuation on information extraction performanceJohn Makhoul, Alex Baron, Ivan Bulyko, Long Nguyen, Lance A. Ramshaw, David Stallard, Richard M. Schwartz, Bing Xiang. 57-60 [doi]

Indexing uncertainty for spoken document searchCiprian Chelba, Alex Acero. 61-64 [doi]

Exploiting passage retrieval for n-best rescoring of spoken questionsTomoyosi Akiba, Hiroyuki Abe. 65-68 [doi]

Multi-stage compaction approach to broadcast news summarisationBalaKrishna Kolluru, Heidi Christensen, Yoshihiko Gotoh. 69-72 [doi]

Audio-video summarization of TV news using speech recognition and shot change detectionChien-Lin Huang, Chia-Hsin Hsieh, Chung-Hsien Wu. 73-76 [doi]

The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasetsAlan W. Black, Keiichi Tokuda. 77-80 [doi]

A probabilistic approach to unit selection for corpus-based speech synthesisShinsuke Sakai, Han Shu. 81-84 [doi]

The blizzard challenge 2005 CMU entry - a method for improving speech synthesis systemsJohn Kominek, Christina L. Bennett, Brian Langner, Arthur R. Toth. 85-88 [doi]

Automatic personal synthetic voice constructionH. Timothy Bunnell, Christopher A. Pennington, Debra Yarrington, John Gray. 89-92 [doi]

An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005Heiga Zen, Tomoki Toda. 93-96 [doi]

On building a concatenative speech synthesis system from the blizzard challenge speech databasesWael Hamza, Raimo Bakis, Zhiwei Shuang, Heiga Zen. 97-100 [doi]

Multisyn voices from ARCTIC data for the blizzard challengeRobert A. J. Clark, Korin Richmond, Simon King. 101-104 [doi]

Large scale evaluation of corpus-based synthesizers: results and lessons from the blizzard challenge 2005Christina L. Bennett. 105-108 [doi]

Speech retrieval of Mandarin broadcast news via mobile devicesBerlin Chen, Yi-Ting Chen, Chih-Hao Chang, Hung-Bin Chen. 109-112 [doi]

State estimation of meetings by information fusion using Bayesian networkMichiaki Katoh, Kiyoshi Yamamoto, Jun Ogata, Takashi Yoshimura, Futoshi Asano, Hideki Asoh, Nobuhiko Kitawaki. 113-116 [doi]

Results from a survey of attendees at ASRU 1997 and 2003Roger K. Moore. 117-120 [doi]

Speech processing in the networked home environment - a view on the amigo projectReinhold Haeb-Umbach, Basilis Kladis, Joerg Schmalenstroeer. 121-124 [doi]

Fixed distortion segmentation in efficient sound segment searchingMasahide Sugiyama. 125-128 [doi]

Identifying singers of popular songsTin Lay Nwe, Haizhou Li. 129-132 [doi]

Speech repair: quick error correction just by using selection operation for speech input interfacesJun Ogata, Masataka Goto. 133-136 [doi]

Steerable highly directional audio beam loudspeakerDirk Olszewski, Fransiskus Prasetyo, Klaus Linhard. 137-140 [doi]

Automatic music genre classification using second-order statistical measures for the prescriptive approachHassan Ezzaidi, Jean Rouat. 141-144 [doi]

Effect of head orientation on the speaker localization performance in smart-room environmentAlberto Abad, Dusan Macho, Carlos Segura, Javier Hernando, Climent Nadeu. 145-148 [doi]

Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia)Corinne Fredouille, Gilles Pouchoulin, Jean-François Bonastre, M. Azzarello, Antoine Giovanni, Alain Ghio. 149-152 [doi]

Adaptive speech analytics: system, infrastructure, and behaviorUpendra V. Chaudhari, Ganesh N. Ramaswamy, Eddie Epstein, Sasha Caskey, Mohamed Kamal Omar. 153-156 [doi]

Correlating student acoustic-prosodic profiles with student learning in spoken tutoring dialoguesKatherine Forbes-Riley, Diane J. Litman. 157-160 [doi]

Speech recognition performance and learning in spoken dialogue tutoringDiane J. Litman, Katherine Forbes-Riley. 161-164 [doi]

Structural representation of the non-native pronunciationsSatoshi Asakawa, Nobuaki Minematsu, Toshiko Isei-Jaakkola, Keikichi Hirose. 165-168 [doi]

Ya-ya language box - a portable device for English pronunciation training with speech recognition technologiesFu-Chiang Chou. 169-172 [doi]

Pronunciation error detection method based on error rule clustering using a decision treeAkinori Ito, Yen-Ling Lim, Motoyuki Suzuki, Shozo Makino. 173-176 [doi]

Modeling and automating detection of errors in Arabic language learner speechAbhinav Sethy, Shrikanth Narayanan, Nicolaus Mote, W. Lewis Johnson. 177-180 [doi]

Effects of F0 feedback on the learning of Chinese tones by native speakers of EnglishFelicia Zhang, Michael Wagner. 181-184 [doi]

Voice-controlled internet browsing for motor-handicapped users. design and implementation issuesTom Brøndsted, Erik Aaskoven. 185-188 [doi]

Creating an ongoing research capability in speech technology for two minority languages: experiences from the WISPR projectBriony Williams, Delyth Prys, Ailbhe Ní Chasaide. 189-192 [doi]

Speech operated smart-home control system for users with special needsAnestis Vovos, Basilis Kladis, Nikolaos D. Fakotakis. 193-196 [doi]

Spoken dialog system and its evaluation of geographic information system for elderly persons mobility supportTakatoshi Jitsuhiro, Shigeki Matsuda, Yutaka Ashikari, Satoshi Nakamura, Ikuko Eguchi Yairi, Seiji Igi. 197-200 [doi]

A frame based spoken dialog system for home careDaniele Falavigna, Toni Giorgino, Roberto Gretter. 201-204 [doi]

Frame based model order selection of spectral envelopesMatthias Wölfel. 205-208 [doi]

On variable-scale piecewise stationary spectral analysis of speech signals for ASRVivek Tyagi, Christian Wellekens, Hervé Bourlard. 209-212 [doi]

Efficient pitch-based estimation of VTLN warp factorsArlo Faria, David Gelbart. 213-216 [doi]

Accent detection and speech recognition for Shanghai-accented MandarinYanli Zheng, Richard Sproat, Liang Gu, Izhak Shafran, Haolang Zhou, Yi Su, Daniel Jurafsky, Rebecca Starr, Su-Youn Yoon. 217-220 [doi]

Variability of automatic speech recognition systems using different featuresLoic Barrault, Renato de Mori, Roberto Gemello, Franco Mana, Driss Matrouf. 221-224 [doi]

Crosslingual and bilingual speech recognition with Slovak and Czech speechdat-e databasesSlavomír Lihan, Jozef Juhar, Anton Cizmar. 225-228 [doi]

Automatic data selection for MLP-based feature extraction for ASRCarmen Peláez-Moreno, Qifeng Zhu, Barry Y. Chen, Nelson Morgan. 229-232 [doi]

Rapid porting of ASR-systems to mobile devicesThilo Köhler, Christian Fügen, Sebastian Stüker, Alex Waibel. 233-236 [doi]

A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN modelsHugo Meinedo, João Paulo Neto. 237-240 [doi]

Speech activity detection fusing acoustic phonetic and energy featuresEtienne Marcheret, Karthik Visweswariah, Gerasimos Potamianos. 241-244 [doi]

Robust voice activity detection based on the entropy of noise-suppressed spectrumZoltán Tüske, Péter Mihajlik, Zoltán Tobler, Tibor Fegyó. 245-248 [doi]

Multiple moving speaker tracking by microphone array on mobile robotMasamitsu Murase, Shun ichi Yamamoto, Jean-Marc Valin, Kazuhiro Nakadai, Kentaro Yamada, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. 249-252 [doi]

A speaker biased SI recognizer for embedded mobile applicationsYaxin Zhang, Bian Wu, Xiaolin Ren, Xin He. 253-256 [doi]

Fast unsupervised speaker adaptation through a discriminative eigen-MLLR algorithmBart Bakker, Carsten Meyer, Xavier L. Aubert. 257-260 [doi]

Incremental largest margin linear regression and MAP adaptation for speech separation in telemedicine applicationsRusheng Hu, Jian Xue, Yunxin Zhao. 261-264 [doi]

Applying vocal tract length normalization to meeting recordingsGiulia Garau, Steve Renals, Thomas Hain. 265-268 [doi]

Implementing frequency-warping and VTLN through linear transformation of conventional MFCCS. Umesh, András Zolnay, Hermann Ney. 269-272 [doi]

MLLR-like speaker adaptation based on linearization of VTLN with MFCC featuresXiaodong Cui, Abeer Alwan. 273-276 [doi]

Model adaptation by state splitting of HMM for long reverberationChandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama. 277-280 [doi]

Online speaker adaptation and tracking for real-time speech recognitionDaben Liu, Daniel Kiecza, Amit Srivastava, Francis Kubala. 281-284 [doi]

Automatic speech recognition based on adaptation and clustering using temporal-difference learningMasafumi Nishida, Yasuo Horiuchi, Akira Ichikawa. 285-288 [doi]

Improving the speech recognition performance of beginners in spoken conversational interaction for language learningHui Ye, Steve Young. 289-292 [doi]

Rapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environmentsRandy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano. 293-296 [doi]

Rapid speaker adaptation for continuous speech recognition using merging eigenvoicesDong-jin Choi, Yung-Hwan Oh. 297-300 [doi]

Real-time pitch tracking based on combined SMDSFJian Liu, Thomas Fang Zheng, Jing Deng, Wenhu Wu. 301-304 [doi]

Fundamental frequency estimation by least-squares harmonic model fittingAndrás Bánhalmi, Kornél Kovács, András Kocsor, László Tóth. 305-308 [doi]

Harmonic filtering for joint estimation of pitch and voiced source with single-microphone inputSiu Wa Lee, Frank K. Soong, Pak-Chung Ching. 309-312 [doi]

High-resolution noise-robust spectral-based pitch estimationMarián Képesi, Luis Weruaga. 313-316 [doi]

F0 estimation for adult and children s speechJohn-Paul Hosom. 317-320 [doi]

Fundamental frequency and voicing prediction from MFCCs for speech reconstruction from unconstrained speechBen Milner, Xu Shao, Jonathan Darch. 321-324 [doi]

F0 stylisation with a free-knot b-spline model and simulated-annealing optimizationNelly Barbot, Olivier Boëffard, Damien Lolive. 325-328 [doi]

Voiced excitation as entrained primary response of a reconstructed glottal master oscillatorFriedhelm R. Drepper. 329-332 [doi]

Estimation of LF glottal source parameters based on an ARX modelDamien Vincent, Olivier Rosec, Thierry Chonavel. 333-336 [doi]

Some experiments on iterative reconstruction of speech from STFT phase and magnitude spectraLeigh D. Alsteris, Kuldip K. Paliwal. 337-340 [doi]

Statistical properties of the warped discrete cosine transform cepstrum compared with MFCCR. Muralishankar, Abhijeet Sangwan, Douglas D. O Shaughnessy. 341-344 [doi]

New signal features for robust identification of isolated vowelsAníbal J. S. Ferreira. 345-348 [doi]

Amplitude modulation of frication noise by voicing saturatesJonathan Pincas, Philip J. B. Jackson. 349-352 [doi]

Extraction of relevant speech features using the information bottleneck methodRon M. Hecht, Naftali Tishby. 353-356 [doi]

Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speechMohammad Firouzmand, Laurent Girin, Sylvain Marchand. 357-360 [doi]

Multi-resolution RASTA filtering for TANDEM-based ASRHynek Hermansky, Petr Fousek. 361-364 [doi]

A category-dependent feature selection method for speech signalsWoojay Jeon, Biing-Hwang Juang. 365-368 [doi]

Voicing features for robust speech detectionTrausti Kristjansson, Sabine Deligne, Peder A. Olsen. 369-372 [doi]

Joint Bayesian predictive classification and parallel model combination for robust speech recognitionSvein Gunnar Pettersen, Magne Hallstein Johnsen, Tor André Myrvoll. 373-376 [doi]

Gaussian elimination algorithm for HMM complexity reduction in continuous speech recognition systemsGlauco F. G. Yared, Fábio Violaro, Lívio C. Sousa. 377-380 [doi]

Robust speech recognition in cars using phoneme dependent multi-environment linear normalizationLuis Buera, Eduardo Lleida, Antonio Miguel, Alfonso Ortega. 381-384 [doi]

Energy-based frame selection for reliable feature normalization and transformation in robust speech recognitionYi Chen, Lin-Shan Lee. 385-388 [doi]

Remodeling of the sensor for non-audible murmur (NAM)Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell. 389-392 [doi]

Focused word segmentation for ASRAmarnag Subramanya, Jeff Bilmes, Chia-Ping Chen. 393-396 [doi]

Lexical tone perception in musicians and non-musiciansJennifer A. Alexander, Patrick C. M. Wong, Ann R. Bradlow. 397-400 [doi]

Contextual effect on perception of lexical tones in CantoneseJoan K. Y. Ma, Valter Ciocca, Tara L. Whitehill. 401-404 [doi]

Visual cues in Mandarin tone perceptionHansjörg Mixdorff, Yu Hu, Denis Burnham. 405-408 [doi]

Cross-language perception of word stressHansjörg Mixdorff, Yu Hu. 409-412 [doi]

The lexical statistics of word recognition problems caused by L2 phonetic confusionAnne Cutler. 413-416 [doi]

A multi-layer fuzzy logical model for emotional speech perceptionChun-Fang Huang, Masato Akagi. 417-420 [doi]

Utterance verification incorporating in-domain confidence and discourse coherence measuresIan R. Lane, Tatsuya Kawahara. 421-424 [doi]

Using symbolic prominence to help design feature subsets for topic classification and clustering of natural human-human conversationsConstantinos Boulis, Mari Ostendorf. 425-428 [doi]

Tightly integrated spoken language understanding using word-to-concept translationKatsuhito Sudoh, Hajime Tsukada. 429-432 [doi]

Exploiting unlabeled data using multiple classifiers for improved natural language call-routingRuhi Sarikaya, Hong-Kwang Jeff Kuo, Vaibhava Goel, Yuqing Gao. 433-436 [doi]

Active learning with minimum expected error for spoken language understandingHong-Kwang Jeff Kuo, Vaibhava Goel. 437-440 [doi]

Lexical out-of-vocabulary models for one-stage speech interpretationMatthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske. 441-444 [doi]

Speech technology for e-inclusion of people with physical disabilities and disordered speechMark S. Hawley, Phil Green, Pam Enderby, Stuart Cunningham, Roger K. Moore. 445-448 [doi]

Speech technology for language training and e-inclusionBjörn Granström. 449-452 [doi]

Supporting the creation of TTS for local language voice information systemsRoger Tucker, Ksenia Shalonova. 453-456 [doi]

Access for all - a talking internet serviceOve Andersen, Christian Hjulmand. 457-460 [doi]

A speech centric mobile multimodal service useful for dyslectics and aphasicsKnut Kvale, Narada D. Warakagoda. 461-464 [doi]

No laughing matterNick Campbell, Hideki Kashioka, Ryo Ohara. 465-468 [doi]

A study on the automatic detection and characterization of emotion in a voice service contextChristophe Blouin, Valérie Maffiolo. 469-472 [doi]

Classical and novel discriminant features for affect recognition from speechRaul Fernandez, Rosalind W. Picard. 473-476 [doi]

Low-dimensional feature space derivation for emotion recognitionJaroslaw Cichosz, Krzysztof Slot. 477-480 [doi]

Proposal of acoustic measures for automatic detection of vocal fryCarlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita. 481-484 [doi]

Automatic detection of laughterKhiet P. Truong, David A. van Leeuwen. 485-488 [doi]

Tales of tuning - prototyping for automatic classification of emotional user statesAnton Batliner, Stefan Steidl, Christian Hacker, Elmar Nöth, Heinrich Niemann. 489-492 [doi]

Automatic emotion recognition using prosodic parametersIker Luengo, Eva Navas, Inmaculada Hernáez, Jon Sánchez. 493-496 [doi]

An articulatory study of emotional speech productionSungbok Lee, Serdar Yildirim, Abe Kazemzadeh, Shrikanth Narayanan. 497-500 [doi]

Informed blending of databases for emotional speech synthesisGregor Hofer, Korin Richmond, Robert A. J. Clark. 501-504 [doi]

Emotional FESTIVAL-MBROLA TTS synthesisFabio Tesser, Piero Cosi, Carlo Drioli, Graziano Tisato. 505-508 [doi]

Emofilt: the simulation of emotional speech by prosody-transformationFelix Burkhardt. 509-512 [doi]

Acoustic/prosodic and lexical correlates of charismatic speechAndrew Rosenberg, Julia Hirschberg. 513-516 [doi]

Communicative speech synthesis using constituent word attributesYoko Greenberg, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka. 517-520 [doi]

Emotions in dubbed speech: an intercultural approach with respect to F0Angelika Braun, Matthias Katerbow. 521-524 [doi]

The prosodic dimensions of emotion in speech: the relative weights of parametersNicolas Audibert, Véronique Aubergé, Albert Rilliard. 525-528 [doi]

Stimulus duration and type in perception of female and male speaker ageSusanne Schötz. 529-532 [doi]

Perceptions of emotions in expressive storytellingCecilia Ovesdotter Alm, Richard Sproat. 533-536 [doi]

Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHTHideki Kawahara, Alain de Cheveigné, Hideki Banno, Toru Takahashi, Toshio Irino. 537-540 [doi]

Gradually changing expression of singing voice based on morphingTomoko Yonezawa, Noriko Suzuki, Kenji Mase, Kiyoshi Kogure. 541-544 [doi]

A multi-pass, dynamic-vocabulary approach to real-time, large-vocabulary speech recognitionI. Lee Hetherington. 545-548 [doi]

Anatomy of an extremely fast LVCSR decoderGeorge Saon, Daniel Povey, Geoffrey Zweig. 549-552 [doi]

Evaluation of a long-contextual-Span hidden trajectory model and phonetic recognizer using a* lattice searchDong Yu, Li Deng, Alex Acero. 553-556 [doi]

Generalized fast on-the-fly composition algorithm for WFST-based speech recognitionTakaaki Hori, Atsushi Nakamura. 557-560 [doi]

Minimum Bayes-risk decoding considering word significance for information retrieval systemHiroaki Nanjo, Teruhisa Misu, Tatsuya Kawahara. 561-564 [doi]

On improvements to CI-based GMM selectionArthur Chan, Mosur Ravishankar, Alexander I. Rudnicky. 565-568 [doi]

Scalable language model look-ahead for LVCSRDominique Massonié, Pascal Nocera, Georges Linares. 569-572 [doi]

Memory efficient approximative lattice generation for grammar based decodingMiroslav Novak. 573-576 [doi]

Improved semi-dynamic network decoding using WFSTsDong-Hoon Ahn, Su-Byeong Oh, Minhwa Chung. 577-580 [doi]

New pruning criteria for efficient decodingJanne Pylkkönen. 581-584 [doi]

A confidence-guided dynamic pruning approach - utilization of confidence measurement in speech recognitionTibor Fábián, Robert Lieb, Günther Ruske, Matthias Thomae. 585-588 [doi]

Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signalsToru Taniguchi, Akishige Adachi, Shigeki Okawa, Masaaki Honda, Katsuhiko Shirai. 589-592 [doi]

Extractive summarization of meeting recordingsGabriel Murray, Steve Renals, Jean Carletta. 593-596 [doi]

IR-based classification of customer-agent phone callsArjan van Hessen, Jaap Hinke. 597-600 [doi]

Mining broadcast news data: robust information extraction from word latticesBenoît Favre, Frédéric Béchet, Pascal Nocera. 601-604 [doi]

To recover from speech recognition errors in spoken document retrievalMikko Kurimo, Ville T. Turunen. 605-608 [doi]

Unsupervised clustering of spontaneous speech documentsEdgar González, Jordi Turmo. 609-612 [doi]

Spectral cross-correlation features for audio indexing of broadcast news and meetingsMasahide Yamaguchi, Masaru Yamashita, Shoichi Matsunaga. 613-616 [doi]

Spontaneous speech consolidation for spoken language applicationsChiori Hori, Alex Waibel. 617-620 [doi]

Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarizationSameer Maskey, Julia Hirschberg. 621-624 [doi]

Hierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (PLSA) for efficient retrieval/browsing applicationsTe-Hsuan Li, Ming-Han Lee, Berlin Chen, Lin-Shan Lee. 625-628 [doi]

The COST278 broadcast news segmentation and speaker clustering evaluation - overview, methodology, systems, resultsJanez Zibert, France Mihelic, Jean-Pierre Martens, Hugo Meinedo, João Paulo Neto, Laura Docío Fernández, Carmen García-Mateo, Petr David, Jindrich Zdánský, Matús Pleva, Anton Cizmar, Andrej Zgank, Zdravko Kacic, Csaba Teleki, Klára Vicsi. 629-632 [doi]

Comparison of keyword spotting approaches for informal continuous speechIgor Szöke, Petr Schwarz, Pavel Matejka, Lukas Burget, Martin Karafiát, Michal Fapso, Jan Cernocký. 633-636 [doi]

Dialogue strategy to clarify user s queries for document retrieval system with speech interfaceTeruhisa Misu, Tatsuya Kawahara. 637-640 [doi]

Comparison of different phone-based spoken document retrieval methods with text and spoken queriesNicolas Moreau, Shan Jin, Thomas Sikora. 641-644 [doi]

PCA of perturbation parameters in voice pathology detectionPedro Gómez Vilda, Francisco Díaz, Agustín Álvarez Marquina, Rafael Martínez, Victoria Rodellar, Roberto Fernández-Baíllo, Alberto Nieto, Francisco J. Fernandez. 645-648 [doi]

Dynamic programming based segmentation approach to LSF matrix reconstructionAnindya Sarkar, T. V. Sreenivas. 649-652 [doi]

Explicit segmentation of speech based on frequency-domain AR modelingT. Nagarajan, Douglas D. O Shaughnessy. 653-656 [doi]

Non-parametric speaker turn segmentation of meeting dataPetr Motlícek, Lukás Burget, Jan Cernocký. 657-660 [doi]

Unsupervised segmentation of continuous speech using vector autoregressive time-frequency modeling errorsPetri Korhonen, Unto K. Laine. 661-664 [doi]

The analysis on band-limited hypernasal speech using group delay based formant extraction techniqueP. Vijayalakshmi, M. RamasubbaReddy. 665-668 [doi]

Detection of acoustic change-points in audio records via global BIC maximization and dynamic programmingJindrich Zdánský, Jan Nouza. 669-672 [doi]

Multi-band approach of audio source discrimination with empirical mode decompositionMd. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu. 673-676 [doi]

Application of auditory image model for speech event detectionMinoru Tsuzaki, Satomi Tanaka, Hiroaki Kato, Yoshinori Sagisaka. 677-680 [doi]

Unsupervised identification of speech segments using kernel methods for clusteringJosé Anibal Arias. 681-684 [doi]

Speech event detection using multiband modulation energyGeorgios Evangelopoulos, Petros Maragos. 685-688 [doi]

Measuring unsupervised acoustic clustering through phoneme pair merge-and-split testsJohn Kominek, Alan W. Black. 689-692 [doi]

Variational Bayesian speaker change detectionFabio Valente, Christian Wellekens. 693-696 [doi]

Distinctive feature based SVM discriminant features for improvements to phone recognition on telephone band speechSarah Borys, Mark Hasegawa-Johnson. 697-700 [doi]

Detection of hypernasality using statistical pattern classifiersP. Vijayalakshmi, M. RamasubbaReddy. 701-704 [doi]

Self-organizing chirp-sensitive artificial auditory cortical modelLuis Weruaga, Marián Képesi. 705-708 [doi]

On the use of a decimative spectral estimation method based on eigenanalysis and SVD for formant and bandwidth tracking of speech signalsSotiris Karabetsos, Pirros Tsiakoulis, Stavroula-Evita Fotinea, Ioannis Dologlou. 709-712 [doi]

Frequency-domain auditory suppression modelling (FASM) - a WDFT-based anthropomorphic noise-robust feature extraction algorithm for speech recognitionAlexei V. Ivanov, Marek Parfieniuk, Alexander A. Petrovsky. 713-716 [doi]

Discriminative maximum entropy language model for speech recognitionChuang-Hua Chueh, To-Chang Chien, Jen-Tzung Chien. 721-724 [doi]

Open vocabulary speech recognition with flat hybrid modelsMaximilian Bisani, Hermann Ney. 725-728 [doi]

An error-corrective language-model adaptation for automatic speech recognitionMinwoo Jeong, Jihyun Eun, Sangkeun Jung, Gary Geunbae Lee. 729-732 [doi]

Discriminative training of finite state decoding graphsShiuan-Sung Lin, François Yvon. 733-736 [doi]

Building continuous space language models for transcribing european languagesHolger Schwenk, Jean-Luc Gauvain. 737-740 [doi]

Using random forest language models in the IBM RT-04 CTS systemPeng Xu, Lidia Mangu. 741-744 [doi]

Perceptual development of the duration cue in dutch /a-a: /Willemijn Heeren. 745-748 [doi]

Pronunciation variations of Spanish-accented English spoken by young childrenHong You, Abeer Alwan, Abe Kazemzadeh, Shrikanth Narayanan. 749-752 [doi]

L2 development of quantity perception: dutch listeners learning Finnish /t-t: /Willemijn Heeren. 753-756 [doi]

Phonetic inventories in Italian children aged 18-27 months: a longitudinal studyClaudio Zmarich, Serena Bonifacio. 757-760 [doi]

Pitch patterns of intonational phrases and intonational phrase groups in native and non-native speechHiroko Hirano, Goh Kawai. 761-764 [doi]

Measuring liveliness in presentation speechRebecca Hincks. 765-768 [doi]

Non-verbal speech processing for a communicative agentNick Campbell. 769-772 [doi]

Physiologically motivated audio-visual localisation and trackingStuart N. Wrigley, Guy J. Brown. 773-776 [doi]

Discriminatively trained features using fMPE for multi-stream audio-visual speech recognitionJing Huang, Daniel Povey. 777-780 [doi]

INTERFACE: a new tool for building emotive/expressive talking headsGraziano Tisato, Piero Cosi, Carlo Drioli, Fabio Tesser. 781-784 [doi]

Variance reduction by using separate genuine- impostor statistics in multimodal biometricsPascual Ejarque, Javier Hernando. 785-788 [doi]

The dialog application metalanguage GDialogXMLVolker Schubert, Stefan W. Hamerich. 789-792 [doi]

Data-driven synthesis of expressive visual speech using an MPEG-4 talking headJonas Beskow, Mikael Nordenberg. 793-796 [doi]

Voice quality interpolation for emotional text-to-speech synthesisOytun Türk, Marc Schröder, Baris Bozkurt, Levent M. Arslan. 797-800 [doi]

Investigating the role of phoneme-level modifications in emotional speech resynthesisMurtaza Bulut, Carlos Busso, Serdar Yildirim, Abe Kazemzadeh, Chul-Min Lee, Sungbok Lee, Shrikanth Narayanan. 801-804 [doi]

Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensemblesBjörn Schuller, Ronald Müller, Manfred K. Lang, Gerhard Rigoll. 805-808 [doi]

Integrating information from speech and physiological signals to achieve emotional sensitivityJonghwa Kim, Elisabeth André, Matthias Rehm, Thurid Vogt, Johannes Wagner. 809-812 [doi]

Multimodal databases of everyday emotion: facing up to complexityEllen Douglas-Cowie, Laurence Devillers, Jean-Claude Martin, Roddy Cowie, Suzie Savvidou, Sarkis Abrilian, Cate Cox. 813-816 [doi]

Learning of stochastic dialog models through a dialog simulation techniqueFrancisco Torres, Emilio Sanchis, Encarna Segarra. 817-820 [doi]

Evaluating the DI@l-log system on a cohort of elderly, diabetic patients: results from a preliminary studyLesley-Ann Black, Michael F. McTear, Norman D. Black, Roy Harper, Michelle Lemon. 821-824 [doi]

Combination of classifiers for automatic recognition of dialog actsPavel Král, Christophe Cerisara, Jana Klecková. 825-828 [doi]

Rapidly developing spoken Chinese dialogue systems with the d-ear SDS SDKXiaojun Wu, Thomas Fang Zheng, Michael Brasser, Zhanjiang Song. 829-832 [doi]

Robust algorithms and interaction strategies for voice spellingDaniela Oria, Akos Vetek. 833-836 [doi]

Modality integration and dialog management for a robotic assistantIoannis Toptsis, Axel Haasch, Sonja Hwel, Jannik Fritsch, Gernot A. Fink. 837-840 [doi]

An integration framework for a mobile multimodal dialogue system accessing the semantic webNorbert Reithinger, Daniel Sonntag. 841-844 [doi]

Operating a public spoken guidance system in real environmentRyuichi Nisimura, Akinobu Lee, Masashi Yamada, Kiyohiro Shikano. 845-848 [doi]

Distributed dialogue management for smart terminal devicesEsa-Pekka Salonen, Markku Turunen, Jaakko Hakulinen, Leena Helin, Perttu Prusi, Anssi Kainulainen. 849-852 [doi]

Visualization of spoken dialogue systems for demonstration, debugging and tutoringJaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen. 853-856 [doi]

Development and evaluation of a spoken dialog system to access a newspaper web siteCésar González Ferreras, Valentín Cardeñoso-Payo. 857-860 [doi]

Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learningOlivier Pietquin, Richard Beaufort. 861-864 [doi]

An approach to multi-strategy dialogue managementShiu-Wah Chu, Ian M. O Neill, Philip Hanna, Michael F. McTear. 865-868 [doi]

Towards user modelling in conversational dialogue systems: a qualitative study of the dynamics of dialogue parametersAnna Hjalmarsson. 869-872 [doi]

Reducing the description amount in authoring MMI applicationsKouichi Katsurada, Kazumine Aoki, Hirobumi Yamada, Tsuneo Nitta. 873-876 [doi]

Contextual constraints based on dialogue models in database search task for spoken dialogue systemsKazunori Komatani, Naoyuki Kanda, Tetsuya Ogata, Hiroshi G. Okuno. 877-880 [doi]

Using word-level pitch features to better predict student emotions during spoken tutoring dialoguesMihai Rotaru, Diane J. Litman. 881-884 [doi]

Let s go public! taking a spoken dialog system to the real worldAntoine Raux, Brian Langner, Dan Bohus, Alan W. Black, Maxine Eskenazi. 885-888 [doi]

Back-channel feedback generation using linguistic and nonlinguistic information and its application to spoken dialogue systemShinya Fujie, Kenta Fukushima, Tetsunori Kobayashi. 889-892 [doi]

Learning user simulations for information state update dialogue systemsKallirroi Georgila, James Henderson, Oliver Lemon. 893-896 [doi]

Design of a voice-enabled interface for real-time access to stock exchange from a PDA through GPRSDarío Martín-Iglesias, Yago Pereiro-Estevan, Ana I. García-Moral, Ascensión Gallardo-Antolín, Fernando Díaz-de-María. 897-900 [doi]

Integrating denotational meaning into a DBN language modelWilliam Schuler, Tim Miller. 901-904 [doi]

Improving out-of-coverage language modelling in a multimodal dialogue system using small training setsLouis ten Bosch. 905-908 [doi]

Ritel: an open-domain, human-computer dialog systemOlivier Galibert, Gabriel Illouz, Sophie Rosset. 909-912 [doi]

A comparison of particle filtering variants for speech feature enhancementReinhold Haeb-Umbach, Joerg Schmalenstroeer. 913-916 [doi]

Enhancement of mel log-power spectrum of speech using particle filteringIlyas Potamitis, Nikolaos D. Fakotakis. 917-920 [doi]

Improving robustness of speech recognition performance to aggregate of noises by two-dimensional visualizationMakoto Shozakai, Goshu Nagino. 921-924 [doi]

Feature compensation based on switching linear dynamic model and soft decisionWoohyung Lim, Bong Kyoung Kim, Nam Soo Kim. 925-928 [doi]

Using output probability distribution for improving speech recognition in adverse environmentShilei Huang, Xiang Xie, Jingming Kuang. 929-932 [doi]

A generalized framework for compensation of mel-filterbank outputs in feature extraction for robust ASREric H. C. Choi. 933-936 [doi]

Robust automatic speech recognition using a perceptually-based optimal spectral amplitude estimator speech enhancement algorithm in various low-SNR environmentsHesham Tolba, Zili Li, Douglas D. O Shaughnessy. 937-940 [doi]

Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energiesStephen So, Kuldip K. Paliwal. 941-944 [doi]

Sub-band weighted projection measure for robust sub-band speech recognitionBabak Nasersharif, Ahmad Akbari. 945-948 [doi]

Noise compensation using interacting multiple kalman filtersJianping Deng, Martin Bouchard, Tet Hin Yeap. 949-952 [doi]

Kalman and unscented kalman filter feature enhancement for noise robust ASRVeronique Stouten, Hugo Van Hamme, Patrick Wambacq. 953-956 [doi]

Histogram-based quantization (HQ) for robust and scalable distributed speech recognitionChia-Yu Wan, Lin-Shan Lee. 957-960 [doi]

Rapid response and robust speech recognition by preliminary model adaptation for additive and convolutional noiseSatoshi Kobashikawa, Satoshi Takahashi, Yoshikazu Yamaguchi, Atsunori Ogawa. 965-968 [doi]

Nonlinear and linear transformations of speech features to compensate for channel and noise effectsSaurabh Prasad, Stephen A. Zahorian. 969-972 [doi]

Construction method of acoustic models dealing with various background noises based on combination of HMMsMotoyuki Suzuki, Yusuke Kato, Akinori Ito, Shozo Makino. 973-976 [doi]

Robust speech recognition based on noise and SNR classification - a multiple-model frameworkHaitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg. 977-980 [doi]

Eigen-environment based noise compensation method for robust speech recognitionHwa Jeon Song, Hyung Soon Kim. 981-984 [doi]

Robust feature compensation in nonstationary and multiple noise environmentsMartin Graciarena, Horacio Franco, Gregory K. Myers, Victor Abrash. 985-988 [doi]

Maximum mutual information SPLICE transform for seen and unseen conditionsJasha Droppo, Alex Acero. 989-992 [doi]

Speech recognition with support vector machines in a hybrid systemSven E. Krüger, Martin Schafföner, Marcel Katz, Edin Andelic, Andreas Wendemuth. 993-996 [doi]

Experiments on speaker profile portabilityVincent Barreaud, Douglas D. O Shaughnessy, Jean-Guy Dahan. 997-1000 [doi]

A confidence measure invariant to language and grammarDaniele Colibro, Luciano Fissore, Claudio Vair, Emanuele Dalmasso, Pietro Laface. 1001-1004 [doi]

Robust detection of sonorant landmarksKen Schutte, James R. Glass. 1005-1008 [doi]

The labial-coronal effect and CVCV stability during reiterant speech production: an acoustic analysisAmélie Rochet-Capellan, Jean-Luc Schwartz. 1009-1012 [doi]

The labial-coronal effect and CVCV stability during reiterant speech production: an articulatory analysisAmélie Rochet-Capellan, Jean-Luc Schwartz. 1013-1016 [doi]

Articulatory constraints and coronal stops: an EPG studyMitsuhiro Nakamura. 1017-1020 [doi]

Strategies of labial coarticulationVincent Robert, Brigitte Wrobel-Dautcourt, Yves Laprie, Anne Bonneau. 1021-1024 [doi]

Investigation and modeling of coarticulation during speechJianwu Dang, Jianguo Wei, Takeharu Suzuki, Pascal Perrier. 1025-1028 [doi]

Tongue kinematics in diphthong production in Ningbo ChineseFang Hu. 1029-1032 [doi]

Comparing tongue positions of vowels in oral and nasal contextsTakayuki Arai. 1033-1036 [doi]

Can we retrieve vocal tract dynamics that produced speech? toward a speaker articulatory strategy modelSlim Ouni. 1037-1040 [doi]

Modeling the production of VCV sequences via the inversion of a biomechanical model of the tonguePascal Perrier, Liang Ma, Yohan Payan. 1041-1044 [doi]

Estimation of the acoustic properties of the nasal tract during the production of nasalized vowelsXiaochuan Niu, Alexander Kain, Jan P. H. van Santen. 1045-1048 [doi]

A web-based articulatory speech synthesis system for distance educationKohichi Ogata. 1049-1052 [doi]

Group delay function as a means to assess quality of glottal inverse filteringPaavo Alku, Matti Airas, Tomas Bäckström, Hannu Pulakka. 1053-1056 [doi]

Subglottal pressure and NAQ variation in voice production of classically trained baritone singersEva Björkner, Johan Sundberg, Paavo Alku. 1057-1060 [doi]

Covariation of subglottal pressure, F0 and intensityGunnar Fant, Anita Kruckenberg. 1061-1064 [doi]

Automatic voice-source parameterization of natural speechJavier Pérez, Antonio Bonafonte. 1065-1068 [doi]

Physiological study of whispered speech in Moroccan ArabicChakir Zeroual, John H. Esling, Lise Crevier-Buchman. 1069-1072 [doi]

Voice quality in down syndrome children treated with rapid maxillary expansionC. P. Moura, D. Andrade, L. M. Cunha, M. J. Cunha, H. Vilarinho, H. Barros, Diamantino Freitas, M. Pais-Clemente. 1073-1076 [doi]

Synthesis of disordered speechJulien Hanquinet, Francis Grenez, Jean Schoentgen. 1077-1080 [doi]

Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic databaseJulie Fontecave, Frédéric Berthommier. 1081-1084 [doi]

The working memory token test (WMTT): preliminary findings in young adults with and without dyslexiaShimon Sapir, Ravit Cohen Mimran. 1085-1088 [doi]

Reducing the corpus-based TTS signal degradation due to speaker s word pronunciationsSérgio Paulo, Luís C. Oliveira. 1089-1092 [doi]

A phonetic study of the "er-hua" rimes in Beijing MandarinWai-Sum Lee. 1093-1096 [doi]

Learning statistically characterized resonance targets in a hidden trajectory model of speech coarticulation and reductionLi Deng, Dong Yu, Alex Acero. 1097-1100 [doi]

Articulatory motivated acoustic features for speech recognitionDaniil Kocharov, András Zolnay, Ralf Schlüter, Hermann Ney. 1101-1104 [doi]

Effects of Bayesian predictive classification using variational Bayesian posteriors for sparse training data in speech recognitionShinji Watanabe, Atsushi Nakamura. 1105-1108 [doi]

A study on separation between acoustic models and its applicationsYu Tsao, Jinyu Li, Chin-Hui Lee. 1109-1112 [doi]

Extended baum-welch reestimation of Gaussian mixture models based on reverse Jensen inequalityMohamed Afify. 1113-1116 [doi]

Hidden conditional random fields for phone classificationAsela Gunawardana, Milind Mahajan, Alex Acero, John C. Platt. 1117-1120 [doi]

Asymptotically exact AM-FM decomposition based on iterated hilbert transformFrancesco Gianfelici, Giorgio Biagetti, Paolo Crippa, Claudio Turchetti. 1121-1124 [doi]

Advances in statistical estimation and tracking of AM-FM speech componentsAthanassios Katsamanis, Petros Maragos. 1125-1128 [doi]

Formant frequency prediction from MFCC vectors in noisy environmentsJonathan Darch, Ben P. Milner, Saeed Vaseghi. 1129-1132 [doi]

Detection of vowel onset point events using excitation informationS. R. Mahadeva Prasanna, B. Yegnanarayana. 1133-1136 [doi]

Pitch-synchronous time-scaling for prosodic and voice quality transformationsJoão P. Cabral, Luís C. Oliveira. 1137-1140 [doi]

Discrimination between singing and speaking voicesYasunori Ohishi, Masataka Goto, Katunobu Itou, Kazuya Takeda. 1141-1144 [doi]

Two experiments comparing reading with listening for human processing of conversational telephone speechDouglas Jones, Wade Shen, Elizabeth Shriberg, Andreas Stolcke, Teresa M. Kamm, Douglas A. Reynolds. 1145-1148 [doi]

The ESTER phase II evaluation campaign for the rich transcription of French broadcast newsSylvain Galliano, Edouard Geoffrois, Djamel Mostefa, Khalid Choukri, Jean-François Bonastre, Guillaume Gravier. 1149-1152 [doi]

A method of multi-layered speech segmentation tailored for speech synthesisTakashi Saito. 1153-1156 [doi]

Generation of word alternative pronunciations using weighted finite state transducersSérgio Paulo, Luís C. Oliveira. 1157-1160 [doi]

Multiword expressions in spontaneous speech: do we really speak like that?Helmer Strik, Diana Binnenpoorte, Catia Cucchiarini. 1161-1164 [doi]

Czech spontaneous speech corpus with structural metadataJáchym Kolár, Jan Svec, Stephanie Strassel, Christopher Walker, Dagmar Kozlíková, Josef Psutka. 1165-1168 [doi]

A longitudinal analysis of the spectral peaks of vowels for a Japanese infantKentaro Ishizuka, Ryoko Mugitani, Hiroko Kato Solvang, Shigeaki Amano. 1169-1172 [doi]

Cross-linguistic comparison of two-year-old children s acoustic vowel spaces: contrasting Hungarian with dutchKrisztina Zajdó, Jeannette M. van der Stelt, Ton G. Wempe, Louis C. W. Pols. 1173-1176 [doi]

Acoustic correlates of contrastive stress in German childrenBritta Lintfert, Katrin Schneider. 1177-1180 [doi]

Ecological language acquisition via incremental model-based clusteringGiampiero Salvi. 1181-1184 [doi]

Perceptual and linguistic category formation in infantsTamami Sudo, Ken Mogi. 1185-1188 [doi]

Myoelectric signals for multimodal speech recognitionRaghunandan S. Kumaran, Karthik Narayanan, John N. Gowdy. 1189-1192 [doi]

Is color information really useful for lip-reading ? (or what is lost when color is not used)Philippe Daubias. 1193-1196 [doi]

A system for audio-visual speech recognitionIslam Shdaifat, Rolf-Rainer Grigat. 1197-1200 [doi]

Multimodal interface for organization name input based on combination of isolated word recognition and continuous base-word recognitionNorihide Kitaoka, Hironori Oshikawa, Seiichi Nakagawa. 1201-1204 [doi]

Recognition of (3) party conversation using prosody and gazeYosuke Matsusaka. 1205-1208 [doi]

Combining voiceprint and face biometrics for speaker identification using SDWSDongdong Li, Yingchun Yang, Zhaohui Wu. 1209-1212 [doi]

Using the focus of visual attention to improve spontaneous speech recognitionNeil Cooke, Martin Russell. 1213-1216 [doi]

Real-time outer lip contour tracking for HCI applicationsSabri Gurbuz. 1217-1220 [doi]

Improving lip-reading with feature space transforms for multi-stream audio-visual speech recognitionJing Huang, Karthik Visweswariah. 1221-1224 [doi]

Are there facial correlates of Thai syllabic tones?Hansjörg Mixdorff, Denis Burnham, Guillaume Vignali, Patavee Charnvivit. 1225-1228 [doi]

A new posterior based audio-visual integration method for robust speech recognitionRowan Seymour, Ji Ming, Darryl Stewart. 1229-1232 [doi]

On integrating insights from human speech perception into automatic speech recognitionSorin Dusan, Lawrence R. Rabiner. 1233-1236 [doi]

Parallels between HSR and ASR: how ASR can contribute to HSROdette Scharenborg. 1237-1240 [doi]

ASR decoding in a computational model of human word recognitionLouis ten Bosch, Odette Scharenborg. 1241-1244 [doi]

An investigation into a simulation of episodic memory for automatic speech recognitionViktoria Maier, Roger K. Moore. 1245-1248 [doi]

Phonetic ignorance is bliss: investigating the effects of phonetic information reduction on ASR performanceEric Fosler-Lussier, C. Anton Rytting, Soundararajan Srinivasan. 1249-1252 [doi]

Automatic speech recognition with neural spike trainsMarcus Holmberg, David Gelbart, Ulrich Ramacher, Werner Hemmert. 1253-1256 [doi]

A speech similarity distance weighting for robust recognitionMichael J. Carey, Tuan P. Quang. 1257-1260 [doi]

Japanese vowel recognition based on structural representation of speechTakao Murakami, Kazutaka Maruyama, Nobuaki Minematsu, Keikichi Hirose. 1261-1264 [doi]

Modeling the perception of multitalker speechSoundararajan Srinivasan, DeLiang Wang. 1265-1268 [doi]

Binaural feature selection for missing data speech recognitionSue Harding, Jon P. Barker, Guy J. Brown. 1269-1272 [doi]

Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machinesThorsten Wesker, Bernd T. Meyer, Kirsten Wagener, Jörn Anemüller, Alfred Mertins, Birger Kollmeier. 1273-1276 [doi]

Minimum word error based discriminative training of language modelsJen-Wei Kuo, Berlin Chen. 1277-1280 [doi]

On the use of morphological constraints in n-gram statistical language modelA. Ghaoui, François Yvon, Chafic Mokbel, Gérard Chollet. 1281-1284 [doi]

A posteriori multiple word-domain language modelElvira I. Sicilia-Garcia, Ji Ming, F. Jack Smith. 1285-1288 [doi]

Effective topic-tree based language model adaptationJavier Dieguez-Tirado, Carmen García-Mateo, Antonio Cardenal López. 1289-1292 [doi]

Building topic specific language models from webdata using competitive modelsAbhinav Sethy, Panayiotis G. Georgiou, Shrikanth Narayanan. 1293-1296 [doi]

Trigger-based language model adaptation for automatic meeting transcriptionCarlos Troncoso, Tatsuya Kawahara. 1297-1300 [doi]

Statistical language models for large vocabulary spontaneous speech recognition in dutchJacques Duchateau, Dong Hoon Van Uytsel, Hugo Van Hamme, Patrick Wambacq. 1301-1304 [doi]

Diachronic vocabulary adaptation for broadcast news transcriptionAlexandre Allauzen, Jean-Luc Gauvain. 1305-1308 [doi]

Growing an n-gram language modelVesa Siivola, Bryan L. Pellom. 1309-1312 [doi]

Embedding grammars into statistical language modelsHarald Hning, Manuel Kirschner, Fritz Class, André Berton, Udo Haiber. 1313-1316 [doi]

Methods for combining language models in speech recognitionSimo Broman, Mikko Kurimo. 1317-1320 [doi]

Review of statistical modeling of highly inflected lithuanian using very large vocabularyAirenas Vaiciunas, Gailius Raskinis. 1321-1324 [doi]

Generalized hebbian algorithm for incremental latent semantic analysisGenevieve Gorrell, Brandyn Webb. 1325-1328 [doi]

Language model adaptation for resource deficient languages using translated dataArnar Thor Jensson, Edward W. D. Whittaker, Koji Iwano, Sadaoki Furui. 1329-1332 [doi]

POS-based language models for large vocabulary speech recognition on embedded systemsPetra Witschel, Sergey Astrov, Gabriele Bakenecker, Josef G. Bauer, Harald Höge. 1333-1336 [doi]

Automatic generation of domain-dependent pronunciation lexicon with data-driven rules and rule adaptationJe Hun Jeon, Minhwa Chung. 1337-1340 [doi]

Pronunciation variation modelling using accent featuresMichael Tjalve, Mark Huckvale. 1341-1344 [doi]

Automatic detection of frequent pronunciation errors made by L2-learnersKhiet P. Truong, Ambra Neri, Febe de Wet, Catia Cucchiarini, Helmer Strik. 1345-1348 [doi]

Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH projectJosef Psutka, Pavel Ircing, Josef V. Psutka, Jan Hajic, William J. Byrne, Jirí Mírovský. 1349-1352 [doi]

A study of implicit and explicit modeling of coarticulation and pronunciation variationStéphane Dupont, Christophe Ris, Laurent Couvreur, Jean-Marc Boite. 1353-1356 [doi]

Detection of coughs from user utterances using imitated phoneme modelShinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta. 1357-1360 [doi]

Stochastic pronunciation modeling by ergodic-HMM of acoustic sub-word unitsV. Ramasubramanian, P. Srinivas, T. V. Sreenivas. 1361-1364 [doi]

An automated linguistic knowledge-based cross-language transfer method for building acoustic models for a language without native training dataChen Liu, Lynette Melnar. 1365-1368 [doi]

Fully automated non-native speech recognition using confusion-based acoustic model integrationGhazi Bouselmi, Dominique Fohr, Irina Illina, Jean-Paul Haton. 1369-1372 [doi]

The focus prosody: more than a simple binary functionVéronique Aubergé, Albert Rilliard. 1373-1376 [doi]

Peak timing in two dialects of connaught irishMartha Dalton, Ailbhe Ní Chasaide. 1377-1380 [doi]

Compound rises and uptalk in spoken EnglishJanet Fletcher. 1381-1384 [doi]

Duration and the temporal structure of Mandarin discourseLi-chiung Yang. 1385-1388 [doi]

Prosodic realization of split noun phrases in Mandarin Chinese compared in topic and focus contextsBei Wang. 1389-1392 [doi]

Downstep effect on disyllabic words of citation forms in standard ChineseZiyu Xiong. 1393-1396 [doi]

Estimation of intonation variation with constrained tone transformationsJinfu Ni, Hisashi Kawai, Keikichi Hirose. 1397-1400 [doi]

Voice quality of falling tones in taiwan minHo-hsien Pan. 1401-1404 [doi]

Duration, intensity and pause predictions in relation to prosody organizationChiu-yu Tseng, Bau-Ling Fu. 1405-1408 [doi]

Pitch accent prediction: effects of genre and speakerJiahong Yuan, Jason M. Brenier, Daniel Jurafsky. 1409-1412 [doi]

Analysis and modeling of fundamental frequency contours of hindi utterancesHiroya Fujisaki, Sumio Ohno. 1413-1416 [doi]

Fundamental frequency and tone in isizulu: initial experimentsNatasha Govender, Etienne Barnard, Marelie H. Davel. 1417-1420 [doi]

Intonational sequences in tuscan ItalianJudith Bishop, Marc Peake, Dmitry Sityaev. 1421-1424 [doi]

Effects of raddoppiamento sintattico on tonal alignment in ItalianCaterina Petrone. 1425-1428 [doi]

Acoustic analysis of Czech stress: intonation, duration and intensity revisitedTomás Dubeda, Jan Votrubec. 1429-1432 [doi]

Variability of F0 peak alignment in moroccan Arabic accentual focusMohamed Yeou. 1433-1436 [doi]

Phonological analysis of schwa and liaison within the PFC project (phonologie du fran ais contemporain): how determinant are the prosodic factors?Anne Lacheret, Ch. Lyche, Michel Morel. 1437-1440 [doi]

Abstractness in speech-metronome synchronisation: P-centres as cyclic attractorsPlínio A. Barbosa, Pablo Arantes, Alexsandro R. Meireles, Jussara M. Vieira. 1441-1444 [doi]

Improvement of rejection performance of keyword spotting using anti-keywords derived from large vocabulary considering acoustical similarity to keywordsMakoto Yamada, Tsuneo Kato, Masaki Naito, Hisashi Kawai. 1445-1448 [doi]

Bayes risk minimization using metric loss functionsRalf Schlüter, T. Scharrenbach, Volker Steinbiss, Hermann Ney. 1449-1452 [doi]

Word error rate minimization using an integrated confidence measureAkio Kobayashi, Kazuo Onoe, Shoei Sato, Toru Imai. 1453-1456 [doi]

Fast confidence measure algorithm for continuous speech recognitionBin Dong, QingWei Zhao, YongHong Yan. 1457-1460 [doi]

Developing and enhancing posterior based speech recognition systemsHamed Ketabdar, Jithendra Vepa, Samy Bengio, Hervé Bourlard. 1461-1464 [doi]

Background model based posterior probability for measuring confidencePeng Liu, Ye Tian, Jian-Lai Zhou, Frank K. Soong. 1465-1468 [doi]

Foreign accents in synthetic speech: development and evaluationLaura Mayfield Tomokiyo, Alan W. Black, Kevin A. Lenzo. 1469-1472 [doi]

Toward multiple-language TTS: experiments in English and MandarinRaul Fernandez, Wei Zhang, Ellen Eide, Raimo Bakis, Wael Hamza, Yi Liu, Michael Picheny, John F. Pitrelli, Yong Qing, Zhiwei Shuang, Li Qin Shen. 1473-1476 [doi]

Cross-language synthesis with a polyglot synthesizerJavier Latorre, Koji Iwano, Sadaoki Furui. 1477-1480 [doi]

Development of a Kiswahili text to speech systemMucemi Gakuru, Frederick K. Iraki, Roger Tucker, Ksenia Shalonova, Kamanda Ngugi. 1481-1484 [doi]

Multilingual models in the IBM bilingual text-to-speech systemsJaime Botella Ordinas, Volker Fischer, Claire Waast-Richard. 1485-1488 [doi]

Reconstruction of Polish diacritics in a text-to-speech systemArtur Janicki, Piotr Herman. 1489-1492 [doi]

Design of bandwidth scalable LSF quantization using interframe and intraframe predictionHiroyuki Ehara, Toshiyuki Morii, Masahiro Oshikiri, Koji Yoshida, Kouichi Honma. 1493-1496 [doi]

Artificial bandwidth extension of speech supported by watermark-transmitted side informationBernd Geiser, Peter Jax, Peter Vary. 1497-1500 [doi]

Speech bandwidth extension by improved codebook mapping towards increased phonetic classificationRongqiang Hu, Venkatesh Krishnan, David V. Anderson. 1501-1504 [doi]

Bandwidth expansion of narrowband speech using non-negative matrix factorizationDhananjay Bansal, Bhiksha Raj, Paris Smaragdis. 1505-1508 [doi]

Robust bandwidth extension of noise-corrupted narrowband speechMichael L. Seltzer, Alex Acero, Jasha Droppo. 1509-1512 [doi]

Pitch-synchronous time-scaling for high-frequency excitation regenerationJoão P. Cabral, Luís C. Oliveira. 1513-1516 [doi]

A database of German emotional speechFelix Burkhardt, Astrid Paeschke, M. Rolfes, Walter F. Sendlmeier, Benjamin Weiss. 1517-1520 [doi]

Evaluating the pronunciation of proper names by four French grapheme-to-phoneme convertersPhilippe Boula de Mareüil, Christophe d Alessandro, Gérard Bailly, Frédéric Béchet, Marie-Neige Garcia, Michel Morel, Romain Prudon, Jean Véronis. 1521-1524 [doi]

A human-human train timetable dialogue corpusFilip Jurcícek, Jirí Zahradil, Libor Jelínek. 1525-1528 [doi]

A Portuguese spoken and multi-modal dialog corporaGloria Branco, Luís Almeida, Rui Gomes, Nuno Beires. 1529-1532 [doi]

Development of a Cantonese-English code-mixing speech corpusJoyce Y. C. Chan, P. C. Ching, Tan Lee. 1533-1536 [doi]

BNSI Slovenian broadcast news database - speech and text corpusAndrej Zgank, Darinka Verdonik, Aleksandra Zögling Markus, Zdravko Kacic. 1537-1540 [doi]

Confronting HMM-based phone labelling with human evaluation of speech productionJan Volín, Radek Skarnitzl, Petr Pollák. 1541-1544 [doi]

Structural metadata annotation: moving beyond EnglishStephanie Strassel, Jáchym Kolár, Zhiyi Song, Leila Barclay, Meghan Lammie Glenn. 1545-1548 [doi]

Neologos: an optimized database for the development of new speech processing algorithmsDelphine Charlet, Sacha Krstulovic, Frédéric Bimbot, Olivier Boëffard, Dominique Fohr, Odile Mella, Filip Korkmazsky, Djamel Mostefa, Khalid Choukri, Arnaud Vallée. 1549-1552 [doi]

A hybrid approach to automatic segmentation and labeling for Mandarin Chinese speech corpusCheng-Yuan Lin, Kuan-Ting Chen, Jyh-Shing Roger Jang. 1553-1556 [doi]

The multiple pronunciations in Taiwanese and the automatic transcription of Buddhist sutra with augmented read speechYuang-Chin Chiang, Min-Siong Liang, Hong-Yi Lin, Ren-Yuan Lyu. 1557-1560 [doi]

Bootstrapping pronunciation dictionaries: practical issuesMarelie H. Davel, Etienne Barnard. 1561-1564 [doi]

Root causes of lost time and user stress in a simple dialog systemNigel G. Ward, Anais G. Rivera, Karen Ward, David G. Novick. 1565-1568 [doi]

Evaluating communication effectiveness in team collaborationJulie A. Parisi, Douglas Brungart. 1569-1572 [doi]

Bilingual aligned corpora for speech to speech translation for Spanish, English and CatalanDavid Conejero, Alan Lounds, Carmen García-Mateo, Leandro Rodríguez Liñares, Raquel Mochales, Asunción Moreno. 1573-1576 [doi]

Design and collection of Czech Lombard speech databaseHynek Boril, Petr Pollák. 1577-1580 [doi]

TBALL data collection: the making of a young children s speech corpusAbe Kazemzadeh, Hong You, Markus Iseli, Barbara Jones, Xiaodong Cui, Margaret Heritage, Patti Price, Elaine Andersen, Shrikanth Narayanan, Abeer Alwan. 1581-1584 [doi]

Construction and utilization of bilingual speech corpus for simultaneous machine interpretation researchHitomi Tohyama, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki. 1585-1588 [doi]

Meeting acts: a labeling system for group interaction in meetingsRebecca A. Bates, Patrick Menning, Elizabeth Willingham, Chad Kuyper. 1589-1592 [doi]

A new evaluation criteria for keyword spotting techniques and a new algorithmMarius-Calin Silaghi, Rachna Vargiya. 1593-1596 [doi]

Phattsessionz: recording 1000 adolescent speakers in schools in GermanyChristoph Draxler, Alexander Steffen. 1597-1600 [doi]

An Amharic speech corpus for large vocabulary continuous speech recognitionSolomon Teferra Abate, Wolfgang Menzel, Bairu Tafila. 1601-1604 [doi]

The FASil speech and multimodal corporaHans Dolfing, David Reitter, Luís Almeida, Nuno Beires, Michael Cody, Rui Gomes, Kerry Robinson, Roman Zielinski. 1605-1608 [doi]

Revealing phonological similarities between German and dutchKarin Müller. 1609-1612 [doi]

Development of a conversational telephone speech recognizer for Levantine ArabicDimitra Vergyri, Katrin Kirchhoff, Venkata Ramana Rao Gadde, Andreas Stolcke, Jing Zheng. 1613-1616 [doi]

Exploiting large quantities of spontaneous speech for unsupervised training of acoustic modelsBhuvana Ramabhadran. 1617-1620 [doi]

Improved spontaneous Mandarin speech recognition by disfluency interruption point (IP) detection using prosodic featuresChe-Kuang Lin, Lin-Shan Lee. 1621-1624 [doi]

Improvements to the BBN RT04 Mandarin conversational telephone speech recognition systemJeff Z. Ma, Spyros Matsoukas. 1625-1628 [doi]

Incorporating a Bayesian wide phonetic context model for acoustic rescoringSakriani Sakti, Satoshi Nakamura, Konstantin Markov. 1629-1632 [doi]

Modeling vowels for Arabic BN transcriptionAbdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain. 1633-1636 [doi]

Recent progress in Arabic broadcast news transcription at BBNMohamed Afify, Long Nguyen, Bing Xiang, Sherif Abdou, John Makhoul. 1637-1640 [doi]

The 2004 BBN 1xRT recognition systems for English broadcast news and conversational telephone speechSpyros Matsoukas, Rohit Prasad, Srinivas Laxminarayan, Bing Xiang, Long Nguyen, Richard M. Schwartz. 1641-1644 [doi]

The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition systemRohit Prasad, Spyros Matsoukas, Chia-Lin Kao, Jeff Z. Ma, D.-X. Xu, Thomas Colthurst, Owen Kimball, Richard M. Schwartz, Jean-Luc Gauvain, Lori Lamel, Holger Schwenk, Gilles Adda, Fabrice Lefèvre. 1645-1648 [doi]

The BBN Mandarin broadcast news transcription systemBing Xiang, Long Nguyen, Xuefeng Guo, Dongxin Xu. 1649-1652 [doi]

The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast newsPaul Deléglise, Yannick Estève, Sylvain Meignier, Téva Merlin. 1653-1656 [doi]

Transcribing lectures and seminarsLori Lamel, Gilles Adda, Eric Bilinski, Jean-Luc Gauvain. 1657-1660 [doi]

Transcription of conference room meetings: an investigationThomas Hain, John Dines, Giulia Garau, Martin Karafiát, Darren Moore, Vincent Wan, Roeland Ordelman, Steve Renals. 1661-1664 [doi]

Where are we in transcribing French broadcast news?Jean-Luc Gauvain, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Véronique Gendner, Lori Lamel, Holger Schwenk. 1665-1668 [doi]

Two-pass strategy for handling OOVs in a large vocabulary recognition taskOdette Scharenborg, Stephanie Seneff. 1669-1672 [doi]

The BBN RT04 English broadcast news transcription systemLong Nguyen, Bing Xiang, Mohamed Afify, Sherif Abdou, Spyros Matsoukas, Richard M. Schwartz, John Makhoul. 1673-1676 [doi]

Investigations on ensemble based semi-supervised acoustic model trainingRong Zhang, Ziad Al Bawab, Arthur Chan, Ananlada Chotimongkol, David Huggins-Daines, Alexander I. Rudnicky. 1677-1680 [doi]

Fully automated system for Czech spoken broadcast transcription with very large (300k+) lexiconJan Nouza, Jindrich Zdánský, Petr David, Petr Cerva, Jan Kolorenc, Dana Nejedlová. 1681-1684 [doi]

Experiments with probabilistic principal component analysis in LVCSRMike Schuster, Takaaki Hori, Atsushi Nakamura. 1685-1688 [doi]

Vietnamese large vocabulary continuous speech recognitionThang Tat Vu, Dung Tien Nguyen, Luong Chi Mai, John-Paul Hosom. 1689-1692 [doi]

Data sampling for improved speech recognizer trainingTakahiro Shinozaki, Mari Ostendorf, Les E. Atlas. 1693-1696 [doi]

Influence of F0 on Vietnamese syllable perceptionDo Dat Tran, Eric Castelli, Jean-François Serignat, Van Loan Trinh, Xuan Hung Le. 1697-1700 [doi]

Lexical tone and pitch perception in tone and non-tone language speakersBarbara Schwanhäußer, Denis Burnham. 1701-1704 [doi]

Intonational contrasts in EP: a categorical perception approachIsabel Falé, Isabel Hub Faria. 1705-1708 [doi]

Does narrow focus activate alternative referents?Bettina Braun, Andrea Weber, Matthew W. Crocker. 1709-1712 [doi]

Audiovisual interaction on the perception of frequency glide of linear sweep tonesKiyoaki Aikawa, Hayato Hashimoto. 1713-1716 [doi]

Audiovisual integration in dichotic listeningKei Omata, Ken Mogi. 1717-1720 [doi]

Perception experiment combining a parametric loudspeaker and a synthetic talking headGunilla Svanfeldt, Dirk Olszewski. 1721-1724 [doi]

Multidimensional scaling of listener responses to synthetic speechCatherine Mayo, Robert A. J. Clark, Simon King. 1725-1728 [doi]

A timbre space for speechHiroko Terasawa, Malcolm Slaney, Jonathan Berger. 1729-1732 [doi]

Voice quality assessment by means of comparative judgments of speech tokensAbdellah Kacha, Francis Grenez, Jean Schoentgen. 1733-1736 [doi]

Speech intelligibility derived from time-frequency and source smearingToshio Irino, Satoru Satou, Shunsuke Nomura, Hideki Banno, Hideki Kawahara. 1737-1740 [doi]

Steady-state pre-processing for improving speech intelligibility in reverberant environments: evaluation in a hall with an electrical reverberatorNahoko Hayashi, Takayuki Arai, Nao Hodoshima, Yusuke Miyauchi, Kiyohiro Kurisu. 1741-1744 [doi]

Neural bases of listening to speech in noisePatrick C. M. Wong, Kiara M. Lee, Todd B. Parrish. 1745-1748 [doi]

The intelligibility of tracheoesophageal speech: first resultsP. Jongmans, Frans J. M. Hilgers, Louis C. W. Pols, C. J. van As-Brooks. 1749-1752 [doi]

A computational model of the speech reception threshold for laterally separated speech and noiseGuy J. Brown, Kalle J. Palomäki. 1753-1756 [doi]

Lexical inhibition effects in time-compressed speechEsther Janse. 1757-1760 [doi]

Perception of time-compressed rapid acoustic cues in French CV syllablesCaroline Jacquier, Fanny Meunier. 1761-1764 [doi]

Reversed speech comprehension depends on the auditory efferent system functionalityClaire-Léonie Grataloup, Michel Hoen, François Pellegrino, E. Veuillet, Lionel Collet, Fanny Meunier. 1765-1768 [doi]

Perceptual space of English fricatives for Japanese learnersWon Tokuma, Shinichi Tokuma. 1769-1772 [doi]

Perceptual salience of language-specific acoustic differences in autonomous fillers across eight languagesIoana Vasilescu, Maria Candea, Martine Adda-Decker. 1773-1776 [doi]

Effects of cortical and subcortical brain damage on the processing of emotional prosodyMarc D. Pell. 1777-1780 [doi]

Spontaneous speech: how people really talk and why engineers should careElizabeth Shriberg. 1781-1784 [doi]

Feature adaptation using projection of Gaussian posteriorsKarthik Visweswariah, Peder A. Olsen. 1785-1788 [doi]

Maximum margin learning and adaptation of MLP classifiersXiao Li, Jeff Bilmes, Jonathan Malkin. 1789-1792 [doi]

Leveraging speaker-dependent variation of adaptationArindam Mandal, Mari Ostendorf, Andreas Stolcke. 1793-1796 [doi]

A comparative study of two kernel eigenspace-based speaker adaptation methods on large vocabulary continuous speech recognitionRoger Wend-Huu Hsiao, Brian Kan-Wing Mak. 1797-1800 [doi]

Environmental compensation using ASR model adaptation by a Bayesian parametric representation methodXuechuan Wang, Douglas D. O Shaughnessy. 1801-1804 [doi]

Discriminative speaker adaptation with eigenvoicesJun Luo, Zhijian Ou, Zuoying Wang. 1805-1808 [doi]

Context in multi-lingual tone and pitch accent recognitionGina-Anne Levow. 1809-1812 [doi]

Automatic prominence identification and prosodic typologyFabio Tamburini. 1813-1816 [doi]

Influence of syntax on prosodic boundary predictionTommy Ingulfsen, Tina Burrows, Sabine Buchholz. 1817-1820 [doi]

Using prosodic information for disambiguation purposesRoberto Gretter, Dino Seppi. 1821-1824 [doi]

Analysis of the effects of word emphasis and echo question on F0 contours of Cantonese utterancesWentao Gu, Keikichi Hirose, Hiroya Fujisaki. 1825-1828 [doi]

Combining models of prosodic phrasing and pausingTina Burrows, Peter Jackson, Katherine Knill, Dmitry Sityaev. 1829-1832 [doi]

Distinguishing deceptive from non-deceptive speechJulia Hirschberg, Stefan Benus, Jason M. Brenier, Frank Enos, Sarah Friedman, Sarah Gilman, Cynthia Girand, Martin Graciarena, Andreas Kathol, Laura Michaelis, Bryan L. Pellom, Elizabeth Shriberg, Andreas Stolcke. 1833-1836 [doi]

Detecting certainness in spoken tutorial dialoguesJackson Liscombe, Julia Hirschberg, Jennifer J. Venditti. 1837-1840 [doi]

Detection of real-life emotions in call centersLaurence Vidrascu, Laurence Devillers. 1841-1844 [doi]

Using context to improve emotion detection in spoken dialog systemsJackson Liscombe, Giuseppe Riccardi, Dilek Z. Hakkani-Tür. 1845-1848 [doi]

Voice quality and f0 cues for affect expression: implications for synthesisIrena Yanushevskaya, Christer Gobl, Ailbhe Ní Chasaide. 1849-1852 [doi]

Voice and emotional expression transformation based on statistics of vowel parameters in an emotional speech databaseToru Takahashi, Takeshi Fujii, Masashi Nishi, Hideki Banno, Toshio Irino, Hideki Kawahara. 1853-1856 [doi]

Automated wizard-of-oz for spoken dialogue systemsGiuseppe Di Fabbrizio, Gökhan Tür, Dilek Z. Hakkani-Tür. 1857-1860 [doi]

A rapid prototyping tool for constructing web-based MMI applicationsKouichi Katsurada, Kunitoshi Sato, Hiroaki Adachi, Hirobumi Yamada, Tsuneo Nitta. 1861-1864 [doi]

Developing extensible and reusable spoken dialogue components: an examination of the Queen s communicatorPhilip Hanna, Ian M. O Neill, Xingkun Liu, Michael F. McTear. 1865-1868 [doi]

SGStudio: rapid semantic grammar development for spoken language understandingYe-Yi Wang, Alex Acero. 1869-1872 [doi]

Rapid transition to new spoken dialogue domains: language model training using knowledge from previous domain applications and web text resourcesMurat Akbacak, Yuqing Gao, Liang Gu, Hong-Kwang Jeff Kuo. 1873-1876 [doi]

A methodology for comparing grammar-based and robust approaches to speech understandingManny Rayner, Pierrette Bouillon, Nikos Chatzichrisafis, Beth Ann Hockey, Marianne Santaholma, Marianne Starlander, Hitoshi Isahara, Kyoko Kanzaki, Yukie Nakao. 1877-1880 [doi]

Learning to personalize spoken generation for dialogue systemsFrançois Mairesse, Marilyn A. Walker. 1881-1884 [doi]

Optimization of text-to-speech phonetic transcriptions using a-posteriori signal comparisonS. Revelin, Didier Cadic, Claire Waast-Richard. 1885-1888 [doi]

Voice transformation using principle component analysis based LSF quantization and dynamic programming approachÖzgül Salor, Mübeccel Demirekler. 1889-1892 [doi]

Adapt Mandarin TTS system to Chinese dialect TTS systemsHai Ping Li, Wei Zhang. 1893-1896 [doi]

Grapheme-to-phoneme conversion based on TBL algorithm in Mandarin TTS systemMin Zheng, Qin Shi, Wei Zhang, Lianhong Cai. 1897-1900 [doi]

An automaton-based machine learning technique for automatic phonetic transcriptionPaolo Massimino, Alberto Pacchiotti. 1901-1904 [doi]

Comparative objective and subjective evaluation of three data-driven techniques for proper name pronunciationTasanawan Soonklang, Robert I. Damper, Yannick Marchand. 1905-1908 [doi]

Articulatory synthesis using corpus-based estimation of line spectrum pairsOlov Engwall. 1909-1912 [doi]

Effects of pitch accent type on interpreting information status in synthetic speechAoju Chen, Els den Os. 1913-1916 [doi]

Towards generic spatial object model and route guidance grammar for speech-based systemsPerttu Prusi, Anssi Kainulainen, Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen, Leena Helin. 1917-1920 [doi]

Duration-embedded bi-HMM for expressive voice conversionChi-Chun Hsia, Chung-Hsien Wu, Te-Hsien Liu. 1921-1924 [doi]

Analysis of major factors of naturalness degradation in concatenative synthesisToshio Hirai, Hisashi Kawai, Minoru Tsuzaki, Nobuyuki Nishizawa. 1925-1928 [doi]

Duration modeling and memory optimization in a Mandarin TTS systemJilei Tian, Jani Nurminen, Imre Kiss. 1929-1932 [doi]

A bi-lingual Mandarin-to-taiwanese text-to-speech systemMin-Siong Liang, Ke-Chun Chuang, Rhuei-Cheng Yang, Yuang-Chin Chiang, Ren-Yuan Lyu. 1933-1936 [doi]

Using morphology and phoneme history to improve grapheme-to-phoneme conversionUwe D. Reichel, Florian Schiel. 1937-1940 [doi]

Predicting consonant duration with Bayesian belief networksOlga Goubanova, Simon King. 1941-1944 [doi]

Phonetic transcription verification with generalized posterior probabilityLijuan Wang, Yong Zhao, Min Chu, Frank K. Soong, Zhigang Cao. 1949-1952 [doi]

Training a maximum entropy model for surface realizationHua Cheng, Fuliang Weng, Niti Hantaweepant, Lawrence Cavedon, Stanley Peters. 1953-1956 [doi]

NAM-to-speech conversion with Gaussian mixture modelsTomoki Toda, Kiyohiro Shikano. 1957-1960 [doi]

Which Italian do current systems speak? a first step towards pronunciation modelling of Italian varietiesMichelina Savino, Mario Refice, Massimo Mitaritonna. 1961-1964 [doi]

Modelling pitch accent types for Polish speech synthesisDominika Oliver, Robert A. J. Clark. 1965-1968 [doi]

Learning methods and features for corpus-based phrase break prediction on ThaiChatchawarn Hansakunbuntheung, Ausdang Thangthai, Chai Wutiwiwatchai, Rungkarn Siricharoenchai. 1969-1972 [doi]

Hidden Markov models for grapheme to phoneme conversionPaul Taylor. 1973-1976 [doi]

Robust distant speaker recognition based on position dependent cepstral mean normalizationLongbiao Wang, Norihide Kitaoka, Seiichi Nakagawa. 1977-1980 [doi]

Speaker adaptation in the NIST speaker recognition evaluation 2004David A. van Leeuwen. 1981-1984 [doi]

A distance measure between GMMs based on the unscented transform and its application to speaker recognitionJacob Goldberger, Hagai Aronowitz. 1985-1988 [doi]

Estimation of speaker s height and vocal tract length from speech signalSorin Dusan. 1989-1992 [doi]

On the relationship between phonetic modeling precision and phonetic speaker recognition accuracyDoroteo Torre Toledano, Carlos Fombella, Joaquin Gonzalez-Rodriguez, Luis A. Hernández Gómez. 1993-1996 [doi]

Open-set speaker identification using adapted Gaussian mixture modelsJ. Fortuna, P. Sivakumaran, Aladdin M. Ariyaeeinia, Amit S. Malegaonkar. 1997-2000 [doi]

Speaker verification in noisy conditions using correlated subband featuresJames McAuley, Ji Ming, Pat Corr. 2001-2004 [doi]

Probabilistic anchor models approach for speaker verificationMikaël Collet, Yassine Mami, Delphine Charlet, Frédéric Bimbot. 2005-2008 [doi]

A Bayesian network approach combining pitch and spectral envelope features to reduce channel mismatch in speaker verification and forensic speaker recognitionMijail Arcienega, Anil Alexander, Philipp Zimmermann, Andrzej Drygajlo. 2009-2012 [doi]

Channel robust speaker verification via Bayesian blind stochastic feature transformationKwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung. 2013-2016 [doi]

dPLRM-based speaker identification with log power spectrumTomoko Matsui, Kunio Tanabe. 2017-2020 [doi]

Speaker verification using Gaussian mixture models within changing real car environmentsXianxian Zhang, John H. L. Hansen, Pongtep Angkititrakul, Kazuya Takeda. 2021-2024 [doi]

The correspondences between the perception of the speaker individualities contained in speech sounds and their acoustic propertiesKanae Amino, Tsutomu Sugawara, Takayuki Arai. 2025-2028 [doi]

A noise-robust pitch synchronous feature extraction algorithm for speaker recognition systemsSamuel Kim, Sung-Wan Yoon, Thomas Eriksson, Hong-Goo Kang, Dae Hee Youn. 2029-2032 [doi]

Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognitionJing Deng, Thomas Fang Zheng, Zhanjiang Song, Jian Liu. 2033-2036 [doi]

In-set/out-of-set speaker identification based on discriminative speech frame selectionXianxian Zhang, John H. L. Hansen. 2037-2040 [doi]

Mixture of support vector machines for text-independent speaker recognitionZhenchun Lei, Yingchun Yang, Zhaohui Wu. 2041-2044 [doi]

Optimal model order selection based on regression tree in speaker identificationShilei Zhang, Junmei Bai, Shuwu Zhang, Bo Xu. 2045-2048 [doi]

Speaker verification improvement using blind inversion of distortionsMarcos Faúndez-Zanuy, Jordi Solé-Casals. 2049-2052 [doi]

Supergaussian GARCH models for speech signalsIsrael Cohen. 2053-2056 [doi]

A spectral conversion approach to feature denoising and speech enhancementAthanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller, Panagiotis Tsakalides. 2057-2060 [doi]

Acoustic feedback cancellation in speech reinforcement systems for vehiclesAlfonso Ortega, Eduardo Lleida, Enrique Masgrau, Luis Buera, Antonio Miguel. 2061-2064 [doi]

Implicit control of noise canceller for speech enhancementJulien Bourgeois, Jürgen Freudenberger, Guillaume Lathoud. 2065-2068 [doi]

Speech enhancement using Markov model of speech segmentsT. M. Sunil Kumar, T. V. Sreenivas. 2069-2072 [doi]

A wavelet based noise reduction algorithm for speech signal corrupted by coloured noiseVladimir Braquet, Takao Kobayashi. 2073-2076 [doi]

Speech enhancement in temporal DFT trajectories using Kalman filtersEsfandiar Zavarehei, Saeed Vaseghi. 2077-2080 [doi]

Formant-tracking linear prediction models for speech processing in noisy environmentsQin Yan, Saeed Vaseghi, Esfandiar Zavarehei, Ben P. Milner. 2081-2084 [doi]

Statistical noise compensation for cochlear implant processingHui Jiang, Qian-Jie Fu. 2085-2088 [doi]

WPD-based noise suppression using nonlinearly weighted threshold quantile estimation and optimal wavelet shrinkingTuan Van Pham, Gernot Kubin. 2089-2092 [doi]

Subjective and objective quality assessment of regression-enhanced speech in real car environmentsWeifeng Li, Katunobu Itou, Kazuya Takeda, Fumitada Itakura. 2093-2096 [doi]

A model for selective segregation of a target instrument sound from the mixed sound of various instrumentsMasashi Unoki, Masaaki Kubo, Atsushi Haniu, Masato Akagi. 2097-2100 [doi]

Improved decision directed approach for speech enhancement using an adaptive time segmentationRichard C. Hendriks, Richard Heusdens, Jesper Jensen. 2101-2104 [doi]

Generalized filter-bank equalizer for noise reduction with reduced signal delayHeinrich W. Löllmann, Peter Vary. 2105-2108 [doi]

A pitch-based model for separation of reverberant speechNicoleta Roman, DeLiang Wang. 2109-2112 [doi]

On noise gain estimation for HMM-based speech enhancementDavid Y. Zhao, W. Bastiaan Kleijn. 2113-2116 [doi]

Speech enhancement using auditory phase opponency modelOm Deshmukh, Carol Y. Espy-Wilson. 2117-2120 [doi]

High-density discrete HMM with the use of scalar quantization indexingBrian Mak, Jeff Siu-Kei Au-Yeung, Yiu-Pong Lai, Man-Hung Siu. 2121-2124 [doi]

Improved discriminative training using phone latticesJing Zheng, Andreas Stolcke. 2125-2128 [doi]

Improved MLP structures for data-driven feature extraction for ASRQifeng Zhu, Barry Y. Chen, Frantisek Grézl, Nelson Morgan. 2129-2132 [doi]

Investigations on error minimizing training criteria for discriminative training in automatic speech recognitionWolfgang Macherey, Lars Haferkamp, Ralf Schlüter, Hermann Ney. 2133-2136 [doi]

Temporally varying model parameters for large vocabulary continuous speech recognitionK. C. Sim, M. J. F. Gales. 2137-2140 [doi]

Using MLP features in SRI s conversational speech recognition systemQifeng Zhu, Andreas Stolcke, Barry Y. Chen, Nelson Morgan. 2141-2144 [doi]

A toolkit for voice inverse filtering and parametrisationMatti Airas, Hannu Pulakka, Tomas Bäckström, Paavo Alku. 2145-2148 [doi]

Stylization of glottal-flow spectra produced by a mechanical vocal-fold modelDenisse Sciamarella, Christophe d Alessandro. 2149-2152 [doi]

Numerical glottal sound source model as coupled problem between vocal cord vibration and glottal flowHideyuki Nomura, Tetsuo Funada. 2153-2156 [doi]

A tagged-cine MRI investigation of German vowelsMarianne Pouplier, Maureen Stone. 2157-2160 [doi]

A three-dimensional linear articulatory model of velum based on MRI dataAntoine Serrurier, Pierre Badin. 2161-2164 [doi]

On the relationship between intra-oral pressure and speech sonorityAnne Cros, Didier Demolin, Ana Georgina Flesia, Antonio Galves. 2165-2168 [doi]

Maximum conditional mutual information modeling for speaker verificationMohamed Kamal Omar, Jiri Navratil, Ganesh N. Ramaswamy. 2169-2172 [doi]

Class-dependent score combination for speaker recognitionLuciana Ferrer, M. Kemal Sönmez, Sachin S. Kajarekar. 2173-2176 [doi]

Modeling intra-speaker variability for speaker recognitionHagai Aronowitz, Dror Irony, David Burshtein. 2177-2180 [doi]

Liveness detection using cross-modal correlations in face-voice person authenticationGirija Chetty, Michael Wagner. 2181-2184 [doi]

Stream-weight optimization by LDA and adaboost for multi-stream speaker verificationTaichi Asami, Koji Iwano, Sadaoki Furui. 2185-2188 [doi]

Considering speech quality in speaker verification fusionYosef A. Solewicz, Moshe Koppel. 2189-2192 [doi]

Speaker adaptive acoustic modeling with mixture of adult and children s speechMatteo Gerosa, Diego Giuliani, Fabio Brugnara. 2193-2196 [doi]

A comparison of human and computer recognition accuracy for children s speechShona D Arcy, Martin J. Russell. 2197-2200 [doi]

Italian children s speech recognition for advanced interactive literacy tutorsPiero Cosi, Bryan L. Pellom. 2201-2204 [doi]

Do speech recognizers prefer female speakers?Martine Adda-Decker, Lori Lamel. 2205-2208 [doi]

Detecting Politeness and frustration state of a child in a conversational computer gameSerdar Yildirim, Chul-Min Lee, Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan. 2209-2212 [doi]

Gender in everyday speech and language: a corpus-based studyDiana Binnenpoorte, Christophe Van Bael, Els den Os, Lou Boves. 2213-2216 [doi]

Developmental change of phoneme duration in a Japanese infant and motherShigeaki Amano. 2217-2220 [doi]

Mora timing organization in producing contrastive geminate/single consonants and long/short vowels by native and non-native speakers of Japanese: effects of speaking rateHaiping Jia, Hiroki Mori, Hideki Kasuya. 2221-2224 [doi]

Mutual intelligibility of american, Chinese and dutch-accented speakers of EnglishHongyan Wang, Vincent J. van Heuven. 2225-2228 [doi]

Deriving a bi-lingual dictionary from raw transcription dataPeter Juel Henrichsen. 2229-2232 [doi]

A statistical method of evaluating pronunciation proficiency for Japanese wordsKei Ohta, Seiichi Nakagawa. 2233-2236 [doi]

Phonotactic language identification using high quality phoneme recognitionPavel Matejka, Petr Schwarz, Jan Cernocký, Pavel Chytil. 2237-2240 [doi]

Advances in word based dialect/accent classificationRongqing Huang, John H. L. Hansen. 2241-2244 [doi]

Syllable structure in spoken Arabic: a comparative investigationRym Hamdi, Salem Ghazali, Melissa Barkat-Defradas. 2245-2248 [doi]

A transformation-based learning approach to language identification for mixed-lingual text-to-speech synthesisJ. C. Marcadet, Volker Fischer, Claire Waast-Richard. 2249-2252 [doi]

Constructing family trees of multilingual speech using Gaussian mixture modelsShuichi Itahashi, Shiwei Zhu, Mikio Yamamoto. 2253-2256 [doi]

Modeling long and short-term prosody for language identificationJean-Luc Rouas. 2257-2260 [doi]

Document driven machine translation enhanced ASRMatthias Paulik, Christian Fügen, Sebastian Stüker, Tanja Schultz, Thomas Schaaf, Alex Waibel. 2261-2264 [doi]

Automatic text dictation in computer-assisted translationShahram Khadivi, András Zolnay, Hermann Ney. 2265-2268 [doi]

On the use of speech recognition in computer assisted translationLuis Rodríguez, Jorge Civera, Enrique Vidal, Francisco Casacuberta, César Martínez. 2269-2272 [doi]

Speech translation for low-resource languages: the case of PashtoAndreas Kathol, Kristin Precoda, Dimitra Vergyri, Wen Wang, Susanne Riehemann. 2273-2276 [doi]

Finite-state transducer inference for a speech-input Portuguese-to-English machine translation systemDavid Picó, Jorge González, Francisco Casacuberta, Diamantino Caseiro, Isabel Trancoso. 2277-2280 [doi]

Quantitative evaluation of effects of speech recognition errors on speech translation qualityKenko Ohta, Keiji Yasuda, Gen-ichiro Kikui, Masuzo Yanagida. 2281-2284 [doi]

A stereo input-output superdirective beamformer for dual channel noise reductionThomas Lotter, Bastian Sauert, Peter Vary. 2285-2288 [doi]

Kalman filters for time delay of arrival-based source localizationUlrich Klee, Tobias Gehrig, John W. McDonough. 2289-2292 [doi]

Simultaneous adaptation of echo cancellation and spectral subtraction for in-car speech recognitionOsamu Ichikawa, Masafumi Nishimura. 2293-2296 [doi]

Variable step size adaptive decorrelation filtering for competing speech separationRong Hu, Yunxin Zhao. 2297-2300 [doi]

Speech extraction in a car interior using frequency-domain ICA with rapid filter adaptationsDaisuke Saitoh, Atsunobu Kaminuma, Hiroshi Saruwatari, Tsuyoki Nishikawa, Akinobu Lee. 2301-2304 [doi]

Speech enhancement using non-acoustic sensorsRongqiang Hu, Sunil D. Kamath, David V. Anderson. 2305-2308 [doi]

Improved blind dereverberation performance by using spatial informationMarc Delcroix, Takafumi Hikichi, Masato Miyoshi. 2309-2312 [doi]

A hybrid microphone array post-filter in a diffuse noise fieldJunfeng Li, Masato Akagi. 2313-2316 [doi]

A framework for estimation of clean speech by fusion of outputs from multiple speech enhancement systemsVenkatesh Krishnan, Phil Spencer Whitehead, David V. Anderson, Mark A. Clements. 2317-2320 [doi]

A study of weighted CSP analysis with average speech spectrum for noise robust talker localizationYuki Denda, Takanobu Nishiura, Yoichi Yamashita. 2321-2324 [doi]

Sound segregation based on binaural zero-crossingsYoung Ik Kim, Sung Jun An, Rhee Man Kil, Hyung-Min Park. 2325-2328 [doi]

A two-microphone diversity system and its application for hands-free car kitsJürgen Freudenberger, Klaus Linhard. 2329-2332 [doi]

Directionally constrained minimization of power algorithm for speech signalsTakahiro Murakami, Kiyoshi Kurihara, Yoshihisa Ishida. 2333-2336 [doi]

Oriented global coherence field for the estimation of the head orientation in smart rooms equipped with distributed microphone arraysAlessio Brutti, Maurizio Omologo, Piergiorgio Svaizer. 2337-2340 [doi]

Robust speaker localization through adaptive weighted pair TDOA (AWEPAT) estimationNilesh Madhu, Rainer Martin. 2341-2344 [doi]

A spectrogram model for enhanced source localization and noise-robust ASRGuillaume Lathoud, Mathew Magimai-Doss, Bertrand Mesot. 2345-2348 [doi]

Denoising through source separation and minimum trackingSriram Srinivasan, Mattias Nilsson, W. Bastiaan Kleijn. 2349-2352 [doi]

Collaborative voice activity detection for hearing aidsLouisa Busca Grisoni, John H. L. Hansen. 2353-2356 [doi]

Using inter-frequency decorrelation to reduce the permutation inconsistency problem in blind source separationEnrique Robledo-Arnuncio, Biing-Hwang Juang. 2357-2360 [doi]

A graphical model for multi-sensory speech processing in air-and-bone conductive microphonesAmarnag Subramanya, Zhengyou Zhang, Zicheng Liu, Jasha Droppo, Alex Acero. 2361-2364 [doi]

The stress foot as a unit of planned timing: evidence from shortening in the prosodic phraseHeejin Kim, Jennifer Cole. 2365-2368 [doi]

Segmental anchorage and the French late risePauline Welby, Hélène Loevenbruck. 2369-2372 [doi]

Prosodic cues for syntactically-motivated juncturesIvan Chow. 2373-2376 [doi]

A glimpse of the time-course of intonation processing in European PortugueseIsabel Falé, Isabel Hub Faria. 2377-2380 [doi]

Great expectations - introspective vs. perceptual prominence ratings and their acoustic correlatesPetra Wagner. 2381-2384 [doi]

Choosing a scale for measuring perceived prominenceChristian Jensen, John Tndering. 2385-2388 [doi]

The effects of prosodic features on the interpretation of clarification ellipsesJens Edlund, David House, Gabriel Skantze. 2389-2392 [doi]

Exploration of different types of intonational deviations in foreign-accented and synthesized speechMatthias Jilka. 2393-2396 [doi]

Fine-tuning speech registers: a comparison of the prosodic features of child-directed and foreigner-directed speechSonja Biersack, Vera Kempe, Lorna Knapton. 2401-2404 [doi]

An analysis of the intonational structure of stuttered speechTimothy Arbisi-Kelm. 2405-2408 [doi]

Voice quality dimensions of pitch accentsBritta Lintfert, Wolfgang Wokurek. 2409-2412 [doi]

Audiovisual production and perception of contrastive focus in French: a multispeaker studyMarion Dohen, Hélène Loevenbruck. 2413-2416 [doi]

Predicting end of utterance in multimodal and unimodal conditionsPashiera Barkhuysen, Emiel Krahmer, Marc Swerts. 2417-2420 [doi]

Production of prominence in Japanese sign languageSaori Tanaka, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa. 2421-2424 [doi]

MLLR transforms as features in speaker recognitionAndreas Stolcke, Luciana Ferrer, Sachin S. Kajarekar, Elizabeth Shriberg, Anand Venkataraman. 2425-2428 [doi]

Gaussian mixture modelling of broad phonetic and syllabic events for text-independent speaker verificationBrendan Baker, Robbie Vogt, Sridha Sridharan. 2429-2432 [doi]

Efficient speaker identification and retrievalHagai Aronowitz, David Burshtein. 2433-2436 [doi]

The Cambridge University March 2005 speaker diarisation systemR. Sinha, S. E. Tranter, M. J. F. Gales, Philip C. Woodland. 2437-2440 [doi]

Combining speaker identification and BIC for speaker diarizationXuan Zhu, Claude Barras, Sylvain Meignier, Jean-Luc Gauvain. 2441-2444 [doi]

Broadcast news speaker tracking for ESTER 2005 campaignDan Istrate, Nicolas Scheffer, Corinne Fredouille, Jean-François Bonastre. 2445-2448 [doi]

On the nature of acoustic information in identification of coarticulated vowelsSorin Dusan. 2449-2452 [doi]

Impact of duration on F1/F2 formant values of oral vowels: an automatic analysis of large broadcast news corpora in French and GermanCédric Gendrot, Martine Adda-Decker. 2453-2456 [doi]

Modeling of between-speaker and within-speaker variation in spontaneous speech tempoHugo Quené. 2457-2460 [doi]

Vowel devoicing vs. mora-timed rhythm in spontaneous Japanese - inspection of phonetic labels of OGI_TSMasahiko Komatsu, Makiko Aoyagi. 2461-2464 [doi]

Does vowel space size depend on language vowel inventories? evidence from two Arabic dialects and FrenchJalal-Eddin Al-Tamimi, Emmanuel Ferragne. 2465-2468 [doi]

Understanding phonology by phonetic implementationChilin Shih. 2469-2472 [doi]

User evaluation of conversational agent h. c. AndersenNiels Ole Bernsen, Laila Dybkjær. 2473-2476 [doi]

Integrated development and on-the-fly simulation of multimodal dialogsSilke Goronzy, Nicole Beringer. 2477-2480 [doi]

Interactions between speech recognition problems and user emotionsMihai Rotaru, Diane J. Litman, Katherine Forbes-Riley. 2481-2484 [doi]

Webtalk: mining websites for interactively answering questionsJunlan Feng, Srihari Reddy, Murat Saraclar. 2485-2488 [doi]

Towards generic quality prediction models for spoken dialogue systems - a case studySebastian Möller. 2489-2492 [doi]

Robust access to large structured data using voice form-fillingS. Parthasarathy, Cyril Allauzen, R. Munkong. 2493-2496 [doi]

Spoken dialog system for real-time data captureEsther Levin, Alex Levin. 2497-2500 [doi]

A user study on the influence of mobile device class, synthesis method, data rate and lexicon on speech synthesis qualityMichael Pucher, Peter Fröhlich. 2501-2504 [doi]

User s experience of a commercial speech dialogue systemFang Chen, Yael Katzenellenbogen. 2505-2508 [doi]

Voice user interface design for automated directory assistanceEsther Levin, Amir M. Mané. 2509-2512 [doi]

Optimizing user experience through design of the spoken language understanding (SLU) moduleMaria Gabriela Alvarez-Ryan, Narendra K. Gupta, Barbara Hollister, Tirso Alonso. 2513-2516 [doi]

Interactive visualization of human-machine dialogsJeremy H. Wright, David A. Kapilow, Alicia Abella. 2517-2520 [doi]

Synthesising hyperarticulation in unit selection TTSMatthew P. Aylett. 2521-2524 [doi]

Symbolic prosody driven unit selection for highly natural synthetic speechDaniel Tihelka. 2525-2528 [doi]

Hybrid syllable/triphone speech synthesisJindrich Matousek, Zdenek Hanzlícek, Daniel Tihelka. 2529-2532 [doi]

A neural network approach for the design of the target cost function in unit-selection speech synthesisFrancisco Campillo Díaz, José Luis Alba, Eduardo Rodríguez Banga. 2533-2536 [doi]

FSM and k-nearest-neighbor for corpus based video-realistic audio-visual synthesisChristian Weiss. 2537-2540 [doi]

An embedded and concatenative approach to TTS of multiple languagesGui-Lin Chen, Ke-Song Han, Zhen-Li Yu, Dong-Jian Yue, Yi-Qing Zu. 2541-2544 [doi]

Morphing spectral envelopes using audio flowTony Ezzat, Ethan Meyers, James R. Glass, Tomaso Poggio. 2545-2548 [doi]

Linguistic features weighting for a text-to-speech system without prosody modelVincent Colotte, Richard Beaufort. 2549-2552 [doi]

Unit selection synthesis database development using utterance verificationIngunn Amdal, Torbjørn Svendsen. 2553-2556 [doi]

Refining phoneme segmentations using speaker-adaptive context dependent boundary modelsYong Zhao, Lijuan Wang, Min Chu, Frank K. Soong, Zhigang Cao. 2557-2560 [doi]

Customizing base unit set with speech database in TTS systemsYining Chen, Yong Zhao, Min Chu. 2561-2564 [doi]

Unit selection for speech synthesis based on a new acoustic target costSoufiane Rouibia, Olivier Rosec. 2565-2568 [doi]

Small footprint concatenative text-to-speech synthesis system using complex spectral envelope modelingDan Chazan, Ron Hoory, Zvi Kons, Ariel Sagi, Slava Shechtman, Alexander Sorin. 2569-2572 [doi]

High quality Spanish restricted-domain TTS oriented to a weather forecast applicationFrancesc Alías, Ignasi Iriondo Sanz, Lluís Formiga, Xavier Gonzalvo, Carlos Monzo, Xavier Sevillano. 2573-2576 [doi]

Comparing spectral distance measures for join cost optimization in concatenative speech synthesisIngmund Bjrkan, Torbjørn Svendsen, Snorre Farner. 2577-2580 [doi]

HMM-based european Portuguese TTS systemMaria João Barros, Ranniery Maia, Keiichi Tokuda, Fernando Gil Resende, Diamantino Freitas. 2581-2584 [doi]

Combining the flexibility of speech synthesis with the naturalness of pre-recorded audio: a comparison of two approaches to phrase-splicing TTSWael Hamza, John F. Pitrelli. 2585-2588 [doi]

Codec integrated voice conversion for embedded speech synthesisGuntram Strecha, Oliver Jokisch, Matthias Eichner, Rüdiger Hoffmann. 2589-2592 [doi]

Evaluation of VTLN-based voice conversion for embedded speech synthesisDavid Sündermann, Guntram Strecha, Antonio Bonafonte, Harald Höge, Hermann Ney. 2593-2596 [doi]

Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesisJuri Isogai, Junichi Yamagishi, Takao Kobayashi. 2597-2600 [doi]

Embedded Cantonese TTS for multi-device access to web contentTien Ying Fung, Yuk-Chi Li, Eddie Sio, Icarus Lee, Helen M. Meng, P. C. Ching. 2601-2604 [doi]

Model based analysis of a diphone database for improved unit concatenationKarl Schnell, Arild Lacroix. 2605-2608 [doi]

Context-dependent word duration modelling for robust speech recognitionNing Ma, Phil Green. 2609-2612 [doi]

An energy search approach to variable frame rate front-end processing for robust ASRJulien Epps, Eric H. C. Choi. 2613-2616 [doi]

Non-linear estimation of voice activity to improve automatic recognition of noisy speechRoberto Gemello, Franco Mana, Renato de Mori. 2617-2620 [doi]

Voice activity detection based on optimally weighted combination of multiple featuresYusuke Kida, Tatsuya Kawahara. 2621-2624 [doi]

Soft decision strategy and adaptive compensation for robust speech recognition against impulsive noisePei Ding. 2625-2628 [doi]

Statistical class-based MFCC enhancement of filtered and band-limited speech for robust ASRNicolás Morales, Doroteo Torre Toledano, John H. L. Hansen, José Colás, Javier Garrido. 2629-2632 [doi]

Spectral entropy feature in full-combination multi-stream for robust ASRHemant Misra, Hervé Bourlard. 2633-2636 [doi]

Environment-independent mask estimation for missing-feature reconstructionWooil Kim, Richard M. Stern, Hanseok Ko. 2637-2640 [doi]

Soft harmonic masks for recognising speech in the presence of a competing speakerAndré Coy, Jon Barker. 2641-2644 [doi]

Comb filter decomposition for robust ASRLech Szymanski, Martin Bouchard. 2645-2648 [doi]

Investigating the role of the Lombard reflex in non-audible murmur (NAM) recognitionPanikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano. 2649-2652 [doi]

Improved TEO feature-based automatic stress detection using physiological and acoustic speech sensorsEvan Ruzanski, John H. L. Hansen, Don Finan, James Meyerhoff, William Norris, Terry Wollert. 2653-2656 [doi]

Spectral subtraction using elliptic integral for multiplication factorTakeshi S. Kobayakawa. 2657-2660 [doi]

Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing techniqueLongbiao Wang, Norihide Kitaoka, Seiichi Nakagawa. 2661-2664 [doi]

Data collection and evaluation of speech recognition for motorbike ridersH. Tanaka, Hiroshi Fujimura, Chiyomi Miyajima, Takanori Nishino, Katunobu Itou, Kazuya Takeda. 2665-2668 [doi]

Application of a first-order differential microphone for efficient voice activity detection in a car platformAgustín Álvarez Marquina, Pedro Gómez Vilda, Victor Nieto Lluis, Rafael Martínez, Victoria Rodellar. 2669-2672 [doi]

Robust speech recognition for mobile devices in car noisePanji Setiawan, Suhadi Suhadi, Tim Fingscheidt, Sorel Stan. 2673-2676 [doi]

Evaluation and optimization of noise robust front-end technologies for the automatic recognition of Hungarian telephone speechPéter Mihajlik, Zoltán Tobler, Zoltán Tüske, Géza Gordos. 2677-2680 [doi]

A performance investigation of noisy voice recognition over IP telephony networksGang Chen, Douglas D. O Shaughnessy, Hesham Tolba. 2681-2684 [doi]

Internal noise suppression for speech recognition by small robotsAkinori Ito, Takashi Kanayama, Motoyuki Suzuki, Shozo Makino. 2685-2688 [doi]

Temporal ICA for classification of acoustic events i a kitchen environmentFlorian Kraft, Robert Malkin, Thomas Schaaf, Alex Waibel. 2689-2692 [doi]

hello - is anybody at home? - about the minimum word accuracy of a smart home spoken dialogue systemJan Felix Krebber. 2693-2696 [doi]

The simulation of realistic acoustic input scenarios for speech recognition systemsHans-Günter Hirsch, Harald Finster. 2697-2700 [doi]

An agent-based framework for speech investigationMichael Walsh, Gregory M. P. O Hare, Julie Carson-Berndsen. 2701-2704 [doi]

Switched split vector quantisation of line spectral frequencies for wideband speech codingStephen So, Kuldip K. Paliwal. 2705-2708 [doi]

A novel voicing cut-off determination for low bit-rate harmonic speech codingChangchun Bao, Jason Lukasiak, Christian Ritz. 2709-2712 [doi]

A partial decorrelation scheme for improved predictive open loop quantization with noise shapingHauke Krüger, Peter Vary. 2713-2716 [doi]

Using dynamic codebook re-ordering to exploit inter-frame correlation in MELP codersVenkatesh Krishnan, Thomas P. Barnwell III, David V. Anderson. 2717-2720 [doi]

Enhanced speech coding based on phonetic class segmentationAdriane Swalm Durey, Venkatesh Krishnan, Thomas P. Barnwell III. 2721-2724 [doi]

A pitch-synchronous pitch-cycle modification method for designing a hybrid i-MELP/waveform-matching speech coderAli Erdem Ertan, Thomas P. Barnwell III. 2725-2728 [doi]

A new structural preprocessor for low-bit rate speech codingJoon-Hyuk Chang, Jong Won Shin, Seung Yeol Lee, Nam Soo Kim. 2729-2732 [doi]

An improved GMM-based voice quality predictorTiago H. Falk, Wai-Yip Chan, Peter Kabal. 2733-2736 [doi]

High-quality memoryless subband coding of impulse responses at 22 bits per frameJan S. Erkelens. 2737-2740 [doi]

A study of variable pulse allocation for MPE and CELP coders based on PESQ analysisShi-Han Chen, Kuo-Guan Wu, Chih-Chung Kuo. 2741-2744 [doi]

Joint source-channel coding of LSP parameters for bursty channelsJosé L. Pérez-Córdoba, Antonio M. Peinado, Angel M. Gomez, Antonio J. Rubio. 2745-2748 [doi]

Adaptation and normalization experiments in speech recognition for 4 to 8 year old childrenDaniel Elenius, Mats Blomberg. 2749-2752 [doi]

PROSPECT features and their application to missing data techniques for vocal tract length normalizationWim Jansen, Hugo Van Hamme. 2753-2756 [doi]

Data driven subword unit modeling for speech recognition and its application to interactive reading tutorsAndreas Hagen, Bryan L. Pellom. 2757-2760 [doi]

The PF_STAR children s speech corpusAnton Batliner, Mats Blomberg, Shona D Arcy, Daniel Elenius, Diego Giuliani, Matteo Gerosa, Christian Hacker, Martin J. Russell, Stefan Steidl, Michael Wong. 2761-2764 [doi]

The Swedish NICE corpus - spoken dialogues between children and embodied characters in a computer game scenarioLinda Bell, Johan Boye, Joakim Gustafson, Mattias Heldner, Anders Lindström, Mats Wirén. 2765-2768 [doi]

A preprocessing technique for improving speech intelligibility in reverberant environments: the effect of steady-state suppression on elderly peopleYusuke Miyauchi, Nao Hodoshima, Keiichi Yasu, Nahoko Hayashi, Takayuki Arai, Mitsuko Shindo. 2769-2772 [doi]

Synchronizing dialogue contributions of human users and virtual characters in a virtual reality environmentNorbert Pfleger, Markus Löckelt. 2773-2776 [doi]

Does active learning help automatic dialog act tagging in meeting data?Anand Venkataraman, Yang Liu, Elizabeth Shriberg, Andreas Stolcke. 2777-2780 [doi]

A principled approach for rejection threshold optimization in spoken dialog systemsDan Bohus, Alexander I. Rudnicky. 2781-2784 [doi]

Application of confidence measures for dialogue systems through the use of parallel speech recognizersDavid Pérez-Piñar López, Carmen García-Mateo. 2785-2788 [doi]

Multi-level information and automatic dialog acts detection in human-human spoken dialogsSophie Rosset, Delphine Tribout. 2789-2792 [doi]

From question answering to spoken dialogue: towards an information search assistant for interactive multimodal information extractionRieks op den Akker, Harry Bunt, Simon Keizer, Boris W. van Schooten. 2793-2796 [doi]

Pitch-effects in diphone recording: are logatomes inappropriate?Ulrich Reubold, Alexander Steffen. 2797-2800 [doi]

Speech parameter generation algorithm considering global variance for HMM-based speech synthesisTomoki Toda, Keiichi Tokuda. 2801-2804 [doi]

Performance evaluation of style adaptation for hidden semi-Markov model based speech synthesisMakoto Tachibana, Junichi Yamagishi, Takashi Masuko, Takao Kobayashi. 2805-2808 [doi]

A comparison of methods for speaker-dependent pronunciation tuning for text-to-speech synthesisGabriel Webster, Tina Burrows, Katherine Knill. 2809-2812 [doi]

Perceptually-based data-driven join costs: comparing join typesAnn K. Syrdal, Alistair Conkie. 2813-2816 [doi]

Discontinuity detection in concatenated speech synthesis based on nonlinear speech analysisYannis Pantazis, Yannis Stylianou, Esther Klabbers. 2817-2820 [doi]

Improving the discrimination between native accents when recorded over different channelsTingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Qian Yang, Jean-Pierre Martens. 2821-2824 [doi]

Aligning and recognizing spoken books in different varieties of PortugueseIsabel Trancoso, António Joaquim Serralheiro, Céu Viana, Diamantino Caseiro. 2825-2828 [doi]

An acoustic segment modeling approach to automatic language identificationBin Ma, Haizhou Li, Chin-Hui Lee. 2829-2832 [doi]

Different size multilingual phone inventories and context-dependent acoustic models for language identificationDong Zhu, Martine Adda-Decker, Fabien Antoine. 2833-2836 [doi]

A text categorization approach to automatic language identificationSheng Gao, Bin Ma, Haizhou Li, Chin-Hui Lee. 2837-2840 [doi]

Advances in regional accent clustering in SwedishGiampiero Salvi. 2841-2844 [doi]

An architecture for seamless access to distributed multimodal servicesDavid Pearce, Jonathan Engelsma, James C. Ferrans, John Johnson. 2845-2848 [doi]

Robust speech recognition in ubiquitous networking and context-aware computingZheng-Hua Tan, Paul Dalsgaard, Børge Lindberg, Haitian Xu. 2849-2852 [doi]

Unified probabilistic approach to error concealment for distributed speech recognitionValentin Ion, Reinhold Haeb-Umbach. 2853-2856 [doi]

Combining packet loss compensation methods for robust distributed speech recognitionAlastair Bruce James, Ben Milner. 2857-2860 [doi]

Distributed ASR using speech coder data for efficient feature vector representationTrond Skogstad, Torbjørn Svendsen. 2861-2864 [doi]

Cluster-based modeling for ubiquitous speech recognitionSadaoki Furui, Tomohisa Ichiba, Takahiro Shinozaki, Edward W. D. Whittaker, Koji Iwano. 2865-2868 [doi]

The feature [sonorant] in lexical accessDanny R. Moates, Zinny S. Bond, Russell Fox, Verna Stockmal. 2869-2872 [doi]

Polder dutch: aspects of the /ei/-lowering in standard dutchIrene Jacobi, Louis C. W. Pols, Jan Stroop. 2877-2880 [doi]

Production and perception of Vietnamese vowelsEric Castelli, René Carré. 2881-2884 [doi]

Using open quotient for the characterisation of vietnamese glottalised tonesVu Ngoc Tuan, Christophe d Alessandro, Alexis Michaud. 2885-2888 [doi]

On the acoustic characterization of ejective stops in Waima aJohn Hajek, Mary Stevens. 2889-2892 [doi]

Spirantization of /p t k/ in Sienese Italian and so-called semi-fricativesMary Stevens, John Hajek. 2893-2896 [doi]

Italian geminates under speech rate and focalization changes: kinematic, acoustic, and perception dataBarbara Gili Fivela, Claudio Zmarich. 2897-2900 [doi]

A cross-linguistic study of vowel quantity in different word structures: Japanese, Finnish and CzechToshiko Isei-Jaakkola, Satoshi Asakawa. 2905-2908 [doi]

Acoustic properties of foreign accent: VOT variations in Moroccan-accented ItalianLaura Mori, Melissa Barkat-Defradas. 2909-2912 [doi]

The interrelation between the perception and production of English vowels by native speakers of Brazilian PortugueseAndréia S. Rauber, Paola Escudero, Ricardo Augusto Hoffmann Bion, Barbara O. Baptista. 2913-2916 [doi]

Czech voiced labiodental continuant discrimination from basic acoustic dataRadek Skarnitzl, Jan Volín. 2921-2924 [doi]

An elitist approach for extracting automatically well-realized speech sounds with high confidenceJean-Baptiste Maj, Anne Bonneau, Dominique Fohr, Yves Laprie. 2925-2928 [doi]

Applying multiple regression models for predicting word duration in a corpus of spontaneous speechNa'im R. Tyson. 2929-2932 [doi]

On european Portuguese automatic syllabificationCatarina Oliveira, Lurdes Castro Moutinho, António J. S. Teixeira. 2933-2936 [doi]

Rule-based grapheme-to-phoneme method for the GreekAimilios Chalamandaris, Spyros Raptis, Pirros Tsiakoulis. 2937-2940 [doi]

Assimilation and deletion phenomena involving word-final /n/ and word-initial /p, t, k/ in modern Greek: a codification of the observed variation intended for use in TTS synthesisConstandinos Kalimeris, George Mikros, Stelios Bakamidis. 2941-2944 [doi]

A German viseme-set for automatic transcription of input text used for audio-visual speech synthesisChristian Weiss, Bianca Aschenberner. 2945-2948 [doi]

Visual perception of anticipatory rounding gestures in FrenchJohanna-Pascale Roy. 2949-2952 [doi]

Hierarchical clustering of mixture tying using a partially observable Markov decision processMichael Jonas, James G. Schmolze. 2953-2956 [doi]

Flavors of Gaussian warpingPierre Ouellet, Gilles Boulianne, Patrick Kenny. 2957-2960 [doi]

Phoneme alignment based on discriminative learningJoseph Keshet, Shai Shalev-Shwartz, Yoram Singer, Dan Chazan. 2961-2964 [doi]

Comparison of low footprint acoustic modeling techniques for embedded ASR systemsJussi Leppänen, Imre Kiss. 2965-2968 [doi]

Factors in classification of stop consonant place of articulationAtiwong Suchato, Proadpran Punyabukkana. 2969-2972 [doi]

Cross-speaker articulatory position data for phonetic feature predictionArthur R. Toth, Alan W. Black. 2973-2976 [doi]

Improvements to fMPE for discriminative training of featuresDaniel Povey. 2977-2980 [doi]

Incorporating tone-related MLP posteriors in the feature representation for Mandarin ASRXin Lei, Mei-Yuh Hwang, Mari Ostendorf. 2981-2984 [doi]

Speech trajectory clustering for improved speech recognitionYan Han, Johan de Veth, Lou Boves. 2985-2988 [doi]

Selection of features and combination of classifiers using a fuzzy approach for acoustic event classificationAndrey Temko, Dusan Macho, Climent Nadeu. 2989-2992 [doi]

Multi-task learning strategies for a recurrent neural net in a hybrid tied-posteriors acoustic modelJan Stadermann, Wolfram Koska, Gerhard Rigoll. 2993-2996 [doi]

Revising Perceptual Linear Prediction (PLP)Florian Hönig, Georg Stemmer, Christian Hacker, Fabio Brugnara. 2997-3000 [doi]

Confidence measures in speech recognition based on probability distribution of likelihoodsJoel Pinto, R. N. V. Sitaram. 3001-3004 [doi]

Continuous local codebook features for multi- and cross-lingual acoustic phonetic modellingFrank Diehl, Asunción Moreno, Enric Monte. 3005-3008 [doi]

Augmented state space acoustic decoding for modeling local variability in speechAntonio Miguel, Eduardo Lleida, Richard C. Rose, Luis Buera, Alfonso Ortega. 3009-3012 [doi]

Auditory Teager energy cepstrum coefficients for robust speech recognitionDimitrios Dimitriadis, Petros Maragos, Alexandros Potamianos. 3013-3016 [doi]

A hybrid Maxent/HMM based ASR systemYasser Hifny, Steve Renals, Neil D. Lawrence. 3017-3020 [doi]

Regularizing linear discriminant analysis for speech recognitionHakan Erdogan. 3021-3024 [doi]

Comprehensive modulation representation for automatic speech recognitionYadong Wang, Steven Greenberg, Jayaganesh Swaminathan, Ramdas Kumaresan, David Poeppel. 3025-3028 [doi]

Segment-based phonetic class detection using minimum verification error (MVE) trainingQiang Fu, Biing-Hwang Juang. 3029-3032 [doi]

Acoustic and phonetic confusions in accented speech recognitionYi Liu, Pascale Fung. 3033-3036 [doi]

Auditory image model features for automatic speech recognitionMario E. Munich, Qiguang Lin. 3037-3040 [doi]

Applications of NAM microphones in speech recognition for privacy in human-machine communicationPanikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano. 3041-3044 [doi]

A hybrid ANN/DBN approach to articulatory feature recognitionJoe Frankel, Simon King. 3045-3048 [doi]

Experiments on speaker tracking and segmentation in radio broadcast newsDaniel Moraru, Mathieu Ben, Guillaume Gravier. 3049-3052 [doi]

Unsupervised segmentation and verification of multi-speaker conversational speechEmanuele Dalmasso, Pietro Laface, Daniele Colibro, Claudio Vair. 3053-3056 [doi]

Focal speakers: a speaker selection method able to deal with heterogeneous similarity criteriaSacha Krstulovic, Frédéric Bimbot, Delphine Charlet, Olivier Boëffard. 3057-3060 [doi]

A model space framework for efficient speaker detectionMathieu Ben, Guillaume Gravier, Frédéric Bimbot. 3061-3064 [doi]

Speaker detection using acoustic event sequencesNicolas Scheffer, Jean-François Bonastre. 3065-3068 [doi]

Speaker clustering of unknown utterances based on maximum purity estimationWei-Ho Tsai, Hsin-Min Wang. 3069-3072 [doi]

Modified DISTBIC algorithm for speaker change detectionPetra Zochová, Vlasta Radová. 3073-3076 [doi]

Decision trees with improved efficiency for fast speaker verificationGilles Gonon, Rémi Gribonval, Frédéric Bimbot. 3077-3080 [doi]

A speaker independent liveness test for audio-visual biometricsNicolas Eveno, Laurent Besacier. 3081-3084 [doi]

Distributed speaker recognition using speaker-dependent VQ codebook and earth mover s distanceShingo Kuroiwa, Yoshiyuki Umeda, Satoru Tsuge, Fuji Ren. 3085-3088 [doi]

Speaker verification via articulatory feature-based conditional pronunciation modeling with vowel and consonant mixture modelsKa-Yee Leung, Man-Wai Mak, Man-Hung Siu, Sun-Yuan Kung. 3089-3092 [doi]

Prosodic features based on wavelet analysis for speaker verificationJixu Chen, Beiqian Dai, Jun Sun. 3093-3096 [doi]

Relevant information extraction for discriminative training applied to speaker identificationMohamed Mihoubi, Douglas D. O Shaughnessy, Pierre Dumouchel. 3097-3100 [doi]

Conceiving a new sequence kernel and applying it to SVM speaker verificationJérôme Louradour, Khalid Daoudi. 3101-3104 [doi]

The predictive differential amplitude spectrum for robust speaker recognition in stationary noisesJing Deng, Thomas Fang Zheng, Jian Liu, Wenhu Wu. 3105-3108 [doi]

Data-driven clustering for blind feature mapping in speaker verificationMichael Mason, Robbie Vogt, Brendan Baker, Sridha Sridharan. 3109-3112 [doi]

Improved covariance modeling for GMM in speaker identificationXi Zhou, Zhiqiang Yao, Beiqian Dai. 3113-3116 [doi]

Modelling session variability in text-independent speaker verificationRobbie Vogt, Brendan Baker, Sridha Sridharan. 3117-3120 [doi]

Overlapping wavelet packet features for speaker verificationMihalis Siafarikas, Todor Ganchev, Nikolaos D. Fakotakis, George K. Kokkinakis. 3121-3124 [doi]

Using Hadamard ECOC in multi-class problems based on SVMAn-rong Yin, Xiang Xie, Jingming Kuang. 3125-3128 [doi]

Joint uncertainty decoding for noise robust speech recognitionH. Liao, M. J. F. Gales. 3129-3132 [doi]

Confidence scoring and rejection using multi-pass speech recognitionVincent Vanhoucke. 3133-3136 [doi]

Memory-enhanced MMSE-based channel error mitigation for distributed speech recognitionCheng-Lung Lee, Wen-Whei Chang. 3137-3140 [doi]

Designing multiple distinctive phonetic feature extractors for canonicalization by using clustering techniqueTakashi Fukuda, Muhammad Ghulam, Tsuneo Nitta. 3141-3144 [doi]

Efficient blind dereverberation framework for automatic speech recognitionKeisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi. 3145-3148 [doi]

Combining multi-source far distance speech recognition strategies: beamforming, blind channel and confusion network combinationMatthias Wölfel, John W. McDonough. 3149-3152 [doi]

Objective quality assessment of wideband speech by an extension of ITU-t recommendation p.862Akira Takahashi, Atsuko Kurashima, Chiharu Morioka, Hideaki Yoshino. 3153-3156 [doi]

Quality control for UMTS-AMR speech channelsMarc Werner, Peter Vary. 3157-3160 [doi]

Perceptual postfilter estimation for low bit rate speech coders using Gaussian mixture modelsWei Chen, Peter Kabal, Turaj Zakizadeh Shabestary. 3161-3164 [doi]

SNR-dependent background noise compensation of PESQ values for cellular phone speechKengo Fujita, Tsuneo Kato, Hideaki Yamada, Hisashi Kawai. 3165 [doi]

A MFCC-based CELP speech coder for server-based speech recognition in network environmentsGil Ho Lee, Jae Sam Yoon, Hong Kook Kim. 3169-3172 [doi]

Distortion measures for vector quantization of noisy spectrumVolodya Grancharov, Jonas Samuelsson, W. Bastiaan Kleijn. 3173-3176 [doi]

On the integration of speech recognition and statistical machine translationEvgeny Matusov, Stephan Kanthak, Hermann Ney. 3177-3180 [doi]

Integrated n-best re-ranking for spoken language translationV. H. Quan, Marcello Federico, Mauro Cettolo. 3181-3184 [doi]

An n-gram-based statistical machine translation decoderJosep Maria Crego, José B. Mariño, Adrià de Gispert. 3185-3188 [doi]

Use of maximum entropy in natural word generation for statistical concept-based speech-to-speech translationLiang Gu, Yuqing Gao. 3189-3192 [doi]

Improving statistical machine translation by classifying and generalizing inflected verb formsAdrià de Gispert, José B. Mariño, Josep Maria Crego. 3193-3196 [doi]

Improved speech recognition word lattice translation by confidence measureAbdulvohid Bozarov, Yoshinori Sagisaka, Ruiqiang Zhang, Gen-ichiro Kikui. 3197-3200 [doi]

Vocal tract area function inversion by linear regression of cepstrumParham Mokhtari, Tatsuya Kitamura, Hironori Takemoto, Kiyoshi Honda. 3201-3204 [doi]

Introducing visual cues in acoustic-to-articulatory inversionOlov Engwall. 3205-3208 [doi]

Speech inversion and re-synthesisVictor N. Sorokin, Alexander S. Leonov, I. S. Makarov, A. I. Tsyplikhin. 3209-3212 [doi]

Teaching a vocal tract simulation to imitate stop consonantsMark Huckvale, Ian S. Howard. 3213-3216 [doi]

Using phonetic constraints in acoustic-to-articulatory inversionBlaise Potard, Yves Laprie. 3217-3220 [doi]

A support vector approach to the acoustic-to-articulatory mappingAsterios Toutios, Konstantinos G. Margaritis. 3221-3224 [doi]

Analysis by synthesis of speech prosody: the Prozed environmentDaniel Hirst, Cyril Auran. 3225-3228 [doi]

A discriminative approach to phrase break modellingStephen Cox. 3229-3232 [doi]

Stochastic and syntactic techniques for predicting phrase breaksIan Read, Stephen Cox. 3233-3236 [doi]

Tree-based prediction of prosodic phrase breaks on top of shallow textual featuresGerasimos Xydas, Panagiotis Zervas, Georgios Kouroupetroglou, Nikolaos D. Fakotakis, George K. Kokkinakis. 3237-3240 [doi]

Chinese prosodic phrasing with a constraint-based approachHonghui Dong, Jianhua Tao, Bo Xu. 3241-3244 [doi]

A probabilistic approach to prosodic word prediction for Mandarin Chinese TTSMinghui Dong, Kim-Teng Lua, Haizhou Li. 3245-3248 [doi]

Evaluation of a system for F0 contour prediction for european PortugueseJoão Paulo Teixeira, Diamantino Freitas, Hiroya Fujisaki. 3249-3252 [doi]

Analysis on command sequences of a F0 generation model for Mandarin speech and its application to their automatic extractionKe Li, Yoshinori Sagisaka. 3253-3256 [doi]

Corpus-based extraction of F0 contour generation process model parametersKeikichi Hirose, Yusuke Furuyama, Nobuaki Minematsu. 3257-3260 [doi]

Optimized selection of intonation dictionaries in corpus based intonation modellingDavid Escudero Mancebo, Valentín Cardeñoso-Payo. 3261-3264 [doi]

Generation of fundamental frequency contours for Mandarin speech synthesis based on tone nucleus modelQinghua Sun, Keikichi Hirose, Wentao Gu, Nobuaki Minematsu. 3265-3268 [doi]

On the inter-syllable coarticulation effect of pitch modeling for Mandarin speechChen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen. 3269-3272 [doi]

Training the tilt intonation model using the JEMA methodologyMatej Rojc, Pablo Daniel Agüero, Antonio Bonafonte, Zdravko Kacic. 3273-3276 [doi]

Piecewise linear stylization of pitch via wavelet analysisDagen Wang, Shrikanth Narayanan. 3277-3280 [doi]

Phonetic labeling and segmentation of mixed-lingual prosody databasesHarald Romsdorfer, Beat Pfister. 3281-3284 [doi]

Exploratory analysis of linguistic data based on genetic algorithm for robust modeling of the segmental duration of speechEdmilson Morais, Fábio Violaro. 3285-3288 [doi]

Annotation-mining for rhythm model comparison in Brazilian portugueseDafydd Gibbon, Flaviane Romani Fernandes. 3289-3292 [doi]

A stochastic approach to phoneme and accent estimationTohru Nagano, Shinsuke Mori, Masafumi Nishimura. 3293-3296 [doi]

The detection of emphatic words using acoustic and lexical featuresJason M. Brenier, Daniel M. Cer, Daniel Jurafsky. 3297-3300 [doi]

Tone recognition in Mandarin using focusDinoj Surendran, Gina-Anne Levow, Yi Xu. 3301-3304 [doi]

An automatic intonation recognizer for the Polish language based on machine learning and expert knowledgeMikolaj Wypych. 3305-3308 [doi]

Generalized envelope matching technique for time-scale modification of speech (GEM-TSM)Atsuhiro Sakurai. 3309-3312 [doi]

Comparing HMM, maximum entropy, and conditional random fields for disfluency detectionYang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary P. Harper. 3313-3316 [doi]

Recognizing speech from simultaneous speakersBhiksha Raj, Rita Singh, Paris Smaragdis. 3317-3320 [doi]

Polynomial dynamic time warping kernel support vector machines for dysarthric speech recognition with sparse training dataVincent Wan, James Carmichael. 3321-3324 [doi]

Flavoured acoustic model and combined spelling to sound for asymmetrical bilingual environmentR. Lejeune, J. Baude, C. Tchong, Hubert Crepy, Claire Waast-Richard. 3325-3328 [doi]

Genetic triangulation of graphical models for speech and language processingChris D. Bartels, Kevin Duh, Jeff Bilmes, Katrin Kirchhoff, Simon King. 3329-3332 [doi]

Improving speech recognition using a data-driven approachGuillermo Aradilla, Jithendra Vepa, Hervé Bourlard. 3333-3336 [doi]

Outlier detection for acoustic model training using robust statisticsShigeki Matsuda, Wolfgang Herbordt, Satoshi Nakamura. 3337-3340 [doi]

Optimization methods for discriminative trainingJonathan Le Roux, Erik McDermott. 3341-3344 [doi]

Segmentation of recordings based on partial transcriptionsPatrick Cardinal, Gilles Boulianne, Michel Comeau. 3345-3348 [doi]

A speaker independent continuous speech recognizer for AmharicHussien Seid, Björn Gambäck. 3349-3352 [doi]

Optimizing the structure of partly-hidden Markov models using weighted likelihood-ratio maximization criterionTetsuji Ogawa, Tetsunori Kobayashi. 3353-3356 [doi]

Multilingual speech recognition: a unified approachC. Santhosh Kumar, V. P. Mohandas, Haizhou Li. 3357-3360 [doi]

Detection of recognition errors based on classifiers trained on artificially created dataTomás Bartos, Ludek Müller. 3361-3364 [doi]

On designing and evaluating speech event detectorsJinyu Li, Chin-Hui Lee. 3365-3368 [doi]

Local word confidence measure using word graph and n-best listJoseph Razik, Odile Mella, Dominique Fohr, Jean-Paul Haton. 3369-3372 [doi]

Mandarin/English mixed-lingual name recognition for mobile phoneXiaolin Ren, Xin He, Yaxin Zhang. 3373-3376 [doi]

New word-level and sentence-level confidence scoring using graph theory calculus and its evaluation on speech understandingJavier Ferreiros, Rubén San Segundo, Fernando F. Fernández-Martínez, Luis Fernando D Haro, Valentín Sama, Roberto Barra-Chicote, Pedro Mellén. 3377-3380 [doi]

Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performancesMasanobu Nakamura, Koji Iwano, Sadaoki Furui. 3381-3384 [doi]

SVitchboard 1: small vocabulary tasks from SwitchboardSimon King, Chris D. Bartels, Jeff Bilmes. 3385-3388 [doi]

Timing of experimentally elicited minimal responses as quantitative evidence for the use of intonation in projecting TRPsWieneke Wesseling, R. J. J. H. van Son. 3389-3392 [doi]

Linguistic and acoustic features depending on different situations - the experiments considering speech recognition rateShinya Yamada, Toshihiko Itoh, Kenji Araki. 3393-3396 [doi]

Towards voiceXML compilation for portable embedded applications in ubiquitous environmentsDirk Bühler, Stefan W. Hamerich. 3397-3400 [doi]

Prosody in public speech: analyses of a news announcement and a Political interviewEva Strangert. 3401-3404 [doi]

Characterising dialogue call-flows for pervasive environmentsAmit Anil Nanavati, Nitendra Rajput. 3405-3408 [doi]

An architecture for pluggable disambiguation mechanism for RDC based voice applicationsTanveer A. Faruquie, Pankaj Kankar, Nitendra Rajput, Abhishek Verma. 3409-3412 [doi]

Adapting dialog call-flows for pervasive devicesNitendra Rajput, Amit Anil Nanavati, Abhishek Kumar, Neeraj Chaudhary. 3413-3416 [doi]

Clarification questions to improve dialogue flow and speech recognition in spoken dialogue systemsUlf Krum, Hartwig Holzapfel, Alex Waibel. 3417-3420 [doi]

Speech interface for controlling an hi-fi audio system based on a Bayesian belief networks approach for dialog modelingFernando F. Fernández-Martínez, Javier Ferreiros, Valentín Sama, Juan Manuel Montero, Rubén San Segundo, Javier Macías Guarasa, Rafael García. 3421-3424 [doi]

Hierarchical language models for one-stage speech interpretationMatthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske. 3425-3428 [doi]

Spoken language understanding using layered n-gram modelingNick J.-C. Wang. 3429-3432 [doi]

Named entity recognition from spontaneous open-domain speechMihai Surdeanu, Jordi Turmo, Eli Comelles. 3433-3436 [doi]

Discriminative training and support vector machine for natural language call routingImed Zitouni, Hui Jiang, Qiru Zhou. 3437-3440 [doi]

A multiple classifier-based concept-spotting approach for robust spoken language understandingJihyun Eun, Minwoo Jeong, Gary Geunbae Lee. 3441-3444 [doi]

A flexible and integrated interface between speech recognition, speech interpretation and dialog managementRobert Lieb, Matthias Thomae, Günther Ruske, Daniel Bobbert, Frank Althoff. 3445-3448 [doi]

Incremental dependency parsing of Japanese spoken monologue based on clause boundariesTomohiro Ohno, Shigeki Matsubara, Hideki Kashioka, Naoto Kato, Yasuyoshi Inagaki. 3449-3452 [doi]

Situation based speech recognition for structuring baseball live gamesAtsushi Sako, Tetsuya Takiguchi, Yasuo Ariki. 3453-3456 [doi]

Semantic annotation of the French media dialog corpusHélène Bonneau-Maynard, Sophie Rosset, Christelle Ayache, A. Kuhn, Djamel Mostefa. 3457-3460 [doi]

Robust and efficient semantic parsing of free word order languages in spoken dialogue systemsRalf Engel. 3461-3464 [doi]

Conceptual language model design for spoken language understandingCatherine Kobus, Géraldine Damnati, Lionel Delphin-Poulat, Renato de Mori. 3465-3468 [doi]

From robust spoken language understanding to knowledge acquisition and managementLuís Seabra Lopes, António J. S. Teixeira, Marcelo Quinderé, Mário Rodrigues. 3469-3472 [doi]

Improving end-to-end performance of call classification through data confusion reduction and model tolerance enhancementCheng Wu, Xiang Li, Hong-Kwang Jeff Kuo, E. E. Jan, Vaibhava Goel, David Lubensky. 3473-3476 [doi]

External Links

Cite Key

Statistics

PDF

Researchr

INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005

Abstract

Table of Contents