Abstract is missing.
- Selected topics from 40 years of research on speech and speaker recognitionSadaoki Furui. 1-8 [doi]
- Connecting human and machine learning via probabilistic models of cognitionThomas L. Griffiths. 9-12 [doi]
- New horizons in the study of child language acquisitionDeb Roy. 13-20 [doi]
- Transcribing human-directed speech for spoken language processingMari Ostendorf. 21-27 [doi]
- Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtractionChanwoo Kim, Richard M. Stern. 28-31 [doi]
- Towards fusion of feature extraction and acoustic model training: a top down process for robust speech recognitionYu-Hsiang Bosco Chiu, Bhiksha Raj, Richard M. Stern. 32-35 [doi]
- Temporal modulation processing of speech signals for noise robust ASRHong You, Abeer Alwan. 36-39 [doi]
- Progressive memory-based parametric non-linear feature equalizationLuz García, Roberto Gemello, Franco Mana, José C. Segura. 40-43 [doi]
- Dynamic features in the linear domain for robust automatic speech recognition in a reverberant environmentOsamu Ichikawa, Takashi Fukuda, Ryuki Tachibana, Masafumi Nishimura. 44-47 [doi]
- Local projections and support vector based feature selection in speech recognitionAntonio Miguel, Alfonso Ortega, Luis Buera, Eduardo Lleida. 48-51 [doi]
- Feedforward control of a 3d physiological articulatory model for vowel productionQiang Fang, Akikazu Nishikido, Jianwu Dang, Aijun Li. 52-55 [doi]
- Articulatory modeling based on semi-polar coordinates and guided PCA techniqueJun Cai, Yves Laprie, Julie Busset, Fabrice Hirsch. 56-59 [doi]
- Sequencing of articulatory gestures using cost optimizationJuraj Simko, Fred Cummins. 60-63 [doi]
- From experiments to articulatory motion - a three dimensional talking head modelXiao Bo Lu, William Thorpe, Kylie Foster, Peter Hunter. 64-67 [doi]
- Towards robust glottal source modelingJavier Pérez, Antonio Bonafonte. 68-71 [doi]
- Sliding vocal-tract model and its application for vowel productionTakayuki Arai. 72-75 [doi]
- Minimum hypothesis phone error as a decoding method for speech recognitionHaihua Xu, Daniel Povey, Jie Zhu, Guanyong Wu. 76-79 [doi]
- Posterior-based out of vocabulary word detection in telephone speechStefan Kombrink, Lukás Burget, Pavel Matejka, Martin Karafiát, Hynek Hermansky. 80-83 [doi]
- Automatic transcription system for meetings of the Japanese national congressYuya Akita, Masato Mimura, Tatsuya Kawahara. 84-87 [doi]
- Cross-language bootstrapping for unsupervised acoustic model training: rapid development of a Polish speech recognition systemJonas Lööf, Christian Gollan, Hermann Ney. 88-91 [doi]
- Porting an european portuguese broadcast news recognition system to brazilian portugueseAlberto Abad, Isabel Trancoso, Nelson Neto, Céu Viana. 92-95 [doi]
- Modeling northern and southern varieties of dutch for STTJulien Despres, Petr Fousek, Jean-Luc Gauvain, Sandrine Gay, Yvan Josse, Lori Lamel, Abdelkhalek Messaoudi. 96-99 [doi]
- Nearly perfect detection of continuous f_0 contour and frame classification for TTS synthesisThomas Ewender, Sarah Hoffmann, Beat Pfister. 100-103 [doi]
- AM-FM estimation for speech based on a time-varying sinusoidal modelYannis Pantazis, Olivier Rosec, Yannis Stylianou. 104-107 [doi]
- Voice source waveform analysis and synthesis using principal component analysis and Gaussian mixture modellingJon Gudnason, Mark R. P. Thomas, Patrick A. Naylor, Dan P. W. Ellis. 108-111 [doi]
- Model-based estimation of instantaneous pitch in noisy speechJung Ook Hong, Patrick J. Wolfe. 112-115 [doi]
- Complex cepstrum-based decomposition of speech for glottal source estimationThomas Drugman, Baris Bozkurt, Thierry Dutoit. 116-119 [doi]
- Relative importance of formant and whole-spectral cues for vowel perceptionMasashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano. 124-127 [doi]
- Influences of vowel duration on speaker-size estimation and discriminationChihiro Takeshima, Minoru Tsuzaki, Toshio Irino. 128-131 [doi]
- High front vowels in Czech: a contrast in quantity or quality?Václav Jonás Podlipský, Radek Skarnitzl, Jan Volín. 132-135 [doi]
- Effect of contralateral noise on energetic and informational masking on speech-in-speech intelligibilityMarjorie Dole, Michel Hoen, Fanny Meunier. 136-139 [doi]
- Using location cues to track speaker changes from mobile, binaural microphonesHeidi Christensen, Jon Barker. 140-143 [doi]
- A perceptual investigation of speech transcription errors involving frequent near-homophones in French and american EnglishIoana Vasilescu, Martine Adda-Decker, Lori Lamel, Pierre A. Hallé. 144-147 [doi]
- The role of glottal pulse rate and vocal tract length in the perception of speaker identityEtienne Gaudrain, Su Li, Vin Shen Ban, Roy D. Patterson. 148-151 [doi]
- Development of voicing categorization in deaf children with cochlear implantVictoria Medina, Willy Serniclaes. 152-155 [doi]
- Processing liaison-initial words in native and non-native French: evidence from eye movementsAnnie Tremblay. 156-159 [doi]
- Estimating the potential of signal and interlocutor-track information for language modelingNigel G. Ward, Benjamin H. Walker. 160-163 [doi]
- Factor analysis and SVM for language recognitionFlorian Verdet, Driss Matrouf, Jean-François Bonastre, Jean Hennebert. 164-167 [doi]
- Exploring universal attribute characterization of spoken languages for spoken language recognitionSabato Marco Siniscalchi, Jeremy Reed, Torbjørn Svendsen, Chin-Hui Lee. 168-171 [doi]
- On the use of phonological features for automatic accent analysisAbhijeet Sangwan, John H. L. Hansen. 172-175 [doi]
- Language recognition using language factorsFabio Castaldo, Sandro Cumani, Pietro Laface, Daniele Colibro. 176-179 [doi]
- Automatic accent detection: effect of base units and boundary informationJe Hun Jeon, Yang Liu. 180-183 [doi]
- Age verification using a hybrid speech processing approachRon M. Hecht, Omer Hezroni, Amit Manna, Ruth Aloni-Lavi, Gil Dobry, Amir Alfandary, Yaniv Zigel. 184-187 [doi]
- Information bottleneck based age verificationRon M. Hecht, Omer Hezroni, Amit Manna, Gil Dobry, Yaniv Zigel, Naftali Tishby. 188-191 [doi]
- Discriminative n-gram selection for dialect recognitionFred S. Richardson, William M. Campbell, Pedro A. Torres-Carrasquillo. 192-195 [doi]
- Data-driven phonetic comparison and conversion between south african, british and american English pronunciationsLinsen Loots, Thomas Niesler. 196-199 [doi]
- Target-aware language models for spoken language recognitionRong Tong, Bin Ma, Haizhou Li, Engsiong Chng, Kong-Aik Lee. 200-203 [doi]
- Language identification for speech-to-speech translationDaniel Chung Yong Lim, Ian R. Lane. 204-207 [doi]
- Using prosody and phonotactics in Arabic dialect identificationFadi Biadsy, Julia Hirschberg. 208-211 [doi]
- Refactoring acoustic models using variational expectation-maximizationPierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen. 212-215 [doi]
- Investigations on convex optimization using log-linear HMMs for digit string recognitionGeorg Heigold, David Rybach, Ralf Schlüter, Hermann Ney. 216-219 [doi]
- Investigations on discriminative training in large scale acoustic model estimationJanne Pylkkönen. 220-223 [doi]
- Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative trainingErik McDermott, Shinji Watanabe, Atsushi Nakamura. 224-227 [doi]
- Compacting discriminative feature space transforms for embedded devicesEtienne Marcheret, Jia-Yu Chen, Petr Fousek, Peder A. Olsen, Vaibhava Goel. 228-231 [doi]
- A back-off discriminative acoustic model for automatic speech recognitionHung-An Chang, James R. Glass. 232-235 [doi]
- Efficient generation and use of MLP features for Arabic speech recognitionJunho Park, Frank Diehl, Mark J. F. Gales, Marcus Tomalin, Philip C. Woodland. 236-239 [doi]
- A study of bootstrapping with multiple acoustic features for improved automatic speech recognitionXiaodong Cui, Jian Xue, Bing Xiang, Bowen Zhou. 240-243 [doi]
- Analysis of low-resource acoustic model self-trainingScott Novotney, Richard M. Schwartz. 244-247 [doi]
- Log-linear model combination with word-dependent scaling factorsBjörn Hoffmeister, Ruoying Liang, Ralf Schlüter, Hermann Ney. 248-251 [doi]
- Enabling a user to specify an item at any time during system enumeration - item identification for barge-in-able conversational dialogue systemsKyoko Matsuyama, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. 252-255 [doi]
- System request detection in human conversation based on multi-resolution Gabor wavelet featuresTomoyuki Yamagata, Tetsuya Takiguchi, Yasuo Ariki. 256-259 [doi]
- Using graphical models for mixed-initiative dialog management systems with realtime PoliciesStefan Schwärzler, Stefan Maier, Joachim Schenk, Frank Wallhoff, Gerhard Rigoll. 260-263 [doi]
- Conversation robot participating in and activating a group communicationShinya Fujie, Yoichi Matsuyama, Hikaru Taniyama, Tetsunori Kobayashi. 264-267 [doi]
- Recent advances in WFST-based dialog systemChiori Hori, Kiyonori Ohtake, Teruhisa Misu, Hideki Kashioka, Satoshi Nakamura. 268-271 [doi]
- A statistical dialog manager for the LUNA projectDavid Griol, Giuseppe Riccardi, Emilio Sanchis. 272-275 [doi]
- A Policy-switching learning approach for adaptive spoken dialogue agentsHeriberto Cuayáhuitl, Juventino Montiel-Hernández. 276-279 [doi]
- Strategies for accelerating the design of dialogue applications using heuristic information from the backend databaseLuis Fernando D Haro, Ricardo de Córdoba, Rubén San Segundo, Javier Macías Guarasa, José Manuel Pardo. 280-283 [doi]
- Feature-based summary space for stochastic dialogue modeling with hierarchical semantic framesFlorian Pinault, Fabrice Lefèvre, Renato de Mori. 284-287 [doi]
- Language modeling and dialog management for address recognitionRajesh Balchandran, Leonid Rachevsky, Larry Sansone. 288-291 [doi]
- A framework for rapid development of conversational natural language call routing systems for call centersEa-Ee Jan, Hong-Kwang Kuo, Osamuyimen Stewart, David Lubensky. 292-295 [doi]
- The MonAMI reminder: a spoken dialogue system for face-to-face interactionJonas Beskow, Jens Edlund, Björn Granström, Joakim Gustafson, Gabriel Skantze, Helena Tobiasson. 296-299 [doi]
- Influence of training on direct and indirect measures for the evaluation of multimodal systemsJulia Seebode, Stefan Schaffer, Ina Wechsung, Florian Metze. 300-303 [doi]
- Talking heads for interacting with spoken dialog smart-home systemsChristine Kühnel, Benjamin Weiss, Sebastian Möller. 304-307 [doi]
- Speech generation from hand gestures based on space mappingAki Kunikoshi, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose. 308-311 [doi]
- The INTERSPEECH 2009 emotion challengeBjörn Schuller, Stefan Steidl, Anton Batliner. 312-315 [doi]
- GTM-URL contribution to the INTERSPEECH 2009 emotion challengeSantiago Planet, Ignasi Iriondo Sanz, Joan Claudi Socoró, Carlos Monzo, Jordi Adell. 316-319 [doi]
- Emotion recognition using a hierarchical binary decision tree approachChi-Chun Lee, Emily Mower, Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan. 320-323 [doi]
- Improving automatic emotion recognition from speech signalsElif Bozkurt, Engin Erzin, Çigdem Eroglu Erdem, A. Tanju Erdem. 324-327 [doi]
- Exploring the benefits of discretization of acoustic features for speech emotion recognitionThurid Vogt, Elisabeth André. 328-331 [doi]
- Combining spectral and prosodic information for emotion recognition in the interspeech 2009 emotion challengeIker Luengo, Eva Navas, Inmaculada Hernáez. 332-335 [doi]
- Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributionsRoberto Barra-Chicote, Fernando F. Fernández-Martínez, Syaheerah L. Lutfi, Juan Manuel Lucas-Cuesta, Javier Macías Guarasa, Juan Manuel Montero, Rubén San Segundo, José Manuel Pardo. 336-339 [doi]
- Emotion classification in children s speech using fusion of acoustic and linguistic featuresTim Polzehl, Shiva Sundaram, Hamed Ketabdar, Michael Wagner, Florian Metze. 340-343 [doi]
- Cepstral and long-term features for emotion recognitionPierre Dumouchel, Najim Dehak, Yazid Attabi, Réda Dehak, Narjès Boufaden. 344-347 [doi]
- Brno University of Technology system for Interspeech 2009 emotion challengeMarcel Kockmann, Lukás Burget, Jan Cernocký. 348-351 [doi]
- Back-off language model compressionBoulos Harb, Ciprian Chelba, Jeffrey Dean, Sanjay Ghemawat. 352-355 [doi]
- Improving broadcast news transcription with a precision grammar and discriminative rerankingTobias Kaufmann, Thomas Ewender, Beat Pfister. 356-359 [doi]
- Use of contexts in language model interpolation and adaptationXunying Liu, Mark J. F. Gales, Philip C. Woodland. 360-363 [doi]
- Exploiting Chinese character models to improve speech recognition performanceJim L. Hieronymus, Xunying Liu, Mark J. F. Gales, Philip C. Woodland. 364-367 [doi]
- Constraint selection for topic-based MDI adaptation of language modelsGwénolé Lecorvé, Guillaume Gravier, Pascale Sébillot. 368-371 [doi]
- Nonstationary latent Dirichlet allocation for speech recognitionChuang-Hua Chueh, Jen-Tzung Chien. 372-375 [doi]
- Categorical perception of speech without stimulus repetitionJack C. Rogers, Matthew H. Davis. 376-379 [doi]
- Non-automaticity of use of orthographic knowledge in phoneme evaluationAnne Cutler, Chris Davis, Jeesun Kim. 380-383 [doi]
- Learning and generalization of novel contrastive cuesMeghan Sumner. 384-387 [doi]
- Vowel category perception affected by microdurational variationsEinar Meister, Stefan Werner. 388-391 [doi]
- Perceptual grouping of alternating word pairs: effect of pitch difference and presentation rateNandini Iyer, Douglas Brungart, Brian D. Simpson. 392-395 [doi]
- Comparing methods to find a best exemplar in a multidimensional spaceTitia Benders, Paul Boersma. 396-399 [doi]
- Autoregressive HMMs for speech synthesisMatt Shannon, William Byrne. 400-403 [doi]
- Asynchronous F0 and spectrum modeling for HMM-based speech synthesisCheng-Cheng Wang, Zhen-Hua Ling, Li-Rong Dai. 404-407 [doi]
- A minimum v/u error approach to F0 generation in HMM-based TTSYao Qian, Frank K. Soong, Miaomiao Wang, Zhizheng Wu. 408-411 [doi]
- Voiced/unvoiced decision algorithm for HMM-based speech synthesisShiyin Kang, Zhiwei Shuang, Quansheng Duan, Yong Qin, Lianhong Cai. 412-415 [doi]
- Local minimum generation error criterion for hybrid HMM speech synthesisXavi Gonzalvo, Alexander Gutkin, Joan Claudi Socoró, Ignasi Iriondo Sanz, Paul Taylor. 416-419 [doi]
- Thousands of voices for HMM-based speech synthesisJunichi Yamagishi, Bela Usabaev, Simon King, Oliver Watts, John Dines, Jilei Tian, Rile Hu, Yong Guan, Keiichiro Oura, Keiichi Tokuda, Reima Karhila, Mikko Kurimo. 420-423 [doi]
- Efficient combination of confidence measures for machine translationSylvain Raybaud, David Langlois, Kamel Smaïli. 424-427 [doi]
- Incremental dialog clustering for speech-to-speech translationDavid Stallard, Stavros Tsakalidis, Shirin Saleem. 428-431 [doi]
- Iterative sentence-pair extraction from quasi-parallel corpora for machine translationRuhi Sarikaya, Sameer Maskey, R. Zhang, Ea-Ee Jan, D. Wang, Bhuvana Ramabhadran, Salim Roukos. 432-435 [doi]
- RTTS: towards enterprise-level real-time speech transcription and translation servicesJuan M. Huerta, Cheng Wu, Andrej Sakrajda, Sasha Caskey, Ea-Ee Jan, Alexander Faisman, Shai Ben-David, Wen Liu, Antonio Lee, Osamuyimen Stewart, Michael Frissora, David Lubensky. 436-439 [doi]
- Using syntax in large-scale audio document translationJing Zheng, Necip Fazil Ayan, Wen Wang, David Burkett. 440-443 [doi]
- Context-driven automatic bilingual movie subtitle alignmentAndreas Tsiartas, Prasanta Kumar Ghosh, Panayiotis G. Georgiou, Shrikanth S. Narayanan. 444-447 [doi]
- Probabilistic effects on French [t] durationFrancisco Torreira, Mirjam Ernestus. 448-451 [doi]
- On the production of sandhi phenomena in French: psycholinguistic and acoustic dataOdile Bagou, Violaine Michel, Marina Laganaro. 452-455 [doi]
- Extreme reductions: contraction of disyllables into monosyllables in taiwan MandarinChierh Cheng, Yi Xu. 456-459 [doi]
- Annotation and features of non-native Mandarin tone qualityMitchell Peabody, Stephanie Seneff. 460-463 [doi]
- On-line formant shifting as a function of F0Katerina Chládková, Paul Boersma, Václav Jonás Podlipský. 464-467 [doi]
- Production boundary between fricative and affricate in Japanese and Korean speakersKimiko Yamakawa, Shigeaki Amano, Shuichi Itahashi. 468-471 [doi]
- Aerodynamics of fricative production in european portugueseCátia M. R. Pinho, Luis M. T. Jesus, Anna Barney. 472-475 [doi]
- Contextual effects on protrusion and lip opening for /i, y/Anne Bonneau, Julie Buquet, Brigitte Wrobel-Dautcourt. 476-479 [doi]
- Speech rate effects on european portuguese nasal vowelsCatarina Oliveira, Paula Martins, António J. S. Teixeira. 480-483 [doi]
- Relation of formants and subglottal resonances in Hungarian vowelsTamás Gábor Csapó, Zsuzsanna Bárkányi, Tekla Etelka Gráczi, Tamás Bohm, Steven M. Lulich. 484-487 [doi]
- Polyglot speech prosody controlHarald Romsdorfer. 488-491 [doi]
- Weighted neural network ensemble models for speech prosody controlHarald Romsdorfer. 492-495 [doi]
- Cross-language F0 modeling for under-resourced tonal languages: a case study on Thai-MandarinVataya Boonpiam, Anocha Rugchatjaroen, Chai Wutiwiwatchai. 496-499 [doi]
- Prosodic issues in synthesising thadou, a tibeto-burman tone languageDafydd Gibbon, Pramod Pandey, D. Mary Kim Haokip, Jolanta Bachan. 500-503 [doi]
- Advanced unsupervised joint prosody labeling and modeling for Mandarin speech and its application to prosody generation for TTSChen-Yu Chiang, Sin-Horng Chen, Yih-Ru Wang. 504-507 [doi]
- Optimization of t-tilt F0 modelingAusdang Thangthai, Anocha Rugchatjaroen, Nattanun Thatphithakkul, Ananlada Chotimongkol, Chai Wutiwiwatchai. 508-511 [doi]
- A multi-level context-dependent prosodic model applied to durational modelingNicolas Obin, Xavier Rodet, Anne Lacheret-Dujour. 512-515 [doi]
- Sentiment classification in English from sentence-level annotations of emotions regarding models of affectAlexandre Trilla, Francesc Alías. 516-519 [doi]
- Identification of contrast and its emphatic realization in HMM based speech synthesisLeonardo Badino, J. Sebastian Andersson, Junichi Yamagishi, Robert A. J. Clark. 520-523 [doi]
- How to improve TTS systems for emotional expressivityAntonio Rui Ferreira Rebordão, Shaikh Mostafa Al Masum, Keikichi Hirose, Nobuaki Minematsu. 524-527 [doi]
- State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesisYi-Jian Wu, Yoshihiko Nankaku, Keiichi Tokuda. 528-531 [doi]
- Real voice and TTS accent effects on intelligibility and comprehension for indian speakers of English as a second languageFrederick Weber, Kalika Bali. 532-535 [doi]
- Improving consistence of phonetic transcription for text-to-speechPablo Daniel Agüero, Antonio Bonafonte, Juan Carlos Tulli. 536-539 [doi]
- On the development of matched and mismatched Italian children s speech recognition systemsPiero Cosi. 540-543 [doi]
- Combination of acoustic and lexical speaker adaptation for disordered speech recognitionOscar Saz, Eduardo Lleida, Antonio Miguel. 544-547 [doi]
- Bilinear transformation space-based maximum likelihood linear regression frameworksHwa Jeon Song, Yongwon Jeong, Hyung Soon Kim. 548-551 [doi]
- Speaking style adaptation for spontaneous speech recognition using multiple-regression HMMYusuke Ijima, Takeshi Matsubara, Takashi Nose, Takao Kobayashi. 552-555 [doi]
- Acoustic class specific VTLN-warping using regression class treesS. P. Rath, Srinivasan Umesh. 556-559 [doi]
- Speaker normalization for template based speech recognitionSébastien Demange, Dirk Van Compernolle. 560-563 [doi]
- Improving the robustness with multiple sets of HMMsHans-Günter Hirsch, Andreas Kitzig. 564-567 [doi]
- On the use of pitch normalization for improving children s speech recognitionRohit Sinha, Shweta Ghai. 568-571 [doi]
- Using VTLN matrices for rapid and computationally-efficient speaker adaptation with robustness to first-pass transcription errorsS. P. Rath, Srinivasan Umesh, Achintya Kumar Sarkar. 572-575 [doi]
- Speaker adaptation based on two-step active learningKoichi Shinoda, Hiroko Murakami, Sadaoki Furui. 576-579 [doi]
- Tree-based estimation of speaker characteristics for speech recognitionMats Blomberg, Daniel Elenius. 580-583 [doi]
- A study on the influence of covariance adaptation on jacobian compensation in vocal tract length normalizationD. Rama Sanand, S. P. Rath, Srinivasan Umesh. 584-587 [doi]
- Designing spoken tutorial dialogue with children to elicit predictable but educationally valuable responsesGregory Aist, Jack Mostow. 588-591 [doi]
- Optimizing non-native speech recognition for CALL applicationsJoost van Doremalen, Helmer Strik, Catia Cucchiarini. 592-595 [doi]
- Evaluation of English intonation based on combination of multiple evaluation scoresAkinori Ito, Tomoaki Konno, Masashi Ito, Shozo Makino. 596-599 [doi]
- A language-independent feature set for the automatic evaluation of prosodyAndreas Maier, Florian Hönig, Viktor Zeißler, Anton Batliner, E. Körner, N. Yamanaka, P. Ackermann, Elmar Nöth. 600-603 [doi]
- Adapting the acoustic model of a speech recognizer for varied proficiency non-native spontaneous speech using read speech with language-specific pronunciation difficultyKlaus Zechner, Derrick Higgins, René Lawless, Yoko Futagi, Sarah Ohls, George Ivanov. 604-607 [doi]
- Analysis and utilization of MLLR speaker adaptation technique for learners pronunciation evaluationDean Luo, Yu Qiao, Nobuaki Minematsu, Yutaka Yamauchi, Keikichi Hirose. 608-611 [doi]
- Control of human generating force by use of acoustic information - study on onomatopoeic utterances for controlling small lifting-forceMiki Iimura, Taichi Sato, Kihachiro Tanaka. 612-615 [doi]
- Mi-DJ: a multi-source intelligent DJ serviceChing-Hsien Lee, Hsu-Chih Wu. 616-619 [doi]
- Human voice or prompt generation? can they co-exist in an application?Géza Németh, Csaba Zainkó, Mátyás Bartalis, Gábor Olaszy, Géza Kiss. 620-623 [doi]
- Automatic vs. human question answering over multimedia meeting recordingsQuoc Anh Le, Andrei Popescu-Belis. 624-627 [doi]
- Characterizing silent and pseudo-silent speech using radar-like sensorsJohn F. Holzrichter. 628-631 [doi]
- Technologies for processing body-conducted speech detected with non-audible murmur microphoneTomoki Toda, Keigo Nakamura, Takayuki Nagai, Tomomi Kaino, Yoshitaka Nakajima, Kiyohiro Shikano. 632-635 [doi]
- Artificial speech synthesizer control by brain-computer interfaceJonathan S. Brumberg, Philip R. Kennedy, Frank H. Guenther. 636-639 [doi]
- Visuo-phonetic decoding using multi-stream and context-dependent models for an ultrasound-based silent speech interfaceThomas Hueber, Elie-Laurent Benaroya, Gérard Chollet, Bruce Denby, Gérard Dreyfus, Maureen Stone. 640-643 [doi]
- Disordered speech recognition using acoustic and sEMG signalsYunbin Deng, Rupal Patel, James T. Heaton, Glen Colby, L. Donald Gilmore, Joao Cabrera, Serge H. Roy, Carlo J. De Luca, Geoffrey S. Meltzner. 644-647 [doi]
- Impact of different speaking modes on EMG-based speech recognitionMichael Wand, Szu-Chen Stan Jou, Arthur R. Toth, Tanja Schultz. 648-651 [doi]
- Synthesizing speech from electromyography using voice transformation techniquesArthur R. Toth, Michael Wand, Tanja Schultz. 652-655 [doi]
- Multimodal HMM-based NAM-to-speech conversionViet-Anh Tran, Gérard Bailly, Hélène Loevenbruck, Tomoki Toda. 656-659 [doi]
- On the semi-supervised learning of multi-layered perceptronsJonathan Malkin, Amarnag Subramanya, Jeff Bilmes. 660-663 [doi]
- Generalized discriminative feature transformation for speech recognitionRoger Hsiao, Tanja Schultz. 664-667 [doi]
- A fast online algorithm for large margin training of continuous density hidden Markov modelsChih-Chieh Cheng, Fei Sha, Lawrence K. Saul. 668-671 [doi]
- Maximum mutual information estimation via second order cone programming for large vocabulary continuous speech recognitionDalei Wu, Baojie Li, Hui Jiang. 672-675 [doi]
- Hidden conditional random field with distribution constraints for phone classificationDong Yu, Li Deng, Alex Acero. 676-679 [doi]
- Deterministic annealing based training algorithm for Bayesian speech recognitionSayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda. 680-683 [doi]
- Connecting rhythm and prominence in automatic ESL pronunciation scoringEmily Nava, Joseph Tepperman, Louis Goldstein, Maria Luisa Zubizarreta, Shrikanth S. Narayanan. 684-687 [doi]
- Evaluating parameters for mapping adult vowels to imitative babblingIlana Heintz, Mary E. Beckman, Eric Fosler-Lussier, Lucie Ménard. 688-691 [doi]
- Intonation of Japanese sentences spoken by English speakersChiharu Tsurutani. 692-695 [doi]
- KLAIR: a virtual infant for spoken language acquisition researchMark Huckvale, Ian S. Howard, Sascha Fagel. 696-699 [doi]
- An articulatory analysis of phonological transfer using real-time MRIJoseph Tepperman, Erik Bresch, Yoon-Chul Kim, Sungbok Lee, Louis Goldstein, Shrikanth S. Narayanan. 700-703 [doi]
- Do multiple caregivers speed up language acquisition?Louis ten Bosch, Okko Johannes Räsänen, Joris Driesen, Guillaume Aimetti, Toomas Altosaar, Lou Boves, A. Corns. 704-707 [doi]
- Grapheme to phoneme conversion using an SMT systemAntoine Laurent, Paul Deléglise, Sylvain Meignier. 708-711 [doi]
- Lexical and phonetic modeling for Arabic automatic speech recognitionLong Nguyen, Tim Ng, Kham Nguyen, Rabih Zbib, John Makhoul. 712-715 [doi]
- Assessing context and learning for isizulu tone recognitionGina-Anne Levow. 716-719 [doi]
- A sequential minimization algorithm for finite-state pronunciation lexicon modelsSimon Dobrisek, Bostjan Vesnicer, France Mihelic. 720-723 [doi]
- A general-purpose 32 ms prosodic vector for hidden Markov modelingKornel Laskowski, Mattias Heldner, Jens Edlund. 724-727 [doi]
- Vocabulary expansion through automatic abbreviation generation for Chinese voice searchDong Yang, Yi-Cheng Pan, Sadaoki Furui. 728-731 [doi]
- Perceptual cost function for cross-fading based concatenationQi Miao, Alexander Kain, Jan P. H. van Santen. 732-735 [doi]
- Exploring automatic similarity measures for unit selection tuningDaniel Tihelka, Jan Romportl. 736-739 [doi]
- Towards intonation control in unit selection speech synthesisCédric Boidin, Olivier Boëffard, Thierry Moudenc, Géraldine Damnati. 740-743 [doi]
- A novel approach to cost weighting in unit selection TTSJerome R. Bellegarda. 744-747 [doi]
- Maximum likelihood unit selection for corpus-based speech synthesisAbubeker Gamboa Rosales, Hamurabi Gamboa Rosales, Ruediger Hoffmann. 748-751 [doi]
- A close look into the probabilistic concatenation model for corpus-based speech synthesisShinsuke Sakai, Ranniery Maia, Hisashi Kawai, Satoshi Nakamura. 752-755 [doi]
- Simple physical models of the vocal tract for education in speech scienceTakayuki Arai. 756-759 [doi]
- Auto-meshing algorithm for acoustic analysis of vocal tractKyohei Hayashi, Nobuhiro Miki. 760-763 [doi]
- Voice production model employing an interactive boundary-layer analysis of glottal flowTokihiko Kaburagi, Katsunori Daimo, Shogo Nakamura. 764-767 [doi]
- Characteristics of two-dimensional finite difference techniques for vocal tract analysis and voice synthesisMatt Speed, Damian T. Murphy, David M. Howard. 768-771 [doi]
- Adaptation of a predictive model of tongue shapesChao Qin, Miguel Á. Carreira-Perpiñán. 772-775 [doi]
- Using sensor orientation information for computational head stabilisation in 3d electromagnetic articulography (EMA)Christian Kroos. 776-779 [doi]
- Collision threshold pressure before and after vocal loadingLaura Enflo, Johan Sundberg, Friedemann Pabst. 780-783 [doi]
- Gender differences in the realization of vowel-initial glottalizationElke Philburn. 784-787 [doi]
- Stability and composition of functional synergies for speech movements in children and adultsHayo Terband, Frits van Brenk, Pascal van Lieshout, Lian Nijland, Ben Maassen. 788-791 [doi]
- An analysis of speech rate strategies in agingFrits van Brenk, Hayo Terband, Pascal van Lieshout, Anja Lowit, Ben Maassen. 792-795 [doi]
- Variability and stability in collaborative dialogues: turn-taking and filled pausesStefan Benus. 796-799 [doi]
- Speaking in the presence of a competing talkerYouyi Lu, Martin Cooke. 800-803 [doi]
- Effect of r-resonance information on intelligibilityAntje Heinrich, Sarah Hawkins. 804-807 [doi]
- Perception of temporal cues at discourse boundariesHsin-Yi Lin, Janice Fon. 808-811 [doi]
- Human audio-visual consonant recognition analyzed with three bimodal integration modelsZhanyu Ma, Arne Leijon. 812-815 [doi]
- Effects of tempo in radio commercials on young and elderly listenersHanny den Ouden, Hugo Quené. 816-819 [doi]
- Self-voice recognition in 4 to 5-year-old childrenSofia Strömbergsson. 820-823 [doi]
- Are real tongue movements easier to speech read than synthesized?Olov Engwall, Preben Wik. 824-827 [doi]
- Eliciting a hierarchical structure of human consonant perception task errors using formal concept analysisCarmen Peláez-Moreno, Ana I. García-Moral, Francisco J. Valverde-Albacete. 828-831 [doi]
- Acoustic and perceptual effects of vocal training in amateur male singingTakeshi Saitou, Masataka Goto. 832-835 [doi]
- Wavelet-based speaker change detection in single channel speech dataMichael Wiesenegger, Franz Pernkopf. 836-839 [doi]
- An adaptive threshold computation for unsupervised speaker segmentationLaura Docío Fernández, Paula Lopez-Otero, Carmen García-Mateo. 840-843 [doi]
- A data-driven approach for estimating the time-frequency binary maskGibak Kim, Philipos C. Loizou. 844-847 [doi]
- A semi-supervised version of heteroscedastic linear discriminant analysisHaolang Zhou, Damianos Karakos, Andreas G. Andreou. 848-851 [doi]
- Self-learning vector quantization for pattern discovery from speechOkko Johannes Räsänen, Unto K. Laine, Toomas Altosaar. 852-855 [doi]
- Monaural segregation of voiced speech using discriminative random fieldsRohit Prabhavalkar, Zhaozhang Jin, Eric Fosler-Lussier. 856-859 [doi]
- Advancements in whisper-island detection within normally phonated audio streamsChi Zhang, John H. L. Hansen. 860-863 [doi]
- Joint segmentation and classification of dialog acts using conditional random fieldsMatthias Zimmermann. 864-867 [doi]
- Exploring complex vowels as phrase break correlates in a corpus of English speech with proPOSEL, a prosody and POS English lexiconClaire Brierley, Eric Atwell. 868-871 [doi]
- Automatic topic detection of recorded voice messagesCaroline Clemens, Stefan Feldes, Karlheinz Schuhmacher, Joachim Stegmann. 872-875 [doi]
- Identification and automatic detection of parasitic speech soundsJindrich Matousek, Radek Skarnitzl, Pavel Machac, Jan Trmal. 876-879 [doi]
- Phonetic alignment for speech synthesis in under-resourced languagesDaniel R. van Niekerk, Etienne Barnard. 880-883 [doi]
- Improving initial boundary estimation for HMM-based automatic phonetic segmentationUdochukwu Kalu Ogbureke, Julie Carson-Berndsen. 884-887 [doi]
- Importance of nasality measures for speaker recognition data selection and performance predictionHoward Lei, Eduardo López Gonzalo. 888-891 [doi]
- Exploration of vocal excitation modulation features for speaker recognitionNing Wang, P. C. Ching, Tan Lee. 892-895 [doi]
- Speaker identification for whispered speech using modified temporal patterns and MFCCsXing Fan, John H. L. Hansen. 896-899 [doi]
- Speaker diarization for meeting room audioHanwu Sun, Tin Lay Nwe, Bin Ma, Haizhou Li. 900-903 [doi]
- Improving speaker segmentation via speaker identification and text segmentationRunxin Li, Tanja Schultz, Qin Jin. 904-907 [doi]
- Overall performance metrics for multi-condition speaker recognition evaluationsDavid A. van Leeuwen. 908-911 [doi]
- Speaker identification using warped MVDR cepstral featuresMatthias Wölfel, Qian Yang, Qin Jin, Tanja Schultz. 912-915 [doi]
- Entropy based overlapped speech detection as a pre-processing stage for speaker diarizationOshry Ben-Harush, Itshak Lapidot, Hugo Guterman. 916-919 [doi]
- Speech style and speaker recognition: a case studyMarco Grimaldi, Fred Cummins. 920-923 [doi]
- The majority wins: a method for combining speaker diarization systemsMarijn Huijbregts, David A. van Leeuwen, Franciska M. G. de Jong. 924-927 [doi]
- Two-wire nuisance attribute projectionYosef A. Solewicz, Hagai Aronowitz. 928-931 [doi]
- Acoustic and high-speed digital imaging based analysis of pathological voice contributes to better understanding and differential diagnosis of neurological dysphonias and of mimicking phonatory disordersKrzysztof Izdebski, Yuling Yan, Melda Kunduk. 932-934 [doi]
- Normalized modulation spectral features for cross-database voice pathology detectionMaria E. Markaki, Yannis Stylianou. 935-938 [doi]
- Speech sample salience analysis for speech cycle detectionChristophe Mertens, Francis Grenez, Jean Schoentgen. 939-942 [doi]
- The use of telephone speech recordings for assessment and monitoring of cognitive function in elderly peopleViliam Rapcan, Shona D Arcy, Nils Penard, Ian H. Robertson, Richard B. Reilly. 943-946 [doi]
- Optimized feature set to assess acoustic perturbations in dysarthric speechSunil Nagaraja, Eduardo Castillo Guerra. 947-950 [doi]
- A microphone-independent visualization technique for speech disordersAndreas Maier, Stefan Wenhardt, Tino Haderlein, Maria Schuster, Elmar Nöth. 951-954 [doi]
- Evaluation of the effect of the GSM full rate codec on the automatic detection of laryngeal pathologies based on cepstral analysisRubén Fraile, Carmelo Sánchez, Juan Ignacio Godino-Llorente, Nicolás Sáenz-Lechón, Víctor Osma-Ruiz, Juana M. Gutiérrez. 955-958 [doi]
- Cepstral analysis of vocal dysperiodicities in disordered connected speechAli Alpan, Jean Schoentgen, Youri Maryn, Francis Grenez, P. Murphy. 959-962 [doi]
- Standard information from patients: the usefulness of self-evaluation (measured with the French version of the VHI)Lise Crevier-Buchman, Stephanie Borel, Stéphane Hans, Madeleine Menard, Jacqueline Vaissière. 963-966 [doi]
- Intelligibility assessment in children with cleft lip and palate in Italian and GermanMarcello Scipioni, Matteo Gerosa, Diego Giuliani, Elmar Nöth, Andreas Maier. 967-970 [doi]
- Universidade de aveiro s voice evaluation protocolLuis M. T. Jesus, Anna Barney, Ricardo Santos, Janine Caetano, Juliana Jorge, Pedro Sá Couto. 971-974 [doi]
- Fast speech recognition for voice destination entry in a car navigation systemHoon Chung, JeonGue Park, HyeonBae Jeon, Yunkeun Lee. 975-978 [doi]
- Improving perceived accuracy for in-car media searchYun-Cheng Ju, Michael L. Seltzer, Ivan Tashev. 979-982 [doi]
- Laying the foundation for in-car alcohol detection by speechFlorian Schiel, Christian Heinrich. 983-986 [doi]
- A voice search approach to replying to SMS messages in automobilesYun-Cheng Ju, Tim Paek. 987-990 [doi]
- Language modeling for what-with-where on GOOG-411Charl Johannes van Heerden, Johan Schalkwyk, Brian Strope. 991-994 [doi]
- Very large vocabulary voice dictation for mobile devicesJan Nouza, Petr Cerva, Jindrich Zdánský. 995-998 [doi]
- Did you say a BLUE banana? the prosody of contrast and abnormality in bulgarian and dutchDiana V. Dimitrova, Gisela Redeker, John C. J. Hoeks. 999-1002 [doi]
- A quantitative study of F0 peak alignment and sentence modalityHansjörg Mixdorff, Hartmut R. Pfitzinger. 1003-1006 [doi]
- Closely related languages, different ways of realizing focusSzu-wei Chen, Bei Wang, Yi Xu. 1007-1010 [doi]
- Cross-variety rhythm typology in portuguesePlínio Almeida Barbosa, Céu Viana, Isabel Trancoso. 1011-1014 [doi]
- Pitch adaptation in different age groups: boundary tones versus global pitchMarie Nilsenová, Marc Swerts, Véronique Houtepen, Heleen Dittrich. 1015-1018 [doi]
- Backchannel-inviting cues in task-oriented dialogueAgustín Gravano, Julia Hirschberg. 1019-1022 [doi]
- What s in an ontology for spoken language understandingSilvia Quarteroni, Giuseppe Riccardi, Marco Dinarelli. 1023-1026 [doi]
- A fundamental study of shouted speech for acoustic-based security systemHiroaki Nanjo, Hiroki Mikami, Hiroshi Kawano, Takanobu Nishiura. 1027-1030 [doi]
- Evaluating the potential utility of ASR n-best lists for incremental spoken dialogue systemsTimo Baumann, Okko Buß, Michaela Atterer, David Schlangen. 1031-1034 [doi]
- Improving the recognition of names by document-level clusteringBin Zhang, Wei Wu, Jeremy G. Kahn, Mari Ostendorf. 1035-1038 [doi]
- Robust dependency parsing for spoken language understanding of spontaneous speechFrédéric Béchet, Alexis Nasr. 1039-1042 [doi]
- Semantic role labeling with discriminative feature selection for spoken language understandingChao-Hong Liu, Chung-Hsien Wu. 1043-1046 [doi]
- A study of new approaches to speaker diarizationDouglas A. Reynolds, Patrick Kenny, Fabio Castaldo. 1047-1050 [doi]
- Redefining the Bayesian information criterion for speaker diarisationThemos Stafylakis, Vassilios Katsouros, George Carayannis. 1051-1054 [doi]
- Speaker diarization using divide-and-conquerShih-Sian Cheng, Chun-Han Tseng, Chia-Ping Chen, Hsin-Min Wang. 1055-1058 [doi]
- KL realignment for speaker diarization with multiple feature streamsDeepu Vijayasenan, Fabio Valente, Hervé Bourlard. 1059-1062 [doi]
- Speech overlap detection in a two-pass speaker diarization systemMarijn Huijbregts, David A. van Leeuwen, Franciska M. G. de Jong. 1063-1066 [doi]
- Improved speaker diarization of meeting speech with recurrent selection of representative speech segments and participant interaction pattern modelingKyu Jeong Han, Shrikanth S. Narayanan. 1067-1070 [doi]
- Spectral and temporal modulation features for phonetic recognitionStephen A. Zahorian, Hongbing Hu, Zhengqing Chen, Jiang Wu. 1071-1074 [doi]
- Use of harmonic phase information for polarity detection in speech signalsIbon Saratxaga, Daniel Erro, Inmaculada Hernáez, Iñaki Sainz, Eva Navas. 1075-1078 [doi]
- Finite mixture spectrogram modeling for multipitch tracking using a factorial hidden Markov modelMichael Wohlmayr, Franz Pernkopf. 1079-1082 [doi]
- Group-delay-deviation based spectral analysis of speechAnthony P. Stark, Kuldip K. Paliwal. 1083-1086 [doi]
- Speaker dependent mapping for low bit rate coding of throat microphone speechJoseph M. Anand, B. Yegnanarayana, Sanjeev Gupta, M. R. Kesheorey. 1087-1090 [doi]
- Analysis of Lombard speech using excitation source informationG. Bapineedu, B. Avinash, Suryakanth V. Gangashetty, B. Yegnanarayana. 1091-1094 [doi]
- A comparison of linear and nonlinear dimensionality reduction methods applied to synthetic speechAndrew Errity, John McKenna. 1095-1098 [doi]
- ZZT-domain immiscibility of the opening and closing phases of the LF GFM under frame length variationsChristian Fischer Pedersen, Ove Andersen, Paul Dalsgaard. 1099-1102 [doi]
- Dimension reducing of LSF parameters based on radial basis function neural networkHongjun Sun, Jianhua Tao, Huibin Jia. 1103-1106 [doi]
- Characterizing speaker variability using spectral envelopes of vowel soundsA. N. Harish, D. Rama Sanand, Srinivasan Umesh. 1107-1110 [doi]
- Analysis of band structures for speaker-specific information in FM feature extractionTharmarajah Thiruvaran, Eliathamby Ambikairajah, Julien Epps. 1111-1114 [doi]
- Artificial nasalization of speech sounds based on pole-zero models of spectral relations between mouth and nose signalsKarl Schnell, Arild Lacroix. 1115-1118 [doi]
- Error metrics for impaired auditory nerve responses of different phoneme groupsAndrew Hines, Naomi Harte. 1119-1122 [doi]
- Application of differential microphone array for IS-127 EVRC rate determination algorithmHenry Widjaja, Suryoadhi Wibowo. 1123-1126 [doi]
- Estimating the position and orientation of an acoustic source with a microphone array networkAlberto Yoshihiro Nakano, Seiichi Nakagawa, Kazumasa Yamamoto. 1127-1130 [doi]
- Singing voice detection in polyphonic music using predominant pitchVishweshwara Rao, S. Ramakrishnan, Preeti Rao. 1131-1134 [doi]
- Word stress assessment for computer aided language learningJuan Pablo Arias, Néstor Becerra Yoma, Hiram Vivanco. 1135-1138 [doi]
- A non-intrusive signal-based model for speech quality evaluation using automatic classification of background noisesAdrien Leman, Julien Faure, Etienne Parizet. 1139-1142 [doi]
- Acoustic event detection for spotting hot spots in podcastsKouhei Sumi, Tatsuya Kawahara, Jun Ogata, Masataka Goto. 1143-1146 [doi]
- Improving detection of acoustic events using audiovisual data and feature level fusionTaras Butko, Cristian Canton-Ferrer, Carlos Segura, Xavier Giró, Climent Nadeu, Javier Hernando, Josep R. Casas. 1147-1150 [doi]
- Detecting audio events for semantic video searchMiguel Bugalho, José Portelo, Isabel Trancoso, Thomas Pellegrini, Alberto Abad. 1151-1154 [doi]
- Factor analysis for audio-based video genre classificationMickael Rouvier, Driss Matrouf, Georges Linarès. 1155-1158 [doi]
- Robust audio-based classification of video genreMickael Rouvier, Georges Linarès, Driss Matrouf. 1159-1162 [doi]
- Fusing audio and video information for online speaker diarizationJoerg Schmalenstroeer, Martin Kelling, Volker Leutnant, Reinhold Haeb-Umbach. 1163-1166 [doi]
- Multimodal speaker verification using ancillary known speaker characteristics such as gender or ageGirija Chetty, Michael Wagner. 1167-1170 [doi]
- Discovering keywords from cross-modal input: ecological vs. engineering methods for enhancing acoustic repetitionsGuillaume Aimetti, Roger K. Moore, Louis ten Bosch, Okko Johannes Räsänen, Unto Kalervo Laine. 1171-1174 [doi]
- Incremental composition of static decoding graphsMiroslav Novak. 1175-1178 [doi]
- Evaluation of phone lattice based speech decodingJacques Duchateau, Kris Demuynck, Hugo Van Hamme. 1179-1182 [doi]
- A fully data parallel WFST-based large vocabulary continuous speech recognition on a graphics processing unitJike Chong, Ekaterina Gonina, Youngmin Yi, Kurt Keutzer. 1183-1186 [doi]
- Combined low level and high level features for out-of-vocabulary word detectionBenjamin Lecouteux, Georges Linarès, Benoît Favre. 1187-1190 [doi]
- Bayes risk approximations using time overlap with an application to system combinationBjörn Hoffmeister, Ralf Schlüter, Hermann Ney. 1191-1194 [doi]
- Unsupervised estimation of the language model scaling factorChristopher M. White, Ariya Rastrow, Sanjeev Khudanpur, Frederick Jelinek. 1195-1198 [doi]
- Simultaneous estimation of confidence and error cause in speech recognition using discriminative modelAtsunori Ogawa, Atsushi Nakamura. 1199-1202 [doi]
- A generalized composition algorithm for weighted finite-state transducersCyril Allauzen, Michael Riley, Johan Schalkwyk. 1203-1206 [doi]
- Word confidence using duration modelsStefano Scanzio, Pietro Laface, Daniele Colibro, Roberto Gemello. 1207-1210 [doi]
- A comparison of audio-free speech recognition error prediction methodsPreethi Jyothi, Eric Fosler-Lussier. 1211-1214 [doi]
- Automatic out-of-language detection based on confidence measures derived from LVCSR word and phone latticesPetr Motlícek. 1215-1218 [doi]
- Automatic estimation of decoding parameters using large-margin iterative linear programmingBrian Mak, Tom Ko. 1219-1222 [doi]
- Optimization of dereverberation parameters based on likelihood of speech recognizerRandy Gomez, Tatsuya Kawahara. 1223-1226 [doi]
- Application of noise robust MDT speech recognition on the SPEECON and speechdat-car databasesJort F. Gemmeke, Yujun Wang, Maarten Van Segbroeck, Bert Cranen, Hugo Van Hamme. 1227-1230 [doi]
- Model based feature enhancement for automatic speech recognition in reverberant environmentsAlexander Krueger, Reinhold Haeb-Umbach. 1231-1234 [doi]
- A study of mutual front-end processing method based on statistical model for noise robust speech recognitionMasakiyo Fujimoto, Kentaro Ishizuka, Tomohiro Nakatani. 1235-1238 [doi]
- Integrating codebook and utterance information in cepstral statistics normalization techniques for robust speech recognitionGuan-min He, Jeih-Weih Hung. 1239-1242 [doi]
- Reduced complexity equalization of lombard effect for speech recognition in noisy adverse environmentsHynek Boril, John H. L. Hansen. 1243-1246 [doi]
- Unsupervised training scheme with non-stereo data for empirical feature vector compensationLuis Buera, Antonio Miguel, Alfonso Ortega, Eduardo Lleida, Richard M. Stern. 1247-1250 [doi]
- Incremental adaptation with VTS and joint adaptively trained systemsFederico Flego, Mark J. F. Gales. 1251-1254 [doi]
- Target speech GMM-based spectral compensation for noise robust speech recognitionTakahiro Shinozaki, Sadaoki Furui. 1255-1258 [doi]
- Noise-robust feature extraction based on forward maskingSheng-Chiuan Chiou, Chia-Ping Chen. 1259-1262 [doi]
- Investigation into variants of joint factor analysis for speaker recognitionLukás Burget, Pavel Matejka, Valiantsina Hubeika, Jan Cernocký. 1263-1266 [doi]
- Improved GMM-based speaker verification using SVM-driven impostor dataset selectionMitchell McLaren, Robbie Vogt, Brendan Baker, Sridha Sridharan. 1267-1270 [doi]
- Adaptive individual background model for speaker verificationYossi Bar-Yosef, Yuval Bistritz. 1271-1274 [doi]
- Optimization of discriminative kernels in SVM speaker verificationShi-Xiong Zhang, Man-Wai Mak. 1275-1278 [doi]
- UBM-based sequence kernel for speaker recognitionZhenchun Lei. 1279-1282 [doi]
- GMM kernel by Taylor series for speaker verificationMinqiang Xu, Xi Zhou, Beiqian Dai, Thomas S. Huang. 1283-1286 [doi]
- Automatic syllabification for danish text-to-speech systemsJeppe Beck, Daniela Braga, João Nogueira, Miguel Sales Dias, Luís Pinto Coelho. 1287-1290 [doi]
- Hybrid approach to grapheme to phoneme conversion for KoreanJinsik Lee, Byeongchang Kim, Gary Geunbae Lee. 1291-1294 [doi]
- Robust LTS rules with the Combilex speech technology lexiconKorin Richmond, Robert A. J. Clark, Susan Fitt. 1295-1298 [doi]
- Letter-to-phoneme conversion by inference of rewriting rulesVincent Claveau. 1299-1302 [doi]
- Online discriminative training for grapheme-to-phoneme conversionSittichai Jiampojamarn, Grzegorz Kondrak. 1303-1306 [doi]
- Using same-language machine translation to create alternative target sequences for text-to-speech synthesisPeter Cahill, Jinhua Du, Andy Way, Julie Carson-Berndsen. 1307-1310 [doi]
- Watermark recovery from speech using inverse filtering and sign correlationRobert Morris, Ralph Johnson, Vladimir Goncharoff, Joseph DiVita. 1311-1314 [doi]
- Weighted linear prediction for speech analysis in noisy conditionsJouni Pohjalainen, Heikki Kallasjoki, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku. 1315-1318 [doi]
- Log-spectral magnitude MMSE estimators under super-Gaussian densitiesRichard C. Hendriks, Richard Heusdens, Jesper Jensen. 1319-1322 [doi]
- Speech enhancement in a 2-dimensional area based on power spectrum estimation of multiple areas with investigation of existence of active sourcesYusuke Hioka, Ken ichi Furuya, Youichi Haneda, Akitoshi Kataoka. 1323-1326 [doi]
- Modulation domain spectral subtraction for speech enhancementKuldip K. Paliwal, Belinda Schwerin, Kamil K. Wójcicki. 1327-1330 [doi]
- Variational loopy belief propagation for multi-talker speech recognitionSteven J. Rennie, John R. Hershey, Peder A. Olsen. 1331-1334 [doi]
- Enhancement of binaural speech using codebook constrained iterative binaural wiener filterNadir Cazi, T. V. Sreenivas. 1335-1338 [doi]
- A semi-blind source separation method with a less amount of computation suitable for tiny DSP modulesKazunobu Kondo, Makoto Yamada, Hideki Kenmochi. 1339-1342 [doi]
- Model-based speech separation: identifying transcription using orthogonalitySiu Wa Lee, Frank K. Soong, Tan Lee. 1343-1346 [doi]
- Enhanced minimum statistics technique incorporating soft decision for noise suppressionYun-Sik Park, Ji-Hyun Song, Jae Hun Choi, Joon-Hyuk Chang. 1347-1350 [doi]
- Effect of noise reduction on reaction time to speech in noiseMark Huckvale, Jayne Leak. 1351-1354 [doi]
- Joint noise reduction and dereverberation of speech using hybrid TF-GSC and adaptive MMSE estimatorBehdad Dashtbozorg, Hamid Reza Abutalebi. 1355-1358 [doi]
- A study on multiple sound source localization with a distributed microphone systemKook Cho, Takanobu Nishiura, Yoichi Yamashita. 1359-1362 [doi]
- Robust minimal variance distortionless speech power spectra enhancement using order statistic filter for microphone arrayTao Yu, John H. L. Hansen. 1363-1366 [doi]
- Speech enhancement minimizing generalized euclidean distortion using supergaussian priorsAmit Das, John H. L. Hansen. 1367-1370 [doi]
- STFT-based speech enhancement by reconstructing the harmonicsIman Haji Abolhassani, Sid-Ahmed Selouani, Douglas D. O Shaughnessy. 1371-1374 [doi]
- Joint speech enhancement and speaker identification using monte carlo methodsCiira Wa Maina, John MacLaren Walsh. 1375-1378 [doi]
- Combined discriminative training for multi-stream HMM-based audio-visual speech recognitionJing Huang, Karthik Visweswariah. 1379-1382 [doi]
- Cued speech recognition for augmentative communication in normal-hearing and hearing-impaired subjectsPanikos Heracleous, Denis Beautemps, Noureddine Aboutabit. 1383-1386 [doi]
- On acquiring speech production knowledge from articulatory measurements for phoneme recognitionDaniel Neiberg, G. Ananthakrishnan, Mats Blomberg. 1387-1390 [doi]
- Measuring the gap between HMM-based ASR and TTSJohn Dines, Junichi Yamagishi, Simon King. 1391-1394 [doi]
- Speech recognition with speech synthesis models by marginalising over decision tree leavesJohn Dines, Lakshmi Saheer, Hui Liang. 1395-1398 [doi]
- Detailed description of triphone model using SSS-free algorithmMotoyuki Suzuki, Daisuke Honma, Akinori Ito, Shozo Makino. 1399-1402 [doi]
- Decision tree acoustic models for ASRJitendra Ajmera, Masami Akamine. 1403-1406 [doi]
- Compression techniques applied to multiple speech recognition systemsCatherine Breslin, Matthew N. Stuttle, Kate Knill. 1407-1410 [doi]
- Graphical models for discrete hidden Markov models in speech recognitionAntonio Miguel, Alfonso Ortega, Luis Buera, Eduardo Lleida. 1411-1414 [doi]
- Factor analyzed HMM topology for speech recognitionChuan-Wei Ting, Jen-Tzung Chien. 1415-1418 [doi]
- Tied-state multi-path HMnet model using three-domain successive state splittingSoo-Young Suk, Hiroaki Kojima. 1419-1422 [doi]
- Acoustic modeling using exponential familiesVaibhava Goel, Peder A. Olsen. 1423-1426 [doi]
- Personalizing synthetic voices for people with progressive speech disorders: judging voice similaritySarah M. Creer, Stuart P. Cunningham, Phil D. Green, K. Fatema. 1427-1430 [doi]
- Electrolaryngeal speech enhancement based on statistical voice conversionKeigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano. 1431-1434 [doi]
- Age recognition for spoken dialogue systems: do we need it?Maria Wolters, Ravichander Vipperla, Steve Renals. 1435-1438 [doi]
- Speech-based and multimodal media center for different user groupsMarkku Turunen, Jaakko Hakulinen, Aleksi Melto, Juho Hella, Juha-Pekka Rajaniemi, Erno Mäkinen, Jussi Rantala, Tomi Heimonen, Tuuli Laivo, Hannu Soronen, Mervi Hansen, Pellervo Valkama, Toni Miettinen, Roope Raisamo. 1439-1442 [doi]
- Virtual speech reading support for hard of hearing in a domestic multi-media settingSamer Al Moubayed, Jonas Beskow, Anne-Marie Öster, Giampiero Salvi, Björn Granström, Nic van Son, Ellen Ormel. 1443-1446 [doi]
- Real-time correction of closed-captionsPatrick Cardinal, Gilles Boulianne. 1447-1450 [doi]
- Universal access: speech recognition for talkers with spastic dysarthriaHarsh Vardhan Sharma, Mark Hasegawa-Johnson. 1451-1454 [doi]
- Exploring speech therapy games with children on the autism spectrumMohammed E. Hoque, Joseph K. Lane, Rana El Kaliouby, Matthew S. Goodwin, Rosalind W. Picard. 1455-1458 [doi]
- Analyzing GMMs to characterize resonance anomalies in speakers suffering from apnoeaJosé Luis Blanco Murillo, Rubén Fernández Pozo, David Díaz Pardo de Vera, Álvaro Sigüenza, Luis A. Hernández Gómez, José Alcázar Ramírez. 1459-1462 [doi]
- On the mutual information between source and filter contributions for voice pathology detectionThomas Drugman, Thomas Dubuisson, Thierry Dutoit. 1463-1466 [doi]
- A system for detecting miscues in dyslexic read speechMorten Højfeldt Rasmussen, Zheng-Hua Tan, Børge Lindberg, Søren Holdt Jensen. 1467-1470 [doi]
- Techniques for rapid and robust topic identification of conversational telephone speechJonathan Wintrode, Scott Kulp. 1471-1474 [doi]
- Localization of speech recognition in spoken dialog systems: how machine translation can make our lives easierDavid Suendermann, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini. 1475-1478 [doi]
- Algorithms for speech indexing in microsoft reciteKunal Mukerjee, Shankar L. Regunathan, Jeffrey Cole. 1479-1482 [doi]
- Parallelized viterbi processor for 5, 000-word large-vocabulary real-time continuous speech recognition FPGA systemTsuyoshi Fujinaga, Kazuo Miura, Hiroki Noguchi, Hiroshi Kawaguchi, Masahiko Yoshimoto. 1483-1486 [doi]
- SplaSH (spoken language search hawk): integrating time-aligned with text-aligned annotationsSara Romano, Elvio Cecere, Francesco Cutugno. 1487-1490 [doi]
- Podcastle: collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcriptionJun Ogata, Masataka Goto. 1491-1494 [doi]
- A WFST-based log-linear framework for speaking-style transformationGraham Neubig, Shinsuke Mori, Tatsuya Kawahara. 1495-1498 [doi]
- Clusterrank: a graph based method for meeting summarizationNikhil Garg, Benoît Favre, Korbinian Riedhammer, Dilek Hakkani-Tür. 1499-1502 [doi]
- Leveraging sentence weights in a concept-based optimization framework for extractive meeting summarizationShasha Xie, Benoît Favre, Dilek Hakkani-Tür, Yang Liu. 1503-1506 [doi]
- Hybrids of supervised and unsupervised models for extractive speech summarizationShih-Hsiang Lin, Yueng-Tien Lo, Yao-Ming Yeh, Berlin Chen. 1507-1510 [doi]
- Automatic detection of audio advertisementsI. Dan Melamed, Yeon-Jun Kim. 1511-1514 [doi]
- Named entity network based on wikipediaSameer Maskey, Wisam Dakka. 1515-1518 [doi]
- The rhythm of text and the rhythm of utterances: from metrics to modelsDaniel Hirst. 1519-1522 [doi]
- Paper 8003 was not available at the time of publication oral presentation of poster papers no time to lose? time shrinking effects enhance the impression of rhythmic isochrony and fast speech ratePetra Wagner, Andreas Windmann. 1523-1526 [doi]
- Measuring speech rhythm variation in a model-based frameworkPlínio A. Barbosa. 1527-1530 [doi]
- Rhythm measures with language-independent segmentationAnastassia Loukina, Greg Kochanski, Chilin Shih, Elinor Keane, Ian Watson. 1531-1534 [doi]
- Investigating changes in the rhythm of maori over timeMargaret Maclagan, Catherine I. Watson, Jeanette King, Ray Harlow, Laura Thompson, Peter Keegan. 1535-1538 [doi]
- Effects of mora-timing in English rhythm control by Japanese learnersShizuka Nakamura, Hiroaki Kato, Yoshinori Sagisaka. 1539-1542 [doi]
- The dynamic dimension of the global speech-rhythm attributesJan Volín, Petr Pollák. 1543-1546 [doi]
- Vowel duration in pre-geminate contexts in PolishZofia Malisz. 1547-1550 [doi]
- Does session variability compensation in speaker recognition model intrinsic variation under mismatched conditions?Elizabeth Shriberg, Sachin S. Kajarekar, Nicolas Scheffer. 1551-1554 [doi]
- Variability compensated support vector machines applied to speaker verificationZahi N. Karam, William M. Campbell. 1555-1558 [doi]
- Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verificationNajim Dehak, Réda Dehak, Patrick Kenny, Niko Brümmer, Pierre Ouellet, Pierre Dumouchel. 1559-1562 [doi]
- Within-session variability modelling for factor analysis speaker verificationRobbie Vogt, Jason W. Pelecanos, Nicolas Scheffer, Sachin S. Kajarekar, Sridha Sridharan. 1563-1566 [doi]
- Speaker recognition by Gaussian information bottleneckRon M. Hecht, Elad Noor, Naftali Tishby. 1567-1570 [doi]
- Variational dynamic kernels for speaker verificationChris Longworth, Rogier C. van Dalen, Mark J. F. Gales. 1571-1574 [doi]
- Emotion dimensions and formant positionMartijn Goudbeek, Jean Philippe Goldman, Klaus R. Scherer. 1575-1578 [doi]
- Identifying uncertain words within an utterance via prosodic featuresHeather Pon-Barry, Stuart M. Shieber. 1579-1582 [doi]
- Evaluating evaluators: a case study in understanding the benefits and pitfalls of multi-evaluator modelingEmily Mower, Maja J. Mataric, Shrikanth S. Narayanan. 1583-1586 [doi]
- Responding to user emotional state by adding emotional coloring to utterancesJaime C. Acosta, Nigel G. Ward. 1587-1590 [doi]
- Analysis of laugh signals for detecting in continuous speechK. Sudheer Kumar, Sri Harish Reddy Mallidi, K. Sri Rama Murty, B. Yegnanarayana. 1591-1594 [doi]
- Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networksMartin Wöllmer, Florian Eyben, Björn Schuller, Ellen Douglas-Cowie, Roddy Cowie. 1595-1598 [doi]
- On the estimation and the use of confusion-matrices for improving ASR accuracySantiago Omar Caballero Morales, Stephen J. Cox. 1599-1602 [doi]
- A study on soft margin estimation of linear regression parameters for speaker adaptationShigeki Matsuda, Yu Tsao, Jinyu Li, Satoshi Nakamura, Chin-Hui Lee. 1603-1606 [doi]
- Exploring the role of spectral smoothing in context of children s speech recognitionShweta Ghai, Rohit Sinha. 1607-1610 [doi]
- Unsupervised lattice-based acoustic model adaptation for speaker-dependent conversational telephone speech transcriptionKishan Thambiratnam, Frank Seide. 1611-1614 [doi]
- Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme modelsSatoshi Kobashikawa, Atsunori Ogawa, Yoshikazu Yamaguchi, Satoshi Takahashi. 1615-1618 [doi]
- Bark-shift based nonlinear speaker normalization using the second subglottal resonanceShizhen Wang, Yi-Hui Lee, Abeer Alwan. 1619-1622 [doi]
- Many-to-many eigenvoice conversion with reference voiceYamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano. 1623-1626 [doi]
- Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelingElizabeth Godoy, Olivier Rosec, Thierry Chonavel. 1627-1630 [doi]
- Efficient modeling of temporal structure of speech for applications in voice transformationBinh Phu Nguyen, Masato Akagi. 1631-1634 [doi]
- Cross-language voice conversion based on eigenvoicesMalorie Charlier, Yamato Ohtani, Tomoki Toda, Alexis Moinet, Thierry Dutoit. 1635-1638 [doi]
- Voice conversion using k-histograms and frame selectionAlejandro José Uriz, Pablo Daniel Agüero, Antonio Bonafonte, Juan Carlos Tulli. 1639-1642 [doi]
- Online model adaptation for voice conversion using model-based speech synthesis techniquesDalei Wu, Baojie Li, Hui Jiang, Qian-Jie Fu. 1643-1646 [doi]
- Fast transcription of unstructured audio recordingsBrandon Roy, Deb Roy. 1647-1650 [doi]
- Finding allophones: an evaluation on consonants in the TIMIT corpusTimothy Kempton, Roger K. Moore. 1651-1654 [doi]
- Automatic formant extraction for sociolinguistic analysis of large corporaKeelan Evanini, Stephen Isard, Mark Liberman. 1655-1658 [doi]
- Investigating phonetic information reduction and lexical confusabilityWilliam Hartmann, Eric Fosler-Lussier. 1659-1662 [doi]
- Improving phone recognition performance via phonetically-motivated unitsHyejin Hong, Minhwa Chung. 1663-1666 [doi]
- An evaluation of formant tracking methods on an Arabic databaseImen Jemaa, Oussama Rekhis, Kaïs Ouni, Yves Laprie. 1667-1670 [doi]
- Comparison of manual and automated estimates of subglottal resonancesWolfgang Wokurek, Andreas Madsack. 1671-1674 [doi]
- Using durational cues in a computational model of spoken-word recognitionOdette Scharenborg. 1675-1678 [doi]
- Second language discrimination vowel contrasts by adults speakers with a five vowel systemBianca Sisinni, Mirko Grimaldi. 1679-1682 [doi]
- Three-way laryngeal categorization of Japanese, French, English and Chinese plosives by Korean speakersTomohiko Ooigawa, Shigeko Shinohara. 1683-1686 [doi]
- The effect of F0 peak-delay on the L1 / L2 perception of English lexical stressShinichi Tokuma, Yi Xu. 1687-1690 [doi]
- Acoustic cues of palatalisation in plosive + lateral onset clustersDaniela Müller, Sidney Martin Mota. 1695-1698 [doi]
- Perception of English compound vs. phrasal stress: natural vs. synthetic speechIrene Vogel, Arild Hestvik, H. Timothy Bunnell, Laura Spinu. 1699-1702 [doi]
- New method for delexicalization and its application to prosodic tagging for text-to-speech synthesisMartti Vainio, Antti Suni, Tuomo Raitio, Jani Nurminen, Juhani Järvikivi, Paavo Alku. 1703-1706 [doi]
- Speech rate and pauses in non-native FinnishMinnaleena Toivola, Mietta Lennes, Eija Aho. 1707-1710 [doi]
- Modelling similarity perception of intonationUwe D. Reichel, Felicitas Kleber, Raphael Winkelmann. 1711-1714 [doi]
- Studying L2 suprasegmental features in asian Englishes: a position paperHelen Meng, Chiu-yu Tseng, Mariko Kondo, Alissa Harrison, Tanya Viscelgia. 1715-1718 [doi]
- Classification of disfluent phenomena as fluent communicative devices in specific prosodic contextsHelena Moniz, Isabel Trancoso, Ana Isabel Mata. 1719-1722 [doi]
- Cross-cultural perception of discourse phenomenaRolf Carlson, Julia Hirschberg. 1723-1726 [doi]
- Modelling vocabulary growth from birth to young adulthoodRoger K. Moore, Louis ten Bosch. 1727-1730 [doi]
- Adaptive non-negative matrix factorization in a computational model of language acquisitionJoris Driesen, Louis ten Bosch, Hugo Van Hamme. 1731-1734 [doi]
- Classifying clear and conversational speech based on acoustic featuresAkiko Amano-Kusumoto, John-Paul Hosom, Izhak Shafran. 1735-1738 [doi]
- The acoustic characteristics of Russian vowels in children of 6 and 7 years of ageElena E. Lyakso, Olga V. Frolova, Aleks S. Grigoriev. 1739-1742 [doi]
- Japanese children s acquisition of prosodic Politeness expressionsTakaaki Shochi, Donna Erickson, Kaoru Sekiyama, Albert Rilliard, Véronique Aubergé. 1743-1746 [doi]
- Perceptual training of singleton and geminate stops in Japanese language by Korean learnersMee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka. 1747-1750 [doi]
- A Bayesian approach to Hidden Semi-Markov Model based speech synthesisKei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda. 1751-1754 [doi]
- A Bayesian approach to Hidden Semi-Markov Model based speech synthesisKei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda. 1751-1754 [doi]
- Rich context modeling for high quality HMM-based TTSZhi-Jie Yan, Yao Qian, Frank K. Soong. 1755-1758 [doi]
- Rich context modeling for high quality HMM-based TTSZhi-Jie Yan, Yao Qian, Frank K. Soong. 1755-1758 [doi]
- Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systemsKeiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda. 1759-1762 [doi]
- Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systemsKeiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda. 1759-1762 [doi]
- The HMM synthesis algorithm of an embedded unified speech recognizer and synthesizerGuntram Strecha, Matthias Wolff, Frank Duckhorn, Sören Wittenberg, Constanze Tschöpe. 1763-1766 [doi]
- The HMM synthesis algorithm of an embedded unified speech recognizer and synthesizerGuntram Strecha, Matthias Wolff, Frank Duckhorn, Sören Wittenberg, Constanze Tschöpe. 1763-1766 [doi]
- Syllable HMM based Mandarin TTS and comparison with concatenative TTSZhiwei Shuang, Shiyin Kang, Qin Shi, Yong Qin, Lianhong Cai. 1767-1770 [doi]
- Syllable HMM based Mandarin TTS and comparison with concatenative TTSZhiwei Shuang, Shiyin Kang, Qin Shi, Yong Qin, Lianhong Cai. 1767-1770 [doi]
- Pulse density representation of spectrum for statistical speech processingYoshinori Shiga. 1771-1774 [doi]
- Pulse density representation of spectrum for statistical speech processingYoshinori Shiga. 1771-1774 [doi]
- Parameterization of vocal fry in HMM-based speech synthesisHanna Silén, Elina Helander, Jani Nurminen, Moncef Gabbouj. 1775-1778 [doi]
- Parameterization of vocal fry in HMM-based speech synthesisHanna Silén, Elina Helander, Jani Nurminen, Moncef Gabbouj. 1775-1778 [doi]
- A deterministic plus stochastic model of the residual signal for improved parametric speech synthesisThomas Drugman, Geoffrey Wilfart, Thierry Dutoit. 1779-1782 [doi]
- A deterministic plus stochastic model of the residual signal for improved parametric speech synthesisThomas Drugman, Geoffrey Wilfart, Thierry Dutoit. 1779-1782 [doi]
- A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesisRanniery Maia, Tomoki Toda, Keiichi Tokuda, Shinsuke Sakai, Satoshi Nakamura. 1783-1786 [doi]
- A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesisRanniery Maia, Tomoki Toda, Keiichi Tokuda, Shinsuke Sakai, Satoshi Nakamura. 1783-1786 [doi]
- An improved minimum generation error based model adaptation for HMM-based speech synthesisYi-Jian Wu, Long Qin, Keiichi Tokuda. 1787-1790 [doi]
- An improved minimum generation error based model adaptation for HMM-based speech synthesisYi-Jian Wu, Long Qin, Keiichi Tokuda. 1787-1790 [doi]
- Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis modelsMatthew Gibson. 1791-1794 [doi]
- Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis modelsMatthew Gibson. 1791-1794 [doi]
- Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTSAnocha Rugchatjaroen, Nattanun Thatphithakkul, Ananlada Chotimongkol, Ausdang Thangthai, Chai Wutiwiwatchai. 1795-1798 [doi]
- Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTSAnocha Rugchatjaroen, Nattanun Thatphithakkul, Ananlada Chotimongkol, Ausdang Thangthai, Chai Wutiwiwatchai. 1795-1798 [doi]
- HMM-based automatic eye-blink synthesis from speechMichal Dziemianko, Gregor Hofer, Hiroshi Shimodaira. 1799-1802 [doi]
- HMM-based automatic eye-blink synthesis from speechMichal Dziemianko, Gregor Hofer, Hiroshi Shimodaira. 1799-1802 [doi]
- Resources for speech research: present and future infrastructure needsLou Boves, Rolf Carlson, Erhard W. Hinrichs, David House, Steven Krauwer, Lothar Lemnitzer, Martti Vainio, Peter Wittenburg. 1803-1806 [doi]
- Speech recordings via the internet: an overview of the VOYS project in scotlandCatherine Dickie, Felix Schaeffler, Christoph Draxler, Klaus Jänsch. 1807-1810 [doi]
- The multi-session audio research project (MARP) corpus: goals, design and initial findingsAaron D. Lawson, A. R. Stauffer, Edward J. Cupples, Stanley J. Wenndt, W. P. Bray, John J. Grieco. 1811-1814 [doi]
- Structure and annotation of Polish LVCSR speech databaseKatarzyna Klessa, Grazyna Demenko. 1815-1818 [doi]
- Balanced corpus of informal spoken Czech: compilation, design and findingsMartina Waclawicová, Michal Kren, Lucie Válková. 1819-1822 [doi]
- JTrans: an open-source software for semi-automatic text-to-speech alignmentChristophe Cerisara, Odile Mella, Dominique Fohr. 1823-1826 [doi]
- Predicting the quality of multimodal systems based on judgments of single modalitiesIna Wechsung, Klaus-Peter Engelbrecht, Anja B. Naumann, Stefan Schaffer, Julia Seebode, Florian Metze, Sebastian Möller. 1827-1830 [doi]
- Auto-checking speech transcriptions by multiple template constrained posteriorLijuan Wang, Shenghao Qin, Frank K. Soong. 1831-1834 [doi]
- Subjective experiments on influence of response timing in spoken dialoguesToshihiko Itoh, Norihide Kitaoka, Ryota Nishimura. 1835-1838 [doi]
- Usability study of VUI consistent with GUI focusing on age-groupsJun Okamoto, Tomoyuki Kato, Makoto Shozakai. 1839-1842 [doi]
- Annotating communicative function and semantic content in dialogue act for construction of consulting dialogue systemsTeruhisa Misu, Kiyonori Ohtake, Chiori Hori, Hideki Kashioka, Satoshi Nakamura. 1843-1846 [doi]
- Improved speech summarization with multiple-hypothesis representations and kullback-leibler divergence measuresShih-Hsiang Lin, Berlin Chen. 1847-1850 [doi]
- An improved speech segmentation quality measure: the r-valueOkko Johannes Räsänen, Unto Kalervo Laine, Toomas Altosaar. 1851-1854 [doi]
- No sooner said than done? testing incrementality of semantic interpretations of spontaneous speechMichaela Atterer, Timo Baumann, David Schlangen. 1855-1858 [doi]
- Role of natural language understanding in voice local searchJunlan Feng, Srinivas Bangalore, Mazin Gilbert. 1859-1862 [doi]
- Recognition and correction of voice web search queriesKeith Vertanen, Per Ola Kristensson. 1863-1866 [doi]
- Semantic context effects in the recognition of acoustically unreduced and reduced wordsChao Wang, Johan Schalkwyk, Roberto Sicconi, Geoffrey Zweig, Marco van de Ven, Benjamin V. Tucker, Mirjam Ernestus. 1867-1870 [doi]
- Context effects and the processing of ambiguous words: further evidence from semantic incongruenceMichael C. W. Yip. 1871-1874 [doi]
- The roles of reconstruction and lexical storage in the comprehension of regular pronunciation variantsMirjam Ernestus. 1875-1878 [doi]
- Lexical embedding in spoken dutchOdette Scharenborg, Stefanie Okolowski. 1879-1882 [doi]
- Real-time lexical competitions during speech-in-speech comprehensionVéronique Boulenger, Michel Hoen, François Pellegrino, Fanny Meunier. 1883-1886 [doi]
- Discovering consistent word confusions in noiseMartin Cooke. 1887-1890 [doi]
- A large greek-English dictionary with incorporated speech and language processing toolsDimitrios P. Lyras, George K. Kokkinakis, Alexandros Lazaridis, Kyriakos N. Sgarbas, Nikos Fakotakis. 1891-1894 [doi]
- Predicting children s reading ability using evaluator-informed featuresMatthew Black, Joseph Tepperman, Sungbok Lee, Shrikanth S. Narayanan. 1895-1898 [doi]
- Automatic intonation classification for speech training systemsGyörgy Szaszák, David Sztahó, Klára Vicsi. 1899-1902 [doi]
- Automated pronunciation scoring using confidence scoring and landmark-based SVMSu-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat. 1903-1906 [doi]
- ASR based pronunciation evaluation with automatically generated competing vocabularyCarlos Molina, Néstor Becerra Yoma, Jorge Wuth, Hiram Vivanco. 1907-1910 [doi]
- High performance automatic mispronunciation detection method based on neural network and TRAP featuresHongyan Li, Shijin Wang, Jiaen Liang, Shen Huang, Bo Xu. 1911-1914 [doi]
- The semi-supervised switchboard transcription projectAmarnag Subramanya, Jeff Bilmes. 1915-1918 [doi]
- Maximum mutual information multi-phone units in direct modelingGeoffrey Zweig, Patrick Nguyen. 1919-1922 [doi]
- Profiling large-vocabulary continuous speech recognition on embedded devices: a hardware resource sensitivity analysisKai Yu, Rob A. Rutenbar. 1923-1926 [doi]
- Continuous speech recognition using attention shift decoding with soft decisionOzlem Kalinli, Shrikanth S. Narayanan. 1927-1930 [doi]
- Towards using hybrid word and fragment units for vocabulary independent LVCSR systemsAriya Rastrow, Abhinav Sethy, Bhuvana Ramabhadran, Frederick Jelinek. 1931-1934 [doi]
- Unsupervised training of an HMM-based speech recognizer for topic classificationHerbert Gish, Man-Hung Siu, Arthur Chan, William Belfield. 1935-1938 [doi]
- Constrained probabilistic subspace maps applied to speech enhancementKaustubh Kalgaonkar, Mark A. Clements. 1939-1942 [doi]
- Reconstructing clean speech from noisy MFCC vectorsBen Milner, Jonathan Darch, Ibrahim Almajai. 1943-1946 [doi]
- An evaluation of objective quality measures for speech intelligibility predictionCees H. Taal, Richard C. Hendriks, Richard Heusdens, Jesper Jensen, Ulrik Kjems. 1947-1950 [doi]
- Performance comparison of HMM and VQ based single channel speech separationMohammad H. Radfar, Wai-Yip Chan, Richard M. Dansereau, W. Wong. 1951-1954 [doi]
- Stereo-input speech recognition using sparseness-based time-frequency masking in a reverberant environmentYosuke Izumi, Kenta Nishiki, Shinji Watanabe, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama. 1955-1958 [doi]
- Enhancing audio speech using visual speech featuresIbrahim Almajai, Ben Milner. 1959-1962 [doi]
- Perceiving surprise on cue words: prosody and semantics interact on right and reallyCatherine Lai. 1963-1966 [doi]
- Emotion recognition using linear transformations in combination with videoRok Gajsek, Vitomir Struc, Simon Dobrisek, France Mihelic. 1967-1970 [doi]
- Speaker dependent emotion recognition using prosodic supervectorsIgnacio Lopez-Moreno, Carlos Ortego-Resa, Joaquin Gonzalez-Rodriguez, Daniel Ramos. 1971-1974 [doi]
- Physiologically-inspired feature extraction for emotion recognitionYu Zhou, Yanqing Sun, Junfeng Li, Jianping Zhang, YongHong Yan. 1975-1978 [doi]
- Perceived loudness and voice quality in affect cueingIrena Yanushevskaya, Christer Gobl, Ailbhe Ní Chasaide. 1979-1982 [doi]
- Modeling mutual influence of interlocutor emotion states in dyadic spoken interactionsChi-Chun Lee, Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan. 1983-1986 [doi]
- A detailed study of word-position effects on emotion expression in speechJangwon Kim, Sungbok Lee, Shrikanth S. Narayanan. 1987-1990 [doi]
- CMAC for speech emotion profilingNorhaslinda Kamaruddin, Abdul Wahab. 1991-1994 [doi]
- On the relevance of high-level features for speaker independent emotion recognition of spontaneous speechMarko Lugger, Bin Yang. 1995-1998 [doi]
- Recognising interest in conversational speech - comparing bag of frames and supra-segmental featuresBjörn Schuller, Gerhard Rigoll. 1999-2002 [doi]
- Classifying turn-level uncertainty using word-level prosodyDiane J. Litman, Mihai Rotaru, Greg Nicholas. 2003-2006 [doi]
- Detecting subjectivity in multiparty speechGabriel Murray, Giuseppe Carenini. 2007-2010 [doi]
- Pitch contour parameterisation based on linear stylisation for emotion recognitionVidhyasaharan Sethu, Eliathamby Ambikairajah, Julien Epps. 2011-2014 [doi]
- Feature-based and channel-based analyses of intrinsic variability in speaker verificationMartin Graciarena, Tobias Bocklet, Elizabeth Shriberg, Andreas Stolcke, Sachin S. Kajarekar. 2015-2018 [doi]
- Robust angry speech detection employing a TEO-based discriminative classifier combinationWooil Kim, John H. L. Hansen. 2019-2022 [doi]
- Improving emotion recognition using class-level spectral featuresDmitri Bitouk, Ani Nenkova, Ragini Verma. 2023-2026 [doi]
- Arousal and valence prediction in spontaneous emotional speech: felt versus perceived emotionKhiet P. Truong, David A. van Leeuwen, Mark A. Neerincx, Franciska M. G. de Jong. 2027-2030 [doi]
- Dimension reduction approaches for SVM based speaker age estimationGil Dobry, Ron M. Hecht, Mireille Avigal, Yaniv Zigel. 2031-2034 [doi]
- ANN based decision fusion for speech emotion recognitionLu Xu, Mingxing Xu, Dali Yang. 2035-2038 [doi]
- Processing affected speech within human machine interactionBogdan Vlasenko, Andreas Wendemuth. 2039-2042 [doi]
- Emotion recognition from speech using extended feature selection and a simple classifierAli Hassan, Robert I. Damper. 2043-2046 [doi]
- Optimal event search using a structural cost function - improvement of structure to speech conversionDaisuke Saito, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose. 2047-2050 [doi]
- Deriving vocal tract shapes from electromagnetic articulograph data via geometric adaptation and matchingZiad Al Bawab, Lorenzo Turicchia, Richard M. Stern, Bhiksha Raj. 2051-2054 [doi]
- Towards unsupervised articulatory resynthesis of German utterances using EMA dataIngmar Steiner, Korin Richmond. 2055-2058 [doi]
- The klattgrid speech synthesizerDavid Weenink. 2059-2062 [doi]
- Development of a kenyan English text to speech system: a method of developing a TTS for a previously undefined English dialectMucemi Gakuru. 2063-2066 [doi]
- Feedback loop for prosody prediction in concatenative speech synthesisJavier Latorre, Sergio Gracia, Masami Akamine. 2067-2070 [doi]
- Assessing a speaker for fast speech in unit selection speech synthesisDonata Moers, Petra Wagner. 2071-2074 [doi]
- Unit selection based speech synthesis for poor channel conditionLing Cen, Minghui Dong, Paul Chan, Haizhou Li. 2075-2078 [doi]
- Vocalic sandwich, a unit designed for unit selection TTSDidier Cadic, Cédric Boidin, Christophe d Alessandro. 2079-2082 [doi]
- Speech synthesis based on the plural unit selection and fusion method using FWF modelRyo Morinaka, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima. 2083-2086 [doi]
- Speech synthesis without a phone inventoryMatthew P. Aylett, Simon King, Junichi Yamagishi. 2087-2090 [doi]
- Context-dependent additive log f_0 model for HMM-based speech synthesisHeiga Zen, Norbert Braunschweiler. 2091-2094 [doi]
- Real-time live broadcast news subtitling system for SpanishAlfonso Ortega, Jose Enrique Garcia, Antonio Miguel, Eduardo Lleida. 2095-2098 [doi]
- Development of the 2008 SRI Mandarin speech-to-text system for broadcast news and conversationXin Lei, Wei Wu, Wen Wang, Arindam Mandal, Andreas Stolcke. 2099-2102 [doi]
- Multifactor adaptation for Mandarin broadcast news and conversation speech recognitionWen Wang, Arindam Mandal, Xin Lei, Andreas Stolcke, Jing Zheng. 2103-2106 [doi]
- Development of the GALE 2008 Mandarin LVCSR systemChristian Plahl, Björn Hoffmeister, Georg Heigold, Jonas Lööf, Ralf Schlüter, Hermann Ney. 2107-2110 [doi]
- The RWTH aachen university open source speech recognition systemDavid Rybach, Christian Gollan, Georg Heigold, Björn Hoffmeister, Jonas Lööf, Ralf Schlüter, Hermann Ney. 2111-2114 [doi]
- Online detecting end times of spoken utterances for synchronization of live speech and its transcriptsJie Gao, QingWei Zhao, YongHong Yan. 2115-2118 [doi]
- Real-time ASR from meetingsPhilip N. Garner, John Dines, Thomas Hain, Asmaa El Hannani, Martin Karafiát, Danil Korchagin, Mike Lincoln, Vincent Wan, Le Zhang. 2119-2122 [doi]
- Improvements to the LIUM French ASR system based on CMU sphinx: what helps to significantly reduce the word error rate?Paul Deléglise, Yannick Estève, Sylvain Meignier, Téva Merlin. 2123-2126 [doi]
- Merging search spaces for subword spoken term detectionTimo Mertens, Daniel Schneider, Joachim Köhler. 2127-2130 [doi]
- A posterior probability-based system hybridisation and combination for spoken term detectionJavier Tejedor, Dong Wang, Simon King, Joe Frankel, José Colás. 2131-2134 [doi]
- Stochastic pronunciation modelling for spoken term detectionDong Wang, Simon King, Joe Frankel. 2135-2138 [doi]
- Term-dependent confidence for out-of-vocabulary term detectionDong Wang, Simon King, Joe Frankel, Peter Bell. 2139-2142 [doi]
- A comparison of query-by-example methods for spoken term detectionWade Shen, Christopher M. White, Timothy J. Hazen. 2143-2146 [doi]
- Fast keyword detection using suffix arrayKouichi Katsurada, Shigeki Teshima, Tsuneo Nitta. 2147-2150 [doi]
- Understanding speaker-listener interactionsDirk Heylen. 2151-2154 [doi]
- Detecting changes in speech expressiveness in participants of a radio programPlínio A. Barbosa. 2155-2158 [doi]
- An audio-visual approach to measuring discourse synchrony in multimodal conversation dataNick Campbell. 2159-2162 [doi]
- Towards flexible representations for analysis of accommodation of temporal features in spontaneous dialogue speechSpyros Kousidis, David Dorran, Ciaran McDonnell, Eugene Coyle. 2163-2166 [doi]
- Are we in sync : turn-taking in collaborative dialoguesStefan Benus. 2167-2170 [doi]
- An audio-visual attention system for online association learningMartin Heckmann, Holger Brandl, Xavier Domont, Bram Bolder, Frank Joublin, Christian Goerick. 2171-2174 [doi]
- A human benchmark for language recognitionRosemary Orr, David A. van Leeuwen. 2175-2178 [doi]
- Large margin estimation of Gaussian mixture model parameters with extended baum-welch for spoken language recognitionDonglai Zhu, Bin Ma, Haizhou Li. 2179-2182 [doi]
- Linguistically-motivated automatic classification of regional French varietiesCécile Woehrling, Philippe Boula de Mareüil, Martine Adda-Decker. 2183-2186 [doi]
- Discriminative acoustic language recognition via channel-compensated GMM statisticsNiko Brümmer, Albert Strasheim, Valiantsina Hubeika, Pavel Matejka, Lukás Burget, Ondrej Glembek. 2187-2190 [doi]
- Language score calibration using adapted Gaussian back-endMohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel. 2191-2194 [doi]
- A framework for discriminative SVM/GMM systems for language recognitionWilliam M. Campbell, Zahi N. Karam. 2195-2198 [doi]
- Functional data analysis as a tool for analyzing speech dynamics - a case study on the French word c étaitMichele Gubian, Francisco Torreira, Helmer Strik, Lou Boves. 2199-2202 [doi]
- Large-scale analysis of formant frequency estimation variability in conversational telephone speechNancy F. Chen, Wade Shen, Joseph P. Campbell, Reva Schwartz. 2203-2206 [doi]
- Developing an automatic functional annotation system for british English intonationSaandia Ali, Daniel Hirst. 2207-2210 [doi]
- Intrinsic vowel duration and the post-vocalic voicing effect: some evidence from dialects of north american EnglishJoshua Tauberer, Keelan Evanini. 2211-2214 [doi]
- Investigating /l/ variation in English through forced alignmentJiahong Yuan, Mark Liberman. 2215-2218 [doi]
- Structural analysis of dialects, sub-dialects and sub-sub-dialects of ChineseXuebin Ma, Akira Nemoto, Nobuaki Minematsu, Yu Qiao, Keikichi Hirose. 2219-2222 [doi]
- Voice activity detection using singular value decomposition-based filterHwa Jeon Song, Sung Min Ban, Hyung Soon Kim. 2223-2226 [doi]
- Voice activity detection using partially observable Markov decision processChiyoun Park, Namhoon Kim, Jeongmi Cho. 2227-2230 [doi]
- High-accuracy, low-complexity voice activity detection based on a posteriori SNR weighted energyZheng-Hua Tan, Børge Lindberg. 2231-2234 [doi]
- Fusing fast algorithms to achieve efficient speech detection in FM broadcastsStéphane Pigeon, Patrick Verlinde. 2235-2238 [doi]
- Robust speech recognition using VAD-measure-embedded decoderTasuku Oonishi, Paul R. Dixon, Koji Iwano, Sadaoki Furui. 2239-2242 [doi]
- Investigating privacy-sensitive features for speech detection in multiparty conversationsSree Hari Krishnan Parthasarathi, Mathew Magimai-Doss, Hervé Bourlard, Daniel Gatica-Perez. 2243-2246 [doi]
- Evaluation of external and internal articulator dynamics for pronunciation learningLan Wang, Hui Chen, JianJun Ouyang. 2247-2250 [doi]
- Robust audio-visual speech synchrony detection by generalized bimodal linear predictionKshitiz Kumar, Jiri Navratil, Etienne Marcheret, Vit Libal, Gerasimos Potamianos. 2251-2254 [doi]
- Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov modelsAtef Ben Youssef, Pierre Badin, Gérard Bailly, Panikos Heracleous. 2255-2258 [doi]
- Speaker discriminability for visual speech modesJeesun Kim, Chris Davis, Christian Kroos, Harold Hill. 2259-2262 [doi]
- Audio-visual prosody of social attitudes in vietnamese: building and evaluating a tones balanced corpusDang-Khoa Mac, Véronique Aubergé, Albert Rilliard, Eric Castelli. 2263-2266 [doi]
- Direct, modular and hybrid audio to visual speech conversion methods - a comparative studyGyörgy Takács. 2267-2270 [doi]
- How similar are clusters resulting from schwa deletion in French to identical underlying clusters?Audrey Bürki, Cécile Fougeron, Christophe Veaux, Ulrich H. Frauenfelder. 2271-2274 [doi]
- Word-final [t]-deletion: an analysis on the segmental and sub-segmental levelBarbara Schuppler, Wim van Dommelen, Jacques C. Koreman, Mirjam Ernestus. 2275-2278 [doi]
- Rarefaction gestures and coarticulation in mangetti dune !xung clicksAmanda Miller, Abigail Scott, Bonny E. Sands, Sheena Shah. 2279-2282 [doi]
- The acoustics of mangetti dune !xung clicksAmanda Miller, Sheena Shah. 2283-2286 [doi]
- Acoustic characteristics of ejectives in amharicHussien Seid Worku, S. Rajendran, B. Yegnanarayana. 2287-2290 [doi]
- Sentence-final particles in hong kong Cantonese: are they tonal or intonational?Wing Li Wu. 2291-2294 [doi]
- Same tone, different category: linguistic-tonetic variation in the areal tone acoustics of chuqu wuWilliam Steed, Phil Rose. 2295-2298 [doi]
- Why would aspiration lower the pitch of the following vowel? observations from leng-shui-jiang ChineseCaicai Zhang. 2299-2302 [doi]
- Dialectal characteristics of osaka and tokyo Japanese: analyses of phonologically identical wordsKanae Amino, Takayuki Arai. 2303-2306 [doi]
- Categories and gradience in intonation: evidence from linguistics and neurobiologyBrechtje Post, Francis Nolan, Emmanuel A. Stamatakis, Toby Hudson. 2307-2310 [doi]
- Exploring vocalization of /l/ in English: an EPG and EMA studyMitsuhiro Nakamura. 2311-2314 [doi]
- The monophthongs and diphthongs of north-eastern welsh: an acoustic studyRobert Mayr, Hannah Davies. 2315-2318 [doi]
- Voicing profile of Polish sonorants: [r] in obstruent clustersJagoda Sieczkowska, Bernd Möbius, Antje Schweitzer, Michael Walsh, Grzegorz Dogil. 2319-2322 [doi]
- Mel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognitionHoward Lei, Eduardo López Gonzalo. 2323-2326 [doi]
- Fast GMM computation for speaker verification using scalar quantization and discrete densitiesGuoli Ye, Brian Mak, Man-Wai Mak. 2327-2330 [doi]
- Text-independent speaker identification using vocal tract length normalization for building universal background modelA. K. Sarkar, Srinivasan Umesh, S. P. Rath. 2331-2334 [doi]
- BUT system for NIST 2008 speaker recognition evaluationLukás Burget, Michal Fapso, Valiantsina Hubeika, Ondrej Glembek, Martin Karafiát, Marcel Kockmann, Pavel Matejka, Petr Schwarz, Jan Cernocký. 2335-2338 [doi]
- Selection of the best set of shifted delta cepstral features in speaker verification using mutual informationJosé R. Calvo, Rafael Fernández, Gabriel Hernández. 2339-2342 [doi]
- Forensic speaker recognition using traditional features comparing automatic and human-in-the-loop formant trackingAlberto de Castro, Daniel Ramos, Joaquin Gonzalez-Rodriguez. 2343-2346 [doi]
- Open-set speaker identification under mismatch conditionsSurosh G. Pillay, Aladdin M. Ariyaeeinia, P. Sivakumaran, M. Pawlewski. 2347-2350 [doi]
- Minivectors: an improved GMM-SVM approach for speaker verificationXavier Anguera. 2351-2354 [doi]
- Robustness of phase based features for speaker recognitionR. Padmanabhan, Sree Hari Krishnan Parthasarathi, Hema A. Murthy. 2355-2358 [doi]
- The MIT lincoln laboratory 2008 speaker recognition systemDouglas E. Sturim, William M. Campbell, Zahi N. Karam, Douglas A. Reynolds, Fred S. Richardson. 2359-2362 [doi]
- Speaker recognition on lossy compressed speech using the speex codecA. R. Stauffer, Aaron D. Lawson. 2363-2366 [doi]
- Text-independent speaker verification using rank threshold in large number of speaker modelsHaruka Okamoto, Satoru Tsuge, Amira Abdelwahab, Masafumi Nishida, Yasuo Horiuchi, Shingo Kuroiwa. 2367-2370 [doi]
- The role of age in factor analysis for speaker identificationYun Lei, John H. L. Hansen. 2371-2374 [doi]
- Do humans and speaker verification system use the same information to differentiate voices?Juliette Kahn, Solange Rossato. 2375-2378 [doi]
- Noisy speech recognition by using output combination of discrete-mixture HMMs and continuous-mixture HMMsTetsuo Kosaka, You Saito, Masaharu Kato. 2379-2382 [doi]
- Adaptive training with noisy constrained maximum likelihood linear regression for noise robust speech recognitionD. K. Kim, M. J. F. Gales. 2383-2386 [doi]
- Performance comparisons of the integrated parallel model combination approaches with front-end noise reductionGuanghu Shen, Soo-Young Suk, Hyun-Yeol Chung. 2387-2390 [doi]
- Tuning support vector machines for robust phoneme classification with acoustic waveformsJibran Yousafzai, Zoran Cvetkovic, Peter Sollich. 2391-2394 [doi]
- An analytic derivation of a phase-sensitive observation model for noise robust speech recognitionVolker Leutnant, Reinhold Haeb-Umbach. 2395-2398 [doi]
- Variational model composition for robust speech recognition with time-varying background noiseWooil Kim, John H. L. Hansen. 2399-2402 [doi]
- Comparison of estimation techniques in joint uncertainty decoding for noise robust speech recognitionHaitian Xu, K. K. Chin. 2403-2406 [doi]
- Replacing uncertainty decoding with subband re-estimation for large vocabulary speech recognition in noiseJianhua Lu, Ji Ming, Roger Woods. 2407-2410 [doi]
- Perception and production of boundary tones in whispered dutchWillemijn Heeren, Vincent J. van Heuven. 2411-2414 [doi]
- Pitch accents and information status in a German radio news corpusKatrin Schweitzer, Arndt Riester, Michael Walsh, Grzegorz Dogil. 2415-2418 [doi]
- Analysis of voice fundamental frequency contours of continuing and terminating prosodic phrases in four swiss German dialectsAdrian Leemann, Keikichi Hirose, Hiroya Fujisaki. 2419-2422 [doi]
- Intonational features for identifying regional accents of ItalianMichelina Savino. 2423-2426 [doi]
- Analysis and recognition of accentual patternsAgnieszka Wagner. 2427-2430 [doi]
- Using responsive prosodic variation to acknowledge the user s current stateNigel G. Ward, Rafael Escalante-Ruiz. 2431-2434 [doi]
- Intonation segments and segmental intonationOliver Niebuhr. 2435-2438 [doi]
- The phrase-final accent in kammu: effects of tone, focus and engagementDavid House, Anastasia Karlsson, Jan-Olof Svantesson, Damrong Tayanin. 2439-2442 [doi]
- Tonal alignment in three varieties of hiberno-EnglishRaya Kalaldeh, Amelie Dorn, Ailbhe Ní Chasaide. 2443-2446 [doi]
- Determining intonational boundaries from the acoustic signalLourdes Aguilar, Antonio Bonafonte, Francisco Campillo, David Escudero Mancebo. 2447-2450 [doi]
- Compression and truncation revisitedClaudia K. Ohl, Hartmut R. Pfitzinger. 2451-2454 [doi]
- Comparison of Fujisaki-model extractors and F0 stylizersHartmut R. Pfitzinger, Hansjörg Mixdorff, Jan Schwarz. 2455-2458 [doi]
- Is tonal alignment interpretation independent of methodology?Caterina Petrone, Mariapaola D Imperio. 2459-2462 [doi]
- Modeling the intonation of topic structure: two approachesMargaret Zellers, Brechtje Post, Mariapaola D Imperio. 2463-2466 [doi]
- A user modeling-based performance analysis of a wizarded uncertainty-adaptive dialogue system corpusKatherine Forbes-Riley, Diane J. Litman. 2467-2470 [doi]
- Using dialogue-based dynamic language models for improving speech recognitionJuan Manuel Lucas-Cuesta, Fernando F. Fernández-Martínez, Javier Ferreiros. 2471-2474 [doi]
- Reinforcement learning for dialog management using least-squares Policy iteration and fast feature selectionLihong Li, Jason D. Williams, Suhrid Balakrishnan. 2475-2478 [doi]
- Hybridisation of expertise and reinforcement learning in dialogue systemsRomain Laroche, Ghislain Putois, Philippe Bretier, Bernadette Bouchon-Meunier. 2479-2482 [doi]
- Bayesian learning of confidence measure function for generation of utterances and motions in object manipulation dialogue taskKomei Sugiura, Naoto Iwahashi, Hideki Kashioka, Satoshi Nakamura. 2483-2486 [doi]
- Predicting how it sounds: re-ranking dialogue prompts based on TTS quality for adaptive spoken dialogue systemsCédric Boidin, Verena Rieser, Lonneke van der Plas, Oliver Lemon, Jonathan Chevelu. 2487-2490 [doi]
- Accounting for the uncertainty of speech estimates in the complex domain for minimum mean square error speech enhancementRamón Fernandez Astudillo, Dorothea Kolossa, Reinhold Orglmeister. 2491-2494 [doi]
- Signal separation for robust speech recognition based on phase difference information obtained in the frequency domainChanwoo Kim, Kshitiz Kumar, Bhiksha Raj, Richard M. Stern. 2495-2498 [doi]
- Transforming features to compensate speech recogniser models for noiseRogier C. van Dalen, Federico Flego, Mark J. F. Gales. 2499-2502 [doi]
- Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environmentsXugang Lu, Masashi Unoki, Satoshi Nakamura. 2503-2506 [doi]
- Robust in-car spelling recognition - a tandem BLSTM-HMM approachMartin Wöllmer, Florian Eyben, Björn Schuller, Yang Sun, Tobias Moosmayr, Nhu Nguyen-Thien. 2507-2510 [doi]
- Applying non-negative matrix factorization on time-frequency reassignment spectra for missing data mask estimationMaarten Van Segbroeck, Hugo Van Hamme. 2511-2514 [doi]
- Experiments on automatic prosodic labelingAntje Schweitzer, Bernd Möbius. 2515-2518 [doi]
- German boundary tones show categorical perception and a perceptual magnet effect when presented in different contextsKatrin Schneider, Grzegorz Dogil, Bernd Möbius. 2519-2522 [doi]
- Eye tracking for the online evaluation of prosody in speech synthesis: not so fast!Michael White, Rajakrishnan Rajkumar, Kiwako Ito, Shari R. Speer. 2523-2526 [doi]
- Prosodic analysis of foreign-accented EnglishHansjörg Mixdorff, John Ingram. 2527-2530 [doi]
- Perception of the evolution of prosody in the French broadcast news stylePhilippe Boula de Mareüil, Albert Rilliard, Alexandre Allauzen. 2531-2534 [doi]
- Prosodic effects on vowel production: evidence from formant structureYoonsook Mo, Jennifer Cole, Mark Hasegawa-Johnson. 2535-2538 [doi]
- An adaptive BIC approach for robust audio stream segmentationJanez Zibert, Andrej Brodnik, France Mihelic. 2539-2542 [doi]
- Improving the robustness of phonetic segmentation to accent and style variation with a two-staged approachVaishali Patil, Shrikant Joshi, Preeti Rao. 2543-2546 [doi]
- Signature cluster model selection for incremental Gaussian mixture cluster modeling in agglomerative hierarchical speaker clusteringKyu Jeong Han, Shrikanth S. Narayanan. 2547-2550 [doi]
- Speaker segmentation and clustering for simultaneously presented speechLingyun Gu, Richard M. Stern. 2551-2554 [doi]
- Trimmed KL divergence between Gaussian mixtures for robust unsupervised acoustic anomaly detectionNash M. Borges, Gerard G. L. Meyer. 2555-2558 [doi]
- How to loose confidence: probabilistic linear machines for multiclass classificationHui Lin, Jeff Bilmes, Koby Crammer. 2559-2562 [doi]
- Quantifying wideband speech codec degradations via impairment factors: the new ITU-t p.834.1 methodology and its application to the g.711.1 codecSebastian Möller, Nicolas Côté, Atsuko Kurashima, Noritsugu Egi, Akira Takahashi. 2563-2566 [doi]
- SUXES - user experience evaluation method for spoken and multimodal interactionMarkku Turunen, Jaakko Hakulinen, Aleksi Melto, Tomi Heimonen, Tuuli Laivo, Juho Hella. 2567-2570 [doi]
- Results of the n-best 2008 dutch speech recognition evaluationDavid A. van Leeuwen, Judith M. Kessens, Eric Sanders, Henk van den Heuvel. 2571-2574 [doi]
- SHoUT, the university of twente submission to the n-best 2008 speech recognition evaluation for dutchMarijn Huijbregts, Roeland Ordelman, Laurens van der Werff, Franciska M. G. de Jong. 2575-2578 [doi]
- NIST 2008 speaker recognition evaluation: performance across telephone and room microphone channelsAlvin F. Martin, Craig S. Greenberg. 2579-2582 [doi]
- The ester 2 evaluation campaign for the rich transcription of French radio broadcastsSylvain Galliano, Guillaume Gravier, Laura Chaubard. 2583-2586 [doi]
- Differential vector quantization of feature vectors for distributed speech recognitionJose Enrique Garcia, Alfonso Ortega, Antonio Miguel, Eduardo Lleida. 2587-2590 [doi]
- Arithmetic coding of sub-band residuals in FDLP speech/audio codecPetr Motlícek, Sriram Ganapathy, Hynek Hermansky. 2591-2594 [doi]
- Pitch variation estimationTomas Bäckström, Stefan Bayer, Sascha Disch. 2595-2598 [doi]
- Soft decision-based acoustic echo suppression in a frequency domainYun-Sik Park, Ji-Hyun Song, Jae Hun Choi, Joon-Hyuk Chang. 2599-2602 [doi]
- Fine-granular scalable MELP coder based on embedded vector quantizationMouloud Djamah, Douglas D. O Shaughnessy. 2603-2606 [doi]
- Joint quantization strategies for low bit-rate sinusoidal codingEmre Unver, Stephane Villette, Ahmet M. Kondoz. 2607-2610 [doi]
- Steganographic band width extension for the AMR codec of low-bit-rate modesAkira Nishimura. 2611-2614 [doi]
- Ultra low bit-rate speech coding based on unit-selection with joint spectral-residual quantization: no transmission of any residual informationV. Ramasubramanian, D. Harish. 2615-2618 [doi]
- On the cost of backward compatibility for communication codecsKonstantin Schmidt, Markus Schnell, Nikolaus Rettelbach, Manfred Lutzky, Jochen Issing. 2619-2622 [doi]
- A media-specific FEC based on huffman coding for distributed speech recognitionYoung Han Lee, Hong Kook Kim. 2623-2626 [doi]
- HMM adaptation and voice conversion for the synthesis of child speech: a comparisonOliver Watts, Junichi Yamagishi, Simon King, Kay Berkling. 2627-2630 [doi]
- HMM-based speaker characteristics emphasis using average voice modelTakashi Nose, Junichi Adada, Takao Kobayashi. 2631-2634 [doi]
- An evaluation methodology for prosody transformation systems based on chirp signalsDamien Lolive, Nelly Barbot, Olivier Boëffard. 2635-2638 [doi]
- Voice morphing based on interpolation of vocal tract area functions using AR-HMM analysis of speechYoshiki Nambu, Masahiko Mikawa, Kazuyo Tanaka. 2639-2642 [doi]
- A novel model-based pitch conversion method for Mandarin speechHsin-Te Hwang, Chen-Yu Chiang, Po-Yi Sung, Sin-Horng Chen. 2643-2646 [doi]
- Observation of empirical cumulative distribution of vowel spectral distances and its application to vowel based voice conversionHideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino. 2647-2650 [doi]
- Japanese pitch conversion for voice morphing based on differential modelingRyuki Tachibana, Zhiwei Shuang, Masafumi Nishimura. 2651-2654 [doi]
- A novel technique for voice conversion based on style and content decomposition with bilinear modelsVictor Popa, Jani Nurminen, Moncef Gabbouj. 2655-2658 [doi]
- Rule-based voice quality variation with formant synthesisFelix Burkhardt. 2659-2662 [doi]
- Multiple text segmentation for statistical language modelingSopheap Seng, Laurent Besacier, Brigitte Bigi, Eric Castelli. 2663-2666 [doi]
- Measuring tagging performance of a joint language modelDenis Filimonov, Mary P. Harper. 2667-2670 [doi]
- Improved language modelling using bag of word pairsLangzhou Chen, K. K. Chin, Kate Knill. 2671-2674 [doi]
- Morphological analysis and decomposition for Arabic speech-to-text systemsFrank Diehl, Mark J. F. Gales, Marcus Tomalin, Philip C. Woodland. 2675-2678 [doi]
- Investigating the use of morphological decomposition and diacritization for improving Arabic LVCSRAmr El-Desoky, Christian Gollan, David Rybach, Ralf Schlüter, Hermann Ney. 2679-2682 [doi]
- Topic dependent language model based on topic voting on noun historyWelly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa. 2683-2686 [doi]
- Investigation of morph-based speech recognition improvements across speech genresPéter Mihajlik, Balázs Tarján, Zoltán Tüske, Tibor Fegyó. 2687-2690 [doi]
- Effective use of pause information in language modelling for speech recognitionKengo Ohta, Masatoshi Tsuchiya, Seiichi Nakagawa. 2691-2694 [doi]
- A parallel training algorithm for hierarchical pitman-yor process language modelsSongfang Huang, Steve Renals. 2695-2698 [doi]
- Probabilistic and possibilistic language models based on the world wide webStanislas Oger, Vladimir Popescu, Georges Linarès. 2699-2702 [doi]
- Classification-based strategies for combining multiple 5-w question answering systemsSibel Yaman, Dilek Hakkani-Tür, Gökhan Tür, Ralph Grishman, Mary P. Harper, Kathleen McKeown, Adam Meyers, Kartavya Sharma. 2703-2706 [doi]
- Combining semantic and syntactic information sources for 5-w question answeringSibel Yaman, Dilek Hakkani-Tür, Gökhan Tür. 2707-2710 [doi]
- Phrase and word level strategies for detecting appositions in speechBenoît Favre, Dilek Hakkani-Tür. 2711-2714 [doi]
- Error correction of proportions in spoken opinion surveysNathalie Camelin, Renato de Mori, Frédéric Béchet, Géraldine Damnati. 2715-2718 [doi]
- Transformation-based learning for semantic parsingFilip Jurcícek, Milica Gasic, Simon Keizer, François Mairesse, Blaise Thomson, Kai Yu, Steve Young. 2719-2722 [doi]
- Large-scale Polish SLUPatrick Lehnen, Stefan Hahn, Hermann Ney, Agnieszka Mykowiecka. 2723-2726 [doi]
- Optimizing CRFs for SLU tasks in various languages using modified training criteriaStefan Hahn, Patrick Lehnen, Georg Heigold, Hermann Ney. 2727-2730 [doi]
- Learning lexicons from spoken utterances based on statistical model selectionRyo Taguchi, Naoto Iwahashi, Takashi Nose, Kotaro Funakoshi, Mikio Nakano. 2731-2734 [doi]
- Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding modelsMasaki Katsumaru, Mikio Nakano, Kazunori Komatani, Kotaro Funakoshi, Tetsuya Ogata, Hiroshi G. Okuno. 2735-2738 [doi]
- Low-cost call type classification for contact center calls using partial transcriptsYoungja Park, Wilfried Teiken, Stephen C. Gates. 2739-2742 [doi]
- A new quality measure for topic segmentation of text and speechMehryar Mohri, Pedro Moreno, Eugene Weinstein. 2743-2746 [doi]
- Concept segmentation and labeling for conversational speechMarco Dinarelli, Alessandro Moschitti, Giuseppe Riccardi. 2747-2750 [doi]
- A noise-type and level-dependent MPO-based speech enhancement architecture with variable frame analysis for noise-robust speech recognitionVikramjit Mitra, Bengt J. Borgstrom, Carol Y. Espy-Wilson, Abeer Alwan. 2751-2754 [doi]
- Complementarity of MFCC, PLP and Gabor features in the presence of speech-intrinsic variabilitiesBernd T. Meyer, Birger Kollmeier. 2755-2758 [doi]
- Noise robustness of tract variables and their application to speech recognitionVikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, Louis Goldstein. 2759-2762 [doi]
- Articulatory phonological code for word classificationXiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis Goldstein, Elliot Saltzman. 2763-2766 [doi]
- Robust keyword spotting with rapidly adapting point process modelsAren Jansen, Partha Niyogi. 2767-2770 [doi]
- Automatically rating pronunciation through articulatory phonologyJoseph Tepperman, Louis Goldstein, Sungbok Lee, Shrikanth S. Narayanan. 2771-2774 [doi]
- Learning the structure of human-computer and human-human dialogsDavid Griol, Giuseppe Riccardi, Emilio Sanchis. 2775-2778 [doi]
- Pause and gap length in face-to-face interactionJens Edlund, Mattias Heldner, Julia Hirschberg. 2779-2782 [doi]
- Modeling other talkers for improved dialog act recognition in meetingsKornel Laskowski, Elizabeth Shriberg. 2783-2786 [doi]
- A closer look at quality judgments of spoken dialog systemsKlaus-Peter Engelbrecht, Felix Hartard, Florian Gödde, Sebastian Möller. 2787-2790 [doi]
- New methods for the analysis of repeated utterancesGeoffrey Zweig. 2791-2794 [doi]
- The effects of different voices for speech-based in-vehicle interfaces: impact of young and old voices on driving performance and attitudeIng-Marie Jonsson, Nils Dahlbäck. 2795-2798 [doi]
- In search of non-uniqueness in the acoustic-to-articulatory mappingG. Ananthakrishnan, Daniel Neiberg, Olov Engwall. 2799-2802 [doi]
- Estimation of articulatory gesture patterns from speech acousticsPrasanta Kumar Ghosh, Shrikanth S. Narayanan, Pierre L. Divenyi, Louis Goldstein, Elliot Saltzman. 2803-2806 [doi]
- Formant trajectories for acoustic-to-articulatory inversionI. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler. 2807-2810 [doi]
- A robust variational method for the acoustic-to-articulatory problemBlaise Potard, Yves Laprie. 2811-2814 [doi]
- Comparison of vowel structures of Japanese and English in articulatory and auditory spacesJianwu Dang, Mark Tiede, Jiahong Yuan. 2815-2818 [doi]
- Static and dynamic modulation spectrum for speech recognitionSriram Ganapathy, Samuel Thomas, Hynek Hermansky. 2823-2826 [doi]
- 2-d processing of speech for multi-pitch analysisTianyu T. Wang, Thomas F. Quatieri. 2827-2830 [doi]
- A correlation-maximization denoising filter used as an enhancement frontend for noise robust bird call classificationWei Chu, Abeer Alwan. 2831-2834 [doi]
- Preliminary inversion mapping results with a new EMA corpusKorin Richmond. 2835-2838 [doi]
- Time-varying autoregressive tests for multiscale speech analysisDaniel Rudoy, Thomas F. Quatieri, Patrick J. Wolfe. 2839-2842 [doi]
- Audio keyword extraction by unsupervised word discoveryArmando Muscariello, Guillaume Gravier, Frédéric Bimbot. 2843-2846 [doi]
- ASR corpus design for resource-scarce languagesEtienne Barnard, Marelie H. Davel, Charl Johannes van Heerden. 2847-2850 [doi]
- Pronunciation dictionary development in resource-scarce environmentsMarelie H. Davel, Olga Martirosian. 2851-2854 [doi]
- XTrans: a speech annotation and transcription toolMeghan Lammie Glenn, Stephanie Strassel, Haejoong Lee. 2855-2858 [doi]
- How to select a good training-data subset for transcription: submodular active selection for sequencesHui Lin, Jeff Bilmes. 2859-2862 [doi]
- Improving acceptability assessment for the labelling of affective speech corporaZoraida Callejas, Ramón López-Cózar. 2863-2866 [doi]
- The broadcast narrow band speech corpus: a new resource type for large scale language recognitionChristopher Cieri, Linda Brandschain, Abby Neely, David Graff, Kevin Walker, Chris Caruso, Alvin F. Martin, Craig S. Greenberg. 2867-2870 [doi]
- Model-based automatic evaluation of L2 learner s English timingChatchawarn Hansakunbuntheung, Hiroaki Kato, Yoshinori Sagisaka. 2871-2874 [doi]
- A Bayesian approach to non-intrusive quality assessment of speechPetko N. Petkov, Iman S. Mossavat, W. Bastiaan Kleijn. 2875-2878 [doi]
- Precision of phoneme boundaries derived using hidden Markov modelsLadan Baghai-Ravary, Greg Kochanski, John Coleman. 2879-2882 [doi]
- A novel method for epoch extraction from speech signalsLakshmish Kaushik, Douglas D. O Shaughnessy. 2883-2886 [doi]
- LS regularization of group delay features for speaker recognitionJia Min Karen Kua, Julien Epps, Eliathamby Ambikairajah, Eric H. C. Choi. 2887-2890 [doi]
- Glottal closure and opening instant detection from speech signalsThomas Drugman, Thierry Dutoit. 2891-2894 [doi]
- A novel codebook search technique for estimating the open quotientYen-Liang Shue, Jody Kreiman, Abeer Alwan. 2895-2898 [doi]
- Long term examination of intra-session and inter-session speaker variabilityAaron D. Lawson, A. R. Stauffer, Brett Y. Smolenski, B. B. Pokines, M. Leonard, Edward J. Cupples. 2899-2902 [doi]
- Distorted visual information influences audiovisual perception of voicingRagnhild Eg, Dawn M. Behne. 2903-2906 [doi]
- Perceived naturalness of a synthesizer of disordered voicesSamia Fraj, Francis Grenez, Jean Schoentgen. 2907-2910 [doi]
- Audio-visual speech asynchrony modeling in a talking headAlexey Karpov, Liliya Tsirulnik, Zdenek Krnoul, Andrey Ronzhin, Boris Lobanov, Milos Zelezný. 2911-2914 [doi]
- The effects of fundamental frequency and formant space on speaker discrimination through bone-conducted ultrasonic hearingTakayuki Kagomiya, Seiji Nakagawa. 2915-2918 [doi]
- Automatic detection and prediction of topic changes through automatic detection of register variations and pause durationCéline De Looze, Stéphane Rauzy. 2919-2922 [doi]
- Analyzing features for automatic age estimation on cross-sectional dataWerner Spiegl, Georg Stemmer, Eva Lasarcyk, Varada Kolhatkar, Andrew Cassidy, Blaise Potard, Stephen Shum, Young Chol Song, Puyang Xu, Peter Beyerlein, James D. Harnsberger, Elmar Nöth. 2923-2926 [doi]
- Intercultural differences in evaluation of pathological voice quality: perceptual and acoustical comparisons between RASATI and GRBASI scalesEmi Juliana Yamauchi, Satoshi Imaizumi, Hagino Maruyama, Tomoyuki Haji. 2927-2930 [doi]
- F0 cues for the discourse functions of hã in hindiKalika Bali. 2931-2934 [doi]
- Audio spatialisation strategies for multitasking during teleconferencesStuart N. Wrigley, Simon Tucker, Guy J. Brown, Steve Whittaker. 2935-2938 [doi]
- Speech rate effects on linguistic changeAlexsandro R. Meireles, Plínio A. Barbosa. 2939-2942 [doi]
- Mandarin spontaneous narrative planning - prosodic evidence from national taiwan university lecture corpusChiu-yu Tseng, Zhao-yu Su, Lin-Shan Lee. 2943-2946 [doi]
- Investigation into bottle-neck features for meeting speech recognitionFrantisek Grézl, Martin Karafiát, Lukás Burget. 2947-2950 [doi]
- Multi-stream to many-stream: using spectro-temporal features for ASRSherry Y. Zhao, Suman V. Ravuri, Nelson Morgan. 2951-2954 [doi]
- Tandem representations of spectral envelope and modulation frequency features for ASRSamuel Thomas, Sriram Ganapathy, Hynek Hermansky. 2955-2958 [doi]
- Entropy-based feature analysis for speech recognitionPanji Setiawan, Harald Höge, Tim Fingscheidt. 2959-2962 [doi]
- Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR systemFabio Valente, Mathew Magimai-Doss, Christian Plahl, Suman V. Ravuri. 2963-2966 [doi]
- Hill-climbing feature selection for multi-stream ASRDavid Gelbart, Nelson Morgan, Alexey Tsymbal. 2967-2970 [doi]
- Robust F0 estimation based on log-time scale autocorrelation and its application to Mandarin tone recognitionYusuke Kida, Masaru Sakai, Takashi Masuko, Akinori Kawamura. 2971-2974 [doi]
- Invariant-integration method for robust feature extraction in speaker-independent speech recognitionFlorian Müller, Alfred Mertins. 2975-2978 [doi]
- Discriminative feature transformation using output coding for speech recognitionOmid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li. 2979-2982 [doi]
- Discriminant spectrotemporal features for phoneme recognitionNima Mesgarani, Garimella S. V. S. Sivaram, Sridhar Krishna Nemala, Mounya Elhilali, Hynek Hermansky. 2983-2986 [doi]
- Auditory model based optimization of MFCCs improves automatic speech recognition performanceSaikat Chatterjee, Christos Koniaris, W. Bastiaan Kleijn. 2987-2990 [doi]
- Pronunciation-based ASR for namesHenk van den Heuvel, Bert Réveil, Jean-Pierre Martens. 2991-2994 [doi]
- How speaker tongue and name source language affect the automatic recognition of spoken namesBert Réveil, Jean-Pierre Martens, Bart D hoore. 2995-2998 [doi]
- Online generation of acoustic models for multilingual speech recognitionMartin Raab, Guillermo Aradilla, Rainer Gruhn, Elmar Nöth. 2999-3002 [doi]
- Basic speech recognition for spoken dialoguesCharl Johannes van Heerden, Etienne Barnard, Marelie H. Davel. 3003-3006 [doi]
- Tonal articulatory feature for Mandarin and its application to conversational LVCSRQingqing Zhang, Jielin Pan, YongHong Yan. 3007-3010 [doi]
- Effects of language mixing for automatic recognition of Cantonese-English code-mixing utterancesHouwei Cao, P. C. Ching, Tan Lee. 3011-3014 [doi]
- A one-step tone recognition approach using MSD-HMM for continuous speechChangliang Liu, Fengpei Ge, Fuping Pan, Bin Dong, YongHong Yan. 3015-3018 [doi]
- Stream-based context-sensitive phone mapping for cross-lingual speech recognitionKhe Chai Sim, Haizhou Li. 3019-3022 [doi]
- Human translations guided language discovery for ASR systemsSebastian Stüker, Laurent Besacier, Alex Waibel. 3023-3026 [doi]
- The case for case-based automatic speech recognitionViktoria Maier, Roger K. Moore. 3027-3030 [doi]
- A self-labeling speech corpus: collecting spoken words with an online educational gameIan McGraw, Alexander Gruenstein, Andrew M. Sutherland. 3031-3034 [doi]
- A noise robust method for pattern discovery in quantized time series: the concept matrix approachOkko Johannes Räsänen, Unto Kalervo Laine, Toomas Altosaar. 3035-3038 [doi]
- Using parallel architectures in speech recognitionPatrick Cardinal, Pierre Dumouchel, Gilles Boulianne. 3039-3042 [doi]
- Example-based speech recognition using formulaic phrasesChristopher J. Watkins, Stephen J. Cox. 3043-3046 [doi]
- Parallel fast likelihood computation for LVCSR using mixture decompositionNaveen Parihar, Ralf Schlüter, David Rybach, Eric A. Hansen. 3047-3050 [doi]
- An indexing weight for voice-to-text searchChen Liu. 3051-3054 [doi]
- On invariant structural representation for speech recognition: theoretical validation and experimental improvementYu Qiao, Nobuaki Minematsu, Keikichi Hirose. 3055-3058 [doi]
- Articulatory feature asynchrony analysis and compensation in detection-based ASRI-Fan Chen, Hsin-Min Wang. 3059-3062 [doi]
- CRANDEM: conditional random fields for word recognitionJeremy Morris, Eric Fosler-Lussier. 3063-3066 [doi]
- HEAR: an hybrid episodic-abstract speech recognizerSébastien Demange, Dirk Van Compernolle. 3067-3070 [doi]