INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010

researchr

You are not signed in
Sign in
Sign up

Takao Kobayashi, Keikichi Hirose, Satoshi Nakamura, editors, INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010. ISCA, 2010.

Conference: interspeech2010

Abstract is missing.

Still talking to machines (cognitively speaking)Steve Young. 1-10 [doi]

Sound-based assistive technology supporting seeing , hearing and speaking for the disabled and the elderlyTohru Ifukube. 11-19 [doi]

Beyond sentence prosodyChiu-yu Tseng. 20-29 [doi]

A procedure for estimating gestural scores from natural speechHosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Y. Espy-Wilson, Mark Hasegawa-Johnson. 30-33 [doi]

On the interdependencies between voice quality, glottal gaps, and voice-source related acoustic measuresYen-Liang Shue, Gang Chen, Abeer Alwan. 34-37 [doi]

Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systemsHideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino. 38-41 [doi]

Phase equalization-based autoregressive model of speech signalsSadao Hiroya, Takemi Mochida. 42-45 [doi]

Articulatory-functional modeling of speech prosody: a reviewYi Xu, Santitham Prom-on. 46-49 [doi]

Two new estimation methods for a superpositional intonation modelHumberto M. Torres, Hansjörg Mixdorff, Jorge A. Gurlekian, Hartmut R. Pfitzinger. 50-53 [doi]

A discriminative splitting criterion for phonetic decision treesSimon Wiesler, Georg Heigold, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney. 54-57 [doi]

Canonical state models for automatic speech recognitionMark J. F. Gales, Kai Yu. 58-61 [doi]

Restructuring exponential family mixture modelsPierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen. 62-65 [doi]

Unsupervised discovery and training of maximally dissimilar cluster modelsFrançoise Beaufays, Vincent Vanhoucke, Brian Strope. 66-69 [doi]

Probabilistic state clustering using conditional random field for context-dependent acoustic modellingKhe Chai Sim. 70-73 [doi]

Integrate template matching and statistical modeling for speech recognitionXie Sun, Yunxin Zhao. 74-77 [doi]

Cross-lingual spoken language understanding from unaligned data using discriminative classification models and machine translationFabrice Lefèvre, François Mairesse, Steve Young. 78-81 [doi]

Techniques for topic detection based processing in spoken dialog systemsRajesh Balchandran, Leonid Rachevsky, Bhuvana Ramabhadran, Miroslav Novak. 82-85 [doi]

Optimizing spoken dialogue management with fitted value iterationSenthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. 86-89 [doi]

Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systemsFilip Jurcícek, Blaise Thomson, Simon Keizer, François Mairesse, Milica Gasic, Kai Yu, Steve Young. 90-93 [doi]

Is it possible to predict task completion in automated troubleshooters?Alexander Schmitt, Michael Scholz, Wolfgang Minker, Jackson Liscombe, David Suendermann. 94-97 [doi]

Minimally invasive surgery for spoken dialog systemsDavid Suendermann, Jackson Liscombe, Roberto Pieraccini. 98-101 [doi]

Detecting categorical perception in continuous discrimination dataPaul Boersma, Katerina Chládková. 102-105 [doi]

The interrelation between the stimulus range and the number of response categories in vowel categorizationTitia Benders, Paola Escudero. 106-109 [doi]

The relation between pitch perception preference and emotion identificationMarie Nilsenová, Martijn Goudbeek, Luuk Kempen. 110-113 [doi]

Competition in the perception of spoken Japanese wordsTakashi Otake, James M. McQueen, Anne Cutler. 114-117 [doi]

Influence of musical training on perception of L2 speechMakiko Sadakata, Lotte van der Zanden, Kaoru Sekiyama. 118-121 [doi]

Full body aero-tactile integration in speech perceptionDonald Derrick, Bryan Gick. 122-125 [doi]

Nucleus position within the intonation phrase: a typological study of English, Czech and HungarianTomás Dubeda, Katalin Mády. 126-129 [doi]

Focus-sensitive operator or focus inducer: always and onlyYong-cheol Lee, Satoshi Nambu. 130-133 [doi]

F::0:: declination in English and Mandarin broadcast news speechJiahong Yuan, Mark Liberman. 134-137 [doi]

Frequency of occurrence effects on pitch accent realisationKatrin Schweitzer, Michael Walsh, Bernd Möbius, Hinrich Schütze. 138-141 [doi]

On the automatic toBI accent type identification from dataCésar González Ferreras, Carlos Vivaracho-Pascual, David Escudero Mancebo, Valentín Cardeñoso-Payo. 142-145 [doi]

AutoBI - a tool for automatic toBI annotationAndrew Rosenberg. 146-149 [doi]

A classifier-based target cost for unit selection speech synthesis trained on perceptual dataVolker Strom, Simon King. 150-153 [doi]

Applying scalable phonetic context similarity in unit selection of concatenative text-to-speechWei Zhang, Xiaodong Cui. 154-157 [doi]

Speech database reduction method for corpus-based TTS systemMitsuaki Isogai, Hideyuki Mizuno. 158-161 [doi]

Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifierHeng Lu 0002, Zhen-Hua Ling, Si Wei, Li-Rong Dai, Ren-Hua Wang. 162-165 [doi]

Using robust viterbi algorithm and HMM-modeling in unit selection TTS to replace units of poor qualityHanna Silén, Elina Helander, Jani Nurminen, Konsta Koppinen, Moncef Gabbouj. 166-169 [doi]

Automatic detection of abnormal stress patterns in unit selection synthesisYeon-Jun Kim, Marc C. Beutnagel. 170-173 [doi]

Enhancements of viterbi search for fast unit selection synthesisDaniel Tihelka, Jirí Kala, Jindrich Matousek. 174-177 [doi]

Accurate pitch marking for prosodic modification of speech segmentsThomas Ewender, Beat Pfister. 178-181 [doi]

A novel hybrid approach for Mandarin speech synthesisShifeng Pan, Meng Zhang, Jianhua Tao. 182-185 [doi]

Modeling liaison in French by using decision treesJosafá de Jesus Aguiar Pontes, Sadaoki Furui. 186-189 [doi]

Improvement on plural unit selection and fusionJian Luan, Jian Li. 190-193 [doi]

Improving speech synthesis of machine translation outputAlok Parlikar, Alan W. Black, Stephan Vogel. 194-197 [doi]

Paraphrase generation to improve text-to-speech synthesisGhislain Putois, Jonathan Chevelu, Cédric Boidin. 198-201 [doi]

Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizerChang Woo Han, Shin Jae Kang, Chul-Min Lee, Nam Soo Kim. 202-205 [doi]

English spoken term detection in multilingual recordingsPetr Motlícek, Fabio Valente, Philip N. Garner. 206-209 [doi]

A hybrid approach to robust word lattice generation via acoustic-based word detectionIcksang Han, Chiyoun Park, Jeongmi Cho, Jeongsu Kim. 210-213 [doi]

Direct observation of pruning errors (DOPE): a search analysis toolVolker Steinbiss, Martin Sundermeyer, Hermann Ney. 214-217 [doi]

Direct construction of compact context-dependency transducers from dataDavid Rybach, Michael Riley. 218-221 [doi]

Incremental composition of static decoding graphs with label pushingMiroslav Novak. 222-225 [doi]

A novel path extension framework using steady segment detection for Mandarin speech recognitionZhanlei Yang, Wenju Liu. 226-229 [doi]

On the relation of Bayes risk, word error, and word posteriors in ASRRalf Schlüter, Markus Nußbaum-Thom, Hermann Ney. 230-233 [doi]

Time conditioned search in automatic speech recognition reconsideredDavid Nolden, Hermann Ney, Ralf Schlüter. 234-237 [doi]

Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent modelsSatoshi Kobashikawa, Taichi Asami, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi. 238-241 [doi]

A novel confidence measure based on marginalization of jointly estimated error cause probabilitiesAtsunori Ogawa, Atsushi Nakamura. 242-245 [doi]

Evaluation of a silent speech interface based on magnetic sensingRobin Hofe, Stephen R. Ell, Michael J. Fagan, James M. Gilbert, Phil D. Green, Roger K. Moore, Sergey I. Rybchenko. 246-249 [doi]

Advanced speech communication system for deaf peopleRubén San Segundo, Verénica López, Raquel Martín, Syaheerah L. Lutfi, Javier Ferreiros, Ricardo de Córdoba, José Manuel Pardo. 250-253 [doi]

Unsupervised acoustic model adaptation for multi-origin non native ASRSethserey Sam, Eric Castelli, Laurent Besacier. 254-257 [doi]

Speech-based automated cognitive status assessmentDilek Hakkani-Tür, Dimitra Vergyri, Gökhan Tür. 258-261 [doi]

Speech recognition with a seamlessly updated language model for real-time closed-captioningToru Imai, Shinichi Homma, Akio Kobayashi, Takahiro Oku, Shoei Sato. 262-265 [doi]

The comparison between the deletion-based methods and the mixing-based methods for audio CAPTCHA systemsTakuya Nishimoto, Takayuki Watanabe. 266-269 [doi]

Comparing mono- & multilingual acoustic seed models for a low e-resourced language: a case-study of luxembourgishMartine Adda-Decker, Lori Lamel, Natalie D. Snoeren. 270-273 [doi]

Manipulating treacheoesophageal speechR. J. J. H. van Son, Irene Jacobi, Frans Hilgers. 274-277 [doi]

Towards mixed language speech recognition systemsDavid Imseng, Hervé Bourlard, Mathew Magimai-Doss. 278-281 [doi]

Voice search for developmentEtienne Barnard, Johan Schalkwyk, Charl Johannes van Heerden, Pedro J. Moreno. 282-285 [doi]

Cross-cultural investigation of prosody in verbal feedback in interactional rapportGina-Anne Levow, Susan Duncan, Edward T. King. 286-289 [doi]

Multimodal speaker diarization using oriented optical flow histogramsMary Tai Knox, Gerald Friedland. 290-293 [doi]

Towards an ASR-free objective analysis of pathological speechCatherine Middag, Yvan Saeys, Jean-Pierre Martens. 294-297 [doi]

Session variability contrasts in the MARP corpusKeith W. Godin, John H. L. Hansen. 298-301 [doi]

Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted modelsKazuhiro Kondo, Yusuke Takano. 302-305 [doi]

Analysis of gender normalization using MLP and VTLN featuresThomas Schaaf, Florian Metze. 306-309 [doi]

Discovering an optimal set of minimally contrasting acoustic speech units: a point of focus for whole-word pattern matchingGuillaume Aimetti, Roger K. Moore, Louis ten Bosch. 310-313 [doi]

Improvements to the equal-parameter BIC for speaker diarizationThemos Stafylakis, Xavier Anguera. 314-317 [doi]

A multistream multiresolution framework for phoneme recognitionNima Mesgarani, Samuel Thomas, Hynek Hermansky. 318-321 [doi]

Cluster analysis of differential spectral envelopes on emotional speechGiampiero Salvi, Fabio Tesser, Enrico Zovato, Piero Cosi. 322-325 [doi]

Modeling pronunciation variation with context-dependent articulatory feature decision treesSam Bowman, Karen Livescu. 326-329 [doi]

Ungrounded independent non-negative factor analysisBhiksha Raj, Kevin W. Wilson, Alexander Krueger, Reinhold Haeb-Umbach. 330-333 [doi]

Signal interaction and the devil functionJohn R. Hershey, Peder A. Olsen, Steven J. Rennie. 334-337 [doi]

Semi-automated update of automatic transcription system for the Japanese national congressYuya Akita, Masato Mimura, Graham Neubig, Tatsuya Kawahara. 338-341 [doi]

Language model cross adaptation for LVCSR system combinationXunying Liu, Mark J. F. Gales, Philip C. Woodland. 342-345 [doi]

Large vocabulary continuous speech recognition using WFST-based linear classifier for structured dataShinji Watanabe, Takaaki Hori, Atsushi Nakamura. 346-349 [doi]

Accelerating hierarchical acoustic likelihood computation on graphics processorsPavel Kveton, Miroslav Novak. 350-353 [doi]

Search by voice in Mandarin ChineseJiulong Shan, Genqing Wu, Zhihong Hu, Xiliu Tang, Martin Jansche, Pedro J. Moreno. 354-357 [doi]

The AMIDA 2009 meeting transcription systemThomas Hain, Lukas Burget, John Dines, Philip N. Garner, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiát, Mike Lincoln, Vincent Wan. 358-361 [doi]

Simple and efficient speaker comparison using approximate KL divergenceWilliam M. Campbell, Zahi N. Karam. 362-365 [doi]

The IIR NIST SRE 2008 and 2010 summed channel speaker recognition systemsHanwu Sun, Bin Ma, Chien-Lin Huang, Trung Hieu Nguyen, Haizhou Li. 366-369 [doi]

Speaker characterization using long-term and temporal informationChien-Lin Huang, Hanwu Sun, Bin Ma, Haizhou Li. 370-373 [doi]

Score-level compensation of extreme speech duration variability in speaker verificationSergio Perez-Gomez, Daniel Ramos, Javier Gonzalez-Dominguez, Joaquin Gonzalez-Rodriguez. 374-377 [doi]

Speaker recognition experiments using connectionist transformation network featuresAlberto Abad, Isabel Trancoso. 378-381 [doi]

Speaker recognition using supervised probabilistic principal component analysisYun Lei, John H. L. Hansen. 382-385 [doi]

A factorial sparse coder model for single channel source separationRobert Peharz, Michael Stark, Franz Pernkopf, Yannis Stylianou. 386-389 [doi]

Oriented PCA method for blind speech separation of convolutive mixturesYasmina Benabderrahmane, Sid-Ahmed Selouani, Douglas D. O Shaughnessy. 390-393 [doi]

Online Gaussian process for nonstationary speech separationHsin-Lung Hsieh, Jen-Tzung Chien. 394-397 [doi]

Convexity and fast speech extraction by split bregman methodMeng Yu, Wenye Ma, Jack Xin, Stanley Osher. 398-401 [doi]

Reducing musical noise in blind source separation by time-domain sparse filters and split bregman methodWenye Ma, Meng Yu, Jack Xin, Stanley Osher. 402-405 [doi]

Combining monaural and binaural evidence for reverberant speech segregationJohn Woodruff, Rohit Prabhavalkar, Eric Fosler-Lussier, DeLiang Wang. 406-409 [doi]

Speaker and language adaptive training for HMM-based polyglot speech synthesisHeiga Zen. 410-413 [doi]

Context adaptive training with factorized decision trees for HMM-based speech synthesisKai Yu, Heiga Zen, François Mairesse, Steve Young. 414-417 [doi]

Roles of the average voice in speaker-adaptive HMM-based speech synthesisJunichi Yamagishi, Oliver Watts, Simon King, Bela Usabaev. 418-421 [doi]

An HMM trajectory tiling (HTT) approach to high quality TTSYao Qian, Zhi-Jie Yan, Yi-Jian Wu, Frank K. Soong, Xin Zhuang, Shengyi Kong. 422-425 [doi]

A perceptual study of acceleration parameters in HMM-based TTSYining Chen, Zhi-Jie Yan, Frank K. Soong. 426-429 [doi]

Evaluation of prosodic contextual factors for HMM-based speech synthesisShuji Yokomizo, Takashi Nose, Takao Kobayashi. 430-433 [doi]

Learning words and speech units through natural interactionsJonas Hörnstein, José Santos-Victor. 434-437 [doi]

Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancementQingju Liu, Wenwu Wang, Philip J. B. Jackson. 438-441 [doi]

Speech estimation in non-stationary noise environments using timing structures between mouth movements and sound signalsHiroaki Kawashima, Yu Horii, Takashi Matsuyama. 442-445 [doi]

Synthesizing photo-real talking head via trajectory-guided sample selectionLijuan Wang, Xiaojun Qian, Wei Han, Frank K. Soong. 446-449 [doi]

Silent vs vocalized articulation for a portable ultrasound-based silent speech interfaceVictoria M. Florescu, Lise Crevier-Buchman, Bruce Denby, Thomas Hueber, Antonia Colazo-Simon, Claire Pillot-Loiseau, Pierre Roussel-Ragot, Cédric Gendrot, Sophie Quattrocchi. 450-453 [doi]

Comparison of HMM and TMDN methods for lip synchronisationGregor Hofer, Korin Richmond. 454-457 [doi]

Rhythm and formant features for automatic alcohol detectionFlorian Schiel, Christian Heinrich, Veronika Neumeyer. 458-461 [doi]

An exploration of voice source correlates of focusIrena Yanushevskaya, Christer Gobl, John Kane, Ailbhe Ní Chasaide. 462-465 [doi]

Modeling perceived vocal age in american EnglishJames D. Harnsberger, Rahul Shrivastav, W. S. Brown Jr.. 466-469 [doi]

Multivariate analysis of vocal fatigue in continuous readingMarie-José Caraty, Claude Montacié. 470-473 [doi]

Frequency-domain delexicalization using surrogate vowelsAlexander Kain, Jan P. H. van Santen. 474-477 [doi]

Emotion recognition using imperfect speech recognitionFlorian Metze, Anton Batliner, Florian Eyben, Tim Polzehl, Björn Schuller, Stefan Steidl. 478-481 [doi]

A novel feature extraction strategy for multi-stream robust emotion identificationGang Liu, Yun Lei, John H. L. Hansen. 482-485 [doi]

Setup for acoustic-visual speech synthesis by concatenating bimodal unitsAsterios Toutios, Utpala Musti, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger. 486-489 [doi]

Towards affective state modeling in narrative and conversational settingsBart Jochems, Martha Larson, Roeland Ordelman, Ronald Poppe, Khiet P. Truong. 490-493 [doi]

Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterancesNarichika Nomoto, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi. 494-497 [doi]

Gesture and speech coordination: the influence of the relationship between manual gesture and speechBenjamin Roustan, Marion Dohen. 498-501 [doi]

Analysis and detection of cognitive load and frustration in drivers speechHynek Boril, Seyed Omid Sadjadi, Tristan Kleinschmidt, John H. L. Hansen. 502-505 [doi]

Acoustic-based recognition of head gestures accompanying speechAkira Sasou, Yasuharu Hashimoto, Katsuhiko Sakaue. 506-509 [doi]

Multimodal dialog in the car: combining speech and turn-and-push dial to control comfort functionsSandro Castronovo, Angela Mahr, Margarita Pentcheva, Christian A. Müller. 510-513 [doi]

Hands free audio analysis from home entertainmentDanil Korchagin, Philip N. Garner, Petr Motlícek. 514-517 [doi]

Affective story teller: a TTS system for emotional expressivityShaikh Mostafa Al Masum, Antonio Rui Ferreira Rebordão, Keikichi Hirose. 518-521 [doi]

Enhancing children s speech recognition under mismatched condition by explicit acoustic normalizationShweta Ghai, Rohit Sinha. 522-525 [doi]

Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systemsBo Li, Khe Chai Sim. 526-529 [doi]

Augmentation of adaptation dataRavichander Vipperla, Steve Renals, Joe Frankel. 530-533 [doi]

Discriminative adaptation based on fast combination of DMAP and dfMLLRLukás Machlica, Zbynek Zajíc, Ludek Müller. 534-537 [doi]

Revisiting VTLN using linear transformation on conventional MFCCDoddipatla Rama Sanand, Ralf Schlüter, Hermann Ney. 538-541 [doi]

Speaker adaptation based on nonlinear spectral transform for speech recognitionToyohiro Hayashi, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda. 542-545 [doi]

Speaker adaptation based on system combination using speaker-class modelsTetsuo Kosaka, Takashi Ito, Masaharu Katoh, Masaki Kohda. 546-549 [doi]

Speaker adaptation in transformation space using two-dimensional PCAYongwon Jeong, Young Rok Song, Hyung Soon Kim. 550-553 [doi]

On speaker adaptive training of artificial neural networksJan Trmal, Jan Zelinka, Ludek Müller. 554-557 [doi]

Model synthesis for band-limited speech recognitionYongjun He, Jiqing Han. 558-561 [doi]

Performance estimation of reverberant speech recognition based on reverberant criteria RSR-d::n:: with acoustic parametersTakahiro Fukumori, Masanori Morise, Takanobu Nishiura. 562-565 [doi]

A novel approach for matched reverberant training of HMMs using data pairsArmin Sehr, Christian Hofmann, Roland Maas, Walter Kellermann. 566-569 [doi]

An auditory based modulation spectral feature for reverberant speech recognitionHari Krishna Maganti, Marco Matassoni. 570-573 [doi]

On the potential of channel selection for recognition of reverberated speech with multiple microphonesMartin Wolf, Climent Nadeu. 574-577 [doi]

An improved wavelet-based dereverberation for robust automatic speech recognitionRandy Gomez, Tatsuya Kawahara. 578-581 [doi]

Methods for robust speech recognition in reverberant environments: a comparisonRico Petrick, Thomas Fehér, Masashi Unoki, Rüdiger Hoffmann. 582-585 [doi]

Integration of multilayer regression analysis with structure-based pronunciation assessmentMasayuki Suzuki, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose. 586-589 [doi]

Using non-native error patterns to improve pronunciation verificationJoost van Doremalen, Catia Cucchiarini, Helmer Strik. 590-593 [doi]

Regularized-MLLR speaker adaptation for computer-assisted language learning systemDean Luo, Yu Qiao, Nobuaki Minematsu, Yutaka Yamauchi, Keikichi Hirose. 594-597 [doi]

Automatic evaluation of English pronunciation by Japanese speakers using various acoustic features and pattern recognition techniquesKuniaki Hirabayashi, Seiichi Nakagawa. 598-601 [doi]

Decision tree based tone modeling with corrective feedbacks for automatic Mandarin tone assessmentHsien-Cheng Liao, Jiang-Chun Chen, Sen-Chia Chang, Ying-Hua Guan, Chin-Hui Lee. 602-605 [doi]

CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second languageJingli Lu, Ruili Wang, Liyanage C. De Silva, Yang Gao, Jia Liu. 606-609 [doi]

Automatic reference independent evaluation of prosody quality using multiple knowledge fusionsShen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu. 610-613 [doi]

Landmark-based automated pronunciation error detectionSu-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat. 614-617 [doi]

HMM based TTS for mixed language textZhiwei Shuang, Shiyin Kang, Yong Qin, Li-Rong Dai, Lianhong Cai. 618-621 [doi]

An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptationHui Liang, John Dines. 622-625 [doi]

Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lecturesTatsuya Kawahara, Norihiro Katsumaru, Yuya Akita, Shinsuke Mori. 626-629 [doi]

Exploring web-browser based runtimes engines for creating ubiquitous speech interfacesPaul R. Dixon, Sadaoki Furui. 630-632 [doi]

Efficient three-stage pitch estimation for packet loss concealmentXuejing Sun, Sameer Gadre. 633-636 [doi]

On evaluation of the f::0:: estimation based on time-varying complex speech analysisKeiichi Funaki. 637-640 [doi]

Pitch estimation in noisy speech based on temporal accumulation of spectrum peaksFeng Huang, Tan Lee. 641-644 [doi]

Multi-pitch estimation by a joint 2-d representation of pitch and pitch dynamicsTianyu T. Wang, Thomas F. Quatieri. 645-648 [doi]

On the effect of fundamental frequency on amplitude and frequency modulation patterns in speech resonancesPirros Tsiakoulis, Alexandros Potamianos. 649-652 [doi]

Pitch determination using autocorrelation function in spectral domainM. Shahidur Rahman, Tetsuya Shimamura. 653-656 [doi]

Chirp complex cepstrum-based decomposition for asynchronous glottal analysisThomas Drugman, Thierry Dutoit. 657-660 [doi]

Exploiting glottal formant parameters for glottal inverse filtering and parameterizationAlan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle. 661-664 [doi]

Glottal parameters estimation on speech using the zeros of the z-transformNicolas Sturmel, Christophe d Alessandro, Boris Doval. 665-668 [doi]

Significance of pitch synchronous analysis for speaker recognition using AANN modelsSri Harish Reddy Mallidi, Kishore Prahallad, Suryakanth V. Gangashetty, B. Yegnanarayana. 669-672 [doi]

On using voice source measures in automatic gender classification of children s speechGang Chen, Xue Feng, Yen-Liang Shue, Abeer Alwan. 673-676 [doi]

Constructing Japanese test collections for spoken term detectionYoshiaki Itoh, Hiromitsu Nishizaki, Xinhui Hu, Hiroaki Nanjo, Tomoyosi Akiba, Tatsuya Kawahara, Seiichi Nakagawa, Tomoko Matsui, Yoichi Yamashita, Kiyoaki Aikawa. 677-680 [doi]

Japanese spoken term detection using syllable transition network derived from multiple speech recognizers outputsSatoshi Natori, Hiromitsu Nishizaki, Yoshihiro Sekiguchi. 681-684 [doi]

Combining Chinese spoken term detection systems via side-information conditioned linear logistic regressionSha Meng, Weiqiang Zhang, Jia Liu. 685-688 [doi]

Metric subspace indexing for fast spoken term detectionTaisuke Kaneko, Tomoyosi Akiba. 689-692 [doi]

Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warpingChun-an Chan, Lin-Shan Lee. 693-696 [doi]

Contextual verification for open vocabulary spoken term detectionDaniel Schneider, Timo Mertens, Martha Larson, Joachim Köhler. 697-700 [doi]

Augmented set of features for confidence estimation in spoken term detectionJavier Tejedor, Doroteo Torre Toledano, Miguel Bautista, Simon King, Dong Wang, José Colás. 701-704 [doi]

Cluster-based language model for spoken document retrieval using NMF-based document clusteringXinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura. 705-708 [doi]

Asymptotically exact noise-corrupted speech likelihoodsRogier C. van Dalen, Mark J. F. Gales. 709-712 [doi]

A MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagationRamón Fernandez Astudillo, Reinhold Orglmeister. 713-716 [doi]

Non-negative matrix factorization based compensation of music for automatic speech recognitionBhiksha Raj, Tuomas Virtanen, Sourish Chaudhuri, Rita Singh. 717-720 [doi]

Feature versus model based noise robustnessKris Demuynck, Xueru Zhang, Dirk Van Compernolle, Hugo Van Hamme. 721-724 [doi]

SNR-based mask compensation for computational auditory scene analysis applied to speech recognition in a car environmentJi Hun Park, Seon-Man Kim, Jae Sam Yoon, Hong Kook Kim, Sung Joo Lee, Yunkeun Lee. 725-728 [doi]

Automatic selection of thresholds for signal separation algorithms based on interaural delayChanwoo Kim, Richard M. Stern, Kiwan Eom, Jaewon Lee. 729-732 [doi]

Channel detectors for system fusion in the context of NIST LRE 2009Florian Verdet, Driss Matrouf, Jean-François Bonastre, Jean Hennebert. 733-736 [doi]

Selecting phonotactic features for language recognitionRong Tong, Bin Ma, Haizhou Li, Engsiong Chng. 737-740 [doi]

Improved language recognition using mixture components statisticsAbualsoud Hanani, Michael J. Carey 0002, Martin J. Russell. 741-744 [doi]

Using cross-decoder co-occurrences of phone n-grams in SVM-based phonotactic language recognitionMikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Germán Bordel. 745-748 [doi]

Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcriptionOscar Koller, Alberto Abad, Isabel Trancoso, Céu Viana. 749-752 [doi]

Dialect recognition using a phone-GMM-supervector-based SVM kernelFadi Biadsy, Julia Hirschberg, Michael Collins. 753-756 [doi]

Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT)Xiaojun Qian, Frank K. Soong, Helen M. Meng. 757-760 [doi]

Automatic pronunciation scoring using learning to rank and DP-based score segmentationLiang-Yu Chen, Jyh-Shing Roger Jang. 761-764 [doi]

Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training systemWai Kit Lo, Shuang Zhang, Helen M. Meng. 765-768 [doi]

Adapting a duration synthesis model to rate children s oral reading prosodyMinh Duong, Jack Mostow. 769-772 [doi]

Predicting word accuracy for the automatic speech recognition of non-native speechSu-Youn Yoon, Lei Chen, Klaus Zechner. 773-776 [doi]

A new approach for automatic tone error detection in strong accented Mandarin based on dominant setTaotao Zhu, Dengfeng Ke, Zhenbiao Chen, Bo Xu. 777-780 [doi]

Analysis of excitation source information in emotional speechS. R. M. Prasanna, D. Govind. 781-784 [doi]

Acoustic feature analysis in speech emotion primitives estimationDongrui Wu, Thomas D. Parsons, Shrikanth S. Narayanan. 785-788 [doi]

Spectro-temporal modulations for robust speech emotion recognitionLan-Ying Yeh, Tai-Shih Chi. 789-792 [doi]

Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couplesChi-Chun Lee, Matthew Black, Athanasios Katsamanis, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan. 793-796 [doi]

A cluster-profile representation of emotion using agglomerative hierarchical clusteringEmily Mower, Kyu Jeong Han, Sungbok Lee, Shrikanth S. Narayanan. 797-800 [doi]

Incremental acoustic valence recognition: an inter-corpus perspective on features, matching, and performance in a gating paradigmBjörn Schuller, Laurence Devillers. 801-804 [doi]

Sinusoidal model parameterization for HMM-based TTS systemSlava Shechtman, Alexander Sorin. 805-808 [doi]

Improved training of excitation for HMM-based parametric speech synthesisYoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Hisashi Kawai. 809-812 [doi]

Excitation modeling based on waveform interpolation for HMM-based speech synthesisJune Sig Sung, Doo Hwa Hong, Kyung Hwan Oh, Nam Soo Kim. 813-816 [doi]

Formant-based frequency warping for improving speaker adaptation in HMM TTSXin Zhuang, Yao Qian, Frank K. Soong, Yi-Jian Wu, Bo Zhang. 817-820 [doi]

Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesisHongwei Hu, Martin J. Russell. 821-824 [doi]

Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesisZhen-Hua Ling, Yu Hu, Li-Rong Dai. 825-828 [doi]

Autoregressive clustering for HMM speech synthesisMatt Shannon, William Byrne. 829-832 [doi]

An implementation of decision tree-based context clustering on graphics processing unitsNicholas Pilkington, Heiga Zen. 833-836 [doi]

Quantized HMMs for low footprint text-to-speech synthesisAlexander Gutkin, Xavi Gonzalvo, Stefan Breuer, Paul Taylor. 837-840 [doi]

The role of higher-level linguistic features in HMM-based speech synthesisOliver Watts, Junichi Yamagishi, Simon King. 841-844 [doi]

HMM-based singing voice synthesis system using pitch-shifted pseudo training dataAyami Mase, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda. 845-848 [doi]

An unsupervised approach to creating web audio contents-based HMM voicesJinfu Ni, Hisashi Kawai. 849-852 [doi]

Conversational spontaneous speech synthesis using average voice modelTomoki Koriyama, Takashi Nose, Takao Kobayashi. 853-856 [doi]

Mandarin digit recognition assisted by selective tone distinctionXiaodong Wang, Kunihiko Owa, Makoto Shozakai. 857-860 [doi]

Brazilian portuguese acoustic model training based on data borrowing from other languageKazuhiko Abe, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura. 861-864 [doi]

Rapid bootstrapping of five eastern european languages using the rapid language adaptation toolkitNgoc Thang Vu, Tim Schlippe, Franziska Kraus, Tanja Schultz. 865-868 [doi]

Cross-lingual speaker adaptation via Gaussian component mappingHouwei Cao, Tan Lee, P. C. Ching. 869-872 [doi]

Cross-lingual acoustic modeling for dialectal Arabic speech recognitionMohamed Elmahdy, Rainer Gruhn, Wolfgang Minker, Slim Abdennadher. 873-876 [doi]

Cross-lingual and multi-stream posterior features for low resource LVCSR systemsSamuel Thomas, Sriram Ganapathy, Hynek Hermansky. 877-880 [doi]

Latent perceptual mapping: a new acoustic modeling framework for speech recognitionShiva Sundaram, Jerome R. Bellegarda. 881-884 [doi]

Unsupervised model adaptation on targeted speech segments for LVCSR system combinationRichard Dufour, Fethi Bougares, Yannick Estève, Paul Deléglise. 885-888 [doi]

Incremental word learning using large-margin discriminative training and variance floor estimationIrene Ayllón Clemente, Martin Heckmann, Alexander Denecke, Britta Wrede, Christian Goerick. 889-892 [doi]

State-based labelling for a sparse representation of speech and its application to robust speech recognitionTuomas Virtanen, Jort F. Gemmeke, Antti Hurmalainen. 893-896 [doi]

Similarity scoring for recognizing repeated out-of-vocabulary wordsMirko Hannemann, Stefan Kombrink, Martin Karafiát, Lukas Burget. 897-900 [doi]

Data pruning for template-based automatic speech recognitionDino Seppi, Dirk Van Compernolle. 901-904 [doi]

Speaking style dependency of formant targetsAkiko Amano-Kusumoto, John-Paul Hosom, Alexander Kain. 905-908 [doi]

Similarity of effects of emotions on the speech organ configuration with and without speakingTatsuya Kitamura. 909-912 [doi]

A study of intra-speaker and inter-speaker affective variability using electroglottograph and inverse filtered glottal waveformsDaniel Bone, Samuel Kim, Sungbok Lee, Shrikanth S. Narayanan. 913-916 [doi]

Modal analysis of vocal fold vibrations using laryngotopographyKen-Ichi Sakakibara, Hiroshi Imagawa, Miwako Kimura, Hisayuki Yokonishi, Niro Tayama. 917-920 [doi]

Laryngeal voice quality in the expression of focusMartti Vainio, Matti Airas, Juhani Järvikivi, Paavo Alku. 921-924 [doi]

Laryngeal characteristics during the production of geminate consonantsMasako Fujimoto, Kikuo Maekawa, Seiya Funatsu. 925-928 [doi]

Numerical study of turbulent flow-induced sound production in presence of a tooth-shaped obstacle: towards sibilant [s] physical modelingJulien Cisonni, Kazunori Nozaki, Annemie Van Hirtum, Shigeo Wada. 929-932 [doi]

Morphological and predictability effects on schwa reduction: the case of dutch word-initial syllablesIris Hanique, Barbara Schuppler, Mirjam Ernestus. 933-936 [doi]

Acoustic-to-articulatory inversion based on local regressionSamer Al Moubayed, G. Ananthakrishnan. 937-940 [doi]

Korean lenis, fortis, and aspirated stops: effect of place of articulation on acoustic realizationMirjam Broersma. 941-944 [doi]

Speech synthesis by modeling harmonics structure with multiple functionToru Nakashika, Ryuki Tachibana, Masafumi Nishimura, Tetsuya Takiguchi, Yasuo Ariki. 945-948 [doi]

Physics of body-conducted silent speech - production, propagation and representation of non-audible murmurMakoto Otani, Tatsuya Hirahara. 949-952 [doi]

Multichannel noise reduction using low order RTF estimateSubhojit Chakladar, Nam Soo Kim, Yu Gwang Jin, Tae Gyoon Kang. 953-956 [doi]

Reinforced blocking matrix with cross channel projection for speech enhancementInho Lee, Jongsung Yoon, Yoonjae Lee, Hanseok Ko. 957-960 [doi]

Masking property based microphone array post-filter designNing Cheng, Wenju Liu, Lan Wang. 961-964 [doi]

Reduction of broadband noise in speech signals by multilinear subspace analysisYusuke Sato, Tetsuya Hoya, Hovagim Bakardjian, Andrzej Cichocki. 965-968 [doi]

Novel probabilistic control of noise reduction for improved microphone array beamformingJungpyo Hong, Seung Ho Han, Sangbae Jeong, Minsoo Hahn. 969-972 [doi]

Speech enhancement using improved generalized sidelobe canceller in frequency domain with multi-channel postfilteringKai Li, Qiang Fu, YongHong Yan. 973-976 [doi]

Close speaker cancellation for suppression of non-stationary background noise for hands-free speech interfaceJani Even, Carlos Toshinori Ishi, Hiroshi Saruwatari, Norihiro Hagita. 977-980 [doi]

Multi-channel iterative dereverberation based on codebook constrained iterative multi-channel wiener filterAjay Srinivasamurthy, Thippur V. Sreenivas. 981-984 [doi]

Speaker-dependent mapping of source and system features for enhancement of throat microphone speechAnand Joseph Xavier Medabalimi, Sri Harish Reddy Mallidi, B. Yegnanarayana. 985-988 [doi]

An analytic modeling approach to enhancing throat microphone speech commands for keyword spottingJun Cai, Stefano Marini, Pierre Malarme, Francis Grenez, Jean Schoentgen. 989-992 [doi]

Single-channel speech enhancement using kalman filtering in the modulation domainStephen So, Kamil K. Wójcicki, Kuldip K. Paliwal. 993-996 [doi]

Integrated feedback and noise reduction algorithm in digital hearing aids via oscillation detectionMiao Yao, Weiqian Liang. 997-1000 [doi]

A blind signal-to-noise ratio estimator for high noise speech recordingsCharles Mercier, Roch Lefebvre. 1001-1004 [doi]

Estimation of glottal area function using stereo-endoscopic high-speed digital imagingHiroshi Imagawa, Ken-Ichi Sakakibara, Isao T. Tokuda, Mamiko Otsuka, Niro Tayama. 1005-1008 [doi]

Toward aero-acoustical analysis of the sibilant /s/: an oral cavity modelingKazunori Nozaki, Youhei Ohnishi, Takashi Suda, Shigeo Wada, Shinji Shimojo. 1009-1012 [doi]

Effects of wall impedance on transmission and attenuation of higher-order modes in vocal-tract modelKunitoshi Motoki. 1013-1016 [doi]

Articulatory synthesis and perception of plosive-vowel syllables with virtual consonant targetsPeter Birkholz, Bernd J. Kröger, Christiane Neuschaefer-Rube. 1017-1020 [doi]

Speech robot mimicking human articulatory motionKotaro Fukui, Toshihiro Kusano, Yoshikazu Mukaeda, Yuto Suzuki, Atsuo Takanishi, Masaaki Honda. 1021-1024 [doi]

Mechanical vocal-tract models for speech dynamicsTakayuki Arai. 1025-1028 [doi]

Prosodic timing analysis for articulatory re-synthesis using a bank of resonators with an adaptive oscillatorMichael C. Brady. 1029-1032 [doi]

Decoding with shrinkage-based language modelsAhmad Emami, Stanley F. Chen, Abraham Ittycheriah, Hagen Soltau, Bing Zhao. 1033-1036 [doi]

Enhanced word classing for model MStanley F. Chen, Stephen M. Chu. 1037-1040 [doi]

Improved neural network based language modelling and adaptationJunho Park, Xunying Liu, Mark J. F. Gales, Philip C. Woodland. 1041-1044 [doi]

Recurrent neural network based language modelTomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernocký, Sanjeev Khudanpur. 1045-1048 [doi]

Discriminative language modeling using simulated ASR errorsPreethi Jyothi, Eric Fosler-Lussier. 1049-1052 [doi]

Learning a language model from continuous speechGraham Neubig, Masato Mimura, Shinsuke Mori, Tatsuya Kawahara. 1053-1056 [doi]

Looking for relevant features for speaker role recognitionBenjamin Bigot, Julien Pinquier, Isabelle Ferrane, Régine André-Obrecht. 1057-1060 [doi]

Prosodic speaker verification using subspace multinomial models with intersession compensationMarcel Kockmann, Lukas Burget, Ondrej Glembek, Luciana Ferrer, Jan Cernocký. 1061-1064 [doi]

The estimation and kernel metric of spectral correlation for text-independent speaker verificationEryu Wang, Kong-Aik Lee, Bin Ma, Haizhou Li, Wu Guo, Li-Rong Dai. 1065-1068 [doi]

Improving monaural speaker identification by double-talk detectionRahim Saeidi, Pejman Mowlaee, Tomi Kinnunen, Zheng-Hua Tan, Mads Græsbøll Christensen, Søren Holdt Jensen, Pasi Fränti. 1069-1072 [doi]

Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signalsB. Avinash, S. Guruprasad, B. Yegnanarayana. 1073-1076 [doi]

A fast implementation of factor analysis for speaker verificationQingsong Liu, Wei Huang, Dongxing Xu, Hongbin Cai, Beiqian Dai. 1077-1080 [doi]

Fast converging iterative kalman filtering for speech enhancement using long and overlapped tapered windows with large side lobe attenuationStephen So, Kuldip K. Paliwal. 1081-1084 [doi]

Robust noise estimation using minimum correction with harmonicity controlXuejing Sun, Kuan-Chieh Yen, Rogerio G. Alves. 1085-1088 [doi]

New insights into subspace noise trackingMahdi Triki. 1089-1092 [doi]

Bias considerations for minimum subspace noise trackingMahdi Triki, Kees Janse. 1093-1096 [doi]

A corpus-based approach to speech enhancement from nonstationary noiseJi Ming, Ramji Srinivasan, Danny Crookes. 1097-1100 [doi]

Bandwidth expansion of speech based on wavelet transform modulus maxima vector mappingZhe Chen, You-Chi Cheng, Fuliang Yin, Chin-Hui Lee. 1101-1104 [doi]

Hidden Markov models with context-sensitive observations for grapheme-to-phoneme conversionUdochukwu Kalu Ogbureke, Peter Cahill, Julie Carson-Berndsen. 1105-1108 [doi]

Evaluating a dialog language generation system: comparing the mountain system to other NLG approachesBrian Langner, Stephan Vogel, Alan W. Black. 1109-1112 [doi]

Active appearance models for photorealistic visual speech synthesisWesley Mattheyses, Lukas Latacz, Werner Verhelst. 1113-1116 [doi]

Latent affective mapping: a novel framework for the data-driven analysis of emotion in textJerome R. Bellegarda. 1117-1120 [doi]

Native and non-native speaker judgements on the quality of synthesized speechAnna C. Janska, Robert A. J. Clark. 1121-1124 [doi]

Machine learning for text selection with expressive unit-selection voicesDominic Espinosa, Michael White, Eric Fosler-Lussier, Chris Brew. 1125-1128 [doi]

Acoustic correlates of meaning structure in conversational speechAlexei V. Ivanov, Giuseppe Riccardi, S. Ghosh, Sara Tonelli, Evgeny A. Stepanov. 1129-1132 [doi]

HMM-based prosodic structure model using rich linguistic contextNicolas Obin, Xavier Rodet, Anne Lacheret. 1133-1136 [doi]

Audiovisual congruence and pragmatic focus markingCharlotte Wollermann, Bernhard Schröder, Ulrich Schade. 1137-1140 [doi]

Redescribing intonational categories with functional data analysisMargaret Zellers, Michele Gubian, Brechtje Post. 1141-1144 [doi]

Exploring goodness of prosody by diverse matching templatesShen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu. 1145-1148 [doi]

A language-identification inspired method for spontaneous speech detectionMickael Rouvier, Richard Dufour, Georges Linarès, Yannick Estève. 1149-1152 [doi]

Speech dominoes and phonetic convergenceGérard Bailly, Amélie Lelong. 1153-1156 [doi]

A quick sequential forward floating feature selection algorithm for emotion detection from speechMátyás Brendel, Riccardo Zaccarelli, Laurence Devillers. 1157-1160 [doi]

Automated vocal emotion recognition using phoneme class specific featuresGéza Kiss, Jan P. H. van Santen. 1161-1164 [doi]

Feature selection for pose invariant lip biometricsAdrian Pass, Jianguo Zhang, Darryl Stewart. 1165-1168 [doi]

Signal-based accent and phrase marking using the fujisaki modelHussein Hussein, Rüdiger Hoffmann. 1169-1172 [doi]

A study of interplay between articulatory movement and prosodic characteristics in emotional speech productionJangwon Kim, Sungbok Lee, Shrikanth S. Narayanan. 1173-1176 [doi]

Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral featuresShang-wen Li, Liang-Che Sun, Lin-Shan Lee. 1177-1180 [doi]

Using spectro-temporal features to improve AFE feature extraction for ASRSuman V. Ravuri, Nelson Morgan. 1181-1184 [doi]

Using harmonic phase information to improve ASR rateIbon Saratxaga, Inma Hernáez, Igor Odriozola, Eva Navas, Iker Luengo, Daniel Erro. 1185-1188 [doi]

Speech recognition using long-term phase informationKazumasa Yamamoto, Eiichi Sueyoshi, Seiichi Nakagawa. 1189-1192 [doi]

Low-dimensional space transforms of posteriors in speech recognitionJan Zelinka, Jan Trmal, Ludek Müller. 1193-1196 [doi]

Hierarchical bottle neck features for LVCSRChristian Plahl, Ralf Schlüter, Hermann Ney. 1197-1200 [doi]

Hierarchical neural net architectures for feature extraction in ASRFrantisek Grézl, Martin Karafiát. 1201-1204 [doi]

Mutual information analysis for feature and sensor subset selection in surface electromyography based speech recognitionVivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan. 1205-1208 [doi]

Learning from human errors: prediction of phoneme confusions based on modified ASR trainingBernd T. Meyer, Birger Kollmeier. 1209-1212 [doi]

Speech intelligibility of diagonally localized speech with competing noise using bone-conduction headphonesKazuhiro Kondo, Takayuki Kanda, Yosuke Kobayashi, Hiroyuki Yagyu. 1213-1216 [doi]

Masking of vowel-analog transitions by vowel-analog distractersPierre L. Divenyi. 1217-1220 [doi]

2010, a speech oddity: phonetic transcription of reversed speechFrançois Pellegrino, Emmanuel Ferragne, Fanny Meunier. 1221-1224 [doi]

Perception on pitch reset at discourse boundariesHsin-Yi Lin, Janice Fon. 1225-1228 [doi]

Effect of spatial separation on speech-in-noise comprehension in dyslexic adultsMarjorie Dole, Michel Hoen, Fanny Meunier. 1229-1232 [doi]

Speech categorization context effects in seven- to nine-month-old infantsEllen Marklund, Francisco Lacerda, Anna Ericsson. 1233-1236 [doi]

Changes in temporal processing of speech across the adult lifespanDiane Kewley-Port, Larry E. Humes, Daniel Fogerty. 1237-1240 [doi]

Fluency and structural complexity as predictors of L2 oral proficiencyJared Bernstein, Jian Cheng, Masanori Suzuki. 1241-1244 [doi]

Semantic facilitation in bilingual everyday speech comprehensionMarco van de Ven, Benjamin V. Tucker, Mirjam Ernestus. 1245-1248 [doi]

L2 experience and non-native vowel categorization of L1-Mandarin speakersBo-ren Hsieh, Ho-hsien Pan. 1249-1252 [doi]

Cross-lingual talker discriminationMirjam Wester. 1253-1256 [doi]

Dajare is not the lowest form of witTakashi Otake. 1257-1260 [doi]

Comparison of methods for topic classification in a speech-oriented guidance systemRafael Torres, Shota Takeuchi, Hiromichi Kawanami, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano. 1261-1264 [doi]

Using dependency parsing and machine learning for factoid question answering on spoken documentsPere Comas, Jordi Turmo, Lluís Màrquez. 1265-1268 [doi]

A spoken term detection framework for recovering out-of-vocabulary words using the webCarolina Parada, Abhinav Sethy, Mark Dredze, Frederick Jelinek. 1269-1272 [doi]

Improved spoken term detection by discriminative training of acoustic models based on user relevance feedbackHung-yi Lee, Chia-Ping Chen, Ching-feng Yeh, Lin-Shan Lee. 1273-1276 [doi]

A lightweight keyword and tag-cloud retrieval algorithm for automatic speech recognition transcriptsSebastian Tschöpel, Daniel Schneider. 1277-1280 [doi]

Lecture subtopic retrieval by retrieval keyword expansion using subordinate conceptNoboru Kanedera, Tetsuo Funada, Seiichi Nakagawa. 1281-1284 [doi]

Spoken document retrieval for oral presentations integrating global document similarities into local document similaritiesHiroaki Nanjo, Yusuke Iyonaga, Takehiko Yoshimi. 1285-1288 [doi]

Combining word-based features, statistical language models, and parsing for named entity recognitionJoseph Polifroni, Stephanie Seneff. 1289-1292 [doi]

Efficient combined approach for named entity recognition in spoken languageAzeddine Zidouni, Sophie Rosset, Hervé Glotin. 1293-1296 [doi]

Prominence based scoring of speech segments for automatic speech-to-speech summarizationSree Harsha Yella, Vasudeva Varma, Kishore Prahallad. 1297-1300 [doi]

Maximum lexical cohesion for fine-grained news story segmentationZihan Liu, Lei Xie, Wei Feng. 1301-1304 [doi]

Phoneme lattice based texttiling towards multilingual story segmentationXiaoxuan Wang, Lei Xie, Bin Ma, Engsiong Chng, Haizhou Li. 1305-1308 [doi]

The characterization of the relative information content by spectral features for the objective intelligibility assessment of nonlinearly processed speechAnton Schlesinger, Marinus M. Boone. 1309-1312 [doi]

Analytical assessment and distance modeling of speech transmission qualityMarcel Wältermann, Alexander Raake, Sebastian Möller. 1313-1316 [doi]

An intrusive super-wideband speech quality model: DIALNicolas Côté, Vincent Koehl, Valérie Gautier-Turbin, Alexander Raake, Sebastian Möller. 1317-1320 [doi]

It takes two to tango - assessing the impact of delay on conversational interactivity on perceived speech qualitySebastian Egger, Raimund Schatz, Stefan Scherer. 1321-1324 [doi]

Comparison of approaches for instrumentally predicting the quality of text-to-speech systemsSebastian Möller, Florian Hinterleitner, Tiago H. Falk, Tim Polzehl. 1325-1328 [doi]

A hybrid architecture for mobile voice user interfacesImre Kiss, Joseph Polifroni, Chao Wang, Ghinwa F. Choueiter, Mike Phillips. 1329-1332 [doi]

Assessment of spoken and multimodal applications: lessons learned from laboratory and field studiesMarkku Turunen, Jaakko Hakulinen, Tomi Heimonen. 1333-1336 [doi]

Improving cross database prediction of dialogue quality using mixture of expertsKlaus-Peter Engelbrecht, Hamed Ketabdar, Sebastian Möller. 1337-1340 [doi]

Boosting systems for LVCSRGeorge Saon, Hagen Soltau. 1341-1344 [doi]

Incorporating sparse representation phone identification features in automatic speech recognition using exponential familiesVaibhava Goel, Tara N. Sainath, Bhuvana Ramabhadran, Peder A. Olsen, David Nahamoo, Dimitri Kanevsky. 1345-1348 [doi]

Integrating MLP features and discriminative training in data sampling based ensemble acoustic modelingXin Chen, Yunxin Zhao. 1349-1352 [doi]

Semi-supervised training of Gaussian mixture models by conditional entropy minimizationJui-Ting Huang, Mark Hasegawa-Johnson. 1353-1356 [doi]

A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSRGuangchuan Shi, Yu Shi, Qiang Huo. 1357-1360 [doi]

Improvements to generalized discriminative feature transformation for speech recognitionRoger Hsiao, Florian Metze, Tanja Schultz. 1361-1364 [doi]

Improving ASR-based topic segmentation of TV programs with confidence measures and semantic relationsCamille Guinaudeau, Guillaume Gravier, Pascale Sébillot. 1365-1368 [doi]

The relevance of timing, pauses and overlaps in dialogues: detecting topic changes in scenario based meetingsSaturnino Luz, Jing Su. 1369-1372 [doi]

Semi-supervised part-of-speech tagging in speech applicationsRichard Dufour, Benoît Favre. 1373-1376 [doi]

Memory-based active learning for French broadcast newsFrédéric Tantini, Christophe Cerisara, Claire Gardent. 1377-1380 [doi]

Can conversational word usage be used to predict speaker demographics?Dan Gillick. 1381-1384 [doi]

Prosodic word-based error correction in speech recognition using prosodic word expansion and contextual informationChao-Hong Liu, Chung-Hsien Wu. 1385-1388 [doi]

Fully automatic segmentation for prosodic speech corporaSarah Hoffmann, Beat Pfister. 1389-1392 [doi]

A novel text-independent phonetic segmentation algorithm based on the microcanonical multiscale formalismVahid Khanagha, Khalid Daoudi, Oriol Pont, Hussein M. Yahia. 1393-1396 [doi]

Phone boundary detection using sample-based acoustic parametersYou-yu Lin, Yih-Ru Wang, Yuan-Fu Liao. 1397-1400 [doi]

HMM-based automatic visual speech segmentation using facial dataUtpala Musti, Asterios Toutios, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger. 1401-1404 [doi]

Bayes factor based speaker segmentation for speaker diarizationD. Wang, Robert Vogt, Sridha Sridharan. 1405-1408 [doi]

Using high-level information to detect key audio events in a tennis gameQiang Huang, Stephen J. Cox. 1409-1412 [doi]

What do you mean, you re uncertain?: the interpretation of cue words and rising intonation in dialogueCatherine Lai. 1413-1416 [doi]

Coping imbalanced prosodic unit boundary detection with linguistically-motivated prosodic featuresYi-Fen Liu, Shu-Chuan Tseng, Jyh-Shing Roger Jang, C.-H. Alvin Chen. 1417-1420 [doi]

Improving prosodic phrase prediction by unsupervised adaptation and syntactic features extractionZhigang Chen, Guoping Hu, Wei Jiang. 1421-1424 [doi]

Perception-based automatic approximation of F0 contours in Cantonese speechYujia Li, Tan Lee. 1425-1428 [doi]

Discriminative training and unsupervised adaptation for labeling prosodic events with limited training dataRaul Fernandez, Bhuvana Ramabhadran. 1429-1432 [doi]

Prosody for the eyes: quantifying visual prosody using guided principal component analysisErin Cvejic, Jeesun Kim, Chris Davis, Guillaume Gibert. 1433-1436 [doi]

An investigation into direct scoring methods without SVM training in speaker verificationCe Zhang, Rong Zheng, Bo Xu. 1437-1440 [doi]

Large margin Gaussian mixture models for speaker identificationReda Jourani, Khalid Daoudi, Régine André-Obrecht, Driss Aboutajdine. 1441-1444 [doi]

On the use of Gaussian component information in the generative likelihood ratio estimation for speaker verificationRong Zheng, Bo Xu. 1445-1448 [doi]

Acoustic vector resampling for GMMSVM-based speaker verificationMan-Wai Mak, Wei Rao. 1449-1452 [doi]

A fast speaker indexing using vector quantization and second order statistics with adaptive threshold computationKonstantin Biatov. 1453-1456 [doi]

Using phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speechGang Wang, Xiaojun Wu, Thomas Fang Zheng. 1457-1460 [doi]

On enhancing feature sequence filtering with filter-bank energy transformation in speaker verification with telephone speechClaudio Garretón, Néstor Becerra Yoma. 1461-1464 [doi]

MAP estimation of subspace transform for speaker recognitionDonglai Zhu, Bin Ma, Kong-Aik Lee, Cheung Chi Leung, Haizhou Li. 1465-1468 [doi]

A longest matching segment approach for text-independent speaker recognitionAyeh Jafari, Ramji Srinivasan, Danny Crookes, Ji Ming. 1469-1472 [doi]

Approaching human listener accuracy with modern speaker verificationVille Hautamäki, Tomi Kinnunen, Mohaddeseh Nosratighods, Kong-Aik Lee, Bin Ma, Haizhou Li. 1473-1476 [doi]

Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditionsJouni Pohjalainen, Rahim Saeidi, Tomi Kinnunen, Paavo Alku. 1477-1480 [doi]

The use of subvector quantization and discrete densities for fast GMM computation for speaker verificationGuoli Ye, Brian Mak. 1481-1484 [doi]

Parallel lexical-tree based LVCSR on multi-core processorsNaveen Parihar, Ralf Schlüter, David Rybach, Eric A. Hansen. 1485-1488 [doi]

Exploring recognition network representations for efficient speech inference on highly parallel platformsJike Chong, Ekaterina Gonina, Kisun You, Kurt Keutzer. 1489-1492 [doi]

WFST compression for automatic speech recognitionDiamantino Caseiro. 1493-1496 [doi]

Speech recognizer optimization under speed constraintsIvan Bulyko. 1497-1500 [doi]

The 2010 CMU GALE speech-to-text systemFlorian Metze, Roger Hsiao, Qin Jin, Udhyakumar Nallasamy, Tanja Schultz. 1501-1504 [doi]

Speaker diarization in meeting audio for single distant microphoneTin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li. 1505-1508 [doi]

Extending the punctuation module for european portugueseFernando Batista, Helena Moniz, Isabel Trancoso, Hugo Meinedo, Ana Isabel Mata, Nuno J. Mamede. 1509-1512 [doi]

Utilizing a noisy-channel approach for Korean LVCSRSakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura. 1513-1516 [doi]

The RWTH 2009 quaero ASR evaluation system for English and GermanMarkus Nußbaum-Thom, Simon Wiesler, Martin Sundermeyer, Christian Plahl, Stefan Hahn, Ralf Schlüter, Hermann Ney. 1517-1520 [doi]

When is indexical information about speech activated? evidence from a cross-modal priming experimentBenjamin Munson, Renata Solum. 1521-1524 [doi]

The influence of actual and perceived sexual orientation on diadochokinetic rate in women and menBenjamin Munson. 1525-1528 [doi]

Laryngealization and features for Chinese tonal recognitionKristine M. Yu. 1529-1532 [doi]

Production and perception of vietnamese short vowels in V1V2 contextViet Son Nguyen, Eric Castelli, René Carré. 1533-1536 [doi]

Measuring basic tempo across languages and some implications for speech rhythmGertraud Fenk-Oczlon, August Fenk. 1537-1540 [doi]

Durational structure of Japanese single/geminate stops in three- and four-mora words spoken at varied ratesYukari Hirata, Shigeaki Amano. 1541-1544 [doi]

Distribution and trichotomic realization of voiced velars in Japanese - an experimental studyShin-ichiro Sano, Tomohiko Ooigawa. 1545-1548 [doi]

Specification in context - devoicing processes in Polish, French, american English and German sonorantsJagoda Sieczkowska, Bernd Möbius, Grzegorz Dogil. 1549-1552 [doi]

Phonetic imitation of Japanese vowel devoicingKuniko Y. Nielsen. 1553-1556 [doi]

Post-aspiration in standard Italian: some first cross-regional acoustic evidenceMary Stevens, John Hajek. 1557-1560 [doi]

Articulatory grounding of southern salentino harmony processesMirko Grimaldi, Andrea Calabrese, Francesco Sigona, Luigina Garrapa, Bianca Sisinni. 1561-1564 [doi]

Effects of accent typicality and phonotactic frequency on nonword immediate serial recall performance in JapaneseYuuki Tanida, Taiji Ueno, Satoru Saito, Matthew A. Lambon-Ralph. 1565-1567 [doi]

How abstract is phonetics?Osamu Fujimura. 1568-1571 [doi]

Data-driven analysis of realtime vocal tract MRI using correlated image regionsAdam C. Lammert, Michael I. Proctor, Shrikanth S. Narayanan. 1572-1575 [doi]

Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysisMichael I. Proctor, Daniel Bone, Athanasios Katsamanis, Shrikanth S. Narayanan. 1576-1579 [doi]

Improved real-time MRI of oral-velar coordination using a golden-ratio spiral view orderYoon-Chul Kim, Shrikanth S. Narayanan, Krishna S. Nayak. 1580-1583 [doi]

Statistical multi-stream modeling of real-time MRI articulatory speech dataErik Bresch, Athanasios Katsamanis, Louis Goldstein, Shrikanth S. Narayanan. 1584-1587 [doi]

Predicting unseen articulations from multi-speaker articulatory modelsG. Ananthakrishnan, Pierre Badin, Julián Andrés Valdés Vargas, Olov Engwall. 1588-1591 [doi]

Estimating missing data sequences in x-ray microbeam recordingsChao Qin, Miguel Á. Carreira-Perpiñán. 1592-1595 [doi]

Adaptation of a tongue shape model by local feature transformationsChao Qin, Miguel Á. Carreira-Perpiñán, Mohsen Farhadloo. 1596-1599 [doi]

Vocal tract contour analysis of emotional speech by the functional data curve representationSungbok Lee, Shrikanth S. Narayanan. 1600-1603 [doi]

Locally-weighted regression for estimating the forward kinematics of a geometric vocal tract modelAdam C. Lammert, Louis Goldstein, Khalil Iskarous. 1604-1607 [doi]

Identifying articulatory goals from kinematic data using principal differential analysisMichael Reimer, Frank Rudzicz. 1608-1611 [doi]

Estimation of speech lip features from discrete cosinus transformZuheng Ming, Denis Beautemps, Gang Feng, Sébastien Schmerber. 1612-1615 [doi]

Autoregressive modelling for linear prediction of ultrasonic speechFarzaneh Ahmadi, Ian Vince McLoughlin, Hamid R. Sharifzadeh. 1616-1619 [doi]

Enhanced speech yielding higher intelligibility for all listeners and environmentsTakayuki Arai, Nao Hodoshima. 1620-1623 [doi]

Quality conversion of non-acoustic signals for facilitating human-to-human speech communication under harsh acoustic conditionsSeyed Omid Sadjadi, Sanjay A. Patil, John H. L. Hansen. 1624-1627 [doi]

The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversionKeigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano. 1628-1631 [doi]

A new binary mask based on noise constraints for improved speech intelligibilityGibak Kim, Philipos C. Loizou. 1632-1635 [doi]

Energy reallocation strategies for speech enhancement in known noise conditionsYan Tang, Martin Cooke. 1636-1639 [doi]

Effects of enhancement of spectral changes on speech quality and subjective speech intelligibilityJing Chen, Thomas Baer, Brian C. J. Moore. 1640-1643 [doi]

Prior information for rapid speaker adaptationCatherine Breslin, K. K. Chin, Mark J. F. Gales, Kate Knill, Haitian Xu. 1644-1647 [doi]

Discriminative adaptation for log-linear acoustic modelsJonas Lööf, Ralf Schlüter, Hermann Ney. 1648-1651 [doi]

Automatic speech recognition of multiple accented English dataDimitra Vergyri, Lori Lamel, Jean-Luc Gauvain. 1652-1655 [doi]

Shrinkage model adaptation in automatic speech recognitionJinyu Li, Yu Tsao, Chin-Hui Lee. 1656-1659 [doi]

Unscented transform with online distortion estimation for HMM adaptationJinyu Li, Dong Yu, Yifan Gong, L. Deng. 1660-1663 [doi]

HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognitionMichael L. Seltzer, Alex Acero. 1664-1667 [doi]

CRF-based stochastic pronunciation modeling for out-of-vocabulary spoken term detectionDong Wang, Simon King, Nicholas W. D. Evans, Raphaël Troncy. 1668-1671 [doi]

Improved spoken term detection by feature space pseudo-relevance feedbackChia-Ping Chen, Hung-yi Lee, Ching-feng Yeh, Lin-Shan Lee. 1672-1675 [doi]

Towards spoken term discovery at scale with zero resourcesAren Jansen, Kenneth Church, Hynek Hermansky. 1676-1679 [doi]

Vocabulary independent spoken query: a case for subword unitsEvandro B. Gouvêa, Tony Ezzat. 1680-1683 [doi]

Extractive speech summarization - from the view of decision theoryShih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen. 1684-1687 [doi]

The impact of ASR on abstractive vs. extractive meeting summariesGabriel Murray, Giuseppe Carenini, Raymond T. Ng. 1688-1691 [doi]

Binary coding of speech spectrograms using a deep auto-encoderLi Deng, Michael L. Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, Geoffrey E. Hinton. 1692-1695 [doi]

A super-resolution spectrogram using coupled PLCAJuhan Nam, Gautham J. Mysore, Joachim Ganseman, Kyogu Lee, Jonathan S. Abel. 1696-1699 [doi]

Fast least-squares solution for sinusoidal, harmonic and quasi-harmonic modelsGeorgios Tzedakis, Yannis Pantazis, Olivier Rosec, Yannis Stylianou. 1700-1703 [doi]

Sparse component analysis for speech recognition in multi-speaker environmentAfsaneh Asaei, Hervé Bourlard, Philip N. Garner. 1704-1707 [doi]

Intra-frame variability as a predictor of frame classifiabilityTrond Skogstad, Torbjørn Svendsen. 1708-1711 [doi]

Autocorrelation and double autocorrelation based spectral representations for a noisy word recognition systemTetsuya Shimamura, Ngoc Dinh Nguyen. 1712-1715 [doi]

Maximum a posteriori voice conversion using sequential monte carlo methodsElina Helander, Hanna Silén, Joaquín Míguez, Moncef Gabbouj. 1716-1719 [doi]

Dynamic model selection for spectral voice conversionPierre Lanchantin, Xavier Rodet. 1720-1723 [doi]

Speaker-independent HMM-based voice conversion using quantized fundamental frequencyTakashi Nose, Takao Kobayashi. 1724-1727 [doi]

Probabilistic integration of joint density model and speaker model for voice conversionDaisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu. 1728-1731 [doi]

Text-independent F0 transformation with non-parallel data for voice conversionZhi-Zheng Wu, Tomi Kinnunen, Engsiong Chng, Haizhou Li. 1732-1735 [doi]

A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversionXiaodan Zhuang, Lijuan Wang, Frank K. Soong, Mark Hasegawa-Johnson. 1736-1739 [doi]

Influence of lexical tones on intonation in kammuAnastasia Karlsson, David House, Jan-Olof Svantesson, Damrong Tayanin. 1740-1743 [doi]

Phonetic realization of second occurrence focus in JapaneseSatoshi Nambu, Yong-cheol Lee. 1744-1747 [doi]

Prosodic grouping and relative clause disambiguation in MandarinJianjing Kuang. 1748-1751 [doi]

Text-based unstressed syllable prediction in MandarinYa Li, Jianhua Tao, Meng Zhang, Shifeng Pan, Xiaoying Xu. 1752-1755 [doi]

flat pitch accents in CzechTomás Dubeda. 1756-1759 [doi]

Positional variability of pitch accents in CzechTomás Dubeda. 1760-1763 [doi]

Modeling of sentence-medial pauses in bangla readout speech: occurrence and durationShyamal Das Mandal, Arup Saha, Tulika Basu, Keikichi Hirose, Hiroya Fujisaki. 1764-1767 [doi]

Declarative sentence intonation patterns in 8 swiss German dialectsAdrian Leemann, Lucy Zuberbühler. 1768-1771 [doi]

Syllable-level prominence detection with acoustic evidenceJe Hun Jeon, Yang Liu. 1772-1775 [doi]

Prosody cues for classification of the discourse particle hã in hindiSankalan Prasad, Kalika Bali. 1776-1779 [doi]

Interaction of syntax-marked focus and wh-question induced focus in standard ChineseYuan Jia, Aijun Li. 1780-1783 [doi]

Prominence detection in Swedish using syllable correlatesSamer Al Moubayed, Jonas Beskow. 1784-1787 [doi]

Automatic analysis of the intonation of a tone language. applying the momel algorithm to spontaneous standard Chinese (beijing)Na Zhi, Daniel Hirst, Pier Marco Bertinetto. 1788-1791 [doi]

Towards long-range prosodic attribute modeling for language recognitionRaymond W. M. Ng, Cheung Chi Leung, Ville Hautamäki, Tan Lee, Bin Ma, Haizhou Li. 1792-1795 [doi]

A modified parameterization of the Fujisaki modelRobert Schubert, Oliver Jokisch, Diane Hirschfeld. 1796-1799 [doi]

Within and across sentence boundary language modelSaeedeh Momtazi, Friedrich Faubel, Dietrich Klakow. 1800-1803 [doi]

Impact of word classing on shrinkage-based language modelsRuhi Sarikaya, Stanley F. Chen, Abhinav Sethy, Bhuvana Ramabhadran. 1804-1807 [doi]

Combination of probabilistic and possibilistic language modelsStanislas Oger, Vladimir Popescu, Georges Linarès. 1808-1811 [doi]

On-demand language model interpolation for mobile speech inputBrandon Ballinger, Cyril Allauzen, Alexander Gruenstein, Johan Schalkwyk. 1812-1815 [doi]

Text normalization based on statistical machine translation and internet user supportTim Schlippe, Chenfei Zhu, Jan Gebhardt, Tanja Schultz. 1816-1819 [doi]

Efficient estimation of maximum entropy language models with n-gram features: an SRILM extensionTanel Alumäe, Mikko Kurimo. 1820-1823 [doi]

Similar n-gram language modelChristian Gillot, Christophe Cerisara, David Langlois, Jean-Paul Haton. 1824-1827 [doi]

Topic and style-adapted language modeling for Thai broadcast news ASRMarkpong Jongtaveesataporn, Sadaoki Furui. 1828-1831 [doi]

Augmented context features for Arabic speech recognitionAhmad Emami, Hong-Kwang Jeff Kuo, Imed Zitouni, Lidia Mangu. 1832-1835 [doi]

A statistical segment-based approach for spoken language understandingLucía Ortega, Isabel Galiano, Lluís F. Hurtado, Emilio Sanchis, Encarna Segarra. 1836-1839 [doi]

Cantonese tone word learning by tone and non-tone language speakersAngela Cooper, Yue Wang. 1840-1843 [doi]

Validation of a training method for L2 continuous-speech segmentationAnne Cutler, Janise Shanley. 1844-1847 [doi]

Linguistic rhythm in foreign accentJiahong Yuan. 1848-1849 [doi]

The effect of a word embedded in a sentence and speaking rate variation on the perceptual training of geminate and singleton consonant distinctionMee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka. 1850-1853 [doi]

Foreign accent matters most when timing is wrongChiharu Tsurutani. 1854-1857 [doi]

Effects of Korean learners consonant cluster reduction strategies on English speech recognition performanceHyejin Hong, Jina Kim, Minhwa Chung. 1858-1861 [doi]

The effects of EMA-based augmented visual feedback on the English speakers' acquisition of the Japanese flap: a perceptual studyJune S. Levitt, William F. Katz. 1862-1865 [doi]

Perception of voiceless fricatives by Japanese listeners of advanced and intermediate level English proficiencyHinako Masuda, Takayuki Arai. 1866-1869 [doi]

Perception of estonian vowel categories by native and non-native speakersLya Meister, Einar Meister. 1870-1873 [doi]

Spoken English assessment system for non-native speakers using acoustic and prosodic featuresQin Shi, Kun Li, Shilei Zhang, Stephen M. Chu, Ji Xiao, Zhijian Ou. 1874-1877 [doi]

Russian infants and children's sounds and speech corpuses for language acquisition studiesElena E. Lyakso, Olga V. Frolova, Anna V. Kurazhova, Julia S. Gaikova. 1878-1881 [doi]

Language-specific influence on phoneme development: French and drehu dataJulia Monnin, Hélène Loevenbruck. 1882-1885 [doi]

Did you say susi or shushi? measuring the emergence of robust fricative contrasts in English- and Japanese-acquiring childrenJeffrey J. Holliday, Mary E. Beckman, Chanelle Mays. 1886-1889 [doi]

An empirical comparison of the t:::3:::, juicer, HDecode and sphinx3 decodersJosef R. Novak, Paul R. Dixon, Sadaoki Furui. 1890-1893 [doi]

Tracter: a lightweight dataflow frameworkPhilip N. Garner, John Dines. 1894-1897 [doi]

Verifying pronunciation dictionaries using conflict analysisMarelie H. Davel, Febe de Wet. 1898-1901 [doi]

Automatic estimation of transcription accuracy and difficultyBrandon Roy, Soroush Vosoughi, Deb Roy. 1902-1905 [doi]

Creating a linguistic plausibility dataset with non-expert annotatorsBenjamin Lambert, Rita Singh, Bhiksha Raj. 1906-1909 [doi]

Construction and evaluations of an annotated Chinese conversational corpus in travel domain for the language model of speech recognitionXinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura. 1910-1913 [doi]

Building transcribed speech corpora quickly and cheaply for many languagesThad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno, Mike LeBeau. 1914-1917 [doi]

The CHiME corpus: a resource and a challenge for computational hearing in multisource environmentsHeidi Christensen, Jon Barker, Ning Ma, Phil D. Green. 1918-1921 [doi]

Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation trainingWen Cao, Dongning Wang, Jinsong Zhang, Ziyu Xiong. 1922-1925 [doi]

How children acquire situation understanding skills?: a developmental analysis utilizing multimodal speech behavior corpusShogo Ishikawa, Shinya Kiriyama, Yoichi Takebayashi, Shigeyoshi Kitazawa. 1926-1929 [doi]

The influence of expertise and efficiency on modality selection strategies and perceived mental effortIna Wechsung, Stefan Schaffer, Robert Schleicher, Anja Naumann, Sebastian Möller. 1930-1933 [doi]

Parameters describing multimodal interaction - definitions and three usage scenariosChristine Kühnel, Benjamin Weiss, Sebastian Möller. 1934-1937 [doi]

Repair strategies on trial: which error recovery do users like best?Alexander Zgorzelski, Alexander Schmitt, Tobias Heinroth, Wolfgang Minker. 1938-1941 [doi]

CRF-based combination of contextual features to improve a posteriori word-level confidence measuresJulien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier, Patrick Gros. 1942-1945 [doi]

Recognition of spontaneous conversational speech using long short-term memory phoneme predictionsMartin Wöllmer, Florian Eyben, Björn Schuller, Gerhard Rigoll. 1946-1949 [doi]

Improving ASR error detection with non-decoder based featuresThomas Pellegrini, Isabel Trancoso. 1950-1953 [doi]

Phoneme classification and lattice rescoring based on a k-NN approachLadan Golipour, Douglas D. O Shaughnessy. 1954-1957 [doi]

Online adaptive learning for speech recognition decodingJeff Bilmes, Hui Lin. 1958-1961 [doi]

Improvements of search error risk minimization in viterbi beam search for speech recognitionTakaaki Hori, Shinji Watanabe, Atsushi Nakamura. 1962-1965 [doi]

Say what? why users choose to speak their web queriesMaryam Kamvar, Doug Beeferman. 1966-1969 [doi]

The effect of audience familiarity on the perception of modified accentJonathan Teutenberg, Catherine I. Watson. 1970-1973 [doi]

On generating combilex pronunciations via morphological analysisKorin Richmond, Robert A. J. Clark, Susan Fitt. 1974-1977 [doi]

Say it as you mean it - analyzing free user comments in the VOICE awards corpusFlorian Gödde, Sebastian Möller. 1978-1981 [doi]

A new multichannel multi modal dyadic interaction databaseViktor Rozgic, Bo Xiao, Athanasios Katsamanis, Brian R. Baucom, Panayiotis G. Georgiou, Shrikanth S. Narayanan. 1982-1985 [doi]

SEAME: a Mandarin-English code-switching speech corpus in south-east asiaDau-Cheng Lyu, Tien Ping Tan, Engsiong Chng, Haizhou Li. 1986-1989 [doi]

Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic databaseDaniel Felps, Christian Geng, Michael Berger, Korin Richmond, Ricardo Gutierrez-Osuna. 1990-1993 [doi]

Investigating articulatory setting - pauses, ready position, and rest - using real-time MRIVikram Ramanarayanan, Dani Byrd, Louis Goldstein, Shrikanth S. Narayanan. 1994-1997 [doi]

Articulatory inversion of american English /turnr/ by conditional density modesChao Qin, Miguel Á. Carreira-Perpiñán. 1998-2001 [doi]

Can tongue be recovered from face? the answer of data-driven statistical modelsAtef Ben Youssef, Pierre Badin, Gérard Bailly. 2002-2005 [doi]

Phrase-medial vowel devoicing in spontaneous FrenchFrancisco Torreira, Mirjam Ernestus. 2006-2009 [doi]

Exploring the mechanism of tonal contraction in taiwan MandarinChierh Cheng, Yi Xu, Michele Gubian. 2010-2013 [doi]

Voice attributes affecting likability perceptionBenjamin Weiss, Felix Burkhardt. 2014-2017 [doi]

Turn-alignment using eye-gaze and speech in conversational interactionKristiina Jokinen, Kazuaki Harada, Masafumi Nishida, Seiichi Yamamoto. 2018-2021 [doi]

An investigation of formant frequencies for cognitive load classificationTet Fei Yap, Julien Epps, Eliathamby Ambikairajah, Eric H. C. Choi. 2022-2025 [doi]

Language specific effects of emotion on phoneme durationMartijn Goudbeek, Mirjam Broersma. 2026-2029 [doi]

Automatic classification of married couples behavior using audio featuresMatthew Black, Athanasios Katsamanis, Chi-Chun Lee, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan. 2030-2033 [doi]

Influence of gestural salience on the interpretation of spoken requestsGideon Kowadlo, Patrick Ye, Ingrid Zukerman. 2034-2037 [doi]

Robust word recognition using articulatory trajectories and gesturesVikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, Louis Goldstein. 2038-2041 [doi]

Performance estimation of noisy speech recognition considering recognition task complexityTakeshi Yamada, Tomohiro Nakajima, Nobuhiko Kitawaki, Shoji Makino. 2042-2045 [doi]

Estimating noise from noisy speech features with a monte carlo variant of the expectation maximization algorithmFriedrich Faubel, Dietrich Klakow. 2046-2049 [doi]

Template-based spectral estimation using microphone array for speech recognitionSatoshi Tamura, Eriko Hishikawa, Wataru Taguchi, Satoru Hayamizu. 2050-2053 [doi]

A particle filter feature compensation approach to robust speech recognitionAleem Mushtaq, Yu Tsao, Chin-Hui Lee. 2054-2057 [doi]

Nonlinear enhancement of onset for robust speech recognitionChanwoo Kim, Richard M. Stern. 2058-2061 [doi]

Mask estimation in non-stationary noise environments for missing feature based robust speech recognitionShirin Badiezadegan, Richard C. Rose. 2062-2065 [doi]

Robust automatic speech recognition with decoder oriented ideal binary mask estimationLae-Hoon Kim, Kyung Tae Kim, Mark Hasegawa-Johnson. 2066-2069 [doi]

A robust speech recognition system against the ego noise of a robotGökhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura. 2070-2073 [doi]

Empirical mode decomposition for noise-robust automatic speech recognitionKuo-Hao Wu, Chia-Ping Chen. 2074-2077 [doi]

An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generationWooil Kim, Jun-Won Suh, John H. L. Hansen. 2078-2081 [doi]

Artificial and online acquired noise dictionaries for noise robust ASRJort F. Gemmeke, Tuomas Virtanen. 2082-2085 [doi]

Voice activity detection based on conditional random fields using multiple featuresAkira Saito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda. 2086-2089 [doi]

A comparative study of noise estimation algorithms for VTS-based robust speech recognitionYong Zhao, Biing-Hwang Juang. 2090-2093 [doi]

On using missing-feature theory with cepstral features - approximations to the multivariate integralFrank Seide, Pei Zhao. 2094-2097 [doi]

Using a DBN to integrate sparse classification and GMM-based ASRYang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves. 2098-2101 [doi]

Transcript-dependent speaker recognition using mixer 1 and 2Fred S. Richardson, Joseph P. Campbell. 2102-2105 [doi]

On the potential of glottal signatures for speaker recognitionThomas Drugman, Thierry Dutoit. 2106-2109 [doi]

Acoustic feature diversity and speaker verificationR. Padmanabhan, Hema A. Murthy. 2110-2113 [doi]

A discriminative performance metric for GMM-UBM speaker identificationOmid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li. 2114-2117 [doi]

A novel speaker binary key derived from anchor modelsXavier Anguera, Jean-François Bonastre. 2118-2121 [doi]

Variant time-frequency cepstral features for speaker recognitionWeiqiang Zhang, Yan Deng, Liang He, Jia Liu. 2122-2125 [doi]

Exploitation of phase information for speaker recognitionNing Wang, P. C. Ching, Tan Lee. 2126-2129 [doi]

Effects of the phonological relevance in speaker verificationYanhua Long, Li-Rong Dai, Bin Ma, Wu Guo. 2130-2133 [doi]

Topological representation of speech for speaker recognitionGabriel H. Sierra, Jean-François Bonastre, Driss Matrouf, José R. Calvo. 2134-2137 [doi]

Assessment of single-channel speech enhancement techniques for speaker identification under mismatched conditionsSeyed Omid Sadjadi, John H. L. Hansen. 2138-2141 [doi]

Speaker recognition using the resynthesized speech via spectrum modelingXiang Zhang, Chuan Cao, Lin Yang, Hongbin Suo, Jianping Zhang, YongHong Yan. 2142-2145 [doi]

Shape-invariant speech transformation with the phase vocoderAxel Röbel. 2146-2149 [doi]

A phonetic alternative to cross-language voice conversion in a text-dependent context: evaluation of speaker identityKayoko Yanagisawa, Mark Huckvale. 2150-2153 [doi]

Evaluation of speaker mimic technology for personalizing SGD voicesEsther Klabbers, Alexander Kain, Jan P. H. van Santen. 2154-2157 [doi]

Adaptive voice-quality control based on one-to-many eigenvoice conversionKumi Ohta, Tomoki Toda, Yamato Ohtani, Hiroshi Saruwatari, Kiyohiro Shikano. 2158-2161 [doi]

Applying voice conversion to concatenative singing-voice synthesisFernando Villavicencio, Jordi Bonada. 2162-2165 [doi]

Improved generation of fundamental frequency in HMM-based speech synthesis using generation process modelMiaomiao Wang, Miaomiao Wen, Keikichi Hirose, Nobuaki Minematsu. 2166-2169 [doi]

A hierarchical F0 modeling method for HMM-based speech synthesisMing Lei, Yi-Jian Wu, Frank K. Soong, Zhen-Hua Ling, Li-Rong Dai. 2170-2173 [doi]

Training a parametric-based logF0 model with the minimum generation error criterionJavier Latorre, Mark J. F. Gales, Heiga Zen. 2174-2177 [doi]

Improving Mandarin segmental duration prediction with automatically extracted syntax featuresMiaomiao Wen, Miaomiao Wang, Keikichi Hirose, Nobuaki Minematsu. 2178-2181 [doi]

An intonation model for TTS in sepediDaniel R. van Niekerk, Etienne Barnard. 2182-2185 [doi]

Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listenersMichael Pucher, Dietmar Schabus, Junichi Yamagishi. 2186-2189 [doi]

A comparison of pronunciation modeling approaches for HMM-TTSGabriel Webster, Sacha Krstulovic, Kate Knill. 2190-2193 [doi]

HMM-based text-to-articulatory-movement prediction and analysis of critical articulatorsZhen-Hua Ling, Korin Richmond, Junichi Yamagishi. 2194-2197 [doi]

Audio-based sports highlight detection by fourier local auto-correlationsJiaxing Ye, Takumi Kobayashi, Tetsuya Higuchi. 2198-2201 [doi]

Automatic excitement-level detection for sports highlights generationHynek Boril, Abhijeet Sangwan, Taufiq Hasan, John H. L. Hansen. 2202-2205 [doi]

Detecting novel objects in acoustic scenes through classifier incongruenceJörg-Hendrik Bach, Jörn Anemüller. 2206-2209 [doi]

A multidomain approach for automatic home environmental sound classificationStavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis. 2210-2213 [doi]

Content-based advertisement detectionPatrick Cardinal, Vishwa Gupta, Gilles Boulianne. 2214-2217 [doi]

Identification of abnormal audio events based on probabilistic novelty detectionStavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis. 2218-2221 [doi]

Lightly supervised recognition for automatic alignment of large coherent speech recordingsNorbert Braunschweiler, Mark J. F. Gales, Sabine Buchholz. 2222-2225 [doi]

Incremental diarization of telephone conversationsOshry Ben-Harush, Itshak Lapidot, Hugo Guterman. 2226-2229 [doi]

Audio analytics by template modeling and 1-pass DP based decodingSrikanth Cherla, V. Ramasubramanian. 2230-2233 [doi]

Perceptual wavelet decomposition for speech segmentationMariusz Ziólko, Jakub Galka, Bartosz Ziólko, Tomasz Drwiega. 2234-2237 [doi]

A comparative study of constrained and unconstrained approaches for segmentation of speech signalVenkatesh Keri, Kishore Prahallad. 2238-2241 [doi]

Automatic discriminative measurement of voice onset timeMorgan Sonderegger, Joseph Keshet. 2242-2245 [doi]

Selective gammatone filterbank feature for robust sound event recognitionYi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li. 2246-2249 [doi]

Towards a robust face recognition system using compressive sensingAllen Y. Yang, Zihan Zhou, Yi Ma, Shankar Sastry. 2250-2253 [doi]

Sparse representation features for speech recognitionTara N. Sainath, Bhuvana Ramabhadran, David Nahamoo, Dimitri Kanevsky, Abhinav Sethy. 2254-2257 [doi]

Data selection for language modeling using sparse representationsAbhinav Sethy, Tara N. Sainath, Bhuvana Ramabhadran, Dimitri Kanevsky. 2258-2261 [doi]

Observation uncertainty measures for sparse imputationJort F. Gemmeke, Ulpu Remes, Kalle J. Palomäki. 2262-2265 [doi]

Sparse representations for text categorizationTara N. Sainath, Sameer Maskey, Dimitri Kanevsky, Bhuvana Ramabhadran, David Nahamoo, Julia Hirschberg. 2266-2269 [doi]

Sparse auto-associative neural networks: theory and application to speech recognitionGarimella S. V. S. Sivaram, Sriram Ganapathy, Hynek Hermansky. 2270-2273 [doi]

FSM-based pronunciation modeling using articulatory phonological codeChi Hu, Xiaodan Zhuang, Mark Hasegawa-Johnson. 2274-2277 [doi]

Detailed pronunciation variant modeling for speech transcriptionDenis Jouvet, Dominique Fohr, Irina Illina. 2278-2281 [doi]

A minimum classification error approach to pronunciation variation modeling of non-native proper namesLine Adde, Bert Réveil, Jean-Pierre Martens, Torbjørn Svendsen. 2282-2285 [doi]

Acoustics-based phonetic transcription method for proper nounsAntoine Laurent, Sylvain Meignier, Téva Merlin, Paul Deléglise. 2286-2289 [doi]

Wiktionary as a source for automatic pronunciation extractionTim Schlippe, Sebastian Ochs, Tanja Schultz. 2290-2293 [doi]

Learning new word pronunciations from spoken examplesIbrahim Badr, Ian McGraw, James R. Glass. 2294-2297 [doi]

Phonetic subspace mixture model for speaker diarizationI-Fan Chen, Shih-Sian Cheng, Hsin-Min Wang. 2298-2301 [doi]

Overlap detection for speaker diarization by fusing spectral and spatial featuresMartin Zelenák, Carlos Segura, Javier Hernando. 2302-2305 [doi]

Floor holder detection and end of speaker turn prediction in meetingsAlfred Dielmann, Giulia Garau, Hervé Bourlard. 2306-2309 [doi]

Confidence measures for speaker segmentation and their relation to speaker verificationCarlos Vaquero, Alfonso Ortega, Jesús A. Villalba, Antonio Miguel, Eduardo Lleida. 2310-2313 [doi]

Decoupling session variability modelling and speaker characterisationAnthony Larcher, Christophe Lévy, Driss Matrouf, Jean-François Bonastre. 2314-2317 [doi]

Incorporating MAP estimation and covariance transform for SVM based speaker recognitionCheung Chi Leung, Donglai Zhu, Kong-Aik Lee, Bin Ma, Haizhou Li. 2318-2321 [doi]

Single-speaker/multi-speaker co-channel speech classificationStéphane Rossignol, Olivier Pietquin. 2322-2325 [doi]

Discriminative training for hierarchical clustering in speaker diarizationOriol Vinyals, Gerald Friedland, Nelson Morgan. 2326-2329 [doi]

GMM-UBM based open-set online speaker diarizationJürgen T. Geiger, Frank Wallhoff, Gerhard Rigoll. 2330-2333 [doi]

A segment-based non-parametric approach for monophone recognitionLadan Golipour, Douglas D. O Shaughnessy. 2334-2337 [doi]

A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual dataTaras Butko, Climent Nadeu. 2338-2341 [doi]

Effects of modelling within- and between-frame temporal variations in power spectra on non-verbal sound recognitionNobuhide Yamakawa, Tetsuro Kitahara, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. 2342-2345 [doi]

On the importance of glottal flow spectral energy for the recognition of emotions in speechLing He, Margaret Lech, Nicholas Allen. 2346-2349 [doi]

Real-life emotion-related states detection in call centers: a cross-corpora studyLaurence Devillers, Christophe Vaudable, Clément Chastagnol. 2350-2353 [doi]

Multi-class and hierarchical SVMs for emotion recognitionAli Hassan, Robert I. Damper. 2354-2357 [doi]

Determining optimal features for emotion recognition from speech by applying an evolutionary algorithmDavid Hübner, Bogdan Vlasenko, Tobias Grosser, Andreas Wendemuth. 2358-2361 [doi]

Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modelingMartin Wöllmer, Angeliki Metallinou, Florian Eyben, Björn Schuller, Shrikanth S. Narayanan. 2362-2365 [doi]

Data-dependent evaluator modeling and its application to emotional valence classification from speechKartik Audhkhasi, Shrikanth S. Narayanan. 2366-2369 [doi]

Modelling speech line spectral frequencies with dirichlet mixture modelsZhanyu Ma, Arne Leijon. 2370-2373 [doi]

PDF-optimized LSF vector quantization based on beta mixture modelsZhanyu Ma, Arne Leijon. 2374-2377 [doi]

Non-linear predictive vector quantization of feature vectors for distributed speech recognitionJose Enrique Garcia, Alfonso Ortega, Antonio Miguel, Eduardo Lleida. 2378-2381 [doi]

Superwideband extension of g.718 and g.729.1 speech codecsLasse Laaksonen, Mikko Tammi, Vladimir Malenovsky, Tommy Vaillancourt, Mi Suk Lee, Tomofumi Yamanashi, Masahiro Oshikiri, Claude Lamblin, Balázs Kövesi, Lei Miao, Deming Zhang, Jon Gibbs, Holly Francois. 2382-2385 [doi]

A multipulse FEC scheme based on amplitude estimation for CELP codecs over packet networksJosé L. Carmona, Angel M. Gomez, Antonio M. Peinado, José L. Pérez-Córdoba, José A. González. 2386-2389 [doi]

Voice quality evaluation of recent open source codecsAnssi Rämö, Henri Toukomaa. 2390-2393 [doi]

Efficient HMM-based estimation of missing features, with applications to packet loss concealmentBengt J. Borgström, Per Henrik Borgström, Abeer Alwan. 2394-2397 [doi]

Speech inventory based discriminative training for joint speech enhancement and low-rate speech codingXiaoqiang Xiao, Robert M. Nickel. 2398-2401 [doi]

Quality-based playout buffering with FEC for conversational voIPQipeng Gong, Peter Kabal. 2402-2405 [doi]

Sub-band basis spectrum model for pitch-synchronous log-spectrum and phase based on approximation of sparse codingMasatsune Tamura, Takehiko Kagoshima, Masami Akamine. 2406-2409 [doi]

A multimodal density function estimation approach to formant trackingSundar Harshavardhan, Chandra Sekhar Seelamantula, Thippur V. Sreenivas. 2410-2413 [doi]

Estimation studies of vocal tract shape trajectory using a variable length and lossy kelly-lochbaum modelHeikki Rasilo, Unto K. Laine, Okko Johannes Räsänen. 2414-2417 [doi]

Improving back-off models with bag of words and hollow-gramsBenjamin Lecouteux, Raphaël Rubino, Georges Linarès. 2418-2421 [doi]

Study on interaction between entropy pruning and kneser-ney smoothingCiprian Chelba, Thorsten Brants, Will Neveitt, Peng Xu. 2422-2425 [doi]

Dynamic language model adaptation using keyword category classificationHitoshi Yamamoto, Ken Hanazawa, Kiyokazu Miki, Koichi Shinoda. 2426-2429 [doi]

Integration of cache-based model and topic dependent class model with soft clustering and soft votingWelly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa. 2430-2433 [doi]

Conditional models for detecting lambda-functions in a spoken language understanding systemFrédéric Duvert, Renato de Mori. 2434-2437 [doi]

Novel weighting scheme for unsupervised language model adaptation using latent dirichlet allocationMd. Akmal Haidar, Douglas D. O Shaughnessy. 2438-2441 [doi]

Automatic speech recognition system channel modelingQun Feng Tan, Kartik Audhkhasi, Panayiotis G. Georgiou, Emil Ettelaie, Shrikanth S. Narayanan. 2442-2445 [doi]

Round-robin discrimination model for reranking ASR hypothesesTakanobu Oba, Takaaki Hori, Atsushi Nakamura. 2446-2449 [doi]

On-the-fly lattice rescoring for real-time automatic speech recognitionHasim Sak, Murat Saraclar, Tunga Güngör. 2450-2453 [doi]

A feature extraction method for automatic speech recognition based on the cochlear nucleusSerajul Haque, Roberto Togneri. 2454-2457 [doi]

A phoneme recognition framework based on auditory spectro-temporal receptive fieldsSamuel Thomas, Kailash Patil, Sriram Ganapathy, Nima Mesgarani, Hynek Hermansky. 2458-2461 [doi]

Perceptual compensation for effects of reverberation in speech identification: a computer model based on auditory efferent processingAmy V. Beeston, Guy J. Brown. 2462-2465 [doi]

Predicting human perception and ASR classification of word-final [t] by its acoustic sub-segmental propertiesBarbara Schuppler, Mirjam Ernestus, Wim A. van Dommelen, Jacques C. Koreman. 2466-2469 [doi]

A speech-in-noise test based on spoken digits: comparison of normal and impaired listeners using a computer modelMatthew Robertson, Guy J. Brown, Wendy Lecluyse, Manasa Panda, Christine M. Tan. 2470-2473 [doi]

Evaluation of bone-conducted ultrasonic hearing-aid regarding transmission of paralinguistic information: a comparison with cochlear implant simulatorTakayuki Kagomiya, Seiji Nakagawa. 2474-2477 [doi]

Challenging the speech intelligibility index: macroscopic vs. microscopic prediction of sentence recognition in normal and hearing-impaired listenersTim Jürgens, Stefan Fredelake, Ralf M. Meyer, Birger Kollmeier, Thomas Brand. 2478-2481 [doi]

Does sentence complexity interfere with intelligibility in noise? evaluation of the oldenburg linguistically and audiologically controlled sentence test (OLACS)Verena N. Uslar, Thomas Brand, Mirko Hanke, Rebecca Carroll, Esther Ruigendijk, Cornelia Hamann, Birger Kollmeier. 2482-2485 [doi]

Intelligibility predictions for speech against fluctuating maskerJuan-Pablo Ramirez, Hamed Ketabdar, Alexander Raake. 2486-2489 [doi]

An effect of formant amplitude in vowel perceptionMasashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano. 2490-2493 [doi]

Functional imaging of brain regions sensitive to communication sounds in primatesChristopher I. Petkov, Benjamin Wilson. 2494-2497 [doi]

Strategies for statistical spoken language understanding with small amount of data - an empirical studyYe-Yi Wang. 2498-2501 [doi]

Investigating multiple approaches for SLU portability to a new languageBassam Jabaian, Laurent Besacier, Fabrice Lefèvre. 2502-2505 [doi]

Learning naturally spoken commands for a robotAnja Austermann, Seiji Yamada, Kotaro Funakoshi, Mikio Nakano. 2506-2509 [doi]

A semi-supervised cluster-and-label approach for utterance classificationAmparo Albalate, Aparna Suchindranath, David Suendermann, Wolfgang Minker. 2510-2513 [doi]

Classifying dialog acts in human-human and human-machine spoken conversationsSilvia Quarteroni, Giuseppe Riccardi. 2514-2517 [doi]

Exploring speaker characteristics for meeting summarizationFei Liu, Yang Liu. 2518-2521 [doi]

Semi-supervised extractive speech summarization via co-training algorithmShasha Xie, Hui Lin, Yang Liu. 2522-2525 [doi]

Extractive summarization using a latent variable modelAsli Çelikyilmaz, Dilek Hakkani-Tür. 2526-2529 [doi]

Hierarchical classification for speech-to-speech translationEmil Ettelaie, Panayiotis G. Georgiou, Shrikanth S. Narayanan. 2530-2533 [doi]

Rapid development of speech translation using consecutive interpretationMatthias Paulik, Alex Waibel. 2534-2537 [doi]

Combining many alignments for speech to speech translationSameer Maskey, Steven J. Rennie, Bowen Zhou. 2538-2541 [doi]

Detecting Politeness and efficiency in a cooperative social interactionPaul M. Brunet, Marcela Charfuelan, Roderick Cowie, Marc Schröder, Hastings Donnan, Ellen Douglas-Cowie. 2542-2545 [doi]

Comparing measures of synchrony and alignment in dialogue speech timing with respect to turn-taking activityNick Campbell, Stefan Scherer. 2546-2549 [doi]

Resources for turn competition in overlap in multi-party conversations: speech rate, pausing and durationEmina Kurtic, Guy J. Brown, Bill Wells. 2550-2553 [doi]

Disambiguating the functions of conversational sounds with prosody: the case of yeahKhiet P. Truong, Dirk Heylen. 2554-2557 [doi]

Prosody and voice quality of vocal social signals: the case of dominance in scenario meetingsMarcela Charfuelan, Marc Schröder, Ingmar Steiner. 2558-2561 [doi]

The prosody of Swedish conversational gruntsD. Neiberg, J. Gustafson. 2562-2565 [doi]

Reliable tracking based on speech sample salience of vocal cycle length perturbationsChristophe Mertens, Francis Grenez, Lise Crevier-Buchman, Jean Schoentgen. 2566-2569 [doi]

Longitudinal changes of selected voice source parametersHideki Kasuya, Hajime Yoshida, Satoshi Ebihara, Hiroki Mori. 2570-2573 [doi]

Automatic perceptual categorization of disordered connected speechAli Alpan, Jean Schoentgen, Youri Maryn, Francis Grenez. 2574-2577 [doi]

Kinematic analysis of tongue movement control in spastic dysarthriaHeejin Kim, Panying Rong, Torrey M. Loucks, Mark Hasegawa-Johnson. 2578-2581 [doi]

Pre- and short-term posttreatment vocal functioning in patients with advanced head and neck cancer treated with concomitant chemoradiotherapyIrene Jacobi, Lisette van der Molen, Maya van Rossum, Frans Hilgers. 2582-2585 [doi]

Acoustic analysis of intonation in parkinson s diseaseJoan K. Y. Ma, Rüdiger Hoffmann. 2586-2589 [doi]

SAFE: a statistical algorithm for F0 estimation for both clean and noisy speechWei Chu, Abeer Alwan. 2590-2593 [doi]

Robust and efficient pitch estimation using an iterative ARMA techniqueJung Ook Hong, Patrick J. Wolfe. 2594-2597 [doi]

Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation basesYasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Hidehisa Nagano, Kunio Kashino. 2598-2601 [doi]

Applying geometric source separation for improved pitch extraction in human-robot interactionMartin Heckmann, Claudius Gläser, Frank Joublin, Kazuhiro Nakadai. 2602-2605 [doi]

A spectral LF model based approach to voice source parameterisationJohn Kane, Mark Kane, Christer Gobl. 2606-2609 [doi]

Glottal-based analysis of the lombard effectThomas Drugman, Thierry Dutoit. 2610-2613 [doi]

Hidden logistic linear regression for support vector machine based phone verificationBo Li, Khe Chai Sim. 2614-2617 [doi]

Jointly optimized discriminative features for speech recognitionTim Ng, Bing Zhang, Long Nguyen. 2618-2621 [doi]

Invariant integration features combined with speaker-adaptation methodsFlorian Müller, Alfred Mertins. 2622-2625 [doi]

Multi resolution discriminative models for subvocalic speech recognitionMark Raugas, Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan. 2626-2629 [doi]

A comparative large scale study of MLP features for Mandarin ASRFabio Valente, Mathew Magimai-Doss, Christian Plahl, Suman V. Ravuri, Wen Wang. 2630-2633 [doi]

Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficientsCong-Thanh Do, Dominique Pastor, Gaël Le Lan, André Goalic. 2634-2637 [doi]

A hybrid approach to online speaker diarizationCarlos Vaquero, Oriol Vinyals, Gerald Friedland. 2638-2641 [doi]

System output combination for improved speaker diarizationSimon Bozonnet, Nicholas W. D. Evans, Xavier Anguera, Oriol Vinyals, Gerald Friedland, Corinne Fredouille. 2642-2645 [doi]

An integrated top-down/bottom-up approach to speaker diarizationSimon Bozonnet, Nicholas W. D. Evans, Corinne Fredouille, Dong Wang, Raphaël Troncy. 2646-2649 [doi]

Advances in fast multistream diarization based on the information bottleneck frameworkDeepu Vijayasenan, Fabio Valente, Hervé Bourlard. 2650-2653 [doi]

Audio-visual synchronisation for speaker diarisationGiulia Garau, Alfred Dielmann, Hervé Bourlard. 2654-2657 [doi]

An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture modelsKyu Jeong Han, Shrikanth S. Narayanan. 2658-2661 [doi]

Dialog prediction for a general model of turn-takingNigel G. Ward, Olac Fuentes, Alejandro Vega. 2662-2665 [doi]

Speaker tracking in an unsupervised speech controlled systemTobias Herbig, Franz Gerl, Wolfgang Minker. 2666-2669 [doi]

MultiBIC: an improved speaker segmentation technique for TV showsPaula Lopez-Otero, Laura Docío Fernández, Carmen García-Mateo. 2670-2673 [doi]

Automatic speech recognition for assistive writing in speech supplemented word predictionJohn-Paul Hosom, Tom Jakobs, Allen Baker, Susan Fager. 2674-2677 [doi]

Viseme-dependent weight optimization for CHMM-based audio-visual speech recognitionAlexey Karpov, Andrey Ronzhin, Konstantin Markov, Milos Zelezný. 2678-2681 [doi]

Audio-visual anticipatory coarticulation modeling by human and machineLouis H. Terry, Karen Livescu, Janet B. Pierrehumbert, Aggelos K. Katsaggelos. 2682-2685 [doi]

Impact of lack of acoustic feedback in EMG-based silent speech recognitionMatthias Janke, Michael Wand, Tanja Schultz. 2686-2689 [doi]

Using prosody to improve Mandarin automatic speech recognitionChong-Jia Ni, Wenju Liu, Bo Xu. 2690-2693 [doi]

A robust audio-visual speech recognition using audio-visual voice activity detectionSatoshi Tamura, Masato Ishikawa, Takashi Hashiba, Shin ichi Takeuchi, Satoru Hayamizu. 2694-2697 [doi]

Efficient manycore CHMM speech recognition for audiovisual and multistream dataDorothea Kolossa, Jike Chong, Steffen Zeiler, Kurt Keutzer. 2698-2701 [doi]

Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robotsTakami Yoshida, Kazuhiro Nakadai. 2702-2705 [doi]

Non-audible murmur recognition based on fusion of audio and visual streamsPanikos Heracleous, Norihiro Hagita. 2706-2709 [doi]

Improved n-gram phonotactic models for language recognitionMohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel. 2710-2713 [doi]

A study of term weighting in phonotactic approach to spoken language recognitionSirinoot Boonsuk, Donglai Zhu, Bin Ma, Atiwong Suchato, Proadpran Punyabukkana, Nattanun Thatphithakkul, Chai Wutiwiwatchai. 2714-2717 [doi]

Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognitionSabato Marco Siniscalchi, Jeremy Reed, Torbjørn Svendsen, Chin-Hui Lee. 2718-2721 [doi]

Hierarchical multilayer perceptron based language identificationDavid Imseng, Mathew Magimai-Doss, Hervé Bourlard. 2722-2725 [doi]

The NIST 2010 speaker recognition evaluationAlvin F. Martin, Craig S. Greenberg. 2726-2729 [doi]

Bayesian speaker recognition using Gaussian mixture model and laplace approximationShih-Sian Cheng, I-Fan Chen, Hsin-Min Wang. 2730-2733 [doi]

What else is new than the hamming window? robust MFCCs for speaker recognition via multitaperingTomi Kinnunen, Rahim Saeidi, Johan Sandberg, Maria Hansson-Sandsten. 2734-2737 [doi]

Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model frameworkAchintya Kumar Sarkar, Srinivasan Umesh. 2738-2741 [doi]

Graph-embedding for speaker recognitionZahi N. Karam, William M. Campbell. 2742-2745 [doi]

A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factorChang Huai You, Haizhou Li, Kong-Aik Lee. 2746-2749 [doi]

Robust mixture modeling using t-distribution: application to speaker IDSundar Harshavardhan, Thippur V. Sreenivas. 2750-2753 [doi]

A variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verificationChi-Sang Jung, Kyu Jeong Han, Hyunson Seo, Shrikanth S. Narayanan, Hong-Goo Kang. 2754-2757 [doi]

Near field sound source localization based on cross-power spectrum phase analysis with multiple microphonesKohei Hayashida, Masanori Morise, Takanobu Nishiura. 2758-2761 [doi]

A maximum a posteriori sound source localization in reverberant and noisy conditionsJinho Choi, Chang D. Yoo. 2762-2765 [doi]

Multichannel source separation based on source location cue with log-spectral shaping by hidden Markov source modelTomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto. 2766-2769 [doi]

A DOA estimation algorithm based on equalization-cancellation theoryDuc Thanh Chau, Junfeng Li, Masato Akagi. 2770-2773 [doi]

Concurrent speaker localization using multi-band position-pitch (m-popi) algorithm with spectro-temporal pre-processingTania Habib, Harald Romsdorfer. 2774-2777 [doi]

On using Gaussian mixture model for double-talk detection in acoustic echo suppressionJi-Hyun Song, Kyu-Ho Lee, Yun-Sik Park, Sang-Ick Kang, Joon-Hyuk Chang. 2778-2781 [doi]

Catalog-based single-channel speech-music separationCemil Demir, A. Taylan Cemgil, Murat Saraclar. 2782-2785 [doi]

Unvoiced speech segregation based on CASA and spectral subtractionKe Hu, DeLiang Wang. 2786-2789 [doi]

Unsupervised sequential organization for cochannel speech separationKe Hu, DeLiang Wang. 2790-2793 [doi]

The INTERSPEECH 2010 paralinguistic challengeBjörn Schuller, Stefan Steidl, Anton Batliner, Felix Burkhardt, Laurence Devillers, Christian A. Müller, Shrikanth S. Narayanan. 2794-2797 [doi]

Age and gender classification from speech using decision level fusion and ensemble based techniquesFlorian Lingenfelser, Johannes Wagner, Thurid Vogt, Jonghwa Kim, Elisabeth André. 2798-2801 [doi]

Level of interest sensing in spoken dialog using multi-level fusion of acoustic and lexical evidenceJe Hun Jeon, Rui Xia, Yang Liu. 2802-2805 [doi]

Fuzzy support vector machines for age and gender classificationPhuoc Nguyen, Trung Le, Dat Tran, Xu Huang, Dharmendra Sharma. 2806-2809 [doi]

Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimationRok Gajsek, Janez Zibert, Tadej Justin, Vitomir Struc, Bostjan Vesnicer, France Mihelic. 2810-2813 [doi]

Age recognition based on speech signals using weights supervectorRoyi Porat, Dan Lange, Yaniv Zigel. 2814-2817 [doi]

Age and gender classification using fusion of acoustic and prosodic featuresHugo Meinedo, Isabel Trancoso. 2818-2821 [doi]

Brno university of technology system for interspeech 2010 paralinguistic challengeMarcel Kockmann, Lukas Burget, Jan Cernocký. 2822-2825 [doi]

Combining five acoustic level modeling methods for automatic speaker age and gender recognitionMing Li, Chi-Sang Jung, Kyu Jeong Han. 2826-2829 [doi]

Age and gender recognition based on multiple systems - early vs. late fusionTobias Bocklet, Georg Stemmer, Viktor Zeißler, Elmar Nöth. 2830-2833 [doi]

Automatic speaker age and gender recognition in the car for tailoring dialog and mobile servicesMichael Feld, Felix Burkhardt, Christian A. Müller. 2834-2837 [doi]

Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervisionMan-Hung Siu, Herbert Gish, Arthur Chan, William Belfield. 2838-2841 [doi]

An analysis of sparseness and regularization in exemplar-based methods for speech classificationDimitri Kanevsky, Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo. 2842-2845 [doi]

Investigation of full-sequence training of deep belief networks for speech recognitionAbdel-rahman Mohamed, Dong Yu, L. Deng. 2846-2849 [doi]

Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgramYow-Bang Wang, Lin-Shan Lee. 2850-2853 [doi]

Continuous speech recognition with a TF-IDF acoustic modelGeoffrey Zweig, Patrick Nguyen, Jasha Droppo, Alex Acero. 2854-2857 [doi]

SCARF: a segmental conditional random field toolkit for speech recognitionGeoffrey Zweig, Patrick Nguyen. 2858-2861 [doi]

Online SLU model adaptation with a partial oraclePierre Gotab, Géraldine Damnati, Frédéric Béchet, Lionel Delphin-Poulat. 2862-2865 [doi]

Role of language models in spoken fluency evaluationOm Deshmukh, Harish Doddala, Ashish Verma, Karthik Visweswariah. 2866-2869 [doi]

Social role discovery from spoken language using dynamic Bayesian networksSibel Yaman, Dilek Hakkani-Tür, Gökhan Tür. 2870-2873 [doi]

Domain adaptation and compensation for emotion detectionMichelle Hewlett Sanchez, Gökhan Tür, Luciana Ferrer, Dilek Hakkani-Tür. 2874-2877 [doi]

Phrase alignment confidence for statistical machine translationSankaranarayanan Ananthakrishnan, Rohit Prasad, Prem Natarajan. 2878-2881 [doi]

Named-entity projection and data-driven morphological decomposition for field maintainable speech-to-speech translation systemsIan R. Lane, Alex Waibel. 2882-2885 [doi]

Acoustic correlates of voice quality improvement by voice trainingKiyoaki Aikawa, Junko Uenuma, Tomoko Akitake. 2886-2889 [doi]

Phonetic segmentation of singing voice using MIDI and parallel speechMinghui Dong, Paul Chan, Ling Cen, Haizhou Li, Jason Teo, Ping Jen Kua. 2890-2893 [doi]

A singing style modeling system for singing voice synthesizersKeijiro Saino, Makoto Tachibana, Hideki Kenmochi. 2894-2897 [doi]

A fast query by humming system based on notesJingzhou Yang, Jia Liu, Weiqiang Zhang. 2898-2901 [doi]

Melody pitch estimation based on range estimation and candidate extraction using harmonic structure modelSeokhwan Jo, Sihyun Joo, Chang D. Yoo. 2902-2905 [doi]

Modified spatial audio object coding scheme with harmonic extraction and elimination structure for interactive audio serviceJiHoon Park, Kwang-Ki Kim, Jeongil Seo, Minsoo Hahn. 2906-2909 [doi]

Modelling the effect of speaker familiarity and noise on infant word recognitionChristina Bergmann, Michele Gubian, Lou Boves. 2910-2913 [doi]

Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition modelKouki Miyazawa, Hideaki Kikuchi, Reiko Mazuka. 2914-2917 [doi]

Learning speaker normalization using semisupervised manifold alignmentAndrew R. Plummer, Mary E. Beckman, Mikhail Belkin, Eric Fosler-Lussier, Benjamin Munson. 2918-2921 [doi]

Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic eventsOkko Johannes Räsänen. 2922-2925 [doi]

Language acquisition and cross-modal associations: computational simulation of the result of infant studiesLouis ten Bosch, Lou Boves. 2926-2929 [doi]

Active word learning under uncertain input conditionsMaarten Versteegh, Louis ten Bosch, Lou Boves. 2930-2933 [doi]

Parallel training of neural networks for speech recognitionKarel Veselý, Lukas Burget, Frantisek Grézl. 2934-2937 [doi]

The use of sense in unsupervised training of acoustic models for ASR systemsRita Singh, Benjamin Lambert, Bhiksha Raj. 2938-2941 [doi]

Boosted mixture learning of Gaussian mixture HMMs for speech recognitionJun Du, Yu Hu, Hui Jiang. 2942-2945 [doi]

On the exploitation of hidden Markov models and linear dynamic models in a hybrid decoder architecture for continuous speech recognitionVolker Leutnant, Reinhold Haeb-Umbach. 2946-2949 [doi]

Context dependent modelling approaches for hybrid speech recognizersAlberto Abad, Thomas Pellegrini, Isabel Trancoso, João Paulo Neto. 2950-2953 [doi]

A regularized discriminative training method of acoustic models derived by minimum relative entropy discriminationYotaro Kubo, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi. 2954-2957 [doi]

Decision tree state clustering with word and syllable featuresHank Liao, Christopher Alberti, Michiel Bacchiani, Olivier Siohan. 2958-2961 [doi]

A duration modeling technique with incremental speech rate normalizationHiroshi Fujimura, Takashi Masuko, Mitsuyoshi Tachimori. 2962-2965 [doi]

Long short-term memory networks for noise robust speech recognitionMartin Wöllmer, Yang Sun, Florian Eyben, Björn Schuller. 2966-2969 [doi]

One-model speech recognition and synthesis based on articulatory movement HMMsTsuneo Nitta, Takayuki Onoda, Masashi Kimura, Yurie Iribe, Kouichi Katsurada. 2970-2973 [doi]

Acoustic modeling with bootstrap and restructuring for low-resourced languagesXiaodong Cui, Jian Xue, Pierre L. Dognin, Upendra V. Chaudhari, Bowen Zhou. 2974-2977 [doi]

Lecture speech recognition by combining word graphs of various acoustic modelsTetsuo Kosaka, Keisuke Goto, Takashi Ito, Masaharu Katoh. 2978-2981 [doi]

Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognitionKhe Chai Sim, Shilin Liu. 2982-2985 [doi]

Deep-structured hidden conditional random fields for phonetic recognitionDong Yu, Li Deng. 2986-2989 [doi]

Semi-supervised learning for improved expression of uncertainty in discriminative classifiersJonathan Malkin, Jeff A. Bilmes. 2990-2993 [doi]

Modeling posterior probabilities using the linear exponential familyPeder A. Olsen, Vaibhava Goel, Charles A. Micchelli, John R. Hershey. 2994-2997 [doi]

New technique to enhance the performance of spoken dialogue systems based on dialogue states-dependent language models and grammatical rulesRamón López-Cózar, David Griol. 2998-3001 [doi]

A stochastic finite-state transducer approach to spoken dialog managementLluís F. Hurtado, Joaquin Planells, Encarna Segarra, Emilio Sanchis, David Griol. 3002-3005 [doi]

Enhanced monitoring tools and online dialogue optimisation merged into a new spoken dialogue system design experienceRomain Laroche, Philippe Bretier, Ghislain Putois. 3006-3009 [doi]

Optimising a handcrafted dialogue system designRomain Laroche, Ghislain Putois, Philippe Bretier. 3010-3013 [doi]

Utterance selection for speech acts in a cognitive tourguide scenarioFelix Putze, Tanja Schultz. 3014-3017 [doi]

Lexical entrainment of real users in the let s go spoken dialog systemGabriel Parent, Maxine Eskenazi. 3018-3021 [doi]

Combining user intention and error modeling for statistical dialog simulatorsSilvia Quarteroni, Meritxell González, Giuseppe Riccardi, Sebastian Varges. 3022-3025 [doi]

Parallel processing of interruptions and feedback in companions affective dialogue systemJaakko Hakulinen, Markku Turunen, Raul Santos de la Camara, Nigel Crook. 3026-3029 [doi]

Dynamic language modeling using Bayesian networks for spoken dialog systemsAntoine Raux, Neville Mehta, Deepak Ramachandran, Rakesh Gupta. 3030-3033 [doi]

Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act n-gramSunao Hara, Norihide Kitaoka, Kazuya Takeda. 3034-3037 [doi]

Dialogue act detection in error-prone spoken dialogue systems using partial sentence tree and latent dialogue act matrixWei-Bin Liang, Chung-Hsien Wu, Yu-Cheng Hsiao. 3038-3041 [doi]

Detection of hot spots in poster conversations based on reactive tokens of audienceTatsuya Kawahara, Kouhei Sumi, Zhi-Qiang Chang, Katsuya Takanashi. 3042-3045 [doi]

Psychological evaluation of a group communication activation robot in a party gameYoichi Matsuyama, Shinya Fujie, Hikaru Taniyama, Tetsunori Kobayashi. 3046-3049 [doi]

Analyzing user utterances in barge-in-able spoken dialogue system for improving identification accuracyKyoko Matsuyama, Kazunori Komatani, Ryu Takeda, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno. 3050-3053 [doi]

Pitch similarity in the vicinity of backchannelsMattias Heldner, Jens Edlund, Julia Hirschberg. 3054-3057 [doi]

A rule-based backchannel prediction model using pitch and pause informationKhiet P. Truong, Ronald Poppe, Dirk Heylen. 3058-3061 [doi]

Combining text categorization and dialog modeling for speaker role identification on call center conversationsRémi Lavalley, Chloé Clavel, Patrice Bellot, Marc El-Bèze. 3062-3065 [doi]

Topic-dependent n-gram models based on optimization of context lengths in LDAAkira Nakamura, Satoru Hayamizu. 3066-3069 [doi]

Expectations for discourse genre identification: a prosodic studyNicolas Obin, Volker Dellwo, Anne Lacheret, Xavier Rodet. 3070-3073 [doi]

Dialogue act tagging and segmentation with a single perceptronRamón Granell, Stephen G. Pulman, Carlos D. Martínez-Hinarejos, José-Miguel Benedí. 3074-3077 [doi]

Improving the readability of class lecture ASR results using a confusion networkYasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa. 3078-3081 [doi]

Toward detecting voice activity employing soft decision in second-order conditional MAPSang-Kyun Kim, Jae Hun Choi, Sang-Ick Kang, Ji-Hyun Song, Joon-Hyuk Chang. 3082-3085 [doi]

Voice activity detection in a reguarized reproducing kernel hilbert spaceXugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura. 3086-3089 [doi]

A new VAD framework using statistical model and human knowledge based empirical ruleJi Wu, Xiao-lei Zhang, Wei Li. 3090-3093 [doi]

Adaptive high accuracy approaches to speech activity detection in noisy and hostile audio environmentsMark C. Huggins, Brett Y. Smolenski, Aaron D. Lawson. 3094-3097 [doi]

Robust voice activity detection in stereo recording with crosstalkPrasanta Kumar Ghosh, Andreas Tsiartas, Panayiotis G. Georgiou, Shrikanth S. Narayanan. 3098-3101 [doi]

Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalizationMasakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani. 3102-3105 [doi]

Spectral entropy-based voice activity detector for videoconferencing systemsBowon Lee, Debargha Muhkerjee. 3106-3109 [doi]

The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithmsDavid Dean, Sridha Sridharan, Robert Vogt, Michael Mason. 3110-3113 [doi]

A Bayesian approach to voice activity detection using multiple statistical models and discriminative trainingTao Yu, John H. L. Hansen. 3114-3117 [doi]

Noise robust voice activity detection using features extracted from the time-domain autocorrelation functionHouman Ghaemmaghami, Brendan Baker, Robert Vogt, Sridha Sridharan. 3118-3121 [doi]

VAD-measure-embedded decoder with online model adaptationTasuku Oonishi, Koji Iwano, Sadaoki Furui. 3122-3125 [doi]

Robust statistical voice activity detection using a likelihood ratio sign testShiwen Deng, Jiqing Han. 3126-3129 [doi]

Automatic turn segmentation in spoken conversationsAlexei V. Ivanov, Giuseppe Riccardi. 3130-3133 [doi]

Turn taking-based conversation detection by using DOA estimationYohei Kawaguchi, Masahito Togami, Yasunari Obuchi. 3134-3137 [doi]

External Links

Cite Key

Statistics

PDF

Researchr

INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010

Abstract

Table of Contents