INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013

researchr

You are not signed in
Sign in
Sign up

Frédéric Bimbot, Christophe Cerisara, Cécile Fougeron, Guillaume Gravier, Lori Lamel, François Pellegrino, Pascal Perrier, editors, INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013. ISCA, 2013. [doi]

Conference: interspeech2013

Abstract is missing.

Information retrieval-based dynamic time warpingXavier Anguera. 1-5 [doi]

On the computation of document frequency statistics from spoken corpora using factor automataDogan Can, Shrikanth Narayanan. 6-10 [doi]

Acceleration of spoken term detection using a suffix array by assigning optimal threshold values to sub-keywordsKouichi Katsurada, Seiichi Miura, Kheang Seng, Yurie Iribe, Tsuneo Nitta. 11-14 [doi]

Strategies for high accuracy keyword detection in noisy channelsArindam Mandal, Julien van Hout, Yik-Cheung Tam, Vikramjit Mitra, Yun Lei, Jing Zheng, Dimitra Vergyri, Luciana Ferrer, Martin Graciarena, Andreas Kathol, Horacio Franco. 15-19 [doi]

On the calibration and fusion of heterogeneous spoken term detection systemsAlberto Abad, Luis Javier Rodríguez-Fuentes, Mikel Peñagarikano, Amparo Varona, Germán Bordel. 20-24 [doi]

Intensive acoustic models constructed by integrating low-occurrence models for spoken term detectionShiro Narumi, Kazuma Konno, Takuya Nakano, Yoshiaki Itoh, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee. 25-28 [doi]

Using phonetic feature extraction to determine optimal speech regions for maximising the effectiveness of glottal source analysisJohn Kane, Irena Yanushevskaya, John Dalton, Christer Gobl, Ailbhe Ní Chasaide. 29-33 [doi]

Beyond bandlimited sampling of speech spectral envelope imposed by the harmonic structure of voiced soundsHideki Kawahara, Masanori Morise, Tomoki Toda, Ryuichi Nisimura, Toshio Irino. 34-38 [doi]

A source-filter based adaptive harmonic model and its application to speech prosody modificationJeeSok Lee, Frank K. Soong, Hong-Goo Kang. 39-43 [doi]

Detection of glottal opening instants using Hilbert envelopeK. Ramesh, S. R. Mahadeva Prasanna, D. Govind. 44-48 [doi]

Robust formant detection using group delay function and stabilized weighted linear predictionDhananjaya N. Gowda, Jouni Pohjalainen, Mikko Kurimo, Paavo Alku. 49-53 [doi]

A source-filter separation algorithm for voiced sounds based on an exact anticausal/causal pole decomposition for the class of periodic signalsThomas Hézard, Thomas Hélie, Boris Doval. 54-58 [doi]

Parallel absolute-relative feature based phonotactic language recognitionWeiwei Liu, Weiqiang Zhang, Zhiyi Li, Jia Liu. 59-63 [doi]

Dimensionality reduction of phone log-likelihood ratio features for spoken language recognitionMireia Díez, Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel. 64-68 [doi]

Improvements in language identification on the RATS noisy speech corpusJeff Z. Ma, Bing Zhang 0004, Spyros Matsoukas, Sri Harish Reddy Mallidi, Feipeng Li, Hynek Hermansky. 69-73 [doi]

Regularized subspace n-gram model for phonotactic ivector extractionMehdi Soufifar, Lukás Burget, Oldrich Plchot, Sandro Cumani, Jan Cernocký. 74-78 [doi]

Foreign accent detection from spoken Finnish using i-vectorsHamid Behravan, Ville Hautamäki, Tomi Kinnunen. 79-83 [doi]

Adaptive Gaussian backend for robust language identificationMitchell McLaren, Aaron Lawson, Yun Lei, Nicolas Scheffer. 84-88 [doi]

Lattice-based training of bottleneck feature extraction neural networksMatthias Paulik. 89-93 [doi]

Modular combination of deep neural networks for acoustic modelingJonas Gehring, Wonkyum Lee, Kevin Kilgour, Ian R. Lane, Yajie Miao, Alex Waibel. 94-98 [doi]

Informative spectro-temporal bottleneck features for noise-robust speech recognitionShuo-Yiin Chang, Nelson Morgan. 99-103 [doi]

A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSRZhi-Jie Yan, Qiang Huo, Jian Xu. 104-108 [doi]

Improved feature processing for deep neural networksShakti P. Rath, Daniel Povey, Karel Veselý, Jan Cernocký. 109-113 [doi]

Deep vs. wide: depth on a budget for robust speech recognitionOriol Vinyals, Nelson Morgan. 114-118 [doi]

An early case of "VOT"Angelika Braun. 119-122 [doi]

Pitch pattern variations in three regional varieties of American EnglishRobert Allen Fox, Ewa Jacewicz, Jessica Hart. 123-127 [doi]

Fine-grain voice strength estimation from vowel spectral cuesJean-Sylvain Liénard, Claude Barras. 128-132 [doi]

Linking loudness increases in normal and lombard speech to decreasing vowel formant separationElizabeth Godoy, Catherine Mayo, Yannis Stylianou. 133-137 [doi]

Three-dimensional rectangular vocal-tract model with asymmetric wall impedancesKunitoshi Motoki. 138-142 [doi]

Quasi closed phase analysis for glottal inverse filteringManu Airaksinen, Brad H. Story, Paavo Alku. 143-147 [doi]

The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autismBjörn Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus R. Scherer, Fabien Ringeval, Mohamed Chetouani, Felix Weninger, Florian Eyben, Erik Marchi, Marcello Mortillaro, Hugues Salamin, Anna Polychroniou, Fabio Valente, Samuel Kim. 148-152 [doi]

Non-linguistic vocalisation recognition based on hybrid GMM-SVM approachArtur Janicki. 153-157 [doi]

Characteristic contours of syllabic-level units in laughterJieun Oh, Eunjoon Cho, Malcolm Slaney. 158-162 [doi]

Detection of nonverbal vocalizations using Gaussian mixture models: looking for fillers and laughter in conversational speechTeun F. Krikke, Khiet P. Truong. 163-167 [doi]

Using phonetic patterns for detecting social cues in natural conversationsJohannes Wagner, Florian Lingenfelser, Elisabeth André. 168-172 [doi]

Paralinguistic event detection from speech using probabilistic time-series smoothing and maskingRahul Gupta, Kartik Audhkhasi, Sungbok Lee, Shrikanth Narayanan. 173-177 [doi]

Detecting laughter and filled pauses using syllable-based featuresGouzhen An, David-Guy Brizan, Andrew Rosenberg. 178-181 [doi]

Classifying language-related developmental disorders from speech cues: the promise and the potential confoundsDaniel Bone, Theodora Chaspari, Kartik Audhkhasi, James Gibson, Andreas Tsiartas, Maarten Van Segbroeck, Ming Li, Sungbok Lee, Shrikanth Narayanan. 182-186 [doi]

Classification of developmental disorders from speech signals using submodular feature selectionKatrin Kirchhoff, Yuzong Liu, Jeff Bilmes. 187-190 [doi]

Robust and accurate features for detecting and diagnosing autism spectrum disordersMeysam Asgari, Alireza Bayestehtashk, Izhak Shafran. 191-194 [doi]

Suprasegmental information modelling for autism disorder spectrum and specific language impairment classificationDavid Martínez González, Dayana Ribas, Eduardo Lleida, Alfonso Ortega, Antonio Miguel. 195-199 [doi]

Let me finish: automatic conflict detection using speaker overlapFélix Grèzes, Justin Richards, Andrew Rosenberg. 200-204 [doi]

GMM based speaker variability compensated system for interspeech 2013 compare emotion challengeVidhyasaharan Sethu, Julien Epps, Eliathamby Ambikairajah, Haizhou Li. 205-209 [doi]

Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speechOkko Räsänen, Jouni Pohjalainen. 210-214 [doi]

Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognitionHung-yi Lee, Ting-Yao Hu, How Jing, Yun-Fan Chang, Yu Tsao, Yu-Cheng Kao, Tsang-Long Pao. 215-219 [doi]

Detecting autism, emotions and social signals using adaboostGábor Gosztolya, Róbert Busa-Fekete, László Tóth. 220-224 [doi]

Resistance is futile - the intonation between continuation rise and calling contour in GermanOliver Niebuhr. 225-229 [doi]

The influence of F0 contour continuity on prominence perceptionHansjörg Mixdorff, Oliver Niebuhr. 230-234 [doi]

Native English listeners' perceptions of prosody in L1 and L2 readingCaroline L. Smith, Paul Edmunds. 235-238 [doi]

Naturalness judgement of L2 Mandarin Chinese - does timing matter?Chiharu Tsurutani, Dean Luo. 239-242 [doi]

Language background affects the strength of the pitch bias in a duration discrimination taskDaniel Aalto, Juraj Simko, Martti Vainio. 243-247 [doi]

Pitch and lengthening as cues to turn transition in SwedishMargaret Zellers. 248-252 [doi]

Perception of glottalization in varying pitch contexts across languagesMaria Paola Bissiri, Margaret Zellers. 253-257 [doi]

Exemplar-based pitch accent categorisation using the generalized context modelMichael Walsh, Katrin Schweitzer, Nadja Schauffler. 258-262 [doi]

Double contrast is signalled by prenuclear and nuclear accent types alone, not by f0-plateauxBettina Braun, Yuki Asano. 263-266 [doi]

Word stress perception in European PortugueseSusana Correia, Sónia Frota, Joseph Butler, Marina Vigário. 267-271 [doi]

Using generalized additive models and random forests to model prosodic prominence in GermanDenis Arnold, Petra Wagner, R. Harald Baayen. 272-276 [doi]

Perceiving speech rate differences between natural and time-scale modified utterancesHartmut R. Pfitzinger, Hansjörg Mixdorff. 277-281 [doi]

On the robustness of some acoustic parameters for signalling word stress across styles in Brazilian PortuguesePlínio A. Barbosa, Anders Eriksson, Joel Åkesson. 282-286 [doi]

Reexamine the sandhi rules and the merging tones in hakka languageShao-Ren Lyu, Ho-hsien Pan. 287-290 [doi]

A preliminary spectral analysis of palatal and velar stop bursts in pitjantjatjaraMarija Tabain, Richard Beare, Andrew Butcher. 291-295 [doi]

Presentational focus realisation in nalbaria variety of assameseShakuntala Mahanta, A. I. Twaha. 296-299 [doi]

On the relation between intonational phrasing and pitch accent distribution. evidence from European Portuguese varietiesMarisa Cruz, Sónia Frota. 300-304 [doi]

How are word-final schwas different in the north and south of france?Rena Nemoto, Martine Adda-Decker. 305-309 [doi]

Modeling postcolonial language varieties: challenges and lessons learned from mozambican PortugueseSimone Ashby, Sílvia Barbosa, Catarina Silva, Paulino Fumo, José Pedro Ferreira. 310-314 [doi]

Prosody of contrastive focus in estonianHeete Sahkai, Mari-Liis Kalvik, Meelis Mihkla. 315-319 [doi]

Exploring the connection of acoustic and distinctive featuresThomas Kisler, Uwe D. Reichel. 320-324 [doi]

A physiological analysis of the tense/lax vowel contrast in two varieties of GermanConceição Cunha, Jonathan Harrington, Phil Hoole. 325-329 [doi]

Production of estonian quantity contrasts by native speakers of FinnishEinar Meister, Lya Meister. 330-334 [doi]

Aerodynamic and durational cues of phonological voicing in whisperYohann Meynadier, Yulia Gaydina. 335-339 [doi]

Information theoretic syllable structure and its relation to the c-center effectUwe D. Reichel. 340-344 [doi]

The bulgarian stressed and unstressed vowel system. a corpus studyBistra Andreeva, William J. Barry, Jacques C. Koreman. 345-348 [doi]

Training an articulatory synthesizer with continuous acoustic dataSantitham Prom-on, Peter Birkholz, Yi Xu. 349-353 [doi]

Estimating speaker-specific intonation patterns using the linear alignment modelGéza Kiss, Jan P. H. van Santen. 354-358 [doi]

Factored maximum likelihood kernelized regression for HMM-based singing voice synthesisJune Sig Sung, Doo Hwa Hong, Hyun Woo Koo, Nam Soo Kim. 359-363 [doi]

Improvements to HMM-based speech synthesis based on parameter generation with rich context modelsShinnosuke Takamichi, Tomoki Toda, Yoshinori Shiga, Sakriani Sakti, Graham Neubig, Satoshi Nakamura. 364-368 [doi]

Voice conversion in high-order eigen space using deep belief netsToru Nakashika, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki. 369-372 [doi]

Voice conversion for non-parallel datasets using dynamic kernel partial least squares regressionHanna Silén, Jani Nurminen, Elina Helander, Moncef Gabbouj. 373-377 [doi]

A style control technique for singing voice synthesis based on multiple-regression HSMMTakashi Nose, Misa Kanemoto, Tomoki Koriyama, Takao Kobayashi. 378-382 [doi]

Predicting the quality of text-to-speech systems from a large-scale feature setFlorian Hinterleitner, Christoph Norrenbrock, Sebastian Möller, Ulrich Heute. 383-387 [doi]

Speaker-specific retraining for enhanced compression of unit selection text-to-speech databasesJani Nurminen, Hanna Silén, Moncef Gabbouj. 388-391 [doi]

Avatar therapy: an audio-visual dialogue system for treating auditory hallucinationsMark Huckvale, Julian Leff, Geoff Williams. 392-396 [doi]

Optimizations and fitting procedures for the liljencrants-fant model for statistical parametric speech synthesisPrasanna Kumar Muthukumar, Alan W. Black, H. Timothy Bunnell. 397-401 [doi]

Analysis and modeling of "focus" in contextDirk Hovy, Gopala Krishna Anumanchipalli, Alok Parlikar, Caroline Vaughn, Adam C. Lammert, Eduard H. Hovy, Alan W. Black. 402-406 [doi]

Production and perception of pseudo-V1CV2 outside the vowel triangle: speech illusion effectsThi Anh Xuan Tran, Viet Son Nguyen, Eric Castelli, René Carré. 407-411 [doi]

Recent evolution of non-standard consonantal variants in French broadcast newsMaria Candea, Martine Adda-Decker, Lori Lamel. 412-416 [doi]

Architekt or archtekt? perception of devoiced vowels produced by Japanese speakers of GermanFrank Zimmerer, Rei Yasuda, Henning Reetz. 417-420 [doi]

Comparing vowel category response surfaces over age-varying maximal vowel spaces within and across language communitiesAndrew R. Plummer, Lucie Ménard, Benjamin Munson, Mary E. Beckman. 421-425 [doi]

Perceived vocal attractiveness across dialects is similar but not uniformMolly Babel, Grant McGuire. 426-430 [doi]

Mutual intelligibility of American, Chinese and Dutch-accented speakers of English tested by SUS and SPIN sentencesHongyan Wang, Vincent J. van Heuven. 431-435 [doi]

Speech enhancement based on deep denoising autoencoderXugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori. 436-440 [doi]

Musical noise analysis for Bayesian minimum mean-square error speech amplitude estimators based on higher-order statisticsHiroshi Saruwatari, Suzumi Kanehara, Ryoichi Miyazaki, Kiyohiro Shikano, Kazunobu Kondo. 441-445 [doi]

Non-negative matrix factorization with linear constraints for single-channel speech enhancementNikolay Lyubimov, Mikhail Kotov. 446-450 [doi]

A single channel speech enhancement approach by combining statistical criterion and multi-frame sparse dictionary learningHung-Wei Tseng, Srikanth Vishnubhotla, Mingyi Hong, Xiangfeng Wang, Jinjun Xiao, Zhi-Quan Luo, Tao Zhang. 451-455 [doi]

Speech enhancement using convolutive nonnegative matrix factorization with cosparsity regularizationMajid Mirbagheri, Yanbo Xu, Sahar Akram, Shihab A. Shamma. 456-459 [doi]

Joint stochastic-deterministic wiener filtering with recursive Bayesian estimation of deterministic speechMatthew McCallum, Bernard J. Guillemin. 460-464 [doi]

Automatic self-supervised learning of associations between speech and textJuha Knuuttila, Okko Räsänen, Unto K. Laine. 465-469 [doi]

Particle swarm optimisation of spoken dialogue system strategiesLucie Daubigney, Matthieu Geist, Olivier Pietquin. 470-474 [doi]

Model-based Bayesian reinforcement learning for dialogue managementPierre Lison. 475-479 [doi]

Evaluating spoken dialogue models under the interactive pattern recognition frameworkFabrizio Ghigi, M. Inés Torres, Raquel Justo, José-Miguel Benedí. 480-484 [doi]

Multi-layer mutually reinforced random walk with hidden parameters for improved multi-party meeting summarizationYun-Nung Chen, Florian Metze. 485-489 [doi]

A recursive dialogue game framework with optimal Policy offering personalized computer-assisted language learningPei-hao Su, Yow-Bang Wang, Tsung-Hsien Wen, Tien-han Yu, Lin-Shan Lee. 490-494 [doi]

Improving LVCSR with hidden conditional random fields for grapheme-to-phoneme conversionStefan Hahn, Patrick Lehnen, Simon Wiesler, Ralf Schlüter, Hermann Ney. 495-499 [doi]

Context-dependent phone mapping for LVCSR of under-resourced languagesVan Hai Do, Xiong Xiao, Engsiong Chng, Haizhou Li. 500-504 [doi]

Improving grapheme-based ASR by probabilistic lexical modeling approachRamya Rasipuram, Mathew Magimai-Doss. 505-509 [doi]

Crosslingual tandem-SGMM: exploiting out-of-language data for acoustic model and feature level adaptationPetr Motlícek, David Imseng, Philip N. Garner. 510-514 [doi]

Multilingual multilayer perceptron for rapid language adaptation between and across language familiesNgoc Thang Vu, Tanja Schultz. 515-519 [doi]

Modeling prosodic sequences with k-means and dirichlet process GMMsAndrew Rosenberg. 520-524 [doi]

Convergence of articulation rate in spontaneous speechAntje Schweitzer, Natalie Lewandowski. 525-529 [doi]

Phonetic convergence in shadowed speech: a comparison of perceptual and acoustic measuresJennifer S. Pardo. 530-534 [doi]

Pitch and duration as a basis for entrainment of overlapped speech onsetsMarcin Wlodarczak, Juraj Simko, Petra Wagner. 535-538 [doi]

Investigating fine temporal dynamics of prosodic and lexical accommodationFrancesca Bonin, Céline De Looze, Sucheta Ghosh, Emer Gilmartin, Carl Vogel, Anna Polychroniou, Hugues Salamin, Alessandro Vinciarelli, Nick Campbell. 539-543 [doi]

Spontaneous and explicit speech imitationJeesun Kim, Ruben Demirdjian, Chris Davis. 544-547 [doi]

Imitation interacts with one's second-language phonology but it does not operate cross-linguisticallyVáclav Jonás Podlipský, Sárka Simácková, Katerina Chládková. 548-552 [doi]

Prosodic markings of semantic predictability in taiwan MandarinPo-jen Hsieh. 553-557 [doi]

How did it work? historic phonetic devices explained by coeval photographsRüdiger Hoffmann, Dieter Mehnert, Rolf Dietzel. 558-562 [doi]

Eliciting speech with sentence lists - a critical evaluation with special emphasis on segmental anchoringLea S. Kohtz, Oliver Niebuhr. 563-567 [doi]

An MRI-based acoustic study of Mandarin vowelsYuguang Wang, Jianwu Dang, Xi Chen, Jianguo Wei, Hongcui Wang, Kiyoshi Honda. 568-571 [doi]

Melody metrics for prosodic typology: comparing English, French and ChineseDaniel Hirst. 572-576 [doi]

Velic coordination in French nasals: a real-time magnetic resonance imaging studyMichael I. Proctor, Louis Goldstein, Adam C. Lammert, Dani Byrd, Asterios Toutios, Shrikanth Narayanan. 577-581 [doi]

Learning to imitate adult speech with the KLAIR virtual infantMark Huckvale, Amrita Sharma. 582-586 [doi]

Physics-based synthesis of disordered voicesJorge C. Lucero, Jean Schoentgen, Mara Behlau. 587-591 [doi]

Place assimilation and articulatory strategies: the case of sibilant sequences in French as L1 and L2Sonia D'Apolito, Barbara Gili Fivela. 592-596 [doi]

Effects of lexical class and lemma frequency on German homographsBarbara Samlowski, Petra Wagner, Bernd Möbius. 597-601 [doi]

Measuring laryngealization in running speech: interaction with contrastive tones in yalálag zapotecLeonardo Lancia, Heriberto Avelino, Daniel Voigt. 602-606 [doi]

A neural oscillator model of speech timing and rhythmErin Rusaw. 607-611 [doi]

Observations of perseverative coarticulation in lateral approximants using MRINicole Wong, Maojing Fu, Zhi-Pei Liang, Ryan Shosted, Bradley P. Sutton. 612-616 [doi]

Comparing computation in Gaussian mixture and neural network based large-vocabulary speech recognitionVishwa Gupta, Gilles Boulianne. 617-621 [doi]

Simultaneous perturbation stochastic approximation for automatic speech recognitionDaniel Stein, Jochen Schwenninger, Michael Stadtschnitzer. 622-626 [doi]

Hardware/software codesign for mobile speech recognitionDavid Sheffield, Michael J. Anderson, Yunsup Lee, Kurt Keutzer. 627-631 [doi]

Exploiting the succeeding words in recurrent neural network language modelsYangyang Shi, Martha Larson, Pascal Wiggers, Catholijn M. Jonker. 632-636 [doi]

Speech acoustic unit segmentation using hierarchical dirichlet processesAmir Hossein Harati Nejad Torbati, Joseph Picone, Marc Sobel. 637-641 [doi]

Transducer-based speech recognition with dynamic language modelsMunir Georges, Stephan Kanthak, Dietrich Klakow. 642-646 [doi]

A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversionYotaro Kubo, Takaaki Hori, Atsushi Nakamura. 647-651 [doi]

Combining forward-based and backward-based decoders for improved speech recognition performanceDenis Jouvet, Dominique Fohr. 652-656 [doi]

ivector-based acoustic data selectionOlivier Siohan, Michiel Bacchiani. 657-661 [doi]

Accurate and compact large vocabulary speech recognition on mobile devicesXin Lei, Andrew Senior, Alexander Gruenstein, Jeffrey Sorensen. 662-665 [doi]

Pre-initialized composition for large-vocabulary speech recognitionCyril Allauzen, Michael Riley. 666-670 [doi]

Speaker dependent activation keyword detector based on GMM-UBMEvelyn Kurniawati, Sapna George. 671-674 [doi]

Written-domain language modeling for automatic speech recognitionHasim Sak, Yun-Hsuan Sung, Françoise Beaufays, Cyril Allauzen. 675-679 [doi]

Detecting words in speech using linear separability in a bag-of-events vector spaceMaarten Versteegh, Louis ten Bosch. 680-684 [doi]

On the improvement of multimodal voice activity detectionMatt Burlick, Dimitrios Dimitriadis, Eric Zavesky. 685-689 [doi]

Using linguistic information to detect overlapping speechJürgen T. Geiger, Florian Eyben, Nicholas W. D. Evans, Björn Schuller, Gerhard Rigoll. 690-694 [doi]

Incremental acoustic subspace learning for voice activity detection using harmonicity-based featuresJiaxing Ye, Takumi Kobayashi, Masahiro Murakawa, Tetsuya Higuchi. 695-699 [doi]

Endpoint detection using weighted finite state transducerHoon Chung, Sung Joo Lee, Yunkeun Lee. 700-703 [doi]

A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voiceMaarten Van Segbroeck, Andreas Tsiartas, Shrikanth Narayanan. 704-708 [doi]

All for one: feature combination for highly channel-degraded speech activity detectionMartin Graciarena, Abeer Alwan, Dan Ellis, Horacio Franco, Luciana Ferrer, John H. L. Hansen, Adam Janin, Byung Suk Lee, Yun Lei, Vikramjit Mitra, Nelson Morgan, Seyed Omid Sadjadi, T. J. Tsai, Nicolas Scheffer, Lee Ngee Tan, Benjamin Williams. 709-713 [doi]

Superposed speech localisation using frequency trackingMaxime Le Coz, Julien Pinquier, Régine André-Obrecht. 714-717 [doi]

Multi-band long-term signal variability features for robust voice activity detectionAndreas Tsiartas, Theodora Chaspari, Nassos Katsamanis, Prasanta Kumar Ghosh, Ming Li, Maarten Van Segbroeck, Alexandros Potamianos, Shrikanth Narayanan. 718-722 [doi]

A low-complexity voice activity detector for smart hearing protection of hyperacusic personsNarimene Lezzoum, Ghyslain Gagnon, Jérémie Voix. 723-727 [doi]

Speech activity detection on youtube using deep neural networksNeville Ryant, Mark Liberman, Jiahong Yuan. 728-731 [doi]

Speaker and noise independent voice activity detectionFrançois Germain, Dennis L. Sun, Gautham J. Mysore. 732-736 [doi]

Confidence-based scoring: a useful diagnostic tool for detection tasksT. J. Tsai, Adam Janin. 737-741 [doi]

Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysisYasuaki Kanai, Shota Morita, Masashi Unoki. 742-746 [doi]

The furhat social companion talking headSamer Al Moubayed, Jonas Beskow, Gabriel Skantze. 747-749 [doi]

Audition: the most important sense for humanoid robots?Rodolphe Gelin, Gabriele Barbieri. 750-751 [doi]

Ultraspeech-player: intuitive visualization of ultrasound articulatory data for speech therapy and pronunciation trainingThomas Hueber. 752-753 [doi]

Laughter modulation: from speech to speech-laughJieun Oh, Ge Wang. 754-755 [doi]

Refr: an open-source reranker frameworkDaniel M. Bikel, Keith B. Hall. 756-758 [doi]

Embedding speech recognition to control lightsAlessandro Sosi, Fabio Brugnara, Luca Cristoforetti, Marco Matassoni, Mirco Ravanelli, Maurizio Omologo. 759-760 [doi]

The MUTE silent speech recognition systemGeoffrey S. Meltzner, James T. Heaton, Yunbin Deng. 761-763 [doi]

The edinburgh speech production facility doubletalk corpusJames M. Scobbie, Alice Turk, Christian Geng, Simon King, Robin J. Lickley, Korin Richmond. 764-766 [doi]

Lexee: a cloud-based platform for building and deploying voice-enabled mobile applicationsDmitry Sityaev, Jonathan Hotz, Vadim Snitkovsky. 767-769 [doi]

Visualizing articulatory data with VisArticoSlim Ouni. 770-772 [doi]

A tool to elicit and collect multicultural and multimodal laughterMariette Soury, Clément Gossart, Martine Adda-Decker, Laurence Devillers. 773-774 [doi]

Design of a mobile app for interspeech conferences: towards an open tool for the spoken language communityRobert Schleicher, Tilo Westermann, Jinjin Li, Moritz Lawitschka, Benjamin Mateev, Ralf Reichmuth, Sebastian Möller. 775-777 [doi]

The acoustics of word stress in Swedish: a function of stress level, speaking style and word accentAnders Eriksson, Plínio A. Barbosa, Joel Åkesson. 778-782 [doi]

Intonational contrasts encode speaker's certainty in neutral vs. incredulity declarative questions in FrenchAmandine Michelas, Cristel Portes, Maud Champagne-Lavau. 783-787 [doi]

Prosodic changes pre-announcing a syntactic completion point in Japanese utteranceYuichi Ishimoto, Mika Enomoto, Hitoshi Iida. 788-792 [doi]

Prosodic encoding of declarative, interrogative and imperative sentences in jaminjung, a language of australiaCandide Simard. 793-797 [doi]

Crosslinguistic priming in interactive reference: evidence for conceptual alignment in speech productionAnne Vullinghs, Martijn Goudbeek, Emiel Krahmer. 798-802 [doi]

A cross-linguistic study on turn-taking and temporal alignment in verbal interactionSpyros Kousidis, David Schlangen, Stavros Skopeteas. 803-807 [doi]

Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separationEmad M. Grais, Hakan Erdogan. 808-812 [doi]

Monaural speech segregation based on pitch track correction using an ensemble kalman filterHan-Gyu Kim, Gil-Jin Jang, Jeong-Sik Park, Yung-Hwan Oh. 813-816 [doi]

Voice activity classification for automatic bi-speaker adaptive beamforming in speech separationNgoc Thuy Tran, William G. Cowley, André Pollok. 817-821 [doi]

Blind source separation using spatially distributed microphones based on microphone-location dependent source activitiesKeisuke Kinoshita, Mehrez Souden, Tomohiro Nakatani. 822-826 [doi]

Non-negative tensor factorisation of modulation spectrograms for monaural sound source separationTom Barker, Tuomas Virtanen. 827-831 [doi]

Iterative sinusoidal-based partial phase reconstruction in single-channel source separationMario Kaoru Watanabe, Pejman Mowlaee. 832-836 [doi]

Classification of speech under stress by modeling the aerodynamics of the laryngeal ventricleXiao Yao, Takatoshi Jitsuhiro, Chiyomi Miyajima, Norihide Kitaoka, Kazuya Takeda. 837-841 [doi]

"sure, i did the right thing": a system for sarcasm detection in speechRachel Rakov, Andrew Rosenberg. 842-846 [doi]

Investigating voice quality as a speaker-independent indicator of depression and PTSDStefan Scherer, Giota Stratou, Jonathan Gratch, Louis-Philippe Morency. 847-851 [doi]

A corpus-based study of elderly and young speakers of European Portuguese: acoustic correlates and their impact on speech recognition performanceThomas Pellegrini, Annika Hämäläinen, Philippe Boula de Mareüil, Michael Tjalve, Isabel Trancoso, Sara Candeias, Miguel Sales Dias, Daniela Braga. 852-856 [doi]

Modeling spectral variability for the classification of depressed speechNicholas Cummins, Julien Epps, Vidhyasaharan Sethu, Michael Breakspear, Roland Goecke. 857-861 [doi]

Sentiment analysis of online spoken reviewsVerónica Pérez-Rosas, Rada Mihalcea. 862-866 [doi]

Using twin-HMM-based audio-visual speech enhancement as a front-end for robust audio-visual speech recognitionAhmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa. 867-871 [doi]

Spectro-temporal directional derivative features for automatic speech recognitionJames Gibson, Maarten Van Segbroeck, Antonio Ortega, Panayiotis G. Georgiou, Shrikanth Narayanan. 872-875 [doi]

Attribute-based histogram equalization (HEQ) and its adaptation for robust speech recognitionXiong Xiao, Engsiong Chng, Haizhou Li. 876-880 [doi]

Modified cepstral mean normalization - transforming to utterance specific non-zero meanVikas Joshi, N. Vishnu Prasad, Srinivasan Umesh. 881-885 [doi]

Damped oscillator cepstral coefficients for robust speech recognitionVikramjit Mitra, Horacio Franco, Martin Graciarena. 886-890 [doi]

Regularized MVDR spectrum estimation-based robust feature extractors for speech recognitionMd. Jahangir Alam, Patrick Kenny, Douglas D. O'Shaughnessy. 891-895 [doi]

Optimization of sigmoidal rate-level function based on acoustic featuresVíctor Poblete, Néstor Becerra Yoma, Richard M. Stern. 896-900 [doi]

Composing auditory ERPs: cross-linguistic comparison of auditory change complex for Japanese fricative consonantsMakiko Sadakata, Loukianos Spyrou, Mizuki Shingai, Kaoru Sekiyama. 901-905 [doi]

How voicing, place and manner of articulation differently modulate event-related potentials associated with response inhibitionNathalie Bedoin, Jennifer Krzonowski, Emmanuel Ferragne. 906-910 [doi]

Categorization of speech in early auditory evoked responsesLudovic Bellier, Michel Mazzuca, Hung Thai-Van, Anne Caclin, Rafael Laboissière. 911-915 [doi]

Perception and production of Italian vowels: an ERP studyAnna Dora Manca, Mirko Grimaldi. 916-920 [doi]

Implicit learning leads to familiarity effects for intonation but not for voiceAnn-Kathrin Grohe, Bettina Braun. 921-924 [doi]

Spoofing and countermeasures for automatic speaker verificationNicholas Evans, Tomi Kinnunen, Junichi Yamagishi. 925-929 [doi]

I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicryRosa González Hautamäki, Tomi Kinnunen, Ville Hautamäki, Timo Leino, Anne-Maria Laukkanen. 930-934 [doi]

Security evaluation of i-vector based speaker verification systems against hill-climbing attacksMarta Gomez-Barrero, Javier Gonzalez-Dominguez, Javier Galbally, Joaquin Gonzalez-Rodriguez. 935-939 [doi]

A new speaker verification spoofing countermeasure based on local binary patternsFederico Alegre, Ravichander Vipperla, Asmaa Amehraye, Nicholas W. D. Evans. 940-944 [doi]

Voice transformation-based spoofing of text-dependent speaker verification systemsZvi Kons, Hagai Aronowitz. 945-949 [doi]

Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraintsZhizheng Wu, Anthony Larcher, Kong-Aik Lee, Engsiong Chng, Tomi Kinnunen, Haizhou Li. 950-954 [doi]

Timing differences in articulation between voiced and voiceless stop consonants: an analysis of cine-MRI dataMasako Fujimoto, Tatsuya Kitamura, Hiroaki Hatano, Ichiro Fujimoto. 955-958 [doi]

Vocal tract cross-distance estimation from real-time MRI using region-of-interest analysisAdam C. Lammert, Vikram Ramanarayanan, Michael I. Proctor, Shrikanth Narayanan. 959-962 [doi]

Syllable nuclei detection using perceptually significant featuresApoorv Reddy Arrabothu, Nivedita Chennupati, B. Yegnanarayana. 963-967 [doi]

Truncation of pharyngeal gesture in English diphthong [aɪ]Fang-Ying Hsieh, Louis Goldstein, Dani Byrd, Shrikanth Narayanan. 968-972 [doi]

The effect of word frequency and lexical class on articulatory-acoustic couplingZhaojun Yang, Vikram Ramanarayanan, Dani Byrd, Shrikanth Narayanan. 973-977 [doi]

Discrimination between fricative and affricate in Japanese using time and spectral domain variablesKimiko Yamakawa, Shigeaki Amano. 978-981 [doi]

L2 syntax acquisition: the effect of oral and written computer assisted practicePolina Drozdova, Catia Cucchiarini, Helmer Strik. 982-986 [doi]

The physiological use of the charismatic voice in Political speechRosario Signorello, Didier Demolin. 987-991 [doi]

Crosslinguistic corpus of hesitation phenomena: a corpus for investigating first and second language speech performanceRalph L. Rose. 992-996 [doi]

Real-time control of a 2d animation model of the vocal tract using optopalatographySimon Preuß, Christiane Neuschaefer-Rube, Peter Birkholz. 997-1001 [doi]

The influence of accentuation and polysyllabicity on compensatory shortening in GermanJessica Siddins, Jonathan Harrington, Felicitas Kleber, Ulrich Reubold. 1002-1006 [doi]

An investigation of vowel epenthesis in Chinese learners' production of German consonantsHongwei Ding, Rüdiger Hoffmann. 1007-1011 [doi]

On the evaluation of inversion mapping performance in the acoustic domainKorin Richmond, Zhen-Hua Ling, Junichi Yamagishi, Benigno Uria. 1012-1016 [doi]

0 contour model incorporating statistical vocabulary model of phrase-accent command sequenceTatsuma Ishihara, Hirokazu Kameoka, Kota Yoshizato, Daisuke Saito, Shigeki Sagayama. 1017-1021 [doi]

Reconstruction of continuous voiced speech from whispersIan Vince McLoughlin, Jingjie Li, Yan Song. 1022-1026 [doi]

Generating fundamental frequency contours for speech synthesis in yorùbáDaniel R. van Niekerk, Etienne Barnard. 1027-1031 [doi]

Real-time voice conversion using artificial neural networks with rectified linear unitsElias Azarov, Maxim Vashkevich, Denis Likhachov, Alexander A. Petrovsky. 1032-1036 [doi]

Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus modelOraphan Krityakien, Keikichi Hirose, Nobuaki Minematsu. 1037-1041 [doi]

Unsupervised speaker and expression factorization for multi-speaker expressive synthesis of ebooksLangzhou Chen, Norbert Braunschweiler. 1042-1046 [doi]

Which resemblance is useful to predict phrase boundary rise labels for Japanese expressive text-to-speech synthesis, numerically-expressed stylistic or distribution-based semantic?Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka, Satoshi Takahashi. 1047-1051 [doi]

A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesisJinfu Ni, Yoshinori Shiga, Chiori Hori, Yutaka Kidawara. 1052-1056 [doi]

An investigation of acoustic features for singing voice conversion based on perceptual ageKazuhiro Kobayashi, Hironori Doi, Tomoki Toda, Tomoyasu Nakano, Masataka Goto, Graham Neubig, Sakriani Sakti, Satoshi Nakamura. 1057-1061 [doi]

Effect of MPEG audio compression on HMM-based speech synthesisBajibabu Bollepalli, Tuomo Raitio, Paavo Alku. 1062-1066 [doi]

Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversionHironori Doi, Tomoki Toda, Tomoyasu Nakano, Masataka Goto, Satoshi Nakamura. 1067-1071 [doi]

Statistical nonparametric speech synthesis using sparse Gaussian processesTomoki Koriyama, Takashi Nose, Takao Kobayashi. 1072-1076 [doi]

Hybrid nearest-neighbor/cluster adaptive training for rapid speaker adaptation in statistical speech synthesis systemsAmir Mohammadi, Cenk Demiroglu. 1077-1081 [doi]

Uniform concatenative excitation model for synthesising speech without voiced/unvoiced classificationJoão P. Cabral. 1082-1086 [doi]

Efficient speech transcription through respeakingMatthias Sperber, Graham Neubig, Christian Fügen, Satoshi Nakamura, Alex Waibel. 1087-1091 [doi]

Annotation and classification of Political advertisementsSamuel Kim, Panayiotis G. Georgiou, Shrikanth Narayanan. 1092-1096 [doi]

Using role play for collecting question-answer pairs for dialogue agentsRyuichiro Higashinaka, Kohji Dohsaka, Hideki Isozaki. 1097-1100 [doi]

Individual differences of emotional expression in speaker's behavioral and autonomic responsesYoshiko Arimoto, Kazuo Okanoya. 1101-1105 [doi]

Development and validation of the conversational agents scale (CAS)Ina Wechsung, Benjamin Weiss, Christine Kühnel, Patrick Ehrenbrink, Sebastian Möller. 1106-1110 [doi]

Motivational feedback in crowdsourcing: a case study in speech transcriptionGiuseppe Riccardi, Arindam Ghosh, S. A. Chowdhury, Ali Orkan Bayer. 1111-1115 [doi]

The sheffield wargames corpusCharles Fox, Yulan Liu, Erich Zwyssig, Thomas Hain. 1116-1120 [doi]

Formalizing expert knowledge for developing accurate speech recognizersAnuj Kumar, Florian Metze, Wenyi Wang, Matthew Kam. 1121-1125 [doi]

Analysis of gaze and speech patterns in three-party quiz game interactionSamer Al Moubayed, Jens Edlund, Joakim Gustafson. 1126-1130 [doi]

Methodologies for the evaluation of speaker diarization and automatic speech recognition in the presence of overlapping speechOlivier Galibert. 1131-1134 [doi]

'houston, we have a solution': using NASA apollo program to advance speech and language processing technologyAbhijeet Sangwan, Lakshmish Kaushik, Chengzhu Yu, John H. L. Hansen, Douglas W. Oard. 1135-1139 [doi]

Performance of the MVOCA silent speech interface across multiple speakersRobin Hofe, Jie Bai, Lam A. Cheah, Stephen R. Ell, James M. Gilbert, Roger K. Moore, Phil D. Green. 1140-1143 [doi]

Automatic glottal tracking from high-speed digital images using a continuous normalized cross correlationGustavo Andrade-Miranda, Juan Ignacio Godino-Llorente. 1144-1148 [doi]

Automatic evaluation of parkinson's speech - acoustic, prosodic and voice related cuesTobias Bocklet, Stefan Steidl, Elmar Nöth, Sabine Skodda. 1149-1153 [doi]

Comparison of approaches for an efficient phonetic decodingLuiza Orosanu, Denis Jouvet. 1154-1158 [doi]

Learning speaker-specific pronunciations of disordered speechHeidi Christensen, Phil D. Green, Thomas Hain. 1159-1163 [doi]

Adapting a speech into sign language translation system to a new domainVerónica López-Ludeña, Rubén San Segundo, Carlos Gonzalez-Morcillo, J. C. López, E. Ferreiro. 1164-1168 [doi]

Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warpingElizabeth Godoy, Maria Koutsogiannaki, Yannis Stylianou. 1169-1173 [doi]

Prediction of intelligibility of noisy and time-frequency weighted speech based on mutual information between amplitude envelopesJesper Jensen, Cees H. Taal. 1174-1178 [doi]

Frequency-adaptive post-filtering for intelligibility enhancement of narrowband telephone speechEmma Jokinen, Marko Takanen, Paavo Alku. 1179-1183 [doi]

Comparative investigation of objective speech intelligibility prediction measures for noise-reduced signals in Mandarin and JapaneseJunfeng Li, Fei Chen, Masato Akagi, YongHong Yan. 1184-1187 [doi]

Monitoring the effects of temporal clipping on voIP speech qualityAndrew Hines, Jan Skoglund, Anil C. Kokaram, Naomi Harte. 1188-1192 [doi]

The spectral dynamics of vowels in Mandarin ChineseJiahong Yuan. 1193-1197 [doi]

CSLM - a modular open-source continuous space language modeling toolkitHolger Schwenk. 1198-1202 [doi]

Speed up of recurrent neural network language models with sentence independent subsampling stochastic gradient descentYangyang Shi, Mei-Yuh Hwang, Kaisheng Yao, Martha Larson. 1203-1207 [doi]

Improving unsupervised language model adaptation with discriminative data filteringShuangyu Chang, Michael Levit, Partha Parthasarathy, Benoît Dumoulin. 1208-1212 [doi]

Lightly supervised training for risk-based discriminative language modelsAkio Kobayashi, Takahiro Oku, Yuya Fujita, Shoei Sato. 1213-1217 [doi]

Investigation of MT-based ASR confusion models for semi-supervised discriminative language modelingErinç Dikici, Emily Tucker Prud'hommeaux, Brian Roark, Murat Saraçlar. 1218-1222 [doi]

Unsupervised discriminative language modeling using error rate estimatorTakanobu Oba, Atsunori Ogawa, Takaaki Hori, Hirokazu Masataki, Atsushi Nakamura. 1223-1227 [doi]

A region-specific feature-space transformation for speaker adaptation and singularity analysis of jacobian matrixShakti P. Rath, Lukás Burget, Martin Karafiát, Ondrej Glembek, Jan Cernocký. 1228-1232 [doi]

An explicit independence constraint for factorised adaptation in speech recognitionY. Q. Wang, M. J. F. Gales. 1233-1237 [doi]

Asynchronous factorisation of speaker and background with feature transforms in speech recognitionOscar Saz, Thomas Hain. 1238-1242 [doi]

Cluster adaptive training with factorized decision trees for speech recognitionKai Yu, Hainan Xu. 1243-1247 [doi]

Rapid and effective speaker adaptation of convolutional neural network based models for speech recognitionOssama Abdel Hamid, Hui Jiang 0001. 1248-1252 [doi]

Text-to-speech inspired duration modeling for improved whole-word acoustic modelsKeith Kintzley, Aren Jansen, Hynek Hermansky. 1253-1257 [doi]

Duration of early vocalisationsAdele Gregory, Marija Tabain, Michael Robb. 1258-1262 [doi]

Acoustic development of vowel production in American English childrenJing Yang, Robert Allen Fox. 1263-1267 [doi]

The role of intrinsic motivations in learning sensorimotor vocal mappings: a developmental robotics studyClement Moulin-Frier, Pierre-Yves Oudeyer. 1268-1272 [doi]

Children's timing and repair strategies for communication in adverse listening conditionsValérie Hazan, Michele Pettinato. 1273-1277 [doi]

Speech planning as an index of speech motor control maturityGuillaume Barbier, Pascal Perrier, Lucie Ménard, Yohan Payan, Mark K. Tiede, Joseph S. Perkell. 1278-1282 [doi]

The relationship between gender-differentiated productions of /s/ and gender role behaviour in young childrenMelissa Kinsman, Fangfang Li. 1283-1286 [doi]

Data-driven design of a sentence list for an articulatory speech corpusJeffrey Berry, Luciano Fadiga. 1287-1291 [doi]

Faster 3d vocal tract real-time MRI using constrained reconstructionYinghua Zhu, Asterios Toutios, Shrikanth Narayanan, Krishna S. Nayak. 1292-1296 [doi]

Relevance-weighted-reconstruction of articulatory features in deep-neural-network-based acoustic-to-articulatory mappingClaudia Canevari, Leonardo Badino, Luciano Fadiga, Giorgio Metta. 1297-1301 [doi]

Word frequency, vowel length and vowel quality in speech production: an EMA study of the importance of experienceFabian Tomaschek, Martijn Wieling, Denis Arnold, R. Harald Baayen. 1302-1306 [doi]

Towards a systematic and quantitative analysis of vocal tract dataSamuel S. Silva, António J. S. Teixeira, Catarina Oliveira, Paula Martins. 1307-1311 [doi]

A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysisColin Vaz, Vikram Ramanarayanan, Shrikanth Narayanan. 1312-1315 [doi]

Electromagnetic articulography with AG500 and AG501Massimo Stella, Antonio Stella, Francesco Sigona, Paolo Bernardini, Mirko Grimaldi, Barbara Gili Fivela. 1316-1320 [doi]

Development and implementation of fiducial markers for vocal tract MRI imaging and speech articulatory modellingPierre Badin, Julián Andrés Valdés Vargas, Arielle Koncki, Laurent Lamalle, Christophe Savariaux. 1321-1325 [doi]

Functional data analysis of tongue articulation in palatal vowels: gothenburg and malmöhus Swedish /iː, yː, ̟ʉː/Susanne Schötz, Johan Frid, Lars Gustafsson, Anders Löfqvist. 1326-1330 [doi]

SMASH: a tool for articulatory data processing and analysisJordan R. Green, Jun Wang, David L. Wilson. 1331-1335 [doi]

Emotion recognition of conversational affective speech using temporal course modelingJen-Chun Lin, Chung-Hsien Wu, Wen-Li Wei. 1336-1340 [doi]

The role of empathy in the recognition of vocal emotionsRene Altrov, Hille Pajupuu, Jaan Pajupuu. 1341-1344 [doi]

Electrophysiological evidence for benefits of imitation during the processing of spoken words embedded in sentential contextsAngèle Brunellière, Sophie Dufour. 1345-1349 [doi]

Compensatory speech response to time-scale altered auditory feedbackRintaro Ogane, Masaaki Honda. 1350-1354 [doi]

Bhattacharyya distance based emotional dissimilarity measure in multi-dimensional space for emotion classificationTin Lay Nwe, Trung Hieu Nguyen, Dilip Kumar Limbu. 1355-1359 [doi]

On the enhancement of dereverberation algorithms based on a perceptual evaluation criterionThiago de M. Prego, Amaro A. de Lima, Sergio L. Netto. 1360-1364 [doi]

Revisiting pitch slope and height effects on perceived durationCarlos Gussenhoven, Wencui Zhou. 1365-1369 [doi]

Adaptation to natural fast speech and time-compressed speech in childrenHélène Guiraud, Emmanuel Ferragne, Nathalie Bedoin, Véronique Boulenger. 1370-1374 [doi]

Modeling durational incompressibilityAndreas Windmann, Juraj Simko, Britta Wrede, Petra Wagner. 1375-1379 [doi]

Perceived prosodic correlates of smiled speech in spontaneous dataCaroline Émond, Lucie Ménard, Marty Laforest. 1380-1383 [doi]

Predicting speech quality based on interactivity and delayAlexander Raake, Katrin Schoenenberg, Janto Skowronek, Sebastian Egger. 1384-1388 [doi]

Perceptual, acoustic and electroglottographic correlates of 3 aggressive attitudes in French: a pilot studyCharlotte Kouklia, Nicolas Audibert. 1389-1393 [doi]

Theme identification in telephone service conversations using quaternions of speech featuresMohamed Morchid, Georges Linarès, Marc El-Bèze, Renato de Mori. 1394-1398 [doi]

Detection of laughter in children's speech using spectral and prosodic acoustic featuresHrishikesh Rao, Jonathan C. Kim, Agata Rozga, Mark A. Clements. 1399-1403 [doi]

Classification of cooperative and competitive overlaps in speech using cues from the context, overlapper, and overlappeeKhiet P. Truong. 1404-1408 [doi]

Annotation and detection of conflict escalation in Political debatesSamuel Kim, Fabio Valente, Alessandro Vinciarelli. 1409-1413 [doi]

Machine learning of probabilistic phonological pronunciation rules from the Italian CLIPS corpusFlorian Schiel, Mary Stevens, Uwe D. Reichel, Francesco Cutugno. 1414-1418 [doi]

Human perception of alcoholic intoxication in speechBarbara Baumeister, Florian Schiel. 1419-1423 [doi]

Phonetic manifestation and influence of zero anaphora in Chinese reading textsLuying Hou, Yuan Jia, Aijun Li. 1424-1428 [doi]

Diacritics restoration for Arabic dialect textsS. Harrat, Mourad Abbas, Karima Meftouh, Kamel Smaïli. 1429-1433 [doi]

Effects of talk-spurt silence boundary thresholds on distribution of gaps and overlapsMarcin Wlodarczak, Petra Wagner. 1434-1437 [doi]

Final lengthening in Russian: a corpus-based studyTatiana Kachkovskaia, Nina B. Volskaya, Pavel A. Skrelin. 1438-1442 [doi]

From segmentation bootstrapping to transcription-to-word conversionUwe D. Reichel. 1443-1447 [doi]

Manual and automatic tone annotation: the case of an endangered language from north vietnam "mo piu"Geneviève Caelen-Haumont, Katarina Bartkova. 1448-1452 [doi]

Non-canonical syntactic structures in discourse: tonality, tonicity and tones in English (semi-)spontaneous speechLaetitia Leonarduzzi, Sophie Herment. 1453-1457 [doi]

Prediction of strategy and outcome as negotiation unfolds by using basic verbal and behavioral featuresElnaz Nouri, Sunghyun Park, Stefan Scherer, Jonathan Gratch, Peter Carnevale, Louis-Philippe Morency, David R. Traum. 1458-1461 [doi]

Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both?Johann Poignant, Laurent Besacier, Viet Bac Le, Sophie Rosset, Georges Quénot. 1462-1466 [doi]

Integer linear programming for speaker diarization and cross-modal identification in TV broadcastHervé Bredin, Johann Poignant. 1467-1471 [doi]

Native accent classification via i-vectors and speaker compensation fusionAndrea DeMarco, Stephen J. Cox. 1472-1476 [doi]

An open-source state-of-the-art toolbox for broadcast news diarizationMickael Rouvier, Grégor Dupuy, Paul Gay, Elie el Khoury, Téva Merlin, Sylvain Meignier. 1477-1481 [doi]

Audio event classification using deep neural networksZvi Kons, Orith Toledo-Ronen. 1482-1486 [doi]

Code-Switching event detection based on delta-BIC using phonetic eigenvoice modelsWei-Bin Liang, Chung-Hsien Wu, Chun-Shan Hsu. 1487-1491 [doi]

Automatic estimation of dialect mixing ratio for dialect speech recognitionNaoki Hirayama, Koichiro Yoshino, Katsutoshi Itoyama, Shinsuke Mori, Hiroshi G. Okuno. 1492-1496 [doi]

The albayzin 2012 language recognition evaluationLuis Javier Rodríguez-Fuentes, Niko Brümmer, Mikel Peñagarikano, Amparo Varona, Germán Bordel, Mireia Díez. 1497-1501 [doi]

TRAP language identification system for RATS phase II evaluationKyu Jeong Han, Sriram Ganapathy, Ming Li, Mohamed Kamal Omar, Shrikanth Narayanan. 1502-1506 [doi]

Improving language identification robustness to highly channel-degraded speech through multiple system fusionAaron Lawson, Mitchell McLaren, Yun Lei, Vikramjit Mitra, Nicolas Scheffer, Luciana Ferrer, Martin Graciarena. 1507-1510 [doi]

Annotation errors detection in TTS corporaJindrich Matousek, Daniel Tihelka. 1511-1515 [doi]

Technique for automatic sentence level alignment of long speech and transcriptsImran Ahmed, Sunil Kumar Kopparapu. 1516-1519 [doi]

Text-to-speech alignment of long recordings using universal phone modelsSarah Hoffmann, Beat Pfister. 1520-1524 [doi]

Lightly supervised discriminative training of grapheme models for improved sentence-level alignment of speech and text dataAdriana Stan, Peter Bell, Junichi Yamagishi, Simon King. 1525-1529 [doi]

Automatic social role recognition in professional meetings using conditional random fieldsAshtosh Sapru, Hervé Bourlard. 1530-1534 [doi]

Same same but different - an acoustical comparison of the automatic segmentation of high quality and mobile telephone speechChristoph Draxler, Hanna S. Feiser. 1535-1539 [doi]

Multi-centroidal duration generation algorithm for HMM-based TTSYongguo Kang, Jian Li, Yan Deng, Miaomiao Wang. 1540-1543 [doi]

Analysis and synthesis of shouted speechTuomo Raitio, Antti Suni, Jouni Pohjalainen, Manu Airaksinen, Martti Vainio, Paavo Alku. 1544-1548 [doi]

Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesisTomohiro Nagata, Hiroki Mori, Takashi Nose. 1549-1553 [doi]

A new prosody annotation protocol for live sports commentariesSandrine Brognaux, Benjamin Picart, Thomas Drugman. 1554-1558 [doi]

Unsupervised prominence prediction for speech synthesisMahnoosh Mehrabani, Taniya Mishra, Alistair Conkie. 1559-1563 [doi]

Expressive speech synthesis in MARY TTS using audiobook data and emotionMLMarcela Charfuelan, Ingmar Steiner. 1564-1568 [doi]

Using dialog-activity similarity for spoken information retrievalNigel G. Ward, Steven D. Werner. 1569-1573 [doi]

A hybrid HMM/DNN approach to keyword spotting of short wordsI.-Fan Chen, Chin-Hui Lee. 1574-1578 [doi]

Leveraging locality for topic identification of conversational speechJonathan Wintrode. 1579-1583 [doi]

Person name spotting by combining acoustic matching and LDA topic modelsGrégory Senay, Benjamin Bigot, Richard Dufour, Georges Linarès, Corinne Fredouille. 1584-1588 [doi]

Using phonological phrase segmentation to improve automatic keyword spotting for the highly agglutinating Hungarian languageGyörgy Szaszák, András Beke. 1589-1593 [doi]

Leveraging knowledge graphs for web-scale unsupervised semantic parsingLarry P. Heck, Dilek Hakkani-Tür, Gökhan Tür. 1594-1598 [doi]

Fast and memory effective i-vector extraction using a factorized sub-spaceSandro Cumani, Pietro Laface. 1599-1603 [doi]

Effective estimation of a multi-session speaker model using information on signal parametersKonstantin Simonchik, Andrey Shulipa, Timur Pekhovsky. 1604-1608 [doi]

Automatic regularization of cross-entropy cost for speaker recognition fusionVille Hautamäki, Kong-Aik Lee, David A. van Leeuwen, Rahim Saeidi, Anthony Larcher, Tomi Kinnunen, Taufiq Hasan, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, John H. L. Hansen, Benoit G. B. Fauve. 1609-1613 [doi]

Speaker verification based on fusion of acoustic and articulatory informationMing Li, Jangwon Kim, Prasanta Kumar Ghosh, Vikram Ramanarayanan, Shrikanth Narayanan. 1614-1618 [doi]

The distribution of calibrated likelihood-ratios in speaker recognitionDavid A. van Leeuwen, Niko Brümmer. 1619-1623 [doi]

Eigenageing compensation for speaker verificationFinnian Kelly, Niko Brümmer, Naomi Harte. 1624-1628 [doi]

Effects of mouth-only and whole-face displays on audio-visual speech perception in noise: is the vision of a talker's full face truly the most efficient solution?Grozdana Erjavec, Denis Legros. 1629-1633 [doi]

Acoustic and visual phonetic features in the mcgurk effect - an audiovisual speech illusionKaisa Tiippana, Mikko Tiainen, Lari Vainio, Martti Vainio. 1634-1638 [doi]

The effect of visual speech timing and form cues on the processing of speech and nonspeechChris Davis, Jeesun Kim. 1639-1642 [doi]

Effect of context, rebinding and noise, on audiovisual speech fusionGanesh Attigodu Chandrashekara, Frédéric Berthommier, Olha Nahorna, Jean-Luc Schwartz. 1643-1647 [doi]

Social face to face communication - American English attitudinal prosodyAlbert Rilliard, Donna Erickson, Takaaki Shochi, João Antônio de Moraes. 1648-1652 [doi]

Adaptation of respiratory patterns in collaborative readingGérard Bailly, Amélie Rochet-Capellan, Coriandre Vilain. 1653-1657 [doi]

A comparative study of glottal open quotient estimation techniquesJohn Kane, Stefan Scherer, Louis-Philippe Morency, Christer Gobl. 1658-1662 [doi]

Estimation of multiple-branch vocal tract models: the influence of prior assumptionsChristian H. Kasess, Wolfgang Kreuzer. 1663-1667 [doi]

Detecting overlapping speech with long short-term memory recurrent neural networksJürgen T. Geiger, Florian Eyben, Björn Schuller, Gerhard Rigoll. 1668-1672 [doi]

Evaluation of fundamental validity in applying AR-HMM with automatic topology generation to pathology voice analysisAkira Sasou. 1673-1676 [doi]

Significance of instants of significant excitation for source modelingNagaraj Adiga, S. R. M. Prasanna. 1677-1681 [doi]

Significance of variable height-bandwidth group delay filters in the spectral reconstruction of speechDevanshu Arya, Anant Raj, Rajesh M. Hegde. 1682-1686 [doi]

Nonlinear prediction of speech signal using volterra-wiener seriesHemant A. Patil, Tanvina B. Patel. 1687-1691 [doi]

Evaluation of speech-based protocol for detection of early-stage dementiaAharon Satt, Alexander Sorin, Orith Toledo-Ronen, Oren Barkan, Ioannis Kompatsiaris, Athina Kokonozi, Magda Tsolaki. 1692-1696 [doi]

Instantaneous harmonic representation of speech using multicomponent sinusoidal excitationElias Azarov, Maxim Vashkevich, Alexander A. Petrovsky. 1697-1701 [doi]

A quantitative comparison of glottal closure instant estimation algorithms on a large variety of singing soundsOnur Babacan, Thomas Drugman, Nicolas D'Alessandro, Nathalie Henrich, Thierry Dutoit. 1702-1706 [doi]

Automatic gender recognition in normal and pathological speechJorge Andrés Gómez García, Juan Ignacio Godino-Llorente, Germán Castellanos-Domínguez. 1707-1711 [doi]

Unsupervised vocal-tract length estimation through model-based acoustic-to-articulatory inversionShanqing Cai, H. Timothy Bunnell, Rupal Patel. 1712-1716 [doi]

Model order estimation using Bayesian NMF for discovering phone patterns in spoken utterancesSayeh Mirzaei, Hugo Van Hamme, Yaser Norouzi. 1717-1721 [doi]

Convolutional deep rectifier neural nets for phone recognitionLászló Tóth. 1722-1726 [doi]

Pitch synchronous spectral analysis for a pitch dependent recognition of voiced phonemes - PISARHans-Günter Hirsch. 1727-1731 [doi]

New parameters for automatic speech recognition based on the mammalian cochlea model using resonance analysisJosé Luis Oropeza Rodríguez. 1732-1736 [doi]

Using an autoencoder with deformable templates to discover features for automated speech recognitionNavdeep Jaitly, Geoffrey E. Hinton. 1737-1740 [doi]

Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devicesChing-feng Yeh, Hung-yi Lee, Lin-Shan Lee. 1741-1745 [doi]

Subspace models for bottleneck featuresJun Qi, Dong Wang, Javier Tejedor. 1746-1750 [doi]

Bottleneck features based on gammatone frequency cepstral coefficientsJun Qi, Dong Wang, Ji Xu, Javier Tejedor. 1751-1755 [doi]

Cross-entropy vs. squared error training: a theoretical and experimental comparisonPavel Golik, Patrick Doetsch, Hermann Ney. 1756-1760 [doi]

Acoustic features for detection of phonemic aspiration in voiced plosivesVaishali Patil, Preeti Rao. 1761-1765 [doi]

Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networksDimitri Palaz, Ronan Collobert, Mathew Magimai-Doss. 1766-1770 [doi]

Hierarchical models based on a continuous acoustic space to identify phonological featuresJavier Mikel Olaso, M. Inés Torres. 1771-1775 [doi]

Locality sensitive hashing for fast computation of correlational manifold learning based feature space transformationsVikrant Singh Tomar, Richard C. Rose. 1776-1780 [doi]

Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipelineThomas Schatz, Vijayaditya Peddinti, Francis Bach, Aren Jansen, Hynek Hermansky, Emmanuel Dupoux. 1781-1785 [doi]

Knowledge integration for improving performance in LVCSRChen-Yu Chiang, Sabato Marco Siniscalchi, Sin-Horng Chen, Chin-Hui Lee. 1786-1790 [doi]

Inter-speaker variability in audio-visual classification of word prominenceMartin Heckmann. 1791-1795 [doi]

Parameter clustering for temporally varying weight regression for automatic speech recognitionShilin Liu, Khe Chai Sim. 1796-1800 [doi]

Phone duration modeling using clustering of rich contextsTanel Alumäe, Rena Nemoto. 1801-1805 [doi]

Human mouth state detection using low frequency ultrasoundFarzaneh Ahmadi, Mousa Ahmadi, Ian Vince McLoughlin. 1806-1810 [doi]

Lexical stress detection for L2 English speech using deep belief networksKun Li, Xiaojun Qian, Shiyin Kang, Helen Meng. 1811-1815 [doi]

MLP-HMM two-stage unsupervised training for low-resource languages on conversational telephone speech recognitionYanmin Qian, Jia Liu. 1816-1820 [doi]

Failure transitions for joint n-gram models and G2p conversionJosef R. Novak, Nobuaki Minematsu, Keikichi Hirose. 1821-1825 [doi]

0 contoursHirokazu Kameoka, Kota Yoshizato, Tatsuma Ishihara, Yasunori Ohishi, Kunio Kashino, Shigeki Sagayama. 1826-1830 [doi]

G2p variant prediction techniques for ASR and STDMarelie H. Davel, Charl Johannes van Heerden, Etienne Barnard. 1831-1835 [doi]

Rhythm analysis of second-language speech through low-frequency auditory featuresJin Jin, Joseph Tepperman. 1836-1839 [doi]

Graph-based semi-supervised learning for phone and segment classificationYuzong Liu, Katrin Kirchhoff. 1840-1843 [doi]

Selective use of gaze information to improve ASR performance in noisy environments by cache-based class language model adaptationAo Shen, Neil Cooke, Martin Russell. 1844-1848 [doi]

Deep segmental neural networks for speech recognitionOssama Abdel Hamid, Li Deng, Dong Yu, Hui Jiang 0001. 1849-1853 [doi]

Quantifying cross-linguistic variation in grapheme-to-phoneme mappingMartine Coene, Annemiek Hammer, Wojtek Kowalczyk, Louis ten Bosch, Bart Vaerenberg, Paul Govaerts. 1854-1857 [doi]

The speech recognition virtual kitchenFlorian Metze, Eric Fosler-Lussier, Rebecca Bates. 1858-1860 [doi]

Multilingual web conferencing using speech-to-speech translationJohn Chen, Shufei Wen, Vivek Kumar Rangarajan Sridhar, Srinivas Bangalore. 1861-1863 [doi]

ROCme! software for the recording and management of speech corporaEmmanuel Ferragne, Sébastien Flavier, Christian Fressard. 1864-1865 [doi]

Voice search in mobile applications with the rootvole frameworkFelix Burkhardt. 1866-1868 [doi]

On-line audio dilation for human interactionJohn S. Novak III, Jason Archer, Valeriy Shafiro, Robert V. Kenyon, Jason Leigh. 1869-1871 [doi]

Phase-aware single-channel speech enhancementPejman Mowlaee, Mario Kaoru Watanabe, Rahim Saeidi. 1872-1874 [doi]

A free online accent and intonation dictionary for teachers and learners of JapaneseHiroko Hirano, Ibuki Nakamura, Nobuaki Minematsu, Masayuki Suzuki, Chieko Nakagawa, Noriko Nakamura, Yukinori Tagawa, Keikichi Hirose, Hiroya Hashimoto. 1875-1876 [doi]

Reactive accent interpolation through an interactive map applicationMaria Astrinaki, Junichi Yamagishi, Simon King, Nicolas D'Alessandro, Thierry Dutoit. 1877-1878 [doi]

A non-experts user interface for obtaining automatic diagnostic spelling evaluations for learners of the German writing systemKay Berkling. 1879-1881 [doi]

Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversationsTatsuya Kawahara, Soichiro Hayashi, Katsuya Takanashi. 1882-1885 [doi]

A new DNN-based high quality pronunciation evaluation for computer-aided language learning (CALL)Wenping Hu, Yao Qian, Frank K. Soong. 1886-1890 [doi]

A multi-domain dialog system to integrate heterogeneous spoken dialog systemsJoaquin Planells, Lluís F. Hurtado, Encarna Segarra, Emilio Sanchis. 1891-1895 [doi]

Development and evaluation of spoken dialog systems with one or two agentsYuki Todo, Ryota Nishimura, Kazumasa Yamamoto, Seiichi Nakagawa. 1896-1900 [doi]

User feedback in human-robot interaction: prosody, gaze and timingGabriel Skantze, Catharine Oertel, Anna Hjalmarsson. 1901-1905 [doi]

KPCatcher - a keyphrase extraction system for enterprise videosYongxin Taylor Xi, Matthias Paulik, Venkata Ramana Rao Gadde, Ananth Sankar. 1906-1910 [doi]

Pitch-gesture modeling using subband autocorrelation change detectionMalcolm Slaney, Elizabeth Shriberg, Jui-Ting Huang. 1911-1915 [doi]

Analysis of emotional speech at subsegmental levelP. Gangamohan, Sudarsana Reddy Kadiri, B. Yegnanarayana. 1916-1920 [doi]

Periodicity extraction for voiced sounds with multiple periodicityMasanori Morise, Hideki Kawahara, Kenji Ozawa. 1921-1925 [doi]

Modelling and estimation of the fundamental frequency of speech using a hidden Markov modelJohn H. Taylor, Ben Milner. 1926-1930 [doi]

Extended weighted linear prediction using the autocorrelation snapshot - a robust speech analysis method and its application to recognition of vocal emotionsJouni Pohjalainen, Paavo Alku. 1931-1935 [doi]

Improving the accuracy and the robustness of harmonic model for pitch estimationMeysam Asgari, Izhak Shafran. 1936-1940 [doi]

Discriminative pronunciation modeling based on minimum phone error trainingMeixu Song, Qingqing Zhang, Jielin Pan, YongHong Yan. 1941-1945 [doi]

Grapheme-to-phoneme conversion based on adaptive regularization of weight vectorsKeigo Kubo, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura. 1946-1950 [doi]

An efficient method to estimate pronunciation from multiple utterancesTofigh Naghibi, Sarah Hoffmann, Beat Pfister. 1951-1955 [doi]

Category-based phoneme-to-grapheme transliterationWillem D. Basson, Marelie H. Davel. 1956-1960 [doi]

Discriminative training of WFST factors with application to pronunciation modelingPreethi Jyothi, Eric Fosler-Lussier, Karen Livescu. 1961-1965 [doi]

Discriminative training of a phoneme confusion model for a dynamic lexicon in ASRPenny Karanasou, François Yvon, Thomas Lavergne, Lori Lamel. 1966-1970 [doi]

The 2012 NIST speaker recognition evaluationCraig S. Greenberg, Vincent M. Stanford, Alvin F. Martin, Meghana Yadagiri, George R. Doddington, John J. Godfrey, Jaime Hernandez-Cordero. 1971-1975 [doi]

Likelihood-ratio calibration using prior-weighted proper scoring rulesNiko Brümmer, George R. Doddington. 1976-1980 [doi]

A noise-robust system for NIST 2012 speaker recognition evaluationLuciana Ferrer, Mitchell McLaren, Nicolas Scheffer, Yun Lei, Martin Graciarena, Vikramjit Mitra. 1981-1985 [doi]

I4u submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verificationRahim Saeidi, Kong-Aik Lee, Tomi Kinnunen, Tawfik Hasan, Benoit G. B. Fauve, Pierre-Michel Bousquet, Elie el Khoury, Pablo Luis Sordo Martinez, Jia Min Karen Kua, Changhuai You, Hanwu Sun, Anthony Larcher, Padmanabhan Rajan, Ville Hautamäki, Cemal Hanilçi, Billy Braithwaite, Rosa González Hautamäki, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, Navid Shokouhi, Driss Matrouf, Laurent El Shafey, Pejman Mowlaee, Julien Epps, Tharmarajah Thiruvaran, David A. van Leeuwen, Bin Ma, Haizhou Li, John H. L. Hansen, Jean-François Bonastre, Sébastien Marcel, John S. D. Mason, Eliathamby Ambikairajah. 1986-1990 [doi]

Improved unsupervised NAP training dataset design for speaker recognitionHanwu Sun, Bin Ma. 1991-1995 [doi]

Nuance - Politecnico di torino's 2012 NIST speaker recognition evaluation systemDaniele Colibro, Claudio Vair, Kevin Farrell, Nir Krause, Gennady Karvitsky, Sandro Cumani, Pietro Laface. 1996-2000 [doi]

A perceptually and physiologically motivated voice source modelGang Chen 0009, Marc Garellek, Jody Kreiman, Bruce R. Gerratt, Abeer Alwan. 2001-2005 [doi]

Stable articulatory tasks and their variable formation: tamil retroflex consonantsCaitlin Smith, Michael I. Proctor, Khalil Iskarous, Louis Goldstein, Shrikanth Narayanan. 2006-2009 [doi]

Articulatory settings facilitate mechanically advantageous motor control of vocal tract articulatorsVikram Ramanarayanan, Adam C. Lammert, Louis Goldstein, Shrikanth Narayanan. 2010-2013 [doi]

The interplay of linguistic structure and breathing in German spontaneous speechAmélie Rochet-Capellan, Susanne Fuchs. 2014-2018 [doi]

Physical models of the vocal tract with a flapping tongue for flap and liquid soundsTakayuki Arai. 2019-2023 [doi]

Articulatory copy synthesis from cine x-ray filmsYves Laprie, Matthieu Loosvelt, Shinji Maeda, Rudolph Sock, Fabrice Hirsch. 2024-2028 [doi]

Large-scale personal assistant technology deployment: the siri experienceJerome R. Bellegarda. 2029-2033 [doi]

Evaluating an adaptive dialog system for the publicBenjamin Weiss, Simon Willkomm, Sebastian Möller. 2034-2038 [doi]

Self-taught assistive vocal interfaces: an overview of the ALADIN projectJort F. Gemmeke, Bart Ons, Netsanet Tessema, Hugo Van Hamme, Janneke van de Loo, Guy De Pauw, Walter Daelemans, Jonathan Huyghe, Jan Derboven, Lode Vuegen, Bert Van Den Broeck, Peter Karsmakers, Bart Vanrumste. 2039-2043 [doi]

Affect recognition in real-life acoustic conditions - a new perspective on feature selectionFlorian Eyben, Felix Weninger, Björn Schuller. 2044-2048 [doi]

A distributed system for recognizing home automation commands and distress calls in the Italian languageEmanuele Principi, Stefano Squartini, Francesco Piazza, Danilo Fuselli, Maurizio Bonifazi. 2049-2053 [doi]

Probabilistic trainable segmenter for call center audio using multiple featuresNina Zinovieva, Xiaodan Zhuang, Pat Peterson, Joe Alwan, Rohit Prasad. 2054-2058 [doi]

Voice search in mobile applications and the use of linked open dataFelix Burkhardt, Hans Ulrich Nägeli. 2059-2061 [doi]

Evaluation of a real-time voice order recognition system from multiple audio channels in a homeMichel Vacher, Benjamin Lecouteux, Dan Istrate, Thierry Joubert, François Portet, Mohamed A. Sehili, Pedro Chahuara. 2062-2064 [doi]

In-home detection of distress calls: the case of aged usersFrédéric Aman, Michel Vacher, Solange Rossato, François Portet. 2065-2067 [doi]

Data driven methods for utterance semantic taggingDing Liu, Anthea Cheung, Anna Margolis, Patrick Redmond, Jun-Won Suh, Chao Wang. 2068-2070 [doi]

The AT&t speech API: a study on practical challenges for customized speech to text serviceE. Gouvêa, Antonio Moreno-Daniel, A. Reddy, Rathinavelu Chengalvarayan, David L. Thomson, Andrej Ljolje. 2071-2073 [doi]

In-vehicle destination entry by voice: practical aspectsBart D'hoore, Alfred Wiesen. 2074-2076 [doi]

Intelligibility at a multilingual cocktail party: effect of concurrent language knowledgeAurore Gautreau, Michel Hoen, Fanny Meunier. 2077-2080 [doi]

Regional accents affect speech intelligibility in a multitalker environmentEwa Jacewicz, Robert Allen Fox. 2081-2085 [doi]

Perception of English minimal pairs in noise by Japanese listeners: does clear speech for L2 listeners help?Shinichi Tokuma, Won Tokuma. 2086-2090 [doi]

Salento Italian listeners' perception of American English vowelsBianca Sisinni, Paola Escudero, Mirko Grimaldi. 2091-2094 [doi]

TP 3.1 software: a tool for designing audio, visual, and audiovisual perceptual training tasks and perception testsAndréia Schurt Rauber, Anabela Rato, Denise Cristina Kluge, Giane Rodrigues dos Santos. 2095-2098 [doi]

Effect of linguistic masker on the intelligibility of Mandarin sentencesFei Chen, Junfeng Li, Lena L. N. Wong, YongHong Yan. 2099-2102 [doi]

The learning and generalization of contrasts consistent or inconsistent with native biasesKyuwon Moon, Meghan Sumner. 2103-2107 [doi]

L2 English learners' recognition of words spoken in familiar versus unfamiliar English accentsJia-ying, Jason A. Shaw, Catherine T. Best. 2108-2112 [doi]

The effects of perceptual and/or productive training on the perception and production of English vowels /ɪ/ and /iː/ by Cantonese ESL learnersJanice Wing Sze Wong. 2113-2117 [doi]

On the role of L1 speech production in L2 perception: evidence from Spanish learners of FrenchNatalia Kartushina, Ulrich H. Frauenfelder. 2118-2122 [doi]

Looking for lexical feedback effects in /tl/→/kl/ repairsPierre A. Hallé, Natalia Kartushina, Juan Segui, Ulrich H. Frauenfelder. 2123-2127 [doi]

Recognizing words across regional accents: the role of perceptual assimilation in lexical competitionCatherine T. Best, Jason A. Shaw, Elizabeth Clancy. 2128-2132 [doi]

Dysarthria intelligibility assessment in a factor analysis total variability spaceDavid Martínez, Phil D. Green, Heidi Christensen. 2133-2137 [doi]

Perceptual interference between regional accent and voice/speech disordersAlain Ghio, Médéric Gasquet-Cyrus, Juliette Roquel, Antoine Giovanni. 2138-2142 [doi]

Linguistic disfluency in narrative speech: evidence from story-telling in 6-year oldsIngrida Balciuniene. 2143-2146 [doi]

Assessing the utility of judgments of children's speech production made by untrained listeners in uncontrolled listening environmentsBenjamin Munson. 2147-2151 [doi]

Consonant distortions in dysarthria due to parkinson's disease, amyotrophic lateral sclerosis and cerebellar ataxiaTanja Kocjancic Antolík, Cécile Fougeron. 2152-2156 [doi]

Study of coarticulation and F2 transitions in French and Italian adult stutterersMarine Verdurand, Solange Rossato, Lionel Granjon, Daria Balbo, Claudio Zmarich. 2157-2161 [doi]

Automatic tracheoesophageal voice typing using acoustic parametersRenee Peje Clapham, Corina J. van As-Brooks, Michiel W. M. van den Brekel, Frans J. M. Hilgers, R. J. J. H. van Son. 2162-2166 [doi]

Burst-based features for the classification of pathological voicesJulie Mauclair, Lionel Koenig, Marina Robert, Peggy Gatignol. 2167-2171 [doi]

Classification of depression state based on articulatory precisionBrian S. Helfer, Thomas F. Quatieri, James R. Williamson, Daryush D. Mehta, Rachelle Horwitz, Bea Yu. 2172-2176 [doi]

Using text and acoustic features to diagnose progressive aphasia and its subtypesKathleen C. Fraser, Frank Rudzicz, Elizabeth Rochon. 2177-2181 [doi]

Multi-domain neural network language modelTanel Alumäe. 2182-2186 [doi]

Improving lightly supervised training for broadcast transcriptionYanhua Long, Mark J. F. Gales, Pierre Lanchantin, Xunying Liu, Matthew Stephen Seigel, Philip C. Woodland. 2187-2191 [doi]

Weakly supervised parsing with rulesChristophe Cerisara, Alejandra Lorenzo, Pavel Král. 2192-2196 [doi]

Relative error bounds for statistical classifiers based on the f-divergenceMarkus Nußbaum-Thom, Eugen Beck, Tamer Alkhouli, Ralf Schlüter, Hermann Ney. 2197-2201 [doi]

Experiments towards a better LVCSR system for tamilMelvin Jose Johnson Premkumar, Ngoc Thang Vu, Tanja Schultz. 2202-2206 [doi]

A hybrid language model for open-vocabulary Thai LVCSRKwanchiva Thangthai, Ananlada Chotimongkol, Chai Wutiwiwatchai. 2207-2211 [doi]

Hierarchical pitman-yor and dirichlet process for language modelJen-Tzung Chien, Ying-Lan Chang. 2212-2216 [doi]

Unsupervised confidence calibration using examples of recognized words and their contextsTaichi Asami, Satoshi Kobashikawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi. 2217-2221 [doi]

Multilingual hierarchical MRASTA features for ASRZoltán Tüske, Ralf Schlüter, Hermann Ney. 2222-2226 [doi]

Heuristic selection of training sentences from historical TV guide for semi-supervised LM adaptationHarry M. Chang. 2227-2231 [doi]

Combination of random indexing based language model and n-gram language model for speech recognitionDominique Fohr, Odile Mella. 2232-2236 [doi]

Improving low-resource CD-DNN-HMM using dropout and multilingual DNN trainingYajie Miao, Florian Metze. 2237-2241 [doi]

Finding recurrent out-of-vocabulary wordsLong Qin, Alexander I. Rudnicky. 2242-2246 [doi]

Using conversational word bursts in spoken term detectionJustin Chiu, Alexander I. Rudnicky. 2247-2251 [doi]

Brain activations in speech recovery process after intra-oral surgery: an fMRI studyAudrey Acher, Marc Sato, Laurent Lamalle, Coriandre Vilain, Arnaud Attye, Alexandre Krainik, Georges Bettega, Christian Adrien Righini, Brice Carlot, Muriel Brix, Pascal Perrier. 2252-2256 [doi]

Acoustic and perceptual analysis of vocal tremorChristophe Mertens, Jean Schoentgen, Francis Grenez, Sabine Skodda. 2257-2261 [doi]

Lexical tone perception in Thai normal-hearing adults and those using hearing aids: a case studyCharturong Tantibundhit, Chutamanee Onsuwan, N. Klangpornkun, P. Phienphanich, Tanawan Saimai, Nantaporn Saimai, P. Pitathawatchai, Chai Wutiwiwatchai. 2262-2266 [doi]

Evaluation of a bone-conducted ultrasonic hearing aid in vocal emotion transmissionTakayuki Kagomiya, Seiji Nakagawa. 2267-2271 [doi]

Processing of /i/ and /u/ in Italian cochlear-implant children: a behavioral and neurophysiologic studyLuigia Garrapa, Davide Bottari, Mirko Grimaldi, Francesco Pavani, Andrea Calabrese, Michele De Benedetto, Silvano Vitale. 2272-2276 [doi]

Predicting the bilateral advantage in cochlear implantees using a non-intrusive speech intelligibility measureStefano Cosentino, Tiago H. Falk, David McAlpine. 2277-2281 [doi]

A blind segmentation approach to acoustic event detection based on i-vectorZhen Huang, You-Chi Cheng, Kehuang Li, Ville Hautamäki, Chin-Hui Lee. 2282-2286 [doi]

A dynamic programming framework for neural network-based automatic speech segmentationVan Zyl van Vuuren, Louis ten Bosch, Thomas Niesler. 2287-2291 [doi]

Acoustic segmentation of speech using zero time liftering (ZTL)RaviShankar Prasad, B. Yegnanarayana. 2292-2296 [doi]

Unsupervised mining of acoustic subword units with segment-level Gaussian posteriorgramsHaipeng Wang, Tan Lee, Cheung Chi Leung, Bin Ma, Haizhou Li. 2297-2301 [doi]

Combination of auditory attention features with phone posteriors for better automatic phoneme segmentationOzlem Kalinli. 2302-2305 [doi]

Automatic phonetic segmentation using boundary modelsJiahong Yuan, Neville Ryant, Mark Liberman, Andreas Stolcke, Vikramjit Mitra, Wen Wang. 2306-2310 [doi]

HMM-based TTS for hanoi vietnamese: issues in design and evaluationThi Thu Trang Nguyen, Christophe d'Alessandro, Albert Rilliard, Do Dat Tran. 2311-2315 [doi]

HMM-based synthesis of creaky voiceTuomo Raitio, John Kane, Thomas Drugman, Christer Gobl. 2316-2320 [doi]

Integrating conditional random fields and joint multi-gram model with syllabic features for grapheme-to-phone conversionXiaoxuan Wang, Khe Chai Sim. 2321-2325 [doi]

Structure learning in hidden conditional random fields for grapheme-to-phoneme conversionPatrick Lehnen, Alexandre Allauzen, Thomas Lavergne, François Yvon, Stefan Hahn, Hermann Ney. 2326-2330 [doi]

TUNDRA: a multilingual corpus of found data for TTS research created with light supervisionAdriana Stan, Oliver Watts, Yoshitaka Mamiya, Mircea Giurgiu, Robert A. J. Clark, Junichi Yamagishi, Simon King. 2331-2335 [doi]

Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesisRanniery Maia, M. J. F. Gales, Yannis Stylianou, Masami Akamine. 2336-2340 [doi]

Augmented conditional random fields modeling based on discriminatively trained featuresYasser Hifny. 2341-2344 [doi]

Sequence-discriminative training of deep neural networksKarel Veselý, Arnab Ghoshal, Lukás Burget, Daniel Povey. 2345-2349 [doi]

Discriminatively trained sparse inverse covariance matrices for low resource acoustic modelingWeibin Zhang, Pascale Fung. 2350-2354 [doi]

Discriminative training of acoustic models for system combinationYuuki Tachioka, Shinji Watanabe. 2355-2359 [doi]

Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibrationYan Huang, Dong Yu, Yifan Gong, Chaojun Liu. 2360-2364 [doi]

Restructuring of deep neural network acoustic models with singular value decompositionJian Xue, Jinyu Li, Yifan Gong. 2365-2369 [doi]

Large-scale characterization of Mandarin pronunciation errors made by native speakers of European languagesNancy F. Chen, Vivaek Shivakumar, Mahesh Harikumar, Bin Ma, Haizhou Li. 2370-2374 [doi]

Production training in second language acquisition: a comparison between objective measures and subjective judgmentsVéronique Delvaux, Kathy Huet, Myriam Piccaluga, Bernard Harmegnies. 2375-2379 [doi]

The production and perception of voice onset time in English-speaking children enrolled in a French immersion programNicole Netelenbos, Fangfang Li. 2380-2384 [doi]

Pronunciation errors by Spanish learners of Dutch: a data-driven study for ASR-based pronunciation trainingPepi Burgos, Catia Cucchiarini, Roeland Van Hout, Helmer Strik. 2385-2389 [doi]

Realisation of tonal alignment in the English of Japanese-English late bilingualsCalbert Graham, Brechtje Post. 2390-2394 [doi]

The influence of language and speech task upon creaky voice use among six young American women learning FrenchAgathe Benoist-Lucy, Claire Pillot-Loiseau. 2395-2399 [doi]

Acoustic-prosodic, turn-taking, and language cues in child-psychologist interactions for varying social demandDaniel Bone, Chi-Chun Lee, Theodora Chaspari, Matthew P. Black, Marian E. Williams, Sungbok Lee, Pat Levitt, Shrikanth Narayanan. 2400-2404 [doi]

A preliminary study of child vocalization on a parallel corpus of US and shanghainese toddlersHynek Boril, Qian Zhang, Pongtep Angkititrakul, John H. L. Hansen, Dongxin Xu, Jill Gilkerson, Jeffrey A. Richards. 2405-2409 [doi]

A survey about databases of children's speechFelix Claus, Hamurabi Gamboa Rosales, Rico Petrick, Horst-Udo Hain, Rüdiger Hoffmann. 2410-2414 [doi]

Affective evaluation of multimodal dialogue games for preschoolers using physiological signalsVassiliki Kouloumenta, Manolis Perakakis, Alexandros Potamianos. 2415-2419 [doi]

Amplitude modulation features for emotion recognition from speechMd. Jahangir Alam, Yazid Attabi, Pierre Dumouchel, Patrick Kenny, Douglas D. O'Shaughnessy. 2420-2424 [doi]

Analyzing eye-voice coordination in rapid automatized namingDaniel Bone, Chi-Chun Lee, Vikram Ramanarayanan, Shrikanth Narayanan, Renske S. Hoedemaker, Peter C. Gordon. 2425-2429 [doi]

Analyzing the structure of parent-moderated narratives from children with ASD using an entity-based approachTheodora Chaspari, Emily Mower Provost, Shrikanth Narayanan. 2430-2434 [doi]

Automated speech scoring for non-native middle school students with multiple task typesKeelan Evanini, Xinhao Wang. 2435-2439 [doi]

Identification of gender from children's speech by computers and humansSaeid Safavi, Peter Jancovic, Martin J. Russell, Michael J. Carey 0002. 2440-2444 [doi]

On why Japanese /r/ sounds are difficult for children to acquireTakayuki Arai. 2445-2449 [doi]

Anchor and UBM-based multi-class MLLR m-vector system for speaker verificationAchintya Kumar Sarkar, Claude Barras. 2450-2454 [doi]

Ensemble approach in speaker verificationLeibny Paola García-Perera, Bhiksha Raj, Juan Arturo Nolazco-Flores. 2455-2459 [doi]

Sequential model adaptation for speaker verificationJun Wang, Dong Wang, Xiaojun Wu, Thomas Fang Zheng, Javier Tejedor. 2460-2464 [doi]

Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniquesAhilan Kanagasundaram, David Dean, Javier Gonzalez-Dominguez, Sridha Sridharan, Daniel Ramos, Joaquin Gonzalez-Rodriguez. 2465-2469 [doi]

On leveraging conversational data for building a text dependent speaker verification systemHagai Aronowitz, Oren Barkan. 2470-2473 [doi]

THU-EE system fusion for the NIST 2012 speaker recognition evaluationWei-Qiang Zhang, Zhiyi Li, Weiwei Liu, Jia Liu. 2474-2478 [doi]

Subspace-constrained supervector PLDA for speaker verificationDaniel Garcia-Romero, Alan McCree. 2479-2483 [doi]

Augmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone dataCong-Thanh Do, Claude Barras, Viet Bac Le, Achintya Kumar Sarkar. 2484-2488 [doi]

Using group delay functions from all-pole models for speaker recognitionPadmanabhan Rajan, Tomi Kinnunen, Cemal Hanilçi, Jouni Pohjalainen, Paavo Alku. 2489-2493 [doi]

Secure binary embeddings of front-end factor analysis for privacy preserving speaker verificationJosé Portelo, Alberto Abad, Bhiksha Raj, Isabel Trancoso. 2494-2498 [doi]

On von-mises fisher mixture model in text-independent speaker identificationJalil Taghia, Zhanyu Ma, Arne Leijon. 2499-2503 [doi]

Using phone log-likelihood ratios as features for speaker recognitionMireia Díez, Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel. 2504-2508 [doi]

Handling recordings acquired simultaneously over multiple channels with PLDAJesús A. Villalba, Mireia Díez, Amparo Varona, Eduardo Lleida. 2509-2513 [doi]

Bayesian distance metric learning on i-vector for speaker verificationXiao Fang, Najim Dehak, James R. Glass. 2514-2518 [doi]

Merging human and automatic system decisions to improve speaker recognition performanceRosa González Hautamäki, Ville Hautamäki, Padmanabhan Rajan, Tomi Kinnunen. 2519-2523 [doi]

Recurrent neural networks for language understandingKaisheng Yao, Geoffrey Zweig, Mei-Yuh Hwang, Yangyang Shi, Dong Yu. 2524-2528 [doi]

A study on LVCSR and keyword search for tagalogKorbinian Riedhammer, Van Hai Do, James Hieronymus. 2529-2533 [doi]

Characterising depressed speech for classificationSharifa Alghowinem, Roland Goecke, Michael Wagner, Julien Epps, Gordon Parker, Michael Breakspear. 2534-2538 [doi]

Combining acoustic name spotting and continuous context models to improve spoken person name recognition in speechBenjamin Bigot, Grégory Senay, Georges Linarès, Corinne Fredouille, Richard Dufour. 2539-2543 [doi]

A resource-dependent approach to word modeling for keyword spottingI.-Fan Chen, Chin-Hui Lee. 2544-2548 [doi]

Markers of confidence and correctness in spoken medical narrativesKathryn Womack, Cecilia Ovesdotter Alm, Cara Calvelli, Jeff B. Pelz, Pengcheng Shi, Anne R. Haake. 2549-2553 [doi]

Development of a web framework for teaching and learning Japanese prosody: OJAD (online Japanese accent dictionary)Ibuki Nakamura, Nobuaki Minematsu, Masayuki Suzuki, Hiroko Hirano, Chieko Nakagawa, Noriko Nakamura, Yukinori Tagawa, Keikichi Hirose, Hiroya Hashimoto. 2554-2558 [doi]

Addressee detection for dialog systems using temporal and spectral dimensions of speaking styleElizabeth Shriberg, Andreas Stolcke, Suman Ravuri. 2559-2563 [doi]

Analysis of factors involved in the choice of rising or non-rising intonation in question utterances appearing in conversational speechHiroaki Hatano, Miyako Kiso, Carlos Toshinori Ishi. 2564-2568 [doi]

IsNL? a discriminative approach to detect natural language like queries for conversational understandingAsli Çelikyilmaz, Gökhan Tür, Dilek Hakkani-Tür. 2569-2573 [doi]

Automatic accent quantification of indian speakers of EnglishJian Cheng, Nikhil Bojja, Xin Chen. 2574-2578 [doi]

Semantic parsing using word confusion networks with conditional random fieldsGökhan Tür, Anoop Deoras, Dilek Hakkani-Tür. 2579-2583 [doi]

Timing responses to questions in dialogueSofia Strömbergsson, Anna Hjalmarsson, Jens Edlund, David House. 2584-2588 [doi]

BUT BABEL system for spontaneous CantoneseMartin Karafiát, Frantisek Grézl, Mirko Hannemann, Karel Veselý, Jan Cernocký. 2589-2593 [doi]

Semi-supervised manifold learning approaches for spoken term verificationAtta Norouzian, Richard C. Rose, Aren Jansen. 2594-2598 [doi]

Language modeling for mixed language speech recognition using weighted phrase extractionYing Li, Pascale Fung. 2599-2603 [doi]

Quality assessment of asymmetric multiparty telephone conferences: a systematic method from technical degradations to perceived impairmentsJanto Skowronek, Julian Herlinghaus, Alexander Raake. 2604-2608 [doi]

User activity estimation method based on probabilistic generative model of acoustic event sequence with user activity and its subordinate categoriesKeisuke Imoto, Suehiro Shimauchi, Hisashi Uematsu, Hitoshi Ohmuro. 2609-2613 [doi]

Generalizing continuous-space translation of paralinguistic informationTakatomo Kano, Shinnosuke Takamichi, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura. 2614-2618 [doi]

An empirical comparison of joint optimization techniques for speech translationMasaya Ohgushi, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura. 2619-2623 [doi]

A sequential repetition model for improved disfluency detectionMari Ostendorf, Sangyun Hahn. 2624-2628 [doi]

Disfluency detection based on prosodic features for university lecturesHenrique Medeiros, Helena Moniz, Fernando Batista, Isabel Trancoso, Luis Nunes. 2629-2633 [doi]

What's the difference? comparing humans and machines on the Aurora 2 speech recognition taskBernd T. Meyer. 2634-2638 [doi]

Calibration of distance measures for unsupervised query-by-exampleMichele Gubian, Lou Boves, Maarten Versteegh. 2639-2643 [doi]

Indexing multimedia documents with acoustic concept recognition latticesDiego Castán, Murat Akbacak. 2644-2648 [doi]

MINT.tools: tools and adaptors supporting acquisition, annotation and analysis of multimodal corporaSpyros Kousidis, Thies Pfeiffer, David Schlangen. 2649-2653 [doi]

4allRobert A. J. Clark. 2654-2656 [doi]

On-line learning of lexical items and grammatical constructions via speech, gaze and action-based human-robot interactionGrégoire Pointeau, Maxime Petit, Xavier Hinaut, Guillaume Gibert, Peter Ford Dominey. 2657-2659 [doi]

Development of a pronunciation training system based on auditory-visual elementsHaruko Miyakoda. 2660-2661 [doi]

Real-time and non-real-time voice conversion systems with web interfacesElias Azarov, Maxim Vashkevich, Denis Likhachov, Alexander A. Petrovsky. 2662-2663 [doi]

Application of the NAO humanoid robot in the treatment of bone marrow-transplanted children (demo)E. Csala, Géza Németh, Csaba Zainkó. 2664-2666 [doi]

Photo-realistic expressive text to talking head synthesisVincent Wan, Robert Anderson, Art Blokland, Norbert Braunschweiler, Langzhou Chen, BalaKrishna Kolluru, Javier Latorre, Ranniery Maia, Björn Stenger, Kayoko Yanagisawa, Yannis Stylianou, Masami Akamine, M. J. F. Gales, Roberto Cipolla. 2667-2669 [doi]

Demonstration of LAPSyd: lyon-albuquerque phonological systems databaseIan Maddieson, Sébastien Flavier, Egidio Marsico, François Pellegrino. 2670-2671 [doi]

Speechmark acoustic landmark tool: application to voice pathologySuzanne Boyce, Marisha Speights, Keiko Ishikawa, Joel MacAuslan. 2672-2674 [doi]

MODIS: an audio motif discovery softwareLaurence Catanese, Nathan Souviraà-Labastie, Bingqing Qu, Sebastien Campion, Guillaume Gravier, Emmanuel Vincent, Frédéric Bimbot. 2675-2677 [doi]

Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognitionMd. Akmal Haidar, Douglas D. O'Shaughnessy. 2678-2682 [doi]

Incorporating proximity information for relevance language modeling in speech recognitionYi-Wen Chen, Bo-Han Hao, Kuan-Yu Chen, Berlin Chen. 2683-2687 [doi]

Instance-based on-line language model adaptationAli Orkan Bayer, Giuseppe Riccardi. 2688-2692 [doi]

Unsupervised topic adaptation for morph-based speech recognitionAndré Mansikkaniemi, Mikko Kurimo. 2693-2697 [doi]

Unsupervised language model adaptation for automatic speech recognition of broadcast news using web 2.0Tim Schlippe, Lukasz Gren, Ngoc Thang Vu, Tanja Schultz. 2698-2702 [doi]

Recurrent neural network based language model personalization by social network crowdsourcingTsung-Hsien Wen, Aaron Heidel, Hung-yi Lee, Yu Tsao, Lin-Shan Lee. 2703-2707 [doi]

Language-independent call routing using the large margin estimation principleMoataz El Ayadi, Mohamed Afify. 2708-2712 [doi]

Deep belief network based semantic taggers for spoken language understandingAnoop Deoras, Ruhi Sarikaya. 2713-2717 [doi]

Error-corrective discriminative joint decoding of automatic spoken language transcription and understandingBassam Jabaian, Fabrice Lefèvre. 2718-2722 [doi]

Detecting summarization hot spots in meetings using group level involvement and turn-taking featuresCatherine Lai, Jean Carletta, Steve Renals. 2723-2727 [doi]

Supervised spoken document summarization based on structured support vector machine with utterance clusters as hidden variablesSz-Rung Shiang, Hung-yi Lee, Lin-Shan Lee. 2728-2732 [doi]

Web data harvesting for speech understanding grammar inductionIoannis Klasinas, Alexandros Potamianos, Elias Iosif, Spiros Georgiladakis, Gianluca Mameli. 2733-2737 [doi]

Articulatory synthesis of French connected speech from EMA dataAsterios Toutios, Shrikanth Narayanan. 2738-2742 [doi]

A new language independent, photo-realistic talking head driven by voice onlyXinjian Zhang, Lijuan Wang, Gang Li, Frank Seide, Frank K. Soong. 2743-2747 [doi]

Binocular photometric stereo acquisition and reconstruction for 3d talking head applicationsChaoyang Wang, Lijuan Wang, Yasuyuki Matsushita, Bojun Huang, Magnetro Chen, Frank K. Soong. 2748-2752 [doi]

Speaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressionsThomas Hueber, Gérard Bailly, Pierre Badin, Frédéric Elisei. 2753-2757 [doi]

Articulatory features for speech-driven head motion synthesisAtef Ben Youssef, Hiroshi Shimodaira, David Adam Braude. 2758-2762 [doi]

Template-warping based speech driven head motion synthesisDavid Adam Braude, Hiroshi Shimodaira, Atef Ben Youssef. 2763-2767 [doi]

ALIZE 3.0 - open source toolkit for state-of-the-art speaker recognitionAnthony Larcher, Jean-François Bonastre, Benoit G. B. Fauve, Kong-Aik Lee, Christophe Lévy, Haizhou Li, John S. D. Mason, Jean-Yves Parfait. 2768-2772 [doi]

New cosine similarity scorings to implement gender-independent speaker verificationMohammed Senoussaoui, Patrick Kenny, Pierre Dumouchel, Najim Dehak. 2773-2777 [doi]

Improving speaker identification in TV-shows using person name detection in overlaid text and speechDelphine Charlet, Corinne Fredouille, Géraldine Damnati, Grégory Senay. 2778-2782 [doi]

Exploring methods of improving speaker accuracy for speaker diarizationMary Tai Knox, Nikki Mirghafori, Gerald Friedland. 2783-2787 [doi]

Combining deep speaker specific representations with GMM-SVM for speaker verificationRyan Price, Sangeeta Biswas, Koichi Shinoda. 2788-2792 [doi]

Using spectral moments as a speaker specific feature in nasals and fricativesCarola Schindler, Christoph Draxler. 2793-2796 [doi]

A computational model of perceptuo-motor processing in speech perception: learning to imitate and categorize synthetic CV syllablesRaphaël Laurent, Jean-Luc Schwartz, Pierre Bessière, Julien Diard. 2797-2801 [doi]

Talker-specific perceptual processing: influences on internal category structureRachel M. Theodore. 2802-2806 [doi]

Elicitation and analysis of a corpus of robust noise-induced word misperceptions in SpanishMaria Luisa Garcia Lecumberri, Máté Attila Tóth, Yan Tang, Martin Cooke. 2807-2811 [doi]

Vocabulary structure and spoken-word recognition: evidence from French reveals the source of embedding asymmetryAnne Cutler, Laurence Bruggeman. 2812-2816 [doi]

How do multiple sublexical cues converge in lexical segmentation? an artificial language learning studyOdile Bagou, Ulrich H. Frauenfelder. 2817-2821 [doi]

Towards an end-to-end computational model of speech comprehension: simulating a lexical decision taskLouis ten Bosch, Lou Boves, Mirjam Ernestus. 2822-2826 [doi]

Demographic recommendation by means of group profile elicitation using speaker age and gender recognitionSven Ewan Shepstone, Zheng-Hua Tan, Søren Holdt Jensen. 2827-2831 [doi]

Affective classification of generic audio clips using regression modelsNikos Malandrakis, Shiva Sundaram, Alexandros Potamianos. 2832-2836 [doi]

A preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perceptionJe Hun Jeon, Duc Le, Rui Xia, Yang Liu. 2837-2840 [doi]

Active learning for dimensional speech emotion recognitionWenjing Han, Haifeng Li, Huabin Ruan, Lin Ma, Jiayin Sun, Björn Schuller. 2841-2845 [doi]

Auditory detectability of vocal ageing and its effect on forensic automatic speaker recognitionFinnian Kelly, Naomi Harte. 2846-2850 [doi]

Comparative study of speaker personality traits recognition in conversational and broadcast news speechFiroj Alam, Giuseppe Riccardi. 2851-2855 [doi]

Active learning by label uncertainty for acoustic emotion recognitionZixing Zhang, Jun Deng, Erik Marchi, Björn Schuller. 2856-2860 [doi]

Modeling therapist empathy and vocal entrainment in drug addiction counselingBo Xiao, Panayiotis G. Georgiou, Zac E. Imel, David C. Atkins, Shrikanth Narayanan. 2861-2865 [doi]

Estimating callers' levels of knowledge in call center dialoguesChiaki Miyazaki, Ryuichiro Higashinaka, Toshiro Makino, Yoshihiro Matsuo. 2866-2870 [doi]

Energy and F0 contour modeling with functional data analysis for emotional speech detectionJuan Pablo Arias, Carlos Busso, Néstor Becerra Yoma. 2871-2875 [doi]

Incremental emotion recognitionTaniya Mishra, Dimitrios Dimitriadis. 2876-2880 [doi]

Comparison of spectrum estimators in speaker verification: mismatch conditions induced by vocal effortCemal Hanilçi, Tomi Kinnunen, Padmanabhan Rajan, Jouni Pohjalainen, Paavo Alku, Figen Ertas. 2881-2885 [doi]

Using denoising autoencoder for emotion recognitionRui Xia, Yang Liu. 2886-2889 [doi]

A phase-modified approach for TDE-based acoustic localizationGeorgios Athanasopoulos, Werner Verhelst. 2890-2894 [doi]

Interference robust DOA estimation of human speech by exploiting historical information and temporal correlationWei Xue, Shan Liang, Wenju Liu. 2895-2899 [doi]

Identifying new bird species from differences in birdsongNaomi Harte, Sadhbh Murphy, David J. Kelly, Nicola M. Marples. 2900-2904 [doi]

Controlling "shout" expression in a Japanese POP singing performance: analysis and suppression studyYuri Nishigaki, Ken-Ichi Sakakibara, Masanori Morise, Ryuichi Nisimura, Toshio Irino, Hideki Kawahara. 2905-2909 [doi]

Dimensionality analysis of singing speech based on locality preserving projectionsMahnoosh Mehrabani, John H. L. Hansen. 2910-2914 [doi]

Audio classification using dominant spatial patterns in time-frequency spaceMd. Khademul Islam Molla, Keikichi Hirose. 2915-2919 [doi]

Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separationTse-En Lin, Chung-Chien Hsu, Yi-Cheng Chen, Jian-Hueng Chen, Tai-Shih Chi. 2920-2923 [doi]

NMF-based temporal feature integration for acoustic event classificationJimmy Ludeña-Choez, Ascensión Gallardo-Antolín. 2924-2928 [doi]

Robust audio-codebooks for large-scale event detection in consumer videosShourabh Rawat, Peter F. Schulam, Susanne Burger, Duo Ding, Yipei Wang, Florian Metze. 2929-2933 [doi]

Person identification using biometric markers from footsteps soundM. Umair Bin Altaf, Taras Butko, Biing-Hwang Juang. 2934-2938 [doi]

Learning binaural spectrogram features for azimuthal speaker localizationWiktor Mlynarski. 2939-2942 [doi]

An unsupervised Bayesian classifier for multiple speaker detection and localizationYoussef Oualil, Friedrich Faubel, Dietrich Klakow. 2943-2947 [doi]

Joint recognition and direction-of-arrival estimation of simultaneous meeting-room acoustic eventsRupayan Chakraborty, Climent Nadeu. 2948-2952 [doi]

Audio self organized units for high-level event detectionXiaodan Zhuang, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan. 2953-2957 [doi]

Distribution-based feature normalization for robust speech recognition leveraging context and dynamics cuesYu-Chen Kao, Berlin Chen. 2958-2962 [doi]

An investigation of temporally varying weight regression for noise robust speech recognitionShilin Liu, Khe Chai Sim. 2963-2967 [doi]

Feature space generalized variable parameter HMMs for noise robust recognitionYang Li, Xunying Liu, Lan Wang. 2968-2972 [doi]

Bidirectional truncated recurrent neural networks for efficient speech denoisingPhilemon Brakel, Dirk Stroobandt, Benjamin Schrauwen. 2973-2977 [doi]

Multi-stream recognition of noisy speech with performance monitoringEhsan Variani, Feipeng Li, Hynek Hermansky. 2978-2981 [doi]

Model-based noise suppression using unsupervised estimation of hidden Markov model for non-stationary noiseMasakiyo Fujimoto, Tomohiro Nakatani. 2982-2986 [doi]

Joint noise cancellation and dereverberation using multi-channel linearly constrained minimum variance filterKaran Nathwani, Rajesh M. Hegde. 2987-2991 [doi]

Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling?Marc Delcroix, Yotaro Kubo, Tomohiro Nakatani, Atsushi Nakamura. 2992-2996 [doi]

Histogram equalization of real and imaginary modulation spectra for noise-robust speech recognitionHsin-Ju Hsieh, Berlin Chen, Jeih-Weih Hung. 2997-3001 [doi]

An investigation of spectral restoration algorithms for deep neural networks based noise robust speech recognitionBo Li, Yu Tsao, Khe Chai Sim. 3002-3006 [doi]

Bounded conditional mean imputation with an approximate posteriorUlpu Remes. 3007-3011 [doi]

Mixtures of Bayesian joint factor analyzers for noise robust automatic speech recognitionXiaodong Cui, Vaibhava Goel, Brian Kingsbury. 3012-3016 [doi]

Robust speech enhancement techniques for ASR in non-stationary noise and dynamic environmentsGang Liu, Dimitrios Dimitriadis, Enrico Bocchieri. 3017-3021 [doi]

LAPSyd: lyon-albuquerque phonological systems databaseIan Maddieson, Sébastien Flavier, Egidio Marsico, Christophe Coupé, François Pellegrino. 3022-3026 [doi]

The duration compensation issue revisitedPlínio A. Barbosa. 3027-3031 [doi]

Cross-language comparison of functional load for vowels, consonants, and tonesYoon Mi Oh, François Pellegrino, Christophe Coupé, Egidio Marsico. 3032-3036 [doi]

Notes on so-called inter-speaker difference in spontaneous speech: the case of Japanese voiced obstruentKikuo Maekawa. 3037-3041 [doi]

The role of the pharynx and tongue in enhancement of vowel nasalization: a real-time MRI investigation of French nasal vowelsChristopher Carignan, Ryan Shosted, Maojing Fu, Zhi-Pei Liang, Bradley P. Sutton. 3042-3046 [doi]

Assimilation of word-final nasals to following word-initial place of articulation in UK EnglishMargaret E. L. Renwick, Ladan Baghai-Ravary, Rosalind Temple, John S. Coleman. 3047-3051 [doi]

Joint spectral distribution modeling using restricted boltzmann machines for voice conversionLing-Hui Chen, Zhen-Hua Ling, Yan Song, Li-Rong Dai. 3052-3056 [doi]

Exemplar-based unit selection for voice conversion utilizing temporal informationZhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Engsiong Chng, Haizhou Li. 3057-3061 [doi]

Alleviating the over-smoothing problem in GMM-based voice conversion with discriminative trainingHsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen. 3062-3066 [doi]

A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversionKou Tanaka, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura. 3067-3071 [doi]

A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversionTakuto Moriguchi, Tomoki Toda, Motoaki Sano, Hiroshi Sato, Graham Neubig, Sakriani Sakti, Satoshi Nakamura. 3072-3076 [doi]

Foreign accent conversion through voice morphingSandesh Aryal, Daniel Felps, Ricardo Gutierrez-Osuna. 3077-3081 [doi]

Empirical link between hypothesis diversity and fusion performance in an ensemble of automatic speech recognition systemsKartik Audhkhasi, Andreas M. Zavou, Panayiotis G. Georgiou, Shrikanth Narayanan. 3082-3086 [doi]

A lecture transcription system combining neural network acoustic and language modelsPeter Bell, Hitoshi Yamamoto, Pawel Swietojanski, Youzheng Wu, Fergus McInnes, Chiori Hori, Steve Renals. 3087-3091 [doi]

Neural network acoustic models for the DARPA RATS programHagen Soltau, Hong-Kwang Kuo, Lidia Mangu, George Saon, Tomás Beran. 3092-3096 [doi]

Improved models for automatic punctuation prediction for spoken and written textNicola Ueffing, Maximilian Bisani, Paul Vozila. 3097-3101 [doi]

Some issues affecting the transcription of Hungarian broadcast audioAnindya Roy, Lori Lamel, Thiago Fraga-Silva, Jean-Luc Gauvain, Ilya Oparin. 3102-3106 [doi]

Development of the RWTH transcription system for slovenianPavel Golik, Zoltán Tüske, Ralf Schlüter, Hermann Ney. 3107-3111 [doi]

Noise robust speaker verification with delta cepstrum normalizationNaoyuki Kanda, Ryu Takeda, Yasunari Obuchi. 3112-3116 [doi]

R-norm: improving inter-speaker variability modelling at the score level via regression score normalisationDavid Vandyke, Michael Wagner, Roland Goecke. 3117-3121 [doi]

Frequency warping and robust speaker verification: a comparison of alternative mel-scale representationsTomi Kinnunen, Md. Jahangir Alam, Pavel Matejka, Patrick Kenny, Jan Cernocký, Douglas D. O'Shaughnessy. 3122-3126 [doi]

Acoustic factor analysis based universal background model for robust speaker verification in noiseTaufiq Hasan, John H. L. Hansen. 3127-3131 [doi]

A new Bayesian network to assess the reliability of speaker verification decisionsJesús Villalba López, Eduardo Lleida, Alfonso Ortega, Antonio Miguel. 3132-3136 [doi]

The IBM RATS phase II speaker recognition system: overview and analysisWeizhong Zhu, Sibel Yaman, Jason W. Pelecanos. 3137-3141 [doi]

Vowel identity conditions the time course of tone recognitionJason A. Shaw, Michael D. Tyler, Benjawan Kasisopa, Yuan Ma, Michael I. Proctor, Chong Han, Donald Derrick, Denis K. Burnham. 3142-3146 [doi]

Changes in the role of intensity as a cue for fricative categorisationOdette Scharenborg, Esther Janse. 3147-3151 [doi]

Weighting of acoustic cues shifts to frication duration in identification of fricatives/affricates when auditory properties are degraded due to agingKeiichi Yasu, Takayuki Arai, Kei Kobayashi, Mitsuko Shindo. 3152-3156 [doi]

Duration as a secondary cue for perception of voicing and tone in shanghai ChineseJiayin Gao, Pierre A. Hallé. 3157-3161 [doi]

Development of central auditory processes and their links with language skills in typically developing childrenMarie Dekerle, Fanny Meunier, Marie-Ange N'Guyen, Estelle Gillet-Perret, Delphine Lassus-Sangosse, Sophie Donnadieu. 3162-3166 [doi]

Show me what you listen to! auditory classification images can reveal the processing of fine acoustic cues during speech categorizationLéo Varnet, Kenneth Knoblauch, Fanny Meunier, Michel Hoen. 3167-3171 [doi]

The organ stop "vox humana" as a model for a vowel synthesiserFabian Brackhane, Jürgen Trouvain. 3172-3176 [doi]

Information theoretic acoustic feature selection for acoustic-to-articulatory inversionPrasanta Kumar Ghosh, Shrikanth Narayanan. 3177-3181 [doi]

Formant contours in Czech vowels: speaker-discriminating potentialDita Fejlová, David Lukes, Radek Skarnitzl. 3182-3186 [doi]

An anisotropic diffusion filter based on multidirectional separabilityShen Liu, Jianguo Wei, Xin Wang, Wenhuan Lu, Qiang Fang, Jianwu Dang. 3187-3190 [doi]

The phonological voicing contrast in Czech: an EPG study of phonated and whispered fricativesRadek Skarnitzl, Pavel Sturm, Pavel Machac. 3191-3195 [doi]

Vowel and prosodic factor dependent variations of vocal-tract lengthShinji Maeda, Yves Laprie. 3196-3200 [doi]

Word identification using phonetic features: towards a method to support multivariate fMRI speech decodingTijl Grootswagers, Karen Dijkstra, Louis ten Bosch, Alex Brandmeyer, Makiko Sadakata. 3201-3205 [doi]

Analysis of breathy, modal and pressed phonation based on low frequency spectral densityDhananjaya N. Gowda, Mikko Kurimo. 3206-3210 [doi]

Is the vowel length contrast in Japanese exaggerated in infant-directed speech?Keiichi Tajima, Kuniyoshi Tanaka, Andrew Martin, Reiko Mazuka. 3211-3215 [doi]

Investigating the relationship between glottal area waveform shape and harmonic magnitudes through computational modeling and laryngeal high-speed videoendoscopyGang Chen 0009, Robin A. Samlan, Jody Kreiman, Abeer Alwan. 3216-3220 [doi]

Formant frequency tracking using Gaussian mixtures with maximum a posteriori adaptationJonathan C. Kim, Hrishikesh Rao, Mark A. Clements. 3221-3225 [doi]

Devoicing of vowels in German, a comparison of Japanese and German speakersRei Yasuda, Frank Zimmerer. 3226-3229 [doi]

Identifying consonantal tasks via measures of tongue shaping: a real-time MRI investigation of the production of vocalized syllabic /l/ in American EnglishCaitlin Smith, Adam C. Lammert. 3230-3233 [doi]

A speech enhancement method by coupling speech detection and spectral amplitude estimationFeng Deng, Changchun Bao, Feng Bao. 3234-3238 [doi]

Late reverberation suppression using MMSE modulation spectral estimationChenxi Zheng, Wai-Yip Chan. 3239-3243 [doi]

A new statistical excitation mapping for enhancement of throat microphone recordingsM. A. Tugtekin Turan, Engin Erzin. 3244-3248 [doi]

Classification based binaural dereverberationNicoleta Roman, Michael I. Mandel. 3249-3253 [doi]

Target-to-non-target directional ratio estimation based on dual-microphone phase differences for target-directional speech enhancementSeon-Man Kim, Hong Kook Kim. 3254-3258 [doi]

Speech spectrum restoration based on conditional restricted boltzmann machineXugang Lu, Shigeki Matsuda, Chiori Hori. 3259-3263 [doi]

Speaker separation using visual speech features and single-channel audioFaheem Khan, Ben Milner. 3264-3268 [doi]

Spectral modulation sensitivity based perceptual acoustic echo cancellationWei-Lun Chuang, Kah-Meng Cheong, Chung-Chien Hsu, Tai-Shih Chi. 3269-3273 [doi]

Speech enhancement using compressed sensingVinayak Abrol, Pulkit Sharma, Anil Kumar Sao. 3274-3278 [doi]

Spectro-temporal post-enhancement using MMSE estimation in NMF based single-channel source separationEmad M. Grais, Hakan Erdogan. 3279-3283 [doi]

A pitch-based spectral enhancement technique for robust speech processingKantapon Kaewtip, Lee Ngee Tan, Abeer Alwan. 3284-3288 [doi]

Stochastic-deterministic signal modelling for the tracking of pitch in noise and speech mixtures using factorial HMMsMatthew McCallum, Bernard J. Guillemin. 3289-3293 [doi]

Restoration of clipped signals with application to speech recognitionShay Maymon, Etienne Marcheret, Vaibhava Goel. 3294-3297 [doi]

On the robustness of distributed EM based BSS in asynchronous distributed microphone array scenariosYasufumi Uezu, Keisuke Kinoshita, Mehrez Souden, Tomohiro Nakatani. 3298-3302 [doi]

Infinite support vector machines in speech recognitionJingzhou Yang, Rogier C. van Dalen, M. J. F. Gales. 3303-3307 [doi]

An on-line incremental speaker adaptation technique for audio stream transcriptionDiego Giuliani, Fabio Brugnara. 3308-3312 [doi]

Accent- and speaker-specific polyphone decision trees for non-native speech recognitionDominic Telaar, Mark C. Fuhs. 3313-3316 [doi]

Investigations on hessian-free optimization for cross-entropy training of deep neural networksSimon Wiesler, Jinyu Li, Jian Xue. 3317-3321 [doi]

Cross-lingual acoustic model adaptation based on transfer vector field smoothing with MAPMasahiro Saiko, Shigeki Matsuda, Ken Hanazawa, Ryosuke Isotani, Chiori Hori. 3322-3326 [doi]

N-best rescoring by phoneme classifiers using subclass adaboost algorithmHiroshi Fujimura, Yusuke Shinohara, Takashi Masuko. 3327-3331 [doi]

Stream selection and integration in multistream ASR using GMM-based performance monitoringTetsuji Ogawa, Feipeng Li, Hynek Hermansky. 3332-3336 [doi]

VTLN based on the linear interpolation of contiguous mel filter-bank energiesNéstor Becerra Yoma, Claudio Garretón, Fernando Huenupán, Ignacio Catalan, Jorge Wuth. 3337-3341 [doi]

Context-dependent modeling and speaker normalization applied to reservoir-based phone recognitionFabian Triefenbach, Azarakhsh Jalalvand, Kris Demuynck, Jean-Pierre Martens. 3342-3346 [doi]

Interpolation of acoustic models for speech recognitionThiago Fraga-Silva, Jean-Luc Gauvain, Lori Lamel. 3347-3351 [doi]

Training log-linear acoustic models in higher-order polynomial feature space for speech recognitionM. Tahir, Heyun Huang, Ralf Schlüter, Hermann Ney, Louis ten Bosch, Bert Cranen, Lou Boves. 3352-3355 [doi]

Comparison of spectral analysis methods for automatic speech recognitionVenkata Neelima Parinam, Chandra Sekhar Vootkuri, Stephen A. Zahorian. 3356-3360 [doi]

Synthetic speaker models using VTLN to improve the performance of children in mismatched speaker conditions for ASRD. Rama Sanand, Torbjørn Svendsen. 3361-3365 [doi]

Exploring convolutional neural network structures and optimization techniques for speech recognitionOssama Abdel Hamid, Li Deng, Dong Yu. 3366-3370 [doi]

Rediscovering 25 years of discoveries in spoken language processing: a preliminary ISCA archive analysisJoseph Mariani, Patrick Paroubek, Gil Francopoulo, Marine Delaborde. 3371-3403 [doi]

Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSRM. Ali Basha Shaik, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney. 3404-3408 [doi]

Morpheme level hierarchical pitman-yor class-based language models for LVCSR of morphologically rich languagesAmr El-Desoky Mousa, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney. 3409-3413 [doi]

Discriminatively trained dependency language modeling for conversational speech recognitionBenjamin Lambert, Bhiksha Raj, Rita Singh. 3414-3418 [doi]

Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition systemYujing Si, Qingqing Zhang, Ta Li, Jielin Pan, YongHong Yan. 3419-3423 [doi]

Cross-domain paraphrasing for improving language modelling using out-of-domain dataXunying Liu, Mark J. F. Gales, Philip C. Woodland. 3424-3428 [doi]

Viterbi decoding for latent words language models using gibbs samplingRyo Masumura, Hirokazu Masataki, Takanobu Oba, Osamu Yoshioka, Satoshi Takahashi. 3429-3433 [doi]

Computationally efficient objective function for algebraic codebook optimization in ACELPTom Bäckström. 3434-3438 [doi]

Speech quality prediction for artificial bandwidth extension algorithmsSebastian Möller, Emilia Kelaidi, Friedemann Köster, Nicolas Côté, Patrick Bauer, Tim Fingscheidt, Thomas Schlien, Hannu Pulakka, Paavo Alku. 3439-3443 [doi]

Speech enhancement with weighted denoising auto-encoderBing-yin Xia, Chang-chun Bao. 3444-3448 [doi]

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architectureMilos Cernak, Xingyu Na, Philip N. Garner. 3449-3452 [doi]

Artificial bandwidth extension based on regularized piecewise linear mapping with discriminative region weighting and long-Span featuresNguyen Duc Duy, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose. 3453-3457 [doi]

Enhanced muting method in packet loss concealment of ITU-t g.722 employing optimized sigmoid functionBong-Ki Lee, Chungsoo Lim, Jihwan Park, Joon-Hyuk Chang. 3458-3462 [doi]

Automatic human utility evaluation of ASR systems: does WER really predict performance?Benoît Favre, Kyla Cheung, Siavash Kazemian, Adam Lee, Yang Liu, Cosmin Munteanu, Ani Nenkova, Dennis Ochei, Gerald Penn, Stephen Tratz, Clare R. Voss, Frauke Zeller. 3463-3467 [doi]

Corpus analysis of simultaneous interpretation data for improving real time speech translationVivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore. 3468-3472 [doi]

A real-world system for simultaneous translation of German lecturesEunah Cho, Christian Fügen, Teresa Herrmann, Kevin Kilgour, Mohammed Mediani, Christian Mohr, Jan Niehues, Kay Rottmann, Christian Saam, Sebastian Stüker, Alex Waibel. 3473-3477 [doi]

Freestyle: a challenge-response system for hip hop lyrics via unsupervised induction of stochastic transduction grammarsDekai Wu, Karteek Addanki, Markus Saers. 3478-3482 [doi]

Toward transfer of acoustic cues of emphasis across languagesAndreas Tsiartas, Panayiotis G. Georgiou, Shrikanth Narayanan. 3483-3486 [doi]

Simple, lexicalized choice of translation timing for simultaneous speech translationTomoki Fujita, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura. 3487-3491 [doi]

Noise adaptive training for subspace Gaussian mixture modelsLiang Lu, Arnab Ghoshal, Steve Renals. 3492-3496 [doi]

The IBM speech activity detection system for the DARPA RATS programGeorge Saon, Samuel Thomas, Hagen Soltau, Sriram Ganapathy, Brian Kingsbury. 3497-3501 [doi]

Conditional emission densities for combining speech enhancement and recognition systemsArmin Sehr, Takuya Yoshioka, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Roland Maas, Walter Kellermann. 3502-3506 [doi]

Channel selection using n-best hypothesis for multi-microphone ASRMartin Wolf, Climent Nadeu. 3507-3511 [doi]

Reverberant speech recognition based on denoising autoencoderTakaaki Ishii, Hiroki Komiyama, Takahiro Shinozaki, Yasuo Horiuchi, Shingo Kuroiwa. 3512-3516 [doi]

Adaptive stereo-based stochastic mappingShay Maymon, Pierre L. Dognin, Xiaodong Cui, Vaibhava Goel. 3517-3521 [doi]

The interplay of intonation and complex lexical tones: how speaker attitudes affect the realization of glottalization on vietnamese sentence-final particlesThi Lan Nguyen, Alexis Michaud, Do Dat Tran, Dang-Khoa Mac. 3522-3526 [doi]

The voice prominence hypothesis: the interplay of F0 and voice source features in accentuationAilbhe Ní Chasaide, Irena Yanushevskaya, John Kane, Christer Gobl. 3527-3531 [doi]

Mora-based pre-low raising in Japanese pitch accentAlbert Lee, Yi Xu, Santitham Prom-on. 3532-3536 [doi]

Prosodic cues of sarcastic speech in French: slower, higher, widerHélène Loevenbruck, Mohamed Ameur Ben Jannet, Mariapaola D'Imperio, Mathilde Spini, Maud Champagne-Lavau. 3537-3541 [doi]

Correlates of contrastive focus in congenitally blind adults and sighted adultsLucie Ménard, Annie Leclerc, Mark K. Tiede, Amélie Prémont, Christine Turgeon, Paméla Trudeau-Fisette, Dominique Côté. 3542-3546 [doi]

Is protrusion of French rounded vowels affected by prosodic positions?Laurianne Georgeton, Nicolas Audibert. 3547-3551 [doi]

Intelligibility-enhancing speech modifications: the hurricane challengeMartin Cooke, Catherine Mayo, Cassia Valentini-Botinhao. 3552-3556 [doi]

Statistical synthesizer with embedded prosodic and spectral modifications to generate highly intelligible speech in noiseDaniel Erro, Tudor-Catalin Zorila, Yannis Stylianou, Eva Navas, Inma Hernáez. 3557-3561 [doi]

Lombard modified text-to-speech synthesis for improved intelligibility: submission for the hurricane challenge 2013Antti Suni, Reima Karhila, Tuomo Raitio, Mikko Kurimo, Martti Vainio, Paavo Alku. 3562-3566 [doi]

Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noiseCassia Valentini-Botinhao, Junichi Yamagishi, Simon King, Yannis Stylianou. 3567-3571 [doi]

Increasing speech intelligibility via spectral shaping with frequency warping and dynamic range compression plus transient enhancementElizabeth Godoy, Yannis Stylianou. 3572-3576 [doi]

Improving speech intelligibility in noise by SII-dependent preprocessing using frequency-dependent amplification and dynamic range compressionHenning F. Schepker, Jan Rennies, Simon Doclo. 3577-3581 [doi]

SII-based speech preprocessing for intelligibility improvement in noiseCees H. Taal, Jesper Jensen. 3582-3586 [doi]

Rephrasing-based speech intelligibility enhancementMengqiu Zhang, Petko N. Petkov, W. Bastiaan Kleijn. 3587-3591 [doi]

Information-preserving temporal reallocation of speech in the presence of fluctuating maskersVincent Aubanel, Martin Cooke. 3592-3596 [doi]

Preservation of speech spectral dynamics enhances intelligibilityPetko N. Petkov, W. Bastiaan Kleijn. 3597-3601 [doi]

An overview of the VUB entry for the 2013 hurricane challengeHenk Brouckxon, Werner Verhelst. 3602-3604 [doi]

Improvement of speech intelligibility by reallocation of spectral energyReiko Takou, Nobumasa Seiyama, Atsushi Imai. 3605-3607 [doi]

Language-universal speech audiometry with automated scoringBart Vaerenberg, Louis ten Bosch, Wojtek Kowalczyk, Martine Coene, Herwig De Smet, Paul J. Govaerts. 3608-3612 [doi]

Balancing word lists in speech audiometry through large spoken language corporaAnnemiek Hammer, Bart Vaerenberg, Wojtek Kowalczyk, Louis ten Bosch, Martine Coene, Paul J. Govaerts. 3613-3616 [doi]

Developing an information system for deafVerónica López-Ludeña, Rubén San Segundo, Javier Ferreiros, José M. Pardo, E. Ferreiro. 3617-3621 [doi]

Dysarthric speech recognition using dysarthria-severity-dependent and speaker-adaptive modelsMyung Jong Kim, Joohong Yoo, Hoirin Kim. 3622-3626 [doi]

Voice pathology detection and classification using MPEG-7 audio low-level featuresGhulam Muhammad, Moutasem Melhem. 3627-3631 [doi]

Empirical mode decomposition-based spectral acoustic cues for disordered voices analysisAbdellah Kacha, Francis Grenez, Jean Schoentgen. 3632-3636 [doi]

Exemplar-based individuality-preserving voice conversion for articulation disorders in noisy environmentsRyo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki. 3637-3641 [doi]

Combining in-domain and out-of-domain speech data for automatic recognition of disordered speechHeidi Christensen, M. B. Aniol, Peter Bell, Phil D. Green, Thomas Hain, Simon King, Pawel Swietojanski. 3642-3645 [doi]

Effects of envelope filter cutoff frequency on the intelligibility of Mandarin noise-vocoded speech in babble noise: implications for cochlear implantsGuangting Mai, James W. Minett, William S.-Y. Wang. 3646-3650 [doi]

Multi-session PLDA scoring of i-vector for partially open-set speaker detectionKong-Aik Lee, Anthony Larcher, Chang Huai You, Bin Ma, Haizhou Li. 3651-3655 [doi]

Impact of noise reduction and spectrum estimation on noise robust speaker identificationKeith W. Godin, Seyed Omid Sadjadi, John H. L. Hansen. 3656-3660 [doi]

Improvement of distant-talking speaker identification using bottleneck features of DNNTakanori Yamada, Longbiao Wang, Atsuhiko Kai. 3661-3664 [doi]

Geometric contamination for GMM/UBM speaker verification in reverberant environmentsAlessio Brutti, Maurizio Omologo. 3665-3669 [doi]

Towards a more efficient SVM supervector speaker verification system using Gaussian reduction and a tree-structured hashRichard D. McClanahan, Phillip L. De Leon. 3670-3673 [doi]

Improving the PLDA based speaker verification in limited microphone data conditionsAhilan Kanagasundaram, David Dean, Javier Gonzalez-Dominguez, Sridha Sridharan, Daniel Ramos, Joaquin Gonzalez-Rodriguez. 3674-3678 [doi]

The I3a speaker recognition system for NIST SRE12: post-evaluation analysisJesús Villalba López, Eduardo Lleida, Alfonso Ortega, Antonio Miguel. 3679-3683 [doi]

Text-dependent speaker recognition using PLDA with uncertainty propagationThemos Stafylakis, Patrick Kenny, Pierre Ouellet, J. Perez, M. Kockmann, Pierre Dumouchel. 3684-3688 [doi]

Robust speaker recognition using spectro-temporal autoregressive modelsSri Harish Reddy Mallidi, Sriram Ganapathy, Hynek Hermansky. 3689-3693 [doi]

Effect of multicondition training on i-vector PLDA configurations for speaker recognitionPadmanabhan Rajan, Tomi Kinnunen, Ville Hautamäki. 3694-3697 [doi]

Improving robustness to compressed speech in speaker recognitionMitchell McLaren, Victor Abrash, Martin Graciarena, Yun Lei, Jan Pesán. 3698-3702 [doi]

Modulation features for noise robust speaker identificationVikramjit Mitra, Mitchell McLaren, Horacio Franco, Martin Graciarena, Nicolas Scheffer. 3703-3707 [doi]

Minimax i-vector extractor for short duration speaker verificationVille Hautamäki, You-Chi Cheng, Padmanabhan Rajan, Chin-Hui Lee. 3708-3712 [doi]

Standoff speaker recognition: effects of recording distance mismatch on speaker recognition system performanceMike Fowler, Mark McCurry, Jonathan Bramsen, Kehinde Dunsin, Jeremiah Remus. 3713-3716 [doi]

Correlates to intelligibility in deviant child speech - comparing clinical evaluations to audience response system-based evaluations by untrained listenersSofia Strömbergsson, Christina Tånnander. 3717-3721 [doi]

Using linguistic analysis to characterize conceptual units of thought in spoken medical narrativesKathryn Womack, Cecilia Ovesdotter Alm, Cara Calvelli, Jeff B. Pelz, Pengcheng Shi, Anne R. Haake. 3722-3726 [doi]

Interacting with robots via speech and gestures, an integrated architectureFrancesco Cutugno, Alberto Finzi, Michelangelo Fiore, Enrico Leone, Silvia Rossi. 3727-3731 [doi]

Incorporating named entity recognition into the speech transcription processMohamed Hatmi, Christine Jacquin, Emmanuel Morin, Sylvain Meignier. 3732-3736 [doi]

DTW-distance-ordered spoken term detectionTeppei Ohno, Tomoyosi Akiba. 3737-3741 [doi]

Refining sentence similarity with discourse information in dialog systemSangkeun Jung, Seung-Hoon Na. 3742-3746 [doi]

Two-step correction of speech recognition errors based on n-gram and long contextual informationRyohei Nakatani, Tetsuya Takiguchi, Yasuo Ariki. 3747-3750 [doi]

Inferring actor communities from videosSumit Negi, Ramnath Balasubramanyan, Santanu Chaudhury. 3751-3755 [doi]

Multiple topic identification in telephone conversationsXavier Bost, Marc El-Bèze, Renato de Mori. 3756-3760 [doi]

Variable-Span out-of-vocabulary named entity detectionWei Chen, Sankaranarayanan Ananthakrishnan, Rohit Prasad, Prem Natarajan. 3761-3765 [doi]

On the feasibility of using pupil diameter to estimate cognitive load changes for in-vehicle spoken dialoguesAndrew L. Kun, Oskar Palinko, Zeljko Medenica, Peter A. Heeman. 3766-3770 [doi]

Investigation of recurrent-neural-network architectures and learning methods for spoken language understandingGrégoire Mesnil, Xiaodong He, Li Deng, Yoshua Bengio. 3771-3775 [doi]

Paraphrase features to improve natural language understandingXiaohu Liu, Ruhi Sarikaya, Chris Brockett, Chris Quirk, William B. Dolan. 3776-3779 [doi]

A weakly-supervised approach for discovering new user intents from search query logsDilek Hakkani-Tür, Asli Çelikyilmaz, Larry P. Heck, Gökhan Tür. 3780-3784 [doi]

Exploiting shared information for multi-intent natural language sentence classificationPuyang Xu, Ruhi Sarikaya. 3785-3789 [doi]

An inter- and cross-disciplinary perspective of spoken language processingHiroya Fujisaki. 4005 [doi]

Progress and prospects for speech technology: what ordinary people thinkRoger K. Moore. 4006 [doi]

External Links

Cite Key

Statistics

PDF

Researchr

INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013

Abstract

Table of Contents