Abstract is missing.
- An Information-Extraction Approach to Speech Analysis and ProcessingChin-Hui Lee. 1-5 [doi]
- Large Vocabulary Speech Recognition Using Deep Tensor Neural NetworksDong Yu, Li Deng, Frank Seide. 6-9 [doi]
- Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free OptimizationBrian Kingsbury, Tara N. Sainath, Hagen Soltau. 10-13 [doi]
- Discriminative feature-space transforms using deep neural networksGeorge Saon, Brian Kingsbury. 14-17 [doi]
- Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?Zoltán Tüske, Ralf Schlüter, Hermann Ney, Martin Sundermeyer. 18-21 [doi]
- Recurrent Neural Networks for Noise Reduction in Robust ASRAndrew L. Maas, Quoc V. Le, Tyler M. O'Neil, Oriol Vinyals, Patrick Nguyen, Andrew Y. Ng. 22-25 [doi]
- Pipelined Back-Propagation for Context-Dependent Deep Neural NetworksXie Chen, Adam Eversole, Gang Li, Dong Yu, Frank Seide. 26-29 [doi]
- Arabic Dialect Identification - 'Is the Secret in the Silence?' and Other ObservationsHynek Boril, Abhijeet Sangwan, John H. L. Hansen. 30-33 [doi]
- The 2011 NIST Language Recognition EvaluationCraig S. Greenberg, Alvin F. Martin, Mark A. Przybocki. 34-37 [doi]
- The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and PerformanceLuis Javier Rodríguez-Fuentes, Mikel Peñagarikano, Amparo Varona, Mireia Díez, Germán Bordel, Alberto Abad, David Martínez González, Jesús A. Villalba, Alfonso Ortega, Eduardo Lleida. 38-41 [doi]
- Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram CountsLuis Fernando D'Haro, Ondrej Glembek, Oldrich Plchot, Pavel Matejka, Mehdi Soufifar, Ricardo de Córdoba, Jan Cernocký. 42-45 [doi]
- Supervector LDA: A New Approach to Reduced-Complexity I-vector Language RecognitionAlan McCree, Bengt J. Borgstrom. 46-49 [doi]
- Patrol Team Language Identification System for DARPA RATS P1 EvaluationPavel Matejka, Oldrich Plchot, Mehdi Soufifar, Ondrej Glembek, Luis Fernando D'Haro, Karel Veselý, Frantisek Grézl, Jeff Z. Ma, Spyros Matsoukas, Najim Dehak. 50-53 [doi]
- Articulatory Strategies in Obstruent Production in Mandarin Esophageal SpeechFang Hu, Yungang Wu, Wen Xu, Demin Han. 54-57 [doi]
- Consonantal space area in Children with a Cleft Palate An acoustic StudyMarion Bechet, Fabrice Hirsch, Camille Fauth, Rudolph Sock. 58-61 [doi]
- Automated Dysarthria Severity Classification for Improved Objective Intelligibility Assessment of Spastic Dysarthric SpeechMilton Orlando Sarria Paja, Tiago H. Falk. 62-65 [doi]
- Assessment of Disordered Voices Using Empirical Mode Decomposition in the Log-Spectral DomainAbdellah Kacha, Francis Grenez, Jean Schoentgen. 66-69 [doi]
- Learning an Artificial F0-Contour for ALT SpeechAnna Katharina Fuchs, Martin Hagmüller. 70-73 [doi]
- Ultrax: An Animated Midsagittal Vocal Tract Display for Speech TherapyKorin Richmond, Steve Renals. 74-77 [doi]
- A Study of Mutual Information for GMM-Based Spectral ConversionHsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen. 78-81 [doi]
- Bayesian Mixture of Probabilistic Linear Regressions for Voice ConversionNa Li, Yu Qiao. 82-85 [doi]
- Iterative MMSE Estimation of Vocal Tract Length Normalization Factors for Voice TransformationDaniel Erro, Eva Navas, Inma Hernáez. 86-89 [doi]
- A HMM approach to residual estimation for high resolution voice conversionWinston S. Percybrooks, Elliot Moore. 90-93 [doi]
- Implementation of Computationally Efficient Real-Time Voice ConversionTomoki Toda, Takashi Muramatsu, Hideki Banno. 94-97 [doi]
- Effects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker ConversionDaisuke Saito, Nobuaki Minematsu, Keikichi Hirose. 98-101 [doi]
- Discrimination of Linguistic and Non-Linguistic Vocalizations in Spontaneous Speech: Intra- and Inter-Corpus PerspectivesFelix Weninger, Björn Schuller. 102-105 [doi]
- Accentual Transfer from Swiss-German to French. A Study of "Français Fédéral"Mathieu Avanzi, Pauline Dubosson, Sandra Schwab, Nicolas Obin. 106-109 [doi]
- Phonology & the Interpretation of Fine Phonetic Detail in Berlin GermanStefanie Jannedy, Melanie Weirich. 110-113 [doi]
- Evaluation of a formant-based speech-driven lip motion generationCarlos Toshinori Ishi, Chaoran Liu, Hiroshi Ishiguro, Norihiro Hagita. 114-117 [doi]
- Using spectral measures to differentiate Mandarin and Korean sibilant fricativesJeffrey Kallay, Jeffrey J. Holliday. 118-121 [doi]
- EFL Conversational Triads: Foreigner-directed Speech and HyperarticulationHua-Li Jian, Richard Konopka. 122-125 [doi]
- Syllable perception depends on tone perceptionIris Chuoying Ouyang, Khalil Iskarous. 126-129 [doi]
- Assessing agreement level between forced alignment models with data from endangered language documentation corporaChristian DiCanio, Hosung Nam, Douglas H. Whalen, H. Timothy Bunnell, Jonathan D. Amith, Rey Castillo García. 130-133 [doi]
- How consonants, dialect and speech rate affect vowel devoicing?Masako Fujimoto, Seiya Funatsu, Ichiro Fujimoto. 134-137 [doi]
- Distance-Dependent Noise Reduction for Two-Channel MicrophonesThomas Fehér, Dietmar Richter, Oliver Jokisch, Rüdiger Hoffmann. 138-141 [doi]
- Direction of Arrival Estimation Based on Subband Weighting for Noisy ConditionsWei Xue, Wenju Liu. 142-145 [doi]
- Binaural Noise Reduction Using Frequency-Warped FIR FiltersJorge I. Marin-Hurtado, David V. Anderson. 146-149 [doi]
- Exploring Off Time Nature for Speech EnhancementMeng Yu, Jack Xin. 150-153 [doi]
- Model-based Single-Channel Dereverberation in Noisy Acoustical EnvironmentsXulei Bao, Jie Zhu. 154-157 [doi]
- An Auditory Inspired Multimodal Framework for Speech EnhancementMajid Mirbagheri, Sahar Akram, Shihab A. Shamma. 158-161 [doi]
- Binary Mask Estimation for Improved Speech Intelligibility in Reverberant EnvironmentsOldooz Hazrati, Jaewook Lee, Philipos C. Loizou. 162-165 [doi]
- Enhancing Subjective Speech Intelligibility Using a Statistical Model of SpeechPetko N. Petkov, W. Bastiaan Kleijn, Gustav Eje Henter. 166-169 [doi]
- Morpheme Level Feature-based Language Models for German LVCSRAmr El-Desoky Mousa, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney. 170-173 [doi]
- Tied-State Mixture Language Model for WFST-based Speech RecognitionHitoshi Yamamoto, Paul R. Dixon, Shigeki Matsuda, Chiori Hori, Hideki Kashioka. 174-177 [doi]
- Maximum Entropy Language Model Adaptation for Mobile Speech InputTanel Alumäe, Kaarel Kaljurand. 178-181 [doi]
- Supervised and unsupervised Web-based language model domain adaptationGwénolé Lecorvé, John Dines, Thomas Hain, Petr Motlícek. 182-185 [doi]
- A Hierarchical Bayesian Approach for Semi-supervised Discriminative Language ModelingYik-Cheung Tam, Paul Vozila. 186-189 [doi]
- Leveraging Social Annotation for Topic Language Model AdaptationYouzheng Wu, Kazuhiko Abe, Paul R. Dixon, Chiori Hori, Hideki Kashioka. 190-193 [doi]
- LSTM Neural Networks for Language ModelingMartin Sundermeyer, Ralf Schlüter, Hermann Ney. 194-197 [doi]
- Phrasal Cohort Based Unsupervised Discriminative Language ModelingPuyang Xu, Brian Roark, Sanjeev Khudanpur. 198-201 [doi]
- Deriving conversation-based features from unlabeled speech for discriminative language modelingDamianos Karakos, Brian Roark, Izhak Shafran, Kenji Sagae, Maider Lehr, Emily Tucker Prud'hommeaux, Puyang Xu, Nathan Glenn, Sanjeev Khudanpur, Murat Saraclar, Daniel M. Bikel, Mark Dredze, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley. 202-205 [doi]
- Performance Comparison of Training Algorithms for Semi-Supervised Discriminative Language ModelingErinç Dikici, Arda Çelebi, Murat Saraclar. 206-209 [doi]
- On-the-fly Topic Adaptation for YouTube Video TranscriptionKapil Thadani, Fadi Biadsy, Daniel M. Bikel. 210-213 [doi]
- Portability of Semantic Annotations for Fast Development of Dialogue CorporaBassam Jabaian, Fabrice Lefèvre, Laurent Besacier. 214-217 [doi]
- Optimization of Dialog Strategies using Automatic Dialog Simulation and Statistical Dialog Management TechniquesZoraida Callejas, Ramón López-Cózar. 218-221 [doi]
- Preference-learning based Inverse Reinforcement Learning for Dialog ControlHiroaki Sugiyama, Toyomi Meguro, Yasuhiro Minami. 222-225 [doi]
- A Data-driven Approach to Understanding Spoken Route Directions in Human-Robot DialogueRaveesh Meena, Gabriel Skantze, Joakim Gustafson. 226-229 [doi]
- Detecting System-directed Utterances using Dialogue-level FeaturesKazunori Komatani, Akira Hirano, Mikio Nakano. 230-233 [doi]
- An Online Generated Transducer to Increase Dialog Manager CoverageJoaquin Planells, Lluís F. Hurtado, Emilio Sanchis, Encarna Segarra. 234-237 [doi]
- A Sequential Bayesian Dialog Agent for Computational EthnographyAbe Kazemzadeh, James Gibson, Juanchen Li, Sungbok Lee, Panayiotis G. Georgiou, Shrikanth Narayanan. 238-241 [doi]
- ClippyScript: A Programming Language for Multi-Domain Dialogue SystemsFrank Seide, Sean McDirmid. 242-245 [doi]
- Correlation Between Model-based Approximations of Grounding-related Cognition and User JudgmentsKlaus-Peter Engelbrecht, Sebastian Möller. 246-249 [doi]
- Assessment of user simulators for spoken dialogue systems by means of subspace multidimensional clusteringZoraida Callejas, David Griol, Klaus-Peter Engelbrecht. 250-253 [doi]
- The INTERSPEECH 2012 Speaker Trait ChallengeBjörn Schuller, Stefan Steidl, Anton Batliner, Elmar Nöth, Alessandro Vinciarelli, Felix Burkhardt, Rob van Son, Felix Weninger, Florian Eyben, Tobias Bocklet, Gelareh Mohammadi, Benjamin Weiss. 254-257 [doi]
- On Speaker-Independent Personality Perception and Prediction from SpeechTim Polzehl, Katrin Schoenenberg, Sebastian Möller, Florian Metze, Gelareh Mohammadi, Alessandro Vinciarelli. 258-261 [doi]
- Speaker Personality Classification Using Systems Based on Acoustic-Lexical Cues and an Optimal Tree-Structured Bayesian NetworkKartik Audhkhasi, Angeliki Metallinou, Ming Li, Shrikanth Narayanan. 262-265 [doi]
- Personality traits detection using a parallelized modified SFFS algorithmClément Chastagnol, Laurence Devillers. 266-269 [doi]
- Feature Selection for Speaker TraitsJouni Pohjalainen, Serdar Kadioglu, Okko Räsänen. 270-273 [doi]
- A Frame Pruning Approach for Paralinguistic Recognition TasksJohannes Wagner, Florian Lingenfelser, Elisabeth André. 274-277 [doi]
- Modulation Spectrum Analysis for Speaker Personality Trait RecognitionAlexei Ivanov, Xin Chen. 278-281 [doi]
- A Comparison of Classification Paradigms for Speaker Likeability DeterminationNicholas Cummins, Julien Epps, Jia Min Karen Kua. 282-285 [doi]
- Predicting Likability of Speakers with Gaussian ProcessesDingchao Lu, Fei Sha. 286-289 [doi]
- Likability Classification - A Not so Deep Neural Network ApproachRaymond Brueckner, Björn Schuller. 290-293 [doi]
- Genetic Algorithm Based Feature Selection for Speaker Trait ClassificationDongrui Wu. 294-297 [doi]
- Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech RecognitionKen'ichi Kumatani, Bhiksha Raj, Rita Singh, John W. McDonough. 298-301 [doi]
- Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary NoiseFelix Weninger, Martin Wöllmer, Björn Schuller. 302-305 [doi]
- Noise Compensation for Subspace Gaussian Mixture ModelsLiang Lu, K. K. Chin, Arnab Ghoshal, Steve Renals. 306-309 [doi]
- Combination of Sparse Classification and Multilayer Perceptron for Noise-robust ASRYang Sun, Mathew M. Doss, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves. 310-313 [doi]
- Sub-band based Log-energy and Its Dynamic Range Stretching for Robust In-car Speech RecognitionWeifeng Li, Hervé Bourlard. 314-317 [doi]
- Subspace Gaussian Mixture Models Based on Noise Compensation for Speech RecognitionDriss Matrouf, Georges Linares, Mickael Rouvier, Mohamed Bouallegue. 318-321 [doi]
- "Help Me, I Need More User Tests!" User Simulations as Supportive Tool in the Development Process of Spoken Dialogue SystemsFlorian Kretzschmar, Sebastian Möller. 322-325 [doi]
- Caller Response Timing Patterns in Spoken Dialog SystemsSilke M. Witt. 326-329 [doi]
- A Discriminative Classification-Based Approach to Information State Updates for a Multi-Domain Dialog SystemDilek Z. Hakkani-Tür, Gökhan Tür, Larry P. Heck, Ashley Fidler, Asli Çelikyilmaz. 330-333
- Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer DialogElizabeth Shriberg, Andreas Stolcke, Dilek Z. Hakkani-Tür, Larry P. Heck. 334-337 [doi]
- Exploiting the Semantic Web for Unsupervised Natural Language Semantic ParsingGökhan Tür, Minwoo Jeong, Ye-Yi Wang, Dilek Hakkani-Tür, Larry P. Heck. 338-341 [doi]
- Prosodic Entrainment in an Information-Driven Dialog SystemAndrew Fandrianto, Maxine Eskenazi. 342-345 [doi]
- Novel Metrics of Speech Rhythm for the Assessment of EmotionFabien Ringeval, Mohamed Chetouani, Björn W. Schuller. 346-349 [doi]
- Temporal and Situational Context Modeling for Improved Dominance Recognition in MeetingsMartin Wöllmer, Florian Eyben, Björn W. Schuller, Gerhard Rigoll. 350-353 [doi]
- Audiovisual correlates of basic emotions in blind and sighted peopleMarc Swerts, Kitty Leuverink, Madelène Munnik, Vera Nijveld. 354-357 [doi]
- Combining Ranking and Classification to Improve Emotion Recognition in Spontaneous SpeechHouwei Cao, Ragini Verma, Ani Nenkova. 358-361 [doi]
- Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion RecognitionZixing Zhang, Björn Schuller. 362-365 [doi]
- Emotion Recognition using Acoustic and Lexical FeaturesViktor Rozgic, Sankaranarayanan Ananthakrishnan, Shirin Saleem, Rohit Kumar, Aravind Namandi Vembu, Rohit Prasad. 366-369 [doi]
- Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image AnalysisPhillip L. De Leon, Bryan Stewart, Junichi Yamagishi. 370-373 [doi]
- Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and SynthesisZhengqi Wen, Hideki Kawahara, Jianhua Tao. 374-377 [doi]
- Robust Pitch Estimation Using l1-regularized Maximum Likelihood EstimationFeng Huang, Tan Lee. 378-381 [doi]
- A full-band adaptive harmonic representation of speechGilles Degottex, Yannis Stylianou. 382-385 [doi]
- Deviation measure of waveform symmetry and its application to high-speed and temporally-fine F0 extraction for vocal sound texture manipulationHideki Kawahara, Masanori Morise, Ryuichi Nisimura, Toshio Irino. 386-389 [doi]
- Hidden Markov Convolutive Mixture Model for Pitch Contour Analysis of SpeechKota Yoshizato, Hirokazu Kameoka, Daisuke Saito, Shigeki Sagayama. 390-393 [doi]
- Extrinsic normalization for vocal tracts depends on the signal, not on attentionMatthias Sjerps, James M. McQueen, Holger Mitterer. 394-397 [doi]
- Perceptual Learning of /f/-/s/ by Older ListenersOdette Scharenborg, Esther Janse, Andrea Weber. 398-401 [doi]
- Correlation between vocal tract length, body height, formant frequencies, and pitch frequency for the five Japanese vowels uttered by fifteen male speakersHiroaki Hatano, Tatsuya Kitamura, Hironori Takemoto, Parham Mokhtari, Kiyoshi Honda, Shinobu Masaki. 402-405 [doi]
- Detection of Transition Segments in VCV Utterances for Estimation of the Place of Closure of Oral Stops for Speech TrainingK. S. Nataraj, P. C. Pandey. 406-409 [doi]
- Audiovisual discrimination of CV syllables: a simultaneous fMRI-EEG studyCyril Dubois, Rudolph Sock. 410-413 [doi]
- Contribution of Spectral Shapes to Tone PerceptionNatthawut Kertkeidkachorn, Surapol Vorapatratorn, Sirinart Tangruamsub, Proadpran Punyabukkana, Atiwong Suchato. 414-417 [doi]
- Methodological Issues in Assessing Perceptual Representation of Consonant Sounds in ThaiCharturong Tantibundhit, Chutamanee Onsuwan, P. Phienphanich, Chai Wutiwiwatchai. 418-421 [doi]
- Pitch and phonological perception of tone in the Suruí language of Rondônia (Brazil): identification task of LHL and LHH tonal patternsJulien Meyer. 422-425 [doi]
- The Role of Creaky Voice in Mandarin Tone 2 and Tone 3 PerceptionRui Cao, Ratree Wayland, Edith Kaan. 426-429 [doi]
- Can litheners retune native categories acroth a thoneme boundary?Michael Tyler, Mona Faris. 430-433
- Synthetic F0 Can Effectively Convey Speaker ID in Delexicalized SpeechEric Morley, Esther Klabbers, Jan P. H. van Santen, Alexander Kain, Seyed Hamidreza Mohammadi. 434-437 [doi]
- Evaluating Prosodic Processing for Incremental Speech SynthesisTimo Baumann, David Schlangen. 438-441 [doi]
- Expressing Speaker's Intentions through Sentence-Final Intonations for Japanese Conversational Speech SynthesisKazuhiko Iwata, Tetsunori Kobayashi. 442-445 [doi]
- Modeling Pause-Duration for Style-Specific Speech SynthesisAlok Parlikar, Alan W. Black. 446-449 [doi]
- Enumerating Differences Between Various Communicative Functions for Purposes of Czech Expressive Speech Synthesis in Limited DomainMartin Gruber. 450-453 [doi]
- Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech SignalsChristoph Norrenbrock, Florian Hinterleitner, Ulrich Heute, Sebastian Möller. 454-457 [doi]
- Improved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech SynthesisHiroya Hashimoto, Keikichi Hirose, Nobuaki Minematsu. 458-461 [doi]
- Discontinuous Observation HMM for Prosodic-Event-Based F0 GenerationTomoki Koriyama, Takashi Nose, Takao Kobayashi. 462-465 [doi]
- Hierarchical English Emphatic Speech Synthesis Based on HMM with Limited Training DataFanbo Meng, Zhiyong Wu, Helen M. Meng, Jia Jia, Lianhong Cai. 466-469 [doi]
- Employing Sentence Structure: Syntax Trees as Prosody GeneratorsSarah Hoffmann, Beat Pfister. 470-473 [doi]
- A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic ComponentsYasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Kunio Kashino. 474-477 [doi]
- Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast TranscriptionJan Silovský, Petr Cerva, Jindrich Zdánský, Jan Nouza. 478-481 [doi]
- On the Use of Spectral and Iterative Methods for Speaker DiarizationStephen Shum, Najim Dehak, Jim Glass. 482-485 [doi]
- Where did I go wrong?: Identifying troublesome segments for speaker diarization systemsMary Tai Knox, Nikki Mirghafori, Gerald Friedland. 486-489 [doi]
- Speaker diarization of overlapping speech based on silence distribution in meeting recordingsSree Harsha Yella, Fabio Valente. 490-493 [doi]
- Phone Adaptive Training for Speaker DiarizationSimon Bozonnet, Ravichander Vipperla, Nicholas W. D. Evans. 494-497 [doi]
- Compensating for Ageing and Quality variation in Speaker VerificationFinnian Kelly, Andrzej Drygajlo, Naomi Harte. 498-501 [doi]
- Calibration of probabilistic age recognitionDavid A. van Leeuwen, Mohamad Hasan Bahari. 502-505 [doi]
- Age Estimation from Telephone Speech using i-vectorsMohamad Hasan Bahari, Mitchell McLaren, Hugo Van Hamme, David A. van Leeuwen. 506-509 [doi]
- Is 'not bad' good enough? Aspects of unknown voices' likabilityBenjamin Weiss, Felix Burkhardt. 510-513 [doi]
- Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait ClassificationMichelle Hewlett Sanchez, Aaron Lawson, Dimitra Vergyri, Harry Bratt. 514-517 [doi]
- The log-Gabor method: speech classification using spectrogram image analysisHarm Buisman, Eric Postma. 518-521 [doi]
- Anchor Models and WCCN Normalization For Speaker Trait ClassificationYazid Attabi, Pierre Dumouchel. 522-525 [doi]
- Pitch and Intonation Contribution to Speakers' Traits ClassificationClaude Montacié, Marie-José Caraty. 526-529 [doi]
- Text-dependent pathological voice detectionGopala Krishna Anumanchipalli, Hugo Meinedo, Miguel Bugalho, Isabel Trancoso, Luís C. Oliveira, Alan W. Black. 530-533 [doi]
- Intelligibility classification of pathological speech using fusion of multiple high level descriptorsJangwon Kim, Naveen Kumar, Andreas Tsiartas, Ming Li, Shrikanth Narayanan. 534-537 [doi]
- Interspeech Pathology Challenge: Investigations into Speaker and Sentence Specific EffectsAnthony P. Stark, Alireza Bayestehtashk, Meysam Asgari, Izhak Shafran. 538-541 [doi]
- Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulationsXinhui Zhou, Daniel Garcia-Romero, Nima Mesgarani, Maureen Stone, Carol Y. Espy-Wilson, Shihab A. Shamma. 542-545 [doi]
- Detecting Intelligibility by Linear Dimensionality Reduction and Normalized Voice Quality Hierarchical FeaturesDong-Yan Huang, Yongwei Zhu, Dajun Wu, Rongshan Yu. 546-549 [doi]
- A factorized representation of FMLLR transform based on QR-decompositionShakti P. Rath, Martin Karafiát, Ondrej Glembek, Jan Cernocký. 551-554 [doi]
- A Correlational Discriminant Approach to Feature Extraction for Robust Speech RecognitionVikrant Singh Tomar, Richard C. Rose. 555-558 [doi]
- Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous SpeechChao Weng, Biing-Hwang Juang, Daniel Povey. 559-562 [doi]
- Discriminative Reranking for LVCSR Leveraging Invariant StructureMasayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu. 563-566 [doi]
- Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker AdaptationTing-Yao Hu, Yu Tsao, Lin-Shan Lee. 567-570 [doi]
- Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech RecognitionMuhammad Ali Tahir, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney. 571-574 [doi]
- Low-SNR, Speaker-Dependent Speech Enhancement using GMMs and MFCCsLaura E. Boucheron, Phillip L. De Leon. 575-578 [doi]
- Can modified casual speech reach the intelligibility of clear speech?Maria Koutsogiannaki, Michele Pettinato, Cassie Mayo, Varvara Kandia, Yannis Stylianou. 579-582 [doi]
- Speech Enhancement Using Sparse Convolutive Non-negative Matrix Factorization with Basis AdaptationMichael Carlin, Nicolas Malyska, Thomas F. Quatieri. 583-586 [doi]
- Inventory-Based Audio-Visual Speech EnhancementDorothea Kolossa, Robert M. Nickel, Steffen Zeiler, Rainer Martin. 587-590 [doi]
- Utilization of the Lombard effect in post-filtering for intelligibility enhancement of telephone speechEmma Jokinen, Paavo Alku, Martti Vainio. 591-594 [doi]
- Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise EnvironmentsZhiyao Duan, Gautham J. Mysore, Paris Smaragdis. 595-598 [doi]
- Phoneme resistance during speech-in-speech comprehensionLéo Varnet, Julien Meyer, Michel Hoen, Fanny Meunier. 599-602 [doi]
- Smile with a smileHugo Quené, Will Schuerman. 603-606 [doi]
- Interactions Between Turn-taking Gaps, Disfluencies and Social ObligationRebecca Lunsford, Peter A. Heeman, Jan P. H. van Santen. 607-610 [doi]
- Effect of being seen on the production of visible speech cues. A pilot study on Lombard speechMaeva Garnier, Lucie Ménard, Gabrielle Richard. 611-614 [doi]
- Temporal entrainment in overlapped speech: Cross-linguistic studyMarcin Wlodarczak, Juraj Simko, Petra Wagner. 615-618 [doi]
- Based on Isolated Saliency or Causal Integration? Toward a Better Understanding of Human Annotation Process using Multiple Instance Learning and Sequential Probability Ratio TestChi-Chun Lee, Athanasios Katsamanis, Panayiotis G. Georgiou, Shrikanth Narayanan. 619-622 [doi]
- Text-To-Speech Intelligibility Across Speech RatesAnn K. Syrdal, H. Timothy Bunnell, Susan R. Hertz, Taniya Mishra, Murray F. Spiegel, Corine Bickley, Deborah Rekart, Matthew J. Makashay. 623-626 [doi]
- Objective Intelligibility Assessment of Text-to-Speech System using Template Constrained Generalized Posterior ProbabilityLinfang Wang, Lijuan Wang, Yan Teng, Zhe Geng, Frank K. Soong. 627-630 [doi]
- Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noiseCassia Valentini-Botinhao, Junichi Yamagishi, Simon King. 631-634 [doi]
- Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compressionTudor-Catalin Zorila, Varvara Kandia, Yannis Stylianou. 635-638 [doi]
- Implementation of Simple Spectral Techniques to Enhance the Intelligibility of Speech using a Harmonic ModelDaniel Erro, Yannis Stylianou, Eva Navas, Inma Hernáez. 639-642 [doi]
- Making Conversational Vowels More ClearSeyed Hamidreza Mohammadi, Alexander Kain, Jan P. H. van Santen. 643-646 [doi]
- Naturalness Judgement of Prosodic Variation of Japanese Utterances with Prosody Modified StimuliChiharu Tsurutani, Shunichi Ishihara. 647-650 [doi]
- Effects of Dialectal Origin on Articulation Rate in FrenchMathieu Avanzi, Pauline Dubosson, Sandra Schwab. 651-654 [doi]
- A New Approach of Speaking Rate Modeling for Mandarin Speech ProsodyChiao-Hua Hsieh, Chen-Yu Chiang, Yih-Ru Wang, Hsiu-Min Yu, Sin-Horng Chen. 655-658 [doi]
- Modelling pause duration as a function of contextual lengthDavid Doukhan, Albert Rilliard, Sophie Rosset, Christophe d'Alessandro. 659-662 [doi]
- Production and Perception of Focus in PFC and non-PFC Languages: Comparing Beijing Mandarin and Hainan TsatBei Wang, Chenxia Li, Qian Wu, Xiaxia Zhang, Baofeng Wang, Yi Xu. 663-666 [doi]
- Prosodic Realization of Focus in Statement and Question in Tibetan (Lhasa Dialect)Xiaxia Zhang, Bei Wang, Qian Wu, Yi Xu. 667-670 [doi]
- Effect of noise type and level on focus related fundamental frequency changesMartti Vainio, Daniel Aalto, Antti Suni, Anja Arnhold, Tuomo Raitio, Henri Seijo, Juhani Järvikivi, Paavo Alku. 671-674 [doi]
- Role of Prosody in Automatic Modality Recognition of Bangla SpeechAnal Warsi, Tulika Basu, Debasis Mazumdar. 675-678 [doi]
- Where to associate stressed additive particles? Evidence from speech prosodyBettina Braun. 679-682 [doi]
- From PVI to Perception: A Return to the Roots of Rhythm in Broadcast NewsMatthew Benton. 683-686 [doi]
- A methodology for the study of rhythm in drummed forms of languages: application to Bora Manguaré of AmazonJulien Meyer, Laure Dentel, Frank Seifart. 687-690 [doi]
- Automatic Detection of High Vocal Effort in Telephone SpeechJouni Pohjalainen, Tuomo Raitio, Hannu Pulakka, Paavo Alku. 691-694 [doi]
- Analysis of Mimicry SpeechGomathi D, Sathya Adithya Thati, Karthik Venkat Sridaran, Yegnanarayana B. 695-698 [doi]
- Estimation of the vocal tract shape of nasals using a Bayesian schemeChristian H. Kasess, Wolfgang Kreuzer, Ewald Enzinger, Nadja Kerschhofer-Puhalo. 699-702 [doi]
- Advances in combined electro-optical palatographyPeter Birkholz, Philippe Daechert, Christiane Neuschaefer-Rube. 703-706 [doi]
- Noise Robust Pitch Tracking by Subband Autocorrelation ClassificationByung Suk Lee, Daniel P. W. Ellis. 707-710 [doi]
- Inference of Critical Articulator Position for Fricative ConsonantsAlexander Sepúlveda, Rodrigo Capobianco Guido, Germán Castellanos-Domínguez. 711-714 [doi]
- Vocal Tremor Measurement Based on Autocorrelation of ContoursMarkus Brückl. 715-718 [doi]
- Model-based Duration-difference Approach on Accent Evaluation of L2 LearnerChatchawarn Hansakunbuntheung, Ananlada Chotimongkol, Sumonmas Thatphithakkul, Patcharika Chootrakool. 719-722 [doi]
- Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech InterfaceThomas Hueber, Gérard Bailly, Bruce Denby. 723-726 [doi]
- Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster ConversationsTatsuya Kawahara, Takuma Iwatate, Katsuya Takanashi. 727-730 [doi]
- Using Quality Ratings to Predict Modality Choice in Multimodal SystemsIna Wechsung, Klaus-Peter Engelbrecht, Sebastian Möller. 731-734 [doi]
- HMM Based Continuous EOG Recognition for Eye-input Speech InterfaceFuming Fang, Takahiro Shinozaki, Yasuo Horiuchi, Shingo Kuroiwa, Sadaoki Furui, Toshimitsu Musha. 735-738 [doi]
- A Random, Semantically Appropriate Sentence Generator for Speaker VerificationJason Lilley, Amanda Stent, Ilija Zeljkovic. 739-742 [doi]
- Coherent Topic Transition in a Conversational AgentDaniel Macías Galindo, Wilson Wong, Lawrence Cavedon, John Thangarajah. 743-746 [doi]
- Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and ConvergencePeter A. Heeman, Jordan Fryer, Rebecca Lunsford, Andrew Rueckert, Ethan Selfridge. 747-750 [doi]
- Enhancing Speech Understanding in Spoken Dialogue Systems by Means of a New Frame-Correction TechniqueRamón López-Cózar, Zoraida Callejas, David Griol. 751-754 [doi]
- Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial DialoguesDiane J. Litman, Heather Friedberg, Katherine Forbes-Riley. 755-758 [doi]
- Spoken Dialogs With a Virtual Science TutorWayne Ward, Daniel Bolaños, Ronald A. Cole. 759-762 [doi]
- Real-Time Lecture Transcription using ASR for Czech Hearing Impaired or Deaf StudentsPetr Cerva, Jan Silovský, Jindrich Zdánský, Jan Nouza, Jirí Málek. 763-766 [doi]
- Application of Structural Events Detected on ASR Outputs for Automated Speaking AssessmentLei Chen, Su-Youn Yoon. 767-770 [doi]
- Addressing Confusions in Spoken Language in ESL Pronunciation TutorsOscar Saz, Maxine Eskenazi. 771-774 [doi]
- The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation TrainingXiaojun Qian, Helen M. Meng, Frank K. Soong. 775-778 [doi]
- Practice and feedback in L2 speaking: an evaluation of the DISCO CALL systemCatia Cucchiarini, Joost van Doremalen, Helmer Strik. 779-782 [doi]
- Cross-speaker Acoustic-to-Articulatory Inversion using Phone-based Trajectory HMM for Pronunciation TrainingThomas Hueber, Atef Ben Youssef, Gérard Bailly, Pierre Badin, Frédéric Elisei. 783-786 [doi]
- MAP Estimation of Whole-Word Acoustic Models with Dictionary PriorsKeith Kintzley, Aren Jansen, Hynek Hermansky. 787-790 [doi]
- Data-driven Posterior Features for Low Resource Speech Recognition ApplicationsSamuel Thomas, Sriram Ganapathy, Aren Jansen, Hynek Hermansky. 791-794 [doi]
- Sparse Bayesian Factor Analysis for Stereo-based Stochastic MappingXiaodong Cui, Mohamed Afify, George Saon, Vaibhava Goel. 795-798 [doi]
- Word Discovery with Beta Process Factor AnalysisNiklas Vanhainen, Giampiero Salvi. 799-802 [doi]
- Speaker Adaptation Using Variational Bayesian Linear Regression in Normalized Feature SpaceSeong-Jun Hahm, Atsunori Ogawa, Masakiyo Fujimoto, Takaaki Hori, Atsushi Nakamura. 803-806 [doi]
- Bayesian Feature Enhancement for ASR of Noisy Reverberant Real-World DataAlexander Krueger, Oliver Walter, Volker Leutnant, Reinhold Haeb-Umbach. 807-810 [doi]
- Robust Tracking for Automatic Reading TutorsEmre Yilmaz, Dirk Van Compernolle, Hugo Van Hamme. 811-814 [doi]
- Maximum F1-Score Discriminative Training for Automatic Mispronunciation Detection in Computer-Assisted Language LearningHuang Hao, Jianming Wang, Halidan Abudureyimu. 815-818 [doi]
- Error Pattern Detection Integrating Generative and Discriminative Learning for Computer-Aided Pronunciation TrainingYow-Bang Wang, Lin-Shan Lee. 819-822 [doi]
- The Automatic Assessment of Non-native Prosody: Combining Classical Prosodic Analysis with Acoustic ModellingFlorian Hönig, Tobias Bocklet, Korbinian Riedhammer, Anton Batliner, Elmar Nöth. 823-826 [doi]
- Improving L1-Specific Phonological Error Diagnosis in Computer Assisted Pronunciation TrainingTheban Stanley, Kadri Hacioglu. 827-830 [doi]
- A Self-Learning Assistive Vocal Interface Based on Vocabulary Learning and Grammar InductionJort F. Gemmeke, Janneke van de Loo, Guy De Pauw, Joris Driesen, Hugo Van Hamme, Walter Daelemans. 831-834 [doi]
- Contrasting Cues to Verbal and Non-Verbal Backchannels in Multi-lingual Dyadic RapportGina-Anne Levow, Susan Duncan. 835-838 [doi]
- Prosodic measurements and question types in the Spontal corpus of Swedish dialoguesSofia Strömbergsson, Jens Edlund, David House. 839-842 [doi]
- Measuring prosodic alignment in cooperative task-based conversationsKhiet P. Truong, Dirk Heylen. 843-846 [doi]
- On the Dynamics of Overlap in Multi-Party ConversationKornel Laskowski, Mattias Heldner, Jens Edlund. 847-850 [doi]
- On the acoustics of overlapping laughter in conversational speechKhiet P. Truong, Jürgen Trouvain. 851-854 [doi]
- A Corpus-Based Study of Interruptions in Spoken DialogueAgustín Gravano, Julia Hirschberg. 855-858 [doi]
- On the Modeling of Voiceless Stop Sounds of Speech using Adaptive Quasi-Harmonic ModelsGeorge P. Kafentzis, Olivier Rosec, Yannis Stylianou. 859-862 [doi]
- An alignment matching method to explore pseudosyllable properties across different corporaRaymond W. M. Ng, Thomas Hain, Keikichi Hirose. 863-866 [doi]
- Deep Architectures for Articulatory InversionBenigno Uria, Iain Murray, Steve Renals, Korin Richmond. 867-870 [doi]
- Automatic Measurement of Positive and Negative Voice Onset TimeKatharine Henry, Morgan Sonderegger, Joseph Keshet. 871-874 [doi]
- Efficient multipulse approximation of speech excitation using the most singular manifoldVahid Khanagha, Khalid Daoudi. 875-878 [doi]
- Intrinsic Spectral Analysis for Zero and High Resource Speech RecognitionAren Jansen, Samuel Thomas, Hynek Hermansky. 879-882 [doi]
- Computational Modelling of the Recognition of Foreign-Accented SpeechOdette Scharenborg, Marijt J. Witteman, Andrea Weber. 883-886 [doi]
- The production and perception of Estonian quantity degrees by native and non-native speakersLya Meister, Einar Meister. 887-890 [doi]
- Perception of the moraic obstruent /Q/: a cross-linguistic studyMakiko Sadakata, Mizuki Shingai, Alex Brandmeyer, Kaoru Sekiyama. 891-894 [doi]
- Comparative Analysis of Intensity between Native Speakers and Japanese Speakers of EnglishTomoko Nariai, Kazuyo Tanaka, Tatsuya Kawahara. 895-898 [doi]
- Auditory and Dynamic Modeling Paradigms to Detect L2 MispronunciationsChristos Koniaris, Olov Engwall, Giampiero Salvi. 899-902 [doi]
- Cross Linguistic Comparison of Mandarin and English EMA Articulatory DataSheng Li, Lan Wang. 903-906 [doi]
- Physiological and acoustic study of word initial post-lexical gemination in Moroccan ArabicChakir Zeroual, Diamantis Gafos, Phil Hoole, John H. Esling. 907-910 [doi]
- Perceptual Assimilation of Arabic Voiceless Fricatives by English MonolingualsMichael Tyler, Sarah Fenwick. 911-914 [doi]
- Non-auditory cognitive capabilities in computational modeling of early language acquisitionOkko Räsänen. 915-918 [doi]
- Modeling spoken language acquisition with a generic cognitive architecture for associative learningOkko Räsänen, Heikki Rasilo, Unto K. Laine. 919-922 [doi]
- Pitch Estimation Based on Long Frame Harmonic Model and Short Frame Average Correlation CoefficientDongmei Wang, Philipos C. Loizou. 923-926 [doi]
- Diagnostic Prediction of Transmitted Speech Quality: A New Framework for Signal-based and Parametric ModelsSebastian Möller, Marcel Wältermann, Nicolas Côté. 927-930 [doi]
- Enumerative Algebraic Coding for ACELPTom Bäckström. 931-934 [doi]
- Speech Enhancement With Bivariate Gamma ModelAtanu Saha, Tetsuya Shimamura. 935-938 [doi]
- Improvements of the Beta-Order Minimum Mean-Square Error (MMSE) Spectral Amplitude Estimator using Chi PriorsMarek B. Trawicki, Michael T. Johnson. 939-942 [doi]
- Enhancing Speech by Reconstruction from Robust Acoustic FeaturesPhilip Harding, Ben Milner. 943-946 [doi]
- Joint Pitch-Analysis Formant-Synthesis framework for CS recovery of speechSrikanth Raj Chetupally, Thippur V. Sreenivas. 947-950 [doi]
- A new noise-tracking algorithm for generalizing binary time-frequency (T-F) masking to ratio maskingShan Liang, Wei Jiang, Wenju Liu. 951-954 [doi]
- Optimised spectral weightings for noise-dependent speech intelligibility enhancementYan Tang, Martin Cooke. 955-958 [doi]
- Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive TrainingLangzhou Chen, Mark J. F. Gales, Vincent Wan, Javier Latorre, Masami Akamine. 959-962 [doi]
- Turning a Monolingual Speaker into Multilingual for a Mixed-language TTSJi He, Yao Qian, Frank K. Soong, Sheng Zhao. 963-966 [doi]
- Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech DisordersChristophe Veaux, Junichi Yamagishi, Simon King. 967-970 [doi]
- Speech factorization for HMM-TTS based on cluster adaptive trainingJavier Latorre, Vincent Wan, Mark J. F. Gales, Langzhou Chen, K. K. Chin, Kate Knill, Masami Akamine. 971-974 [doi]
- Factored MLLR Adaptation Algorithm for HMM-based Expressive TTSJune Sig Sung, Doo Hwa Hong, Hyun Woo Koo, Nam Soo Kim. 975-978 [doi]
- Speaker-adaptive visual speech synthesis in the HMM-frameworkDietmar Schabus, Michael Pucher, Gregor Hofer. 979-982 [doi]
- Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker InterpolationViviane de Franca Oliveira, Sayaka Shiota, Yoshihiko Nankaku, Keiichi Tokuda. 983-986 [doi]
- C2H: A Computational Model of H&H-based Phonetic Contrast in Synthetic SpeechMauro Nicolao, Javier Latorre, Roger K. Moore. 987-990 [doi]
- Vowel Creation by Articulatory Control in HMM-based Parametric Speech SynthesisZhen-Hua Ling, Korin Richmond, Junichi Yamagishi. 991-994 [doi]
- Analysis of speaker clustering strategies for HMM-based speech synthesisRasmus Dall, Christophe Veaux, Junichi Yamagishi, Simon King. 995-998 [doi]
- Word Relevance Modeling for Speech RecognitionKuan-Yu Chen, Hao-Chin Chang, Berlin Chen, Hsin-Min Wang. 999-1002 [doi]
- Using context-free grammars for embedded speech recognition with Weighted Finite-State TransducersFrank Duckhorn, Rüdiger Hoffmann. 1003-1006 [doi]
- Automatic transcription error recovery for Person Name RecognitionRichard Dufour, Géraldine Damnati, Delphine Charlet, Frédéric Béchet. 1007-1010 [doi]
- Efficient Beam Width Control to Suppress Excessive Speech Recognition Computation Time Based on Prior Score Range NormalizationSatoshi Kobashikawa, Takaaki Hori, Yoshikazu Yamaguchi, Taichi Asami, Hirokazu Masataki, Satoshi Takahashi. 1011-1014 [doi]
- Search Space Pruning Based on Anticipated Path Recombination in LVCSRDavid Nolden, Ralf Schlüter, Hermann Ney. 1015-1018 [doi]
- Estimating Word-Stability During Incremental Speech RecognitionIan McGraw, Alexander Gruenstein. 1019-1022 [doi]
- Using broad phonetic classes to guide search in automatic speech recognitionStefan Ziegler, Bogdan Ludusan, Guillaume Gravier. 1023-1026 [doi]
- Parallel combination of multilingual speech streams for improved ASRJoão Miranda, João Paulo Neto, Alan W. Black. 1027-1030 [doi]
- Low latency combination of parallelized single-pass LVCSR systemsFethi Bougares, Mickael Rouvier, Yannick Estève, Georges Linarès. 1031-1034 [doi]
- Efficient On-The-Fly Hypothesis Rescoring in a Hybrid GPU/CPU-based Large Vocabulary Continuous Speech Recognition EngineJungsuk Kim, Jike Chong, Ian R. Lane. 1035-1038 [doi]
- Fully Automated Neuropsychological Assessment for Detecting Mild Cognitive ImpairmentMaider Lehr, Emily Tucker Prud'hommeaux, Izhak Shafran, Brian Roark. 1039-1042 [doi]
- Spontaneous-Speech Acoustic-Prosodic Features of Children with Autism and the Interacting PsychologistDaniel Bone, Matthew P. Black, Chi-Chun Lee, Marian E. Williams, Pat Levitt, Sungbok Lee, Shrikanth Narayanan. 1043-1046 [doi]
- Contrastive intonation in autism: The effect of speaker- and listener-perspectiveConstantijn Kaland, Emiel Krahmer, Marc Swerts. 1047-1050 [doi]
- Characterizing Covert Articulation in Apraxic Speech Using real-time MRIChristina Hagedorn, Michael I. Proctor, Louis Goldstein, Maria Luisa Gorno-Tempini, Shrikanth Narayanan. 1051-1054 [doi]
- Automatic word naming recognition for treatment and assessment of aphasiaAlberto Abad, Anna Pompili, Ângela Costa, Isabel Trancoso. 1055-1058 [doi]
- Vocal-Source Biomarkers for Depression: A Link to Psychomotor ActivityThomas F. Quatieri, Nicolas Malyska. 1059-1062 [doi]
- Discriminatively learning factorized finite state pronunciation models from dynamic Bayesian networksPreethi Jyothi, Eric Fosler-Lussier, Karen Livescu. 1063-1066 [doi]
- Joint Decoding for Speech Recognition and Semantic TaggingAnoop Deoras, Ruhi Sarikaya, Gökhan Tür, Dilek Z. Hakkani-Tür. 1067-1070 [doi]
- Investigation of Maximum Entropy Hybrid Language Models for Open Vocabulary German and Polish LVCSRM. Ali Basha Shaik, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney. 1071-1074 [doi]
- A Specialized WFST Approach for Class Models and Dynamic VocabularyPaul R. Dixon, Chiori Hori, Hideki Kashioka. 1075-1078 [doi]
- Dynamic Grammars with Lookahead Composition for WFST-based Speech RecognitionJosef R. Novak, Nobuaki Minematsu, Keikichi Hirose. 1079-1082 [doi]
- Knowledge-Based Word Lattice Rescoring in a Dynamic ContextTodd Shore, Friedrich Faubel, Hartmut Helmke, Dietrich Klakow. 1083-1086 [doi]
- Mixture Component Clustering for Efficient Speaker VerificationRichard McClanahan, Phillip L. De Leon. 1087-1090 [doi]
- Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker RecognitionTaufiq Hasan, John H. L. Hansen. 1091-1094 [doi]
- Query-by-Example using Speaker Content GraphsWilliam M. Campbell, Elliot Singer. 1095-1098 [doi]
- Unsupervised NAP Training Data Design for Speaker RecognitionHanwu Sun, Bin Ma. 1099-1102 [doi]
- The Role of Score Calibration in Speaker RecognitionGeorge R. Doddington. 1103-1106 [doi]
- A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model StructuresTakafumi Hattori, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda. 1107-1110 [doi]
- Similarities in fundamental frequency in infant speech segmentation modelsEllen Marklund, Francisco Lacerda, Iris-Corinna Schwarz, Ulla Sundberg. 1111-1114 [doi]
- Phonological complexity and vocabulary size in 30-month-old Swedish childrenUlrika Marklund, Ulla Sundberg, Iris-Corinna Schwarz, Francisco Lacerda. 1115-1118 [doi]
- Auditory-visual speech to infants and adults: signals and correlationsJeesun Kim, Chris Davis, Christine Kitamura. 1119-1122 [doi]
- Objective Child Vocal Development Measurement with Naturalistic Daylong Audio RecordingDongxin Xu, Jill Gilkerson, Jeffery Richards. 1123-1126 [doi]
- Speech Production-Perception Relationships in Children with Speech DelayKyoko Nagao, Mark Paullin, Vilena Livinsky, James B. Polikoff, Linda D. Vallino, Thierry G. Morlet, N. Carolyn Schanen, H. Timothy Bunnell. 1127-1130 [doi]
- Synthetic correction of deviant speech - children's perception of phonologically modified recordings of their own speechSofia Strömbergsson. 1131-1134 [doi]
- Combining multiple high quality corpora for improving HMM-TTSVincent Wan, Javier Latorre, K. K. Chin, Langzhou Chen, Mark J. F. Gales, Heiga Zen, Kate Knill, Masami Akamine. 1135-1138 [doi]
- An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech SynthesisShinnosuke Takamichi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai, Sakriani Sakti, Satoshi Nakamura. 1139-1142 [doi]
- Using Bayesian Networks to find relevant context features for HMM-based speech synthesisHeng Lu, Simon King. 1143-1146 [doi]
- Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech SynthesisXiang Yin, Zhen-Hua Ling, Ming Lei, Li-Rong Dai. 1147-1150 [doi]
- A speech parameter generation algorithm using local variance for HMM-based speech synthesisVataya Chunwijitra, Takashi Nose, Takao Kobayashi. 1151-1154 [doi]
- Histogram-based spectral equalization for HMM-based speech synthesis using mel-LSPYamato Ohtani, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima, Masami Akamine. 1155-1158 [doi]
- Improving Recognition of Speaker States and Traits by Cumulative Evidence: Intoxication, Sleepiness, Age and GenderFelix Weninger, Erik Marchi, Björn Schuller. 1159-1162 [doi]
- Speaker Clustering in Emotion RecognitionNi Ding, Julien Epps. 1163-1166 [doi]
- Automatic detection of conflict escalation in spoken conversationsSamuel Kim, Sree Harsha Yella, Fabio Valente. 1167-1170 [doi]
- The entropy of intoxicated speech - lexical creativity and heavy tonguesUwe D. Reichel. 1171-1174 [doi]
- A Robust Unsupervised Arousal Rating Framework using Prosody with Cross-Corpora EvaluationDaniel Bone, Chi-Chun Lee, Shrikanth S. Narayanan. 1175-1178 [doi]
- Unveiling the Acoustic Properties that Describe the Valence DimensionCarlos Busso, Tauhidur Rahman. 1179-1182 [doi]
- Annotation and Recognition of Personality Traits in Spoken Conversations from the AMI Meetings CorpusFabio Valente, Samuel Kim, Petr Motlícek. 1183-1186 [doi]
- The Effects of Lexical Tones and Nasal Coda /-n/ to Sadness in Taiwan HakkaShao-Ren Lyu. 1187-1190 [doi]
- Comparing different acoustic modeling techniques for multilingual boostingDavid Imseng, John Dines, Petr Motlícek, Philip N. Garner, Hervé Bourlard. 1191-1194 [doi]
- Model-based approaches to adaptive training in reverberant environmentsYongqiang Wang, Mark J. F. Gales. 1195-1198 [doi]
- Model-Based Approaches for Degraded Channel Modelling in Robust ASRMark J. F. Gales, Federico Flego. 1199-1202 [doi]
- Improved Model Selection for the ASR-Driven Binary MaskWilliam Hartmann, Eric Fosler-Lussier. 1203-1206 [doi]
- Accelerated Batch Learning of Convex Log-linear Models for LVCSRSimon Wiesler, Ralf Schlüter, Hermann Ney. 1207-1210 [doi]
- Improving Discriminative Training for Robust Acoustic Models in Large Vocabulary Continuous Speech RecognitionJanne Pylkkönen, Mikko Kurimo. 1211-1214 [doi]
- Semi-Supervised Methods for Improving Keyword Search of Unseen TermsScott Novotney, Ivan Bulyko, Richard M. Schwartz, Sanjeev Khudanpur, Owen Kimball. 1215-1218 [doi]
- Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech RecognitionXiangang Li, Dan Su, Zaihu Pang, Xihong Wu. 1219-1222 [doi]
- Classification of Stressed Speech Using Physical Parameters Derived from Two-Mass ModelXiao Yao, Takatoshi Jitsuhiro, Chiyomi Miyajima, Norihide Kitaoka, Kazuya Takeda. 1223-1226 [doi]
- IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech RecognitionJun Du, Qiang Huo. 1227-1230 [doi]
- Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth?Niko Moritz, Jörn Anemüller, Birger Kollmeier. 1231-1234 [doi]
- Effect of speech priors in single-channel speech-music separation for ASRCemil Demir, Ali Taylan Cemgil, Murat Saraçlar. 1235-1238 [doi]
- On the Role of Binary Mask Pattern in Automatic Speech RecognitionArun Narayanan, DeLiang Wang. 1239-1242 [doi]
- Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech RecognitionTatsuya Kawahara, Randy Gomez. 1243-1246 [doi]
- Spectral Intersections for Non-Stationary Signal SeparationTrausti T. Kristjansson, Thad Hughes. 1247-1250 [doi]
- Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant EnvironmentKyohei Odani, Longbiao Wang, Atsuhiko Kai. 1251-1254 [doi]
- Q-Gaussian based spectral subtraction for robust speech recognitionHilman Ferdinandus Pardede, Koichi Shinoda, Koji Iwano. 1255-1258 [doi]
- Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognitionBernd T. Meyer, Constantin Spille, Birger Kollmeier, Nelson Morgan. 1259-1262 [doi]
- Feature extraction based on hearing system signal processing for robust large vocabulary speech recognitionPeter Li, Xie Sun. 1263-1266 [doi]
- Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditionsHarish Arsikere, Gary K. F. Leung, Steven M. Lulich, Abeer Alwan. 1267-1270 [doi]
- Real-time Visualization of English Pronunciation on an IPA Chart Based on Articulatory Feature ExtractionYurie Iribe, Takurou Mori, Kouichi Katsurada, Goh Kawai, Tsuneo Nitta. 1271-1274 [doi]
- Acoustic Feature-based Non-scorable Response Detection for an Automated Speaking Proficiency AssessmentJe Hun Jeon, Su-Youn Yoon. 1275-1278 [doi]
- Pronunciation quality evaluation of sentences by combining word based scoresJorge Wuth, Néstor Becerra Yoma, Leopoldo Benavides, Hiram Vivanco. 1279-1282 [doi]
- Designing a spoken language interface for a tutorial dialogue systemPeter Bell, Myroslava Dzikovska, Amy Isard. 1283-1286 [doi]
- Automatic Pronunciation Error Detection Based on Extended Pronunciation Space Using the Unsupervised Clustering of Pronunciation ErrorsLong Zhang, Haifeng Li. 1287-1290 [doi]
- Less errors with TTS? A dictation experiment with foreign language learnersThomas Pellegrini, Ângela Costa, Isabel Trancoso. 1291-1294 [doi]
- Improvement in Automatic Pronunciation Scoring using Additional Basic Scores and Learning to RankLiang-Yu Chen, Jyh-Shing Roger Jang. 1295-1298 [doi]
- Automatic Tone Assessment of Non-Native Mandarin SpeakersJian Cheng. 1299-1302 [doi]
- Audio and Contact Microphones for Cough DetectionThomas Drugman, Jérôme Urbain, Nathalie Bauwens, Ricardo Chessini, Anne-Sophie Aubriot, Patrick Lebecque, Thierry Dutoit. 1303-1306 [doi]
- Analyzing and Interpreting Automatically Learned Rules Across DialectsNancy F. Chen, Wade Shen, Joseph P. Campbell. 1307-1310 [doi]
- The Effect of Use of Drugs on Speaker's Fundamental Frequency and FormantsAndrey N. Raev, Yuri Matveev, Tatiana Goloshchapova. 1311-1314 [doi]
- On the assessment of audiovisual cues to speaker confidence by preteens with typical development (TD) and a-typical development (AD)Marc Swerts, Cees de Bie. 1315-1318 [doi]
- Interplay between verbal response latency and physiology of children with autism during ECA interactionsTheodora Chaspari, Chi-Chun Lee, Shrikanth Narayanan. 1319-1322 [doi]
- Combination of Multiple Speech Dimensions for Automatic Assessment of Dysarthric Speech IntelligibilityMyung Jong Kim, Hoirin Kim. 1323-1326 [doi]
- Whole-Word Recognition from Articulatory Movements for Silent Speech InterfacesJun Wang, Ashok Samal, Jordan R. Green, Frank Rudzicz. 1327-1330 [doi]
- Verifying Session Level Pronunciation Accuracy in a Speech Therapy ApplicationShou-Chun Yin, Richard C. Rose, Yun Tang. 1331-1334 [doi]
- Duration of ambulatory monitoring needed to accurately estimate voice useDaryush D. Mehta, Rebecca Woodbury Listfield, Harold A. Cheyne II, James T. Heaton, Shengran W. Feng, Matías Zanartu, Robert E. Hillman. 1335-1338 [doi]
- Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech TranscriptsKhairun-nisa Hassanali, Yang Liu, Thamar Solorio. 1339-1342 [doi]
- Quantitative Analysis of Pitch in Speech of Children with Neurodevelopmental DisordersGéza Kiss, Jan P. H. van Santen, Emily Tucker Prud'hommeaux, Lois M. Black. 1343-1346 [doi]
- Robust phoneme recognition based on biomimetic speech contoursMichael A. Carlin, Kailash Patil, Sridhar Krishna Nemala, Mounya Elhilali. 1348-1351 [doi]
- A Feature Space Transformation Method for Personalization using Generalized I-Vector ClusteringKaisheng Yao, Yifan Gong, Chaojun Liu. 1352-1355 [doi]
- Longer Features: They do a speech detector goodT. J. Tsai, Nelson Morgan. 1356-1359 [doi]
- Robust Feature Extraction for Speech Recognition by Enhancing Auditory SpectrumMd. Jahangir Alam, Patrick Kenny, Douglas D. O'Shaughnessy. 1360-1363 [doi]
- Enhancing Vocal Tract Length Normalization with Elastic Registration for Automatic Speech RecognitionFlorian Müller, Alfred Mertins. 1364-1367 [doi]
- Beamforming using uniform circular arrays for distant speech recognition in reverberant environments and double talk scenariosHannes Pessentheiner, Stefan Petrik, Harald Romsdorfer. 1368-1371 [doi]
- Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's NeedsAles Prazák, Zdenek Loose, Jan Trmal, Josef V. Psutka, Josef Psutka. 1372-1375 [doi]
- Development and Evaluation of Automatic Punctuation for French and English Speech-to-TextJáchym Kolár, Lori Lamel. 1376-1379 [doi]
- Spoken Document Clustering Using Word Confusion NetworksShajith Ikbal, Sachindra Joshi, Ashish Verma, Om D. Deshmukh. 1380-1383 [doi]
- Dynamic Conditional Random Fields for Joint Sentence Boundary and Punctuation PredictionXuancong Wang, Hwee Tou Ng, Khe Chai Sim. 1384-1387 [doi]
- Analysis of the Characteristics of Talk-show TV ProgramsFabio Brugnara, Daniele Falavigna, Diego Giuliani, Roberto Gretter. 1388-1391 [doi]
- Rethinking The Corpus: Moving towards Dynamic Linguistic ResourcesAndrew Rosenberg. 1392-1395 [doi]
- Effects of stress and speech rate on vowel quality in Catalan and SpanishMarianna Nadeu. 1396-1399 [doi]
- Predictability affects vowel dispersion and dynamics in the Buckeye CorpusMichael McAuliffe, Molly Babel. 1400-1403 [doi]
- Dialectal and generational variations in vowels in spontaneous speechRobert Allen Fox, Ewa Jacewicz. 1404-1407 [doi]
- Acoustic Cues of Vowel Quality to Coda Nasal Perception in Southern MinYing Chen, Vsevolod Kapatsinski, Susan Guion-Anderson. 1412-1415 [doi]
- Lenition of /d/ in spontaneous Spanish and CatalanMiguel Simonet, José Ignacio Hualde, Marianna Nadeu. 1416-1419 [doi]
- Wideband Parametric Speech Synthesis Using Warped Linear PredictionTuomo Raitio, Antti Suni, Martti Vainio, Paavo Alku. 1420-1423 [doi]
- Modeling the Creaky Excitation for Parametric Speech SynthesisThomas Drugman, John Kane, Christer Gobl. 1424-1427 [doi]
- Amplitude Spectrum based Excitation Model for HMM-based Speech SynthesisZhengqi Wen, Jianhua Tao. 1428-1431 [doi]
- Speech synthesis using a non-maximally decimated filter bank for embedded systemsNobuyuki Nishizawa, Tsuneo Kato. 1432-1435 [doi]
- Ways to Implement Global Variance in Statistical Speech SynthesisHanna Silén, Elina Helander, Jani Nurminen, Moncef Gabbouj. 1436-1439 [doi]
- HMM-based speech synthesis using sub-band basis spectrum modelYamato Ohtani, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima, Masami Akamine. 1440-1443 [doi]
- Average Spectrotemporal Structure of Continuous Speech Matches with the Frequency Resolution of Human HearingOkko Räsänen. 1444-1447 [doi]
- Perceptual Importance of the Phase Related Information in SpeechIbon Saratxaga, Inma Hernáez, Michael Pucher, Eva Navas, Iñaki Sainz. 1448-1451 [doi]
- Improving the Entropy Estimate of Neuronal Firings of Modeled Cochlear Nucleus NeuronsAndrea Grigorescu, Marek Rudnicki, Michael Isik, Werner Hemmert, Stefano Rini. 1452-1455 [doi]
- Perception of Synthetic Speech in Adult Users of Cochlear ImplantsKyoko Nagao, Mark Paullin, James B. Polikoff, Jason Lilley, H. Timothy Bunnell. 1456-1459 [doi]
- Hearing Loss and the Use of Acoustic Cues in Phonetic Categorisation of FricativesOdette Scharenborg, Esther Janse. 1460-1463 [doi]
- Intelligibility of speech spoken in noise/reverberation for older adults in reverberant environmentsNao Hodoshima, Takayuki Arai, Kiyohiro Kurisu. 1464-1467 [doi]
- Improved Speech Intelligibility with a Chimaera Hearing Aid AlgorithmAndrew Hines, Naomi Harte. 1468-1471 [doi]
- Unsupervised Acoustic Analyses of Normal and Lombard Speech, with Spectral Envelope Transformation to Improve IntelligibilityElizabeth Godoy, Yannis Stylianou. 1472-1475 [doi]
- The effect of dichotic processing on the perception of binaural cuesAkiko Amano-Kusumoto, Justin M. Aronoff, Motokuni Itoh, Sigfrid D. Soli. 1476-1479 [doi]
- Speech and speaker separation in human auditory cortexNima Mesgarani, Edward Chang. 1480-1483 [doi]
- On the effect of the acoustic environment on the accuracy of perception of speaker orientation from auditory cues aloneJens Edlund, Mattias Heldner, Joakim Gustafson. 1484-1487 [doi]
- Sibilant Speech Detection in NoiseSira Gonzalez, Mike Brookes. 1488-1491 [doi]
- Voice Activity Detection Using Speech Recognizer FeedbackKit Thambiratnam, Weiwu Zhu, Frank Seide. 1492-1495 [doi]
- Descriptive Vocabulary Development for Degraded SpeechDushyant Sharma, Gaston Hilkhuysen, Patrick A. Naylor, Nikolay D. Gaubitch, Mark A. Huckvale, Mike Brookes. 1496-1499 [doi]
- Overlapped Speech Detection in Meeting Using Cross-Channel Spectral Subtraction and Spectrum SimilarityRyo Yokoyama, Yu Nasu, Koichi Shinoda, Koji Iwano. 1500-1503 [doi]
- Speech restoration based on deep learning autoencoder with layer-wised pretrainingXugang Lu, Shigeki Matsuda, Chiori Hori, Hideki Kashioka. 1504-1507 [doi]
- Detection and Positioning of Overlapped Sounds in a Room EnvironmentRupayan Chakraborty, Climent Nadeu, Taras Butko. 1508-1511 [doi]
- Foreground Speech Segmentation using Zero Frequency Filtered SignalDeepak K. T., Biswajit Dev Sarma, S. R. Mahadeva Prasanna. 1512-1515 [doi]
- The Effect of Spectral Estimator on Common Spectral Measures for Sibilant FricativesPatrick Reidy, Mary Beckman. 1516-1519 [doi]
- Gaussian Mixture Gain Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source SeparationEmad M. Grais, Hakan Erdogan. 1520-1523 [doi]
- Speaker Independent Single Channel Source Separation using Sinusoidal FeaturesShivesh Ranjan, Karen L. Payton, Pejman Mowlaee. 1524-1527 [doi]
- Boosting Classification Based Speech Separation Using Temporal DynamicsYuxuan Wang, DeLiang Wang. 1528-1531 [doi]
- Acoustic Features for Classification Based Speech SeparationYuxuan Wang, Kun Han, DeLiang Wang. 1532-1535 [doi]
- Hidden Markov Models as Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source SeparationEmad M. Grais, Hakan Erdogan. 1536-1539 [doi]
- Unconstrained Speech Separation by Composition of Longest SegmentsJi Ming, Ramji Srinivasan, Danny Crookes. 1540-1543 [doi]
- Modulation domain blind source separation for noisy speech mixtureYi Zhang, Yunxin Zhao. 1544-1547 [doi]
- Phase estimation for signal reconstruction in single-channel source separationPejman Mowlaee, Rahim Saeidi, Rainer Martin. 1548-1551 [doi]
- Bayesian Group Sparse Learning for Nonnegative Matrix FactorizationJen-Tzung Chien, Hsin-Lung Hsieh. 1552-1555 [doi]
- Residual Phase Cepstrum Coefficients with Application to Cross-lingual Speaker VerificationMichael Johnson, Jianglin Wang. 1556-1559 [doi]
- Speaker Verification Using Neighborhood Preserving EmbeddingChunyan Liang, Jinchao Yang, Lin Yang, YongHong Yan. 1560-1563 [doi]
- Discriminative Decision Function Based Scoring Method in Joint Factor Analysis for Speaker VerificationChunyan Liang, Xiang Zhang, Lin Yang, YongHong Yan. 1564-1567 [doi]
- Integrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor AnalysisTaufiq Hasan, John H. L. Hansen. 1568-1571 [doi]
- Factor Analysis and Nuisance Attribute Projection RevisitedLukás Machlica, Zbynek Zajíc. 1572-1575 [doi]
- Compensation of Intrinsic Variability with Factor Analysis Modeling for Robust Speaker VerificationSheng Chen, Mingxing Xu. 1576-1579 [doi]
- RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-PhrasesAnthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li. 1580-1583 [doi]
- Speaker idiosyncratic rhythmic features in the speech signalVolker Dellwo, Adrian Leemann, Marie-José Kolly. 1584-1587 [doi]
- Bilinear Factor Analysis for iVector Based Speaker VerificationYun Lei, Lukas Burget, Nicolas Scheffer. 1588-1591 [doi]
- Resonator-based creaky voice detectionThomas Drugman, John Kane, Christer Gobl. 1592-1595 [doi]
- Effect of Tongue Tip Trilling on the Glottal Excitation SourceVinay Kumar Mittal, N. Dhananjaya, Bayya Yegnanarayana. 1596-1599 [doi]
- Estimating the voice source in noiseGang Chen 0009, Yen-Liang Shue, Jody Kreiman, Abeer Alwan. 1600-1603 [doi]
- Voice source analysis using biomechanical modeling and glottal inverse filteringAlan Pinheiro, Tuomo Raitio, Danyane Gomes, Paavo Alku. 1604-1607 [doi]
- Speech modeling and processing by low-dimensional dynamic glottal modelsCarlo Drioli, Andrea Calanca. 1608-1611 [doi]
- Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear predictionPaavo Alku, Jouni Pohjalainen, Martti Vainio, Anne-Maria Laukkanen, Brad H. Story. 1612-1615 [doi]
- Automatic Topology Generation of Glottal Source HMMAkira Sasou. 1616-1619 [doi]
- Towards Glottal Source Controllability in Expressive Speech SynthesisJaime Lorenzo-Trueba, Roberto Barra-Chicote, Tuomo Raitio, Nicolas Obin, Paavo Alku, Junichi Yamagishi, Juan Manuel Montero. 1620-1623 [doi]
- Combining temporal and cepstral features for the automatic perceptual categorization of disordered connected speechAli Alpan, Jean Schoentgen, Francis Grenez. 1624-1627 [doi]
- A Preliminary Study on Cross-Databases Emotion Recognition using the Glottal Features in SpeechRui Sun, Elliot Moore II. 1628-1631 [doi]
- Analysis on the Importance of Short-Term Speech Parameterizations for Emotional Statistical Parametric Speech SynthesisRanniery Maia. 1632-1635 [doi]
- Analysis of vocal tremor and jitter by empirical mode decomposition of glottal cycle length time seriesChristophe Mertens, Francis Grenez, Jean Schoentgen. 1636-1639 [doi]
- Utilizing Markov Chain Monte Carlo (MCMC) Method for Improved Glottal Inverse FilteringHarri Auvinen, Tuomo Raitio, Samuli Siltanen, Paavo Alku. 1640-1643 [doi]
- Glottal source shape parameter estimation using phase minimization variantsStefan Huber, Axel Röbel, Gilles Degottex. 1644-1647 [doi]
- Glottal Waveform Analysis of Physical Task Stress SpeechKeith W. Godin, Taufiq Hasan, John H. L. Hansen. 1648-1651 [doi]
- Speaker Discrimination Ability of Glottal Waveform FeaturesJuan F. Torres, Elliot Moore. 1652-1655 [doi]
- Paraphrastic Language ModelsXunying Liu, Mark J. F. Gales, Philip C. Woodland. 1656-1659 [doi]
- Efficient Structured Language Modeling for Speech RecognitionAriya Rastrow, Mark Dredze, Sanjeev Khudanpur. 1660-1663 [doi]
- Towards Recurrent Neural Networks Language Models with Linguistic and Contextual FeaturesYangyang Shi, Pascal Wiggers, Catholijn M. Jonker. 1664-1667 [doi]
- Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech RecognitionGwénolé Lecorvé, Petr Motlícek. 1668-1671 [doi]
- Large Scale Hierarchical Neural Network Language ModelsHong-Kwang Kuo, Ebru Arisoy, Ahmad Emami, Paul Vozila. 1672-1675 [doi]
- A Sparse Plus Low Rank Maximum Entropy Language ModelBrian Hutchinson, Mari Ostendorf, Maryam Fazel. 1676-1679 [doi]
- PLDA Modeling in I-Vector and Supervector Space for Speaker VerificationYe Jiang, Kong-Aik Lee, Zhenmin Tang, Bin Ma, Anthony Larcher, Haizhou Li. 1680-1683 [doi]
- Supervized Mixture of PLDA Models for Cross-Channel Speaker VerificationKonstantin Simonchik, Timur Pekhovsky, Andrey Shulipa, Anton Afanasyev. 1684-1687 [doi]
- Spoofing countermeasures for the protection of automatic speaker recognition systems against attacks with artificial signalsFederico Alegre, Ravichander Vipperla, Nicholas W. D. Evans. 1688-1691 [doi]
- PLDA using Gaussian Restricted Boltzmann Machines with application to Speaker VerificationThemos Stafylakis, Patrick Kenny, Mohammed Senoussaoui, Pierre Dumouchel. 1692-1695 [doi]
- Mean Hilbert Envelope Coefficients (MHEC) for Robust Speaker RecognitionSeyed Omid Sadjadi, Taufiq Hasan, John H. L. Hansen. 1696-1699 [doi]
- Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker RecognitionZhi-Zheng Wu, Chng Eng Siong, Haizhou Li. 1700-1703 [doi]
- Maximising objective speech intelligibility by local f0 modulationJulián Villegas, Martin Cooke. 1704-1707 [doi]
- Effect of prosodic changes on speech intelligibilityCatherine Mayo, Vincent Aubanel, Martin Cooke. 1708-1711 [doi]
- Effects of visual speech information on native listener judgments of L2 consonant intelligibilitySaya Kawase, Yue Wang. 1712-1715 [doi]
- Perceptual compensation for the effects of reverberation on consonant identification: A comparison of human and machine performanceGuy J. Brown, Amy V. Beeston, Kalle J. Palomäki. 1716-1719 [doi]
- The Intelligibility of Lombard Speech: Communicative setting mattersMichael Fitzpatrick, Jeesun Kim, Chris Davis. 1720-1723 [doi]
- Performance Comparison of Intrusive Objective Speech Intelligibility and Quality Metrics for Cochlear Implant UsersJoão Felipe Santos, Stefano Cosentino, Oldooz Hazrati, Philipos C. Loizou, Tiago H. Falk. 1724-1727 [doi]
- Exploiting Temporal Sequence Structure for Semantic Analysis of MultimediaSourish Chaudhuri, Rita Singh, Bhiksha Raj. 1728-1731 [doi]
- Time Delay Estimation for Speech Signal Based on FOC-SpectrumHong Liu, Xiaofei Li. 1732-1735 [doi]
- Low-rank Audio Signal Classification Under Soft Margin and Trace Norm ConstraintsZiqiang Shi, Tieran Zheng, Jiqing Han, Shiwen Deng. 1736-1739 [doi]
- GCC-PHAT based Head Orientation EstimationCarlos Segura, Javier Hernando. 1740-1743 [doi]
- Plagiarism Detection in Polyphonic Music using Monaural Signal SeparationSoham De, Indradyumna Roy, Tarunima Prabhakar, Kriti Suneja, Sourish Chaudhuri, Rita Singh, Bhiksha Raj. 1744-1747 [doi]
- TDOA Estimation for Multiple Speakers in Underdetermined CaseMariem Bouafif, Zied Lachiri. 1748-1751 [doi]
- Local-feature-map Integration Using Convolutional Neural Networks for Music Genre ClassificationToru Nakashika, Christophe Garcia, Tetsuya Takiguchi. 1752-1755 [doi]
- Training Deep Nets with Imbalanced and Unlabeled DataJeff Berry, Ian Fasel, Luciano Fadiga, Diana Archangeli. 1756-1759 [doi]
- Speech Data Clustering Based on Phoneme Error Trend for Unsupervised Acoustic Model AdaptationTaichi Asami, Satoshi Kobashikawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi. 1760-1763 [doi]
- Gaussian Map based Acoustic Model Adaptation Using Untranscribed Data for Speech Recognition in Severely Adverse EnvironmentsWooil Kim, John H. L. Hansen. 1764-1767 [doi]
- Investigating Performance of the Discriminative Methods for Long-Term Speaker AdaptationDanning Jiang, Dimitri Kanevsky, Vaibhava Goel, Yong Qin. 1768-1771 [doi]
- A Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech RecognitionBo Li, Khe Chai Sim. 1772-1775 [doi]
- A comparative study of adaptive, automatic recognition of disordered speechHeidi Christensen, Stuart Cunningham, Charles Fox, Phil Green, Thomas Hain. 1776-1779 [doi]
- Phoneme Class Based Adaptation for Mismatch Acoustic Modeling of Distant Noisy SpeechSeckin Uluskan, John H. L. Hansen. 1780-1783 [doi]
- Rapid Nonlinear Speaker Adaptation for Large-Vocabulary Continuous Speech RecognitionZoi Roupakia, Anton Ragni, Mark J. F. Gales. 1784-1787 [doi]
- A Study on Using Word-Level HMMs to Improve ASR Performance over State-of-the-Art Phone-Level Acoustic Modeling for LVCSRI.-Fan Chen, Chin-Hui Lee. 1788-1791 [doi]
- Factored adaptation using a combination of feature-space and model-space transformsMichael L. Seltzer, Alex Acero. 1792-1795 [doi]
- Exploring Discriminative Speech Trajectory StructuresHeyun Huang, Louis ten Bosch, Bert Cranen, Lou Boves. 1796-1799 [doi]
- Estimating Classifier Performance in Unknown NoiseEhsan Variani, Hynek Hermansky. 1800-1803 [doi]
- Continuous Digit Recognition in Noise: Reservoirs can do an excellent job!Azarakhsh Jalalvand, Fabian Triefenbach, Jean-Pierre Martens. 1804-1807 [doi]
- Optimization-Based Control for the Extended Baum-Welch AlgorithmJanne Pylkkönen, Mikko Kurimo. 1808-1811 [doi]
- Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systemsMarc René Schädler, Birger Kollmeier. 1812-1815 [doi]
- Phone recognition in critical bands using sub-band temporal modulationsFeipeng Li, Sri Harish Reddy Mallidi, Hynek Hermansky. 1816-1819 [doi]
- Combining Acoustic Data Driven G2P and Letter-to-Sound Rules for Under Resource Lexicon GenerationRamya Rasipuram, Mathew Magimai-Doss. 1820-1823 [doi]
- CRF-based Diacritisation of Colloquial Arabic for Automatic Speech RecognitionSarah Al-Shareef, Thomas Hain. 1824-1827 [doi]
- Analysis of Temporal Resolution in Frequency Domain Linear PredictionSriram Ganapathy, Hynek Hermansky. 1828-1831 [doi]
- White Listing and Score Normalization for Keyword Spotting of Noisy SpeechBing Zhang 0004, Richard M. Schwartz, Stavros Tsakalidis, Long Nguyen, Spyros Matsoukas. 1832-1835 [doi]
- Speaker Recognition for Children's SpeechSaeid Safavi, Maryam Najafian, Abualsoud Hanani, Martin J. Russell, Peter Jancovic, Michael J. Carey 0002. 1836-1839 [doi]
- A simple and efficient method to align very long speech signals to acoustically imperfect transcriptionsGermán Bordel, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Amparo Varona. 1840-1843 [doi]
- Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase CoefficientsRyoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki. 1844-1847 [doi]
- Sentence Detection Using Multiple AnnotationsAnn Lee, James R. Glass. 1848-1851 [doi]
- A speaker-role based approach for detecting politicians in TV broadcast newsDelphine Charlet, Géraldine Damnati. 1852-1855 [doi]
- Relative Importance of Temporal Envelope and Fine Structure Cues in Low- and High-Order Harmonic Regions for Mandarin Lexical-tone RecognitionGuangting Mai. 1856-1859 [doi]
- Real-time Implementation of Multi-band Frequency Compression for Listeners with Moderate Sensorineural ImpairmentNitya Tiwari, Prem C. Pandey, Pandurangarao N. Kulkarni. 1860-1863 [doi]
- Word Prominence Detection using Robust yet Simple Prosodic FeaturesTaniya Mishra, Vivek Kumar Rangarajan Sridhar, Alistair Conkie. 1864-1867 [doi]
- Online Story Segmentation of Multilingual Streaming Broadcast NewsAmit Srivastava, Saurabh Khanwalkar, Gretchen Markiewicz, Guruprasad Saikumar. 1868-1871 [doi]
- The Speech Recognition Virtual Kitchen: An Initial PrototypeFlorian Metze, Eric Fosler-Lussier. 1872-1873 [doi]
- PermA and Balloon: Tools for string alignment and text processingUwe D. Reichel. 1874-1877 [doi]
- VisArtico: a visualization tool for articulatory dataSlim Ouni, Loic Mangeonjean, Ingmar Steiner. 1878-1881 [doi]
- Towards Automated Annotation of Audio and Video Recordings by Application of Advanced Web-servicesPrzemyslaw Lenkiewicz, Dieter Van Uytvanck, Peter Wittenburg, Sebastian Drude. 1882-1885 [doi]
- A Rule Based Pronunciation Generator and Regional Accent Databank for PortugueseSimone Ashby, Sílvia Barbosa, Silvia Brandão, José Pedro Ferreira, Maarten Janssen, Catarina Silva, Mário Eduardo Viaro. 1886-1887 [doi]
- Speech Enhancement for Android (SEA): A Speech Processing Demonstration Tool for Android Based Smart Phones and TabletsRoger Chappel, Kuldip K. Paliwal. 1888-1891 [doi]
- ProTK: An Improved Prosody ToolkitJacob Okamoto, Serguei V. S. Pakhomov, Elizabeth Shriberg, Andreas Stolcke. 1892-1893 [doi]
- SpeechMark: Landmark Detection Tool for Speech AnalysisSuzanne Boyce, Harriet J. Fell, Joel MacAuslan. 1894-1897 [doi]
- Efficient Segmental Conditional Random Fields for One-Pass Phone RecognitionYanzhang He, Eric Fosler-Lussier. 1898-1901 [doi]
- Enhanced Polyphone Decision Tree Adaptation for Accented Speech RecognitionUdhyakumar Nallasamy, Florian Metze, Tanja Schultz. 1902-1905 [doi]
- Efficient VTS Adaptation Using Jacobian ApproximationJinyu Li, Michael L. Seltzer, Yifan Gong. 1906-1909 [doi]
- Robust triphone mapping for acoustic modelingMilos Cernak, David Imseng, Hervé Bourlard. 1910-1913 [doi]
- sparse banded precision matrices for low resource speech recognitionWeibin Zhang, Pascale Fung. 1914-1917 [doi]
- Semi-Blind Model Adaptation using Piece-wise Energy Decay Curve for Large Reverberant EnvironmentsAbdul Waheed Mohammed, Marco Matassoni, Hari Krishna Maganti, Maurizio Omologo. 1918-1921 [doi]
- Example-based speech enhancement with joint utilization of spatial, spectral & temporal cues of speech and noiseKeisuke Kinoshita, Marc Delcroix, Mehrez Souden, Tomohiro Nakatani. 1926-1929 [doi]
- A Fast-Converging Adaptive Frequency-Domain MVDR Beamformer for Speech EnhancementShengkui Zhao, Douglas L. Jones. 1930-1933 [doi]
- A signal-separation-based array postfilter for distant speech recognitionRita Singh, Ken'ichi Kumatani, John W. McDonough, Chen Liu. 1934-1937 [doi]
- Constrained Multichannel Speech DereverberationMeng Yu, Frank K. Soong. 1938-1941 [doi]
- A Triple-Microphone Real-Time Speech Enhancement Algorithm Based on Approximate Array Analytical SolutionsMeng Yu, Ryan Ritch, Jack Xin. 1942-1945 [doi]
- Perception of Pitch Contours among Native Tone ListenersRatree Wayland, Donruethai Laphasradakul, Edith Kaan, Cao Rui. 1946-1948 [doi]
- Pitch range control of Japanese boundary pitch movementsYosuke Igarashi, Hanae Koiso. 1949-1952 [doi]
- Perceived prosodic boundaries in Taiwanese and their acoustic correlatesGrace Kuo. 1953-1956 [doi]
- Phonetic Foreignization of Mandarin for Dubbing in Imported Western MoviesLuying Hou, Yuan Jia, Aijun Li. 1957-1960 [doi]
- Prosodic contex-based analysis of disfluenciesHelena Moniz, Fernando Batista, Isabel Trancoso, Ana Isabel Mata. 1961-1964 [doi]
- Describing the development of intonational categories using a target-oriented parametric approachBritta Lintfert, Bernd Möbius. 1965-1968 [doi]
- Developing a Speech Activity Detection System for the DARPA RATS ProgramTim Ng, Bing Zhang 0004, Long Nguyen, Spyros Matsoukas, Xinhui Zhou, Nima Mesgarani, Karel Veselý, Pavel Matejka. 1969-1972 [doi]
- Speech Activity Detection for Noisy Data Using Adaptation TechniquesMohamed Omar. 1973-1976 [doi]
- Speech/Nonspeech Segmentation in Web VideosAnanya Misra. 1977-1980 [doi]
- On the use of Machine Learning Methods for Speech and Voicing ClassificationPhilip Harding, Ben Milner. 1981-1984 [doi]
- Acoustic and Data-driven Features for Robust Speech Activity DetectionSamuel Thomas, Sri Harish Reddy Mallidi, Thomas Janu, Hynek Hermansky, Nima Mesgarani, Xinhui Zhou, Shihab A. Shamma, Tim Ng, Bing Zhang 0004, Long Nguyen, Spyros Matsoukas. 1985-1988 [doi]
- A Two-step NMF Based Algorithm for Single Channel Speech SeparationShuo Wang, Wenjun Wu 0001. 1989-1992 [doi]
- Meaning inhibition and sentence processing in Chinese: Evidence from negative primingMichael C. W. Yip. 1993-1996 [doi]
- Similar Speaker Selection Technique Based on Distance Metric Learning with Perceptual Voice Quality SimilarityYusuke Ijima, Mitsuaki Isogai, Hideyuki Mizuno. 1997-2000 [doi]
- Gendered sound symbolism and masking effects in speech processingMolly Babel, Grant McGuire. 2001-2004 [doi]
- Modeling Cue Trading in Human Word RecognitionLouis ten Bosch, Odette Scharenborg. 2005-2008 [doi]
- Accounting for Speech Rate in Spoken Word RecognitionDavid Cheng-Huan Li, Elsi Kaiser. 2009-2012 [doi]
- The processes underlying two frequent casual speech phenomena in Dutch: A production experimentIris Hanique, Mirjam Ernestus. 2013-2016 [doi]
- Intrinsic velocity differences of lip and jaw movements: preliminary resultsPeter Birkholz, Phil Hoole. 2017-2020 [doi]
- Co-occurrence of reduced word forms in natural speechMalte Viebahn, Mirjam Ernestus, James M. McQueen. 2021-2024 [doi]
- Voice Production Mechanisms of Vibrato in NohIkuyo Yoshinaga, Jiangping Kong. 2025-2028 [doi]
- Automatic detection of hypernasal speech signals using nonlinear and entropy measurementsJuan Rafael Orozco-Arroyave, Julián D. Arias-Londoño, Jesus Francisco Vargas Bonilla, Elmar Nöth. 2029-2032 [doi]
- Effects of the availability of visual information and presence of competing conversations on speech productionVincent Aubanel, Martin Cooke, Emma Foster, Maria Luisa Garcia Lecumberri, Cassie Mayo. 2033-2036 [doi]
- Constrained Maximum Mutual Information Dimensionality Reduction for Language IdentificationShuai Huang, Glen A. Coppersmith, Damianos Karakos. 2037-2040 [doi]
- Phonotactic Language Recognition Using MLP FeaturesMohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel. 2041-2044 [doi]
- The EHU Systems for the NIST 2011 Language Recognition EvaluationMikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Mireia Díez, Germán Bordel. 2045-2048 [doi]
- Study of Different Backends in a State-Of-the-Art Language Recognition SystemMikel Peñagarikano, Amparo Varona, Mireia Díez, Luis Javier Rodríguez-Fuentes, Germán Bordel. 2049-2052 [doi]
- On the Use of Non-Linear Polynomial Kernel SVMs in Language RecognitionSibel Yaman, Jason W. Pelecanos, Mohamed Kamal Omar. 2053-2056 [doi]
- Exemplar-Based Sparse Representation for Language Recognition on I-VectorsBing Jiang, Yan Song, Wu Guo, Li-Rong Dai. 2057-2060 [doi]
- Subspace-Based Feature Representation and Learning for Language RecognitionYu-Chin Shih, Hung-Shin Lee, Hsin-Min Wang, Shyh-Kang Jeng. 2061-2064 [doi]
- Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language RecognitionChanghuai You, Haizhou Li, Bin Ma, Kong-Aik Lee. 2065-2068 [doi]
- Using Time-Synchronous Phone Co-occurrences in a SVM-Phonotactic Dialect Recognition SystemAmparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel, Mireia Díez. 2069-2072 [doi]
- Nativeness Classification with Suprasegmental Features on the Accent Group LevelMahnoosh Mehrabani, Joseph Tepperman, Emily Nava. 2073-2076 [doi]
- Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature SimilarityHung-yi Lee, Po-wei Chou, Lin-Shan Lee. 2077-2080 [doi]
- Consumer-level multimedia event detection through unsupervised audio signal modelingByungki Byun, Ilseo Kim, Sabato Marco Siniscalchi, Chin-Hui Lee. 2081-2084 [doi]
- Event-based Video Retrieval Using AudioQin Jin, Peter Franz Schulam, Shourabh Rawat, Susanne Burger, Duo Ding, Florian Metze. 2085-2088 [doi]
- Compact Audio Representation for Event Detection in Consumer MediaXiaodan Zhuang, Stavros Tsakalidis, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan. 2089-2092 [doi]
- N-gram FST Indexing for Spoken Term DetectionChao Liu, Dong Wang, Javier Tejedor. 2093-2096 [doi]
- Spoken Inquiry Discrimination Using Bag-of-Words for Speech-Oriented Guidance SystemHaruka Majima, Rafael Torres, Yoko Fujita, Hiromichi Kawanami, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano. 2097-2100 [doi]
- Robust Event Detection From Spoken Content In Consumer Domain VideosStavros Tsakalidis, Xiaodan Zhuang, Roger Hsiao, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan. 2101-2104 [doi]
- Bag-of-Audio-Words Approach for Multimedia Event ClassificationStephanie Pancoast, Murat Akbacak. 2105-2108 [doi]
- Improvements in Japanese Voice SearchKen-ichi Iso, Edward Whittaker, Tadashi Emori, Junpei Miyake. 2109-2112 [doi]
- A tutorial dialogue system with unrestricted spoken inputPeter Bell, Myroslava Dzikovska, Amy Isard. 2113-2114 [doi]
- Integrating Adaptive Beam-forming and Auditory Features for Robust Large Vocabulary Speech RecognitionXie Sun, Peter Li, Manli Zhu, Qiru Zhou. 2115-2116 [doi]
- A Natural In-Car Speech Interface to Internet Services Using Hybrid ASRHansjörg Hofmann, Ute Ehrlich, Klaus Bader, Ilona Nothelfer, André Berton. 2117-2118 [doi]
- How Marni Helps English Language Learners Acquire Oral Reading FluencyRonald A. Cole, Daniel Bolaños, Wayne Ward, J. T. Carmer, Eric Borts, Edward Svirsky. 2119-2120 [doi]
- Demonstration of Advanced Multi-Modal, Network-Centric Communication Management SuiteVictor S. Finomore. 2121-2122 [doi]
- Dutch Automatic Speech Recognition on the Web: Towards a General Purpose SystemJoris Pelemans, Kris Demuynck, Patrick Wambacq. 2123-2126 [doi]
- An On-Line, Cloud-Based Spanish-Spanish Sign Language Translation SystemJavier Tejedor, Fernando J. López-Colino, Jordi Porta, José Colás. 2127-2128 [doi]
- Enhancing Exemplar-Based Posteriors for Speech Recognition TasksTara N. Sainath, David Nahamoo, Dimitri Kanevsky, Bhuvana Ramabhadran. 2130-2133 [doi]
- Advances in noise robust digit recognition using hybrid exemplar-based techniquesJort F. Gemmeke, Hugo Van Hamme. 2134-2137 [doi]
- Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech RecognitionAntti Hurmalainen, Rahim Saeidi, Tuomas Virtanen. 2138-2141 [doi]
- Using Sparse Classification Outputs as Feature Observations for Noise-robust ASRYang Sun, Bert Cranen, Jort F. Gemmeke, Louis ten Bosch, Lou Boves, Mathew M. Doss. 2142-2145 [doi]
- Synthetic References for Template-based ASR using posterior featuresSerena Soldo, Mathew Magimai-Doss, Hervé Bourlard. 2146-2149 [doi]
- Heterogeneous Convolutive Non-Negative Sparse CodingDong Wang, Javier Tejedor. 2150-2153 [doi]
- Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker DiarizationJürgen T. Geiger, Ravichander Vipperla, Simon Bozonnet, Nicholas W. D. Evans, Björn Schuller, Gerhard Rigoll. 2154-2157 [doi]
- Selection of TDOA Parameters for MDM Speaker DiarizationBeatriz Martínez-González, José Manuel Pardo, Julián D. Echeverry-Correa, José A. Vallejo-Pinto, Roberto Barra-Chicote. 2158-2161 [doi]
- Confidence for Speaker Diarization using PCA Spectral RatioOrith Toledo-Ronen, Hagai Aronowitz. 2162-2165 [doi]
- Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture modelNaohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi. 2166-2169 [doi]
- DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings RecordingsDeepu Vijayasenan, Fabio Valente. 2170-2173 [doi]
- I-vectors and ILP clustering adapted to cross-show speaker diarizationGrégor Dupuy, Mickael Rouvier, Sylvain Meignier, Yannick Estève. 2174-2177 [doi]
- Emphatic segments and emphasis spread in Lebanese Arabic: a Real-time Magnetic Resonance Imaging StudyAssaf Israel, Michael I. Proctor, Louis Goldstein, Khalil Iskarous, Shrikanth Narayanan. 2178-2181 [doi]
- Using magnetic resonance to image the pharynx during Arabic speech: Static and dynamic aspectsRyan Shosted, Bradley Sutton, Abbas Benmamoun. 2182-2185 [doi]
- Articulatory speaker normalisation based on MRI-data using three-way linear decomposition methodsJulián Andrés Valdés Vargas, Pierre Badin, Laurent Lamalle. 2186-2189 [doi]
- Vowels Produced by Sliding Three-tube Model with Different LengthsTakayuki Arai. 2190-2193 [doi]
- Estimating the Vocal-Tract Area Function From Formants Using a Sensitivity Function and Least SquareTokihiko Kaburagi, Tetsuro Takano, Yuki Sakamoto. 2194-2197 [doi]
- Modeling source-tract interaction in speech production: Voicing onset vs. vowel height after a voiceless obstruentJorge C. Lucero, Laura L. Koenig, Susanne Fuchs. 2198-2201 [doi]
- Modelling a Noisy-channel for Voice Conversion Using Articulatory FeaturesBajibabu Bollepalli, Alan W. Black, Kishore Prahallad. 2202-2205 [doi]
- Asymmetries in the perception of synthesized speechAnna C. Janska, Erich Schröger, Thomas Jacobsen, Robert A. J. Clark. 2206-2209 [doi]
- Predicting Character-Appropriate Voices for a TTS-based Storyteller SystemErica Greene, Taniya Mishra, Patrick Haffner, Alistair Conkie. 2210-2213 [doi]
- Psychoacoustic Segment Scoring for Multi-Form Speech SynthesisAlexander Sorin, Slava Shechtman, Vincent Pollet. 2214-2217 [doi]
- Pauses and respiratory markers of the structure of book readingGérard Bailly, Cécilia Gouvernayre. 2218-2221 [doi]
- Proper Name Splicing in Computer Games with TTSBlaise Potard, Matthew P. Aylett, Christopher J. Pidcock. 2222-2225 [doi]
- Confidence Measures in Speech Emotion Recognition Based on Semi-supervised LearningJun Deng, Björn Schuller. 2226-2229 [doi]
- Using i-Vector Space Model for Emotion RecognitionRui Xia, Yang Liu. 2230-2233 [doi]
- Cries and Whispers - Classification of Vocal Effort in Expressive SpeechNicolas Obin. 2234-2237 [doi]
- Emotional Speech: A Spectral AnalysisPouria Fewzee, Fakhri Karray. 2238-2241 [doi]
- Classifying Skewed Data: Importance Weighting to Optimize Average RecallAndrew Rosenberg. 2242-2245 [doi]
- Gaze Patterns in Turn-TakingCatharine Oertel, Marcin Wlodarczak, Jens Edlund, Petra Wagner, Joakim Gustafson. 2246-2249 [doi]
- The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewearNatalie Fecher. 2250-2253 [doi]
- A Case Study: Detecting Counselor Reflections in Psychotherapy for Addictions using Linguistic FeaturesDogan Can, Panayiotis G. Georgiou, David C. Atkins, Shrikanth S. Narayanan. 2254-2257 [doi]
- Speaker Clustering for a Mixture of Singing and ReadingMahnoosh Mehrabani, John H. L. Hansen. 2258-2261 [doi]
- Automatic Speech Segmentation Using Probabilistic Latent Component ModelingSayan Ghosh, T. V. Sreenivas. 2262-2265 [doi]
- Overlapping Sound Event Recognition using Local Spectrogram Features with the Generalised Hough TransformJonathan Dennis, Tran Huy Dat, Engsiong Chng. 2266-2269 [doi]
- Automatic Phoneme Segmentation Using Auditory Attention FeaturesOzlem Kalinli. 2270-2273 [doi]
- A Non-Uniform Filterbank for Speaker RecognitionJia Min Karen Kua, Tharmarajah Thiruvaran, Eliathamby Ambikairajah. 2274-2277 [doi]
- Towards an Unsupervised Speaking Style Voice Building Framework: Multi-Style Speaker DiarizationJaime Lorenzo-Trueba, Beatriz Martínez-González, Roberto Barra-Chicote, Verónica López-Ludeña, Javier Ferreiros, Junichi Yamagishi, Juan Manuel Montero. 2278-2281 [doi]
- KNNDIST: A Non-Parametric Distance Measure for Speaker SegmentationSeyed Hamidreza Mohammadi, Hossein Sameti, Mahsa Sadat Elyasi Langarani, Amirhossein Tavanaei. 2282-2285 [doi]
- Lexical Story Co-Segmentation of Chinese Broadcast NewsWei Feng, Xuecheng Nie, Liang Wan, Lei Xie, Jianmin Jiang. 2286-2289 [doi]
- Toward an Optimum Feature Set and HMM Model Parameters for Automatic Phonetic Alignment of Spontaneous SpeechMontri Karnjanadecha, Stephen A. Zahorian. 2290-2293 [doi]
- Spelling as a Complementary Strategy for Speech RecognitionKeith Vertanen, Per Ola Kristensson. 2294-2297 [doi]
- Automatic Error Recovery for Pronunciation DictionariesTim Schlippe, Sebastian Ochs, Ngoc Thang Vu, Tanja Schultz. 2298-2301 [doi]
- Confidence measure for speech indexing based on Latent Dirichlet AllocationGrégory Senay, Georges Linarès. 2302-2305 [doi]
- Mixed probabilistic and deterministic dependency parsingChristophe Cerisara, Alejandra Lorenzo. 2306-2309 [doi]
- Automatic Vocabulary Adaptation Based on Semantic Similarity and Speech Recognition Confidence MeasureShoko Yamahata, Yoshikazu Yamaguchi, Atsunori Ogawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi. 2310-2313 [doi]
- Towards Empirical Dialog-State Modeling and its Use in Language ModelingNigel G. Ward, Alejandro Vega. 2314-2317 [doi]
- Evaluation of Many-to-Many Alignment Algorithm by Automatic Pronunciation Annotation Using Web Text MiningKeigo Kubo, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano. 2318-2321 [doi]
- Applying multiview learning algorithms to human-human conversation classificationSokol Koço, Cécile Capponi, Frédéric Béchet. 2322-2325 [doi]
- Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding TextsYuya Akita, Makoto Watanabe, Tatsuya Kawahara. 2326-2329 [doi]
- Normalization of Text Messages Using Character- and Phone-based Machine Translation ApproachesChen Li, Yang Liu. 2330-2333 [doi]
- A Weighted Combination of Speech with Text-based Models for Arabic DiacritizationAisha S. Azim, Xiaoxuan Wang, Khe Chai Sim. 2334-2337 [doi]
- Using Sub-word-level Information for Confidence Estimation with Conditional Random Field ModelsMatthew Stephen Seigel, Philip C. Woodland. 2338-2341 [doi]
- Supervised Spoken Document Summarization jointly Considering Utterance Importance and Redundancy by Structured Support Vector MachineHung-yi Lee, Yu-Yu Chou, Yow-Bang Wang, Lin-Shan Lee. 2342-2345 [doi]
- Integrating Intra-Speaker Topic Modeling and Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting SummarizationYun-Nung Chen, Florian Metze. 2346-2349 [doi]
- Language Modeling for Voice-Enabled Social TV Using TweetsJunlan Feng, Bernard Renger. 2350-2353 [doi]
- Detecting OOV Named-Entities in Conversational SpeechRohit Kumar, Rohit Prasad, Sankaranarayanan Ananthakrishnan, Aravind Namandi Vembu, David Stallard, Stavros Tsakalidis, Prem Natarajan. 2354-2357 [doi]
- Unsupervised Deep Belief Features for Speech TranslationSameer Maskey, Bowen Zhou. 2358-2361 [doi]
- EuskoParl: a speech and text Spanish-Basque parallel corpusAlicia Pérez, José M. Alcaide, M. Inés Torres. 2362-2365 [doi]
- Comparing transcription agreement on non-native English speech corpus between native and non-native annotatorsHyuksu Ryu, SunHee Kim, Minhwa Chung. 2366-2369 [doi]
- PodCastle: Collaborative Training of Language Models on the Basis of Wisdom of CrowdsJun Ogata, Masataka Goto. 2370-2373 [doi]
- Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation AnalysisLei Xie, Yinqing Xu, Lilei Zheng, Qiang Huang, Bingfeng Li. 2374-2377 [doi]
- Power Mean Pyramid Scores for Summarization EvaluationSameer Maskey, Andrew Rosenberg. 2378-2381 [doi]
- Visualizing tool for evaluating inter-label similarity in prosodic labeling experimentsDavid Escudero Mancebo, Eva Estebas-Vilaplana. 2382-2385 [doi]
- Objective, Subjective and Linguistic Roads to Perceptual Prominence - How are they compared and why?Petra Wagner, Fabio Tamburini, Andreas Windmann. 2386-2389 [doi]
- Audio-visual Evaluation and Detection of Word Prominence in a Human-Machine Interaction ScenarioMartin Heckmann. 2390-2393 [doi]
- Obtaining prominence judgments from naïve listeners - Influence of rating scales, linguistic levels and normalisationDenis Arnold, Petra Wagner, Bernd Möbius. 2394-2397 [doi]
- Towards Hierarchical Prosodic Prominence Generation in TTS SynthesisLeonardo Badino, Robert A. J. Clark. 2398-2401 [doi]
- Investigating syllabic prominence with Conditional Random Fields and Latent-Dynamic Conditional Random FieldsFrancesco Cutugno, Enrico Leone, Bogdan Ludusan, Antonio Origlia. 2402-2405 [doi]
- Disentangling lexical, morphological, syntactic and semantic influences on German prominence - Evidence from a production studyBarbara Samlowski, Petra Wagner, Bernd Möbius. 2406-2409 [doi]
- Using Prominence and Phrasing Predictions to Improve Weighted Dictionary Pronunciation ModelsAndrew Rosenberg. 2410-2413 [doi]
- A Continuous Prominence Score Based On Acoustic FeaturesJean Philippe Goldman, Mathieu Avanzi, Anne-Catherine Simon, Antoine Auchlin. 2414-2417 [doi]
- More on the Normalization of Syllable Prominence RatingsChristopher Sappok, Denis Arnold. 2418-2421 [doi]
- F0 and the Perception of ProminenceTim Mahrt, Jennifer Cole, Margaret M. Fleck, Mark Hasegawa-Johnson. 2422-2425 [doi]
- Language differences in the perceptual weight of prominence-lending propertiesBistra Andreeva, William J. Barry, Magdalena Wolska. 2426-2429 [doi]
- A Novel Confidence Measure Based on Context Consistency for Spoken Term DetectionHaiyang Li, Jiqing Han, Tieran Zheng, Guibin Zheng. 2430-2433 [doi]
- Discriminatively trained phoneme confusion model for keyword spottingPanagiota Karanasou, Lukas Burget, Dimitra Vergyri, Murat Akbacak, Arindam Mandal. 2434-2437 [doi]
- Inverting the Point Process Model for Fast Phonetic Keyword SearchKeith Kintzley, Aren Jansen, Kenneth Church, Hynek Hermansky. 2438-2441 [doi]
- Exploiting Discriminative Point Process Models for Spoken Term DetectionAtta Norouzian, Aren Jansen, Richard C. Rose, Samuel Thomas. 2442-2445 [doi]
- Subword speech recognition for detection of unseen wordsIvan Bulyko, Jose Herrero, Chris Mihelich, Owen Kimball. 2446-2449 [doi]
- OOV Word Detection using Hybrid Models with Mixed Types of FragmentsLong Qin, Alexander I. Rudnicky. 2450-2453 [doi]
- A Conversational Movie Search System Based on Conditional Random FieldsJingjing Liu, Scott Cyphers, Panupong Pasupat, Ian McGraw, Jim Glass. 2454-2457 [doi]
- Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision ProcessTsung-Hsien Wen, Hung-yi Lee, Lin-Shan Lee. 2458-2461 [doi]
- Voice Query RefinementCyril Allauzen, Edward Benson, Ciprian Chelba, Michael Riley, Johan Schalkwyk. 2462-2465 [doi]
- Indexing Raw Acoustic Features for Scalable Zero Resource SearchAren Jansen, Benjamin Van Durme. 2466-2469 [doi]
- Lexical-phonetic automata for spoken utterance indexing and retrievalJulien Fayolle, Murat Saraclar, Fabienne Moreau, Christian Raymond, Guillaume Gravier. 2470-2473 [doi]
- Automating Crowd-supervised Learning for Spoken Language SystemsIan McGraw, Scott Cyphers, Panupong Pasupat, Jingjing Liu, Jim Glass. 2474-2477 [doi]
- An Automatic Child-Directed Speech Detector for the Study of Child Language DevelopmentSoroush Vosoughi, Deb Roy. 2478-2481 [doi]
- Aligning manifolds to model the earliest phonological abstraction in infant-caretaker vocal imitationAndrew Plummer. 2482-2485 [doi]
- The F0 fall delay of lexical pitch accent in Japanese Infant-directed speechYoko Saikachi, Mafuyu Kitahara, Ken'ya Nishikawa, Ai Kanato, Reiko Mazuka. 2486-2489 [doi]
- Children's Productions of Multi-Syllabic Lexical Stress Patterns in Different Prosodic PositionsIrina Shport. 2490-2493 [doi]
- Prosodic Marking of Continuation versus Completion in Children's NarrativesMelissa Redford, Laura Dilley, Jessica Gamache, Elizabeth Wieland. 2494-2497 [doi]
- Judging temporal onset differences for concurrent vowels: Results for young, middle-aged, and older adultsDaniel Fogerty, Diane Kewley-Port, Larry E. Humes. 2498-2501 [doi]
- Combining frame and segment based models for environmental sound classificationPengfei Hu, Wenju Liu, Wei Jiang. 2502-2505 [doi]
- Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event RecognitionYi Ren Leng, Tran Huy Dat. 2506-2509 [doi]
- Goal-Oriented Auditory Scene RecognitionKailash Patil, Mounya Elhilali. 2510-2513 [doi]
- Prof-Life-Log: Audio Environment Detection for Naturalistic Audio StreamsAli Ziaei, Abhijeet Sangwan, John H. L. Hansen. 2514-2517 [doi]
- Pooling Robust Shift-Invariant Sparse Representations of Acoustic SignalsPo-Sen Huang, Jianchao Yang, Mark Hasegawa-Johnson, Feng Liang, Thomas S. Huang. 2518-2521 [doi]