Abstract is missing.
- Language Disorders: Viewpoints on a Complex ObjectGabriele Miceli. [doi]
- From Teleoperated Androids to Cellphones as SurrogatesHiroshi Ishiguro. [doi]
- Speech Technology in (Re)Habilitation of Persons with Communication DisabilitiesBjörn Granström. [doi]
- Signals and SpeechAlex Pentland. 1-4 [doi]
- Skew Gaussian Mixture Models for Speaker RecognitionAvi Matza. 5-8 [doi]
- Towards Goat Detection in Text-Dependent Speaker VerificationOrith Toledo-Ronen, Hagai Aronowitz, Ron Hoory, Jason W. Pelecanos, David Nahamoo. 9-12 [doi]
- Speaker Modeling Using Local Binary DecisionsJean-François Bonastre, Xavier Anguera Miró, Gabriel Hernández Sierra, Pierre-Michel Bousquet. 13-16 [doi]
- New Developments in Voice Biometrics for User AuthenticationHagai Aronowitz, Ron Hoory, Jason W. Pelecanos, David Nahamoo. 17-20 [doi]
- Evaluation of i-vector Speaker Recognition Systems for Forensic ApplicationMiranti Indar Mandasari, Mitchell McLaren, David A. van Leeuwen. 21-24 [doi]
- Mixture of PLDA Models in i-vector Space for Gender-Independent Speaker RecognitionMohammed Senoussaoui, Patrick Kenny, Niko Brümmer, Edward de Villiers, Pierre Dumouchel. 25-28 [doi]
- Segregation of Whispered Speech Interleaved with Noise or Speech MaskersNandini Iyer, Douglas Brungart, Brian D. Simpson. 29-32 [doi]
- Monaural Azimuth Localization Using Spectral Dynamics of SpeechRoi Kliper, Hendrik Kayser, Daphna Weinshall, Israel Nelken, Jörn Anemüller. 33-36 [doi]
- Prediction of Binaural Intelligibility Level Differences in ReverberationJan Rennies, Thomas Brand, Birger Kollmeier. 37-40 [doi]
- Let's All Speak Together! Exploring the Impact of Various Languages on the Comprehension of Speech in Multi-Linguistic BabbleAurore Gautreau, Michel Hoen, Fanny Meunier. 41-44 [doi]
- Cross-Rate Variation in the Intelligibility of Dual-Rate Gated Speech in Older ListenersValeriy Shafiro, Stanley Sheft, Robert Risley. 45-48 [doi]
- An Efferent-Inspired Auditory Model Front-End for Speech RecognitionChia-Ying Lee, James R. Glass, Oded Ghitza. 49-52 [doi]
- A Long-Term Harmonic Plus Noise Model for Speech SignalsFaten Ben Ali, Laurent Girin, Sonia Djaziri Larbi. 53-56 [doi]
- A Frequency Domain Approach to ARX-LF Voiced Speech Parameterization and SynthesisAlan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle. 57-60 [doi]
- Automatic Data-Driven Learning of Articulatory Primitives from Real-Time MRI Data Using Convolutive NMF with Sparseness ConstraintsVikram Ramanarayanan, Athanasios Katsamanis, Shrikanth S. Narayanan. 61-64 [doi]
- Online Pattern Learning for Non-Negative Convolutive Sparse CodingDong Wang, Ravichander Vipperla, Nicholas W. D. Evans. 65-68 [doi]
- Sinewave Representations of NonmodalityNicolas Malyska, Thomas F. Quatieri, Robert B. Dunn. 69-72 [doi]
- Time-Varying Signal Adaptive Transform and IHT Recovery of Compressive Sensed SpeechCh. Srikanth Raj, Thippur V. Sreenivas. 73-76 [doi]
- Acoustic-Linguistic Recognition of Interest in Speech with Bottleneck-BLSTM NetsMartin Wöllmer, Felix Weninger, Florian Eyben, Björn Schuller. 77-80 [doi]
- Automatic Detection of Anger in Human-Human Call Center DialogsMustafa Erden, Levent M. Arslan. 81-84 [doi]
- Improved Classification of Speaking Styles for Mental Health Monitoring Using Phoneme DynamicsKeng-hao Chang, Howard Lei, John Canny. 85-88 [doi]
- "You made me do it": Classification of Blame in Married Couples' Interactions by Fusing Automatically Derived Speech and Language InformationMatthew Black, Panayiotis G. Georgiou, Athanasios Katsamanis, Brian R. Baucom, Shrikanth S. Narayanan. 89-92 [doi]
- Context and Priming Effects in the Recognition of Emotion of Old and Young ListenersMartijn Goudbeek, Marie Nilsenová. 93-96 [doi]
- Acoustic and Prosodic Correlates of Social BehaviorAgustín Gravano, Rivka Levitan, Laura Willson, Stefan Benus, Julia Hirschberg, Ani Nenkova. 97-100 [doi]
- Decision Tree-Based Clustering with Outlier Detection for HMM-Based Speech SynthesisKyung Hwan Oh, June Sig Sung, Doo Hwa Hong, Nam Soo Kim. 101-104 [doi]
- Prediction of Voice Aperiodicity Based on Spectral Representations in HMM Speech SynthesisHanna Silén, Elina Helander, Moncef Gabbouj. 105-108 [doi]
- A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMMTakashi Nose, Takao Kobayashi. 109-112 [doi]
- Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech SynthesisKei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda. 113-116 [doi]
- Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-Based Speech SynthesisZhen-Hua Ling, Korin Richmond, Junichi Yamagishi. 117-120 [doi]
- The Effect of Using Normalized Models in Statistical Speech SynthesisMatt Shannon, Heiga Zen, William J. Byrne. 121-124 [doi]
- Restoring the Residual Speaker Information in Total Variability Modeling for Speaker VerificationCe Zhang, Rong Zheng, Bo Xu. 125-128 [doi]
- New Developments in Joint Factor Analysis for Speaker VerificationHagai Aronowitz, Oren Barkan. 129-132 [doi]
- Speaker Recognition Using Temporal Contours in Linguistic Units: The Case of Formant and Formant-Bandwidth TrajectoriesJoaquin Gonzalez-Rodriguez. 133-136 [doi]
- Discriminatively Trained i-vector Extractor for Speaker VerificationOndrej Glembek, Lukas Burget, Niko Brümmer, Oldrich Plchot, Pavel Matejka. 137-140 [doi]
- Constrained Cepstral Speaker Recognition Using Matched UBM and JFA TrainingMichelle Hewlett Sanchez, Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke. 141-144 [doi]
- A New Perspective on GMM Subspace Compensation Based on PPCA and Wiener FilteringAlan McCree, Douglas E. Sturim, Douglas A. Reynolds. 145-148 [doi]
- Perceptual Learning of LiquidsOdette Scharenborg, Holger Mitterer, James M. McQueen. 149-152 [doi]
- The Efficiency of Cross-Dialectal Word RecognitionAnnelie Tuinman, Holger Mitterer, Anne Cutler. 153-156 [doi]
- Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination TaskMinoru Tsuzaki, Keiichi Tokuda, Hisashi Kawai, Jinfu Ni. 157-160 [doi]
- The Relation Between Perception and Production in L2 Phonological ProcessingSharon Peperkamp, Camillia Bouchon. 161-164 [doi]
- The Role of Word-Initial Glottal Stops in Recognizing English WordsMaria Paola Bissiri, Maria Luisa Garcia Lecumberri, Martin Cooke, Jan Volín. 165-168 [doi]
- Effect of Language Experience on the Categorical Perception of Cantonese Vowel DurationCaicai Zhang, Gang Peng, William S.-Y. Wang. 169-172 [doi]
- Adaptive Estimation of Zeros of Time-Varying Z-TransformsChristian Fischer Pedersen, Ove Andersen, Paul Dalsgaard. 173-176 [doi]
- Identifying Regions of Non-Modal Phonation Using Features of the Wavelet TransformJohn Kane, Christer Gobl. 177-180 [doi]
- Acoustic Analysis of Whispered Speech for Phoneme and Speaker DependencyXing Fan, Keith W. Godin, John H. L. Hansen. 181-184 [doi]
- Multi-Party Speech Recovery Exploiting Structured Sparsity ModelsAfsaneh Asaei, Mohammad Javad Taghizadeh, Hervé Bourlard, Volkan Cevher. 185-188 [doi]
- Modulation Spectrum Analysis for Recognition of Reverberant SpeechSri Harish Reddy Mallidi, Sriram Ganapathy, Hynek Hermansky. 189-192 [doi]
- Discrete Choice Models for Non-Intrusive Quality AssessmentPetko N. Petkov, W. Bastiaan Kleijn, Bert de Vries. 193-196 [doi]
- Single Channel Dereverberation Using Example-Based Speech Enhancement with Uncertainty Decoding TechniqueKeisuke Kinoshita, Mehrez Souden, Marc Delcroix, Tomohiro Nakatani. 197-200 [doi]
- A Statistical Room Impulse Response Model with Frequency Dependent Reverberation Time for Single-Microphone Late Reverberation SuppressionJan S. Erkelens, Richard Heusdens. 201-204 [doi]
- An Assessment of the Improvement Potential of Time-Frequency Masking for Speech DereverberationChenxi Zheng, Tiago H. Falk, Wai-Yip Chan. 205-208 [doi]
- Perceptual Improvement of a Two-Stage Algorithm for Speech DereverberationThiago de M. Prego, Amaro A. de Lima, Sergio L. Netto. 209-212 [doi]
- A Model-Based Spectral Envelope Wiener Filter for Perceptually Motivated Speech EnhancementNajib Hadir, Friedrich Faubel, Dietrich Klakow. 213-216 [doi]
- Binaural Noise-Reduction Method Based on Blind Source Separation and Perceptual Post ProcessingJorge I. Marin-Hurtado, Devangi N. Parikh, David V. Anderson. 217-220 [doi]
- Region Dependent Transform on MLP Features for Speech RecognitionTim Ng, Bing Zhang 0004, Spyridon Matsoukas, Long Nguyen. 221-224 [doi]
- Discriminant Sub-Space Projection of Spectro-Temporal Speech Features Based on Maximizing Mutual InformationMartin Heckmann, Claudius Gläser. 225-228 [doi]
- Combining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech RecognitionTakashi Fukuda, Osamu Ichikawa, Masafumi Nishimura. 229-232 [doi]
- Improved Bottleneck Features Using Pretrained Deep Neural NetworksDong Yu, Michael L. Seltzer. 237-240 [doi]
- Minimum Classification Error Based Spectro-Temporal Feature Extraction for Robust Audio ClassificationYuan-Fu Liao, Chia-Hsing Lin, We-Der Fang. 241-244 [doi]
- Data-Driven Gaussian Component Selection for Fast GMM-Based Speaker VerificationCe Zhang, Rong Zheng, Bo Xu. 245-248 [doi]
- Analysis of i-vector Length Normalization in Speaker Recognition SystemsDaniel Garcia-Romero, Carol Y. Espy-Wilson. 249-252 [doi]
- An Analysis Framework Based on Random Subspace Sampling for Speaker VerificationWeiwu Jiang, Zhifeng Li, Helen M. Meng. 253-256 [doi]
- Factor Analysis Back Ends for MLLR Transforms in Speaker RecognitionNicolas Scheffer, Yun Lei, Luciana Ferrer. 257-260 [doi]
- Report on Performance Results in the NIST 2010 Speaker Recognition EvaluationCraig S. Greenberg, Alvin F. Martin, Bradford Barr, George R. Doddington. 261-264 [doi]
- iVector Fusion of Prosodic and Cepstral Features for Speaker VerificationMarcel Kockmann, Luciana Ferrer, Lukas Burget, Jan Cernocký. 265-268 [doi]
- Visualization of Vocal Tract Shape Using Interleaved Real-Time MRI of Multiple Scan PlanesYoon-Chul Kim, Michael I. Proctor, Shrikanth S. Narayanan, Krishna S. Nayak. 269-272 [doi]
- Biomechanical Tongue Models: An Approach to Studying Inter-Speaker VariabilityRalf Winkler, Susanne Fuchs, Pascal Perrier, Mark Tiede. 273-276 [doi]
- Quantifying Articulatory Distinctiveness of VowelsJun Wang, Jordan R. Green, Ashok Samal, David Marx. 277-280 [doi]
- Direct Estimation of Articulatory Kinematics from Real-Time Magnetic Resonance Image SequencesMichael I. Proctor, Adam C. Lammert, Athanasios Katsamanis, Louis M. Goldstein, Christina Hagedorn, Shrikanth S. Narayanan. 281-284 [doi]
- Combined Optical Distance Sensing and Electropalatography to Measure ArticulationPeter Birkholz, Christiane Neuschaefer-Rube. 285-288 [doi]
- Simulating Post-L F0 Bouncing by Modeling Articulatory DynamicsSantitham Prom-on, Yi Xu, Fang Liu. 289-292 [doi]
- Learning New Acoustic Events in an HMM-Based System Using MAP AdaptationJürgen T. Geiger, Mohamed Anouar Lakhal, Björn Schuller, Gerhard Rigoll. 293-296 [doi]
- Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event RecognitionYi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li. 297-300 [doi]
- Evaluation of Abnormal Sound Detection using Multi-Stage GMM in Various EnvironmentsAkinori Ito, Akihito Aiba, Masashi Ito, Shozo Makino. 301-304 [doi]
- Unsupervised Learning of Acoustic Events Using Dynamic Time Warping and Hierarchical K-Means++ ClusteringJoerg Schmalenstroeer, Markus Bartek, Reinhold Haeb-Umbach. 305-308 [doi]
- Feature Extraction Assessment for an Acoustic-Event Classification Task Using the Entropy TriangleDavid Mejía-Navarrete, Ascensión Gallardo-Antolín, Carmen Peláez-Moreno, Francisco J. Valverde-Albacete. 309-312 [doi]
- Unsupervised Audio Analysis for Categorizing Heterogeneous Consumer Domain VideosPradeep Natarajan, Stavros Tsakalidis, Vasant Manohar, Rohit Prasad, Premkumar Natarajan. 313-316 [doi]
- Enriching Text-to-Speech Synthesis Using Automatic Dialog Act TagsVivek Kumar Rangarajan Sridhar, Ann K. Syrdal, Alistair Conkie, Srinivas Bangalore. 317-320 [doi]
- Joint Target and Join Cost Weight Training for Unit Selection SynthesisLukas Latacz, Wesley Mattheyses, Werner Verhelst. 321-324 [doi]
- Prominence-Based Prosody Prediction for Unit Selection Speech SynthesisAndreas Windmann, Igor Jauk, Fabio Tamburini, Petra Wagner. 325-328 [doi]
- Evaluating the Meaning of Synthesized Listener VocalizationsSathish Pammi, Marc Schröder. 329-332 [doi]
- A Hybrid TTS Approach for Prosody and Acoustic ModulesIñaki Sainz, Daniel Erro, Eva Navas, Inma Hernáez. 333-336 [doi]
- Uniform Speech Parameterization for Multi-Form Segment SynthesisAlexander Sorin, Slava Shechtman, Vincent Pollet. 337-340 [doi]
- Theoretical Analysis of Musical Noise and Speech Distortion in Structure-Generalized Parametric Blind Spatial Subtraction ArrayRyoichi Miyazaki, Hiroshi Saruwatari, Kiyohiro Shikano. 341-344 [doi]
- Subjective and Objective Evaluation of Speech Intelligibility Enhancement Under Constant Energy and Duration ConstraintsYan Tang, Martin Cooke. 345-348 [doi]
- A Risk-Estimation-Based Comparison of Mean Square Error and Itakura-Saito Distortion Measures for Speech EnhancementNagarjuna Reddy Muraka, Chandra Sekhar Seelamantula. 349-352 [doi]
- On Noise Tracking for Noise Floor EstimationMahdi Triki. 353-356 [doi]
- Maximum a posteriori Estimation of Noise from Non-Acoustic Reference Signals in Very Low Signal-to-Noise Ratio EnvironmentsBen Milner. 357-360 [doi]
- Blind Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude EstimatorRyo Wakisaka, Hiroshi Saruwatari, Kiyohiro Shikano, Tomoya Takatani. 361-364 [doi]
- Harmonic Structure Transform for Speaker RecognitionKornel Laskowski, Qin Jin. 365-368 [doi]
- Combining Evidence from Spectral and Source-Like Features for Person Recognition from HummingHemant A. Patil, Maulik C. Madhavi, Keshab K. Parhi. 369-372 [doi]
- Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise ModelYanhua Long, Zhi-Jie Yan, Frank K. Soong, Li-Rong Dai, Wu Guo. 373-376 [doi]
- Implicit Segmentation in Two-Wire Speaker RecognitionYosef A. Solewicz, Hagai Aronowitz. 377-380 [doi]
- Boosting Speaker Recognition Performance with Compact RepresentationsSibel Yaman, Jason W. Pelecanos, Mohamed Kamal Omar. 381-384 [doi]
- Partitioning of Two-Speaker Conversation DatasetsCarlos Vaquero, Alfonso Ortega, Eduardo Lleida. 385-388 [doi]
- Jaw Movement in Vowels and Liquids Forming the Syllable NucleusStefan Benus, Marianne Pouplier. 389-392 [doi]
- Coarticulation Across Prosodic Domains in Italian: An Ultrasound InvestigationBarbara Gili Fivela, Antonio Stella, Sonia D'Apolito, Francesco Sigona. 393-396 [doi]
- Investigating the Stability of Intergestural Timing RelationsJuraj Simko, Fred Cummins, Stefan Benus. 397-400 [doi]
- Speech Timing Organization for the Phonological Length Contrast in Italian ConsonantsClaudio Zmarich, Barbara Gili Fivela, Pascal Perrier, Christophe Savariaux, Graziano Tisato. 401-404 [doi]
- Timing in Italian VNC Sequences at Different Speech RatesChiara Celata, Silvia Calamai. 405-408 [doi]
- Automatic Analysis of Singleton and Geminate Consonant Articulation Using Real-Time Magnetic Resonance ImagingChristina Hagedorn, Michael I. Proctor, Louis Goldstein. 409-412 [doi]
- A Two-Stage Sample-Based Phone Boundary Detector Using Segmental Similarity FeaturesYih-Ru Wang. 413-416 [doi]
- Iterative Improvement of Speaker Segmentation in a Noisy Environment Using High-Level KnowledgeQiang Huang, Stephen J. Cox. 417-420 [doi]
- Hierarchical Audio Segmentation with HMM and Factor Analysis in Broadcast News DomainDiego Castán, Carlos Vaquero, Alfonso Ortega, David Martínez González, Jesús A. Villalba, Eduardo Lleida. 421-424 [doi]
- Syllable Segmentation of Continuous Speech Using Auditory Attention CuesOzlem Kalinli. 425-428 [doi]
- Exploiting Phone-Class Specific Landmarks for Refinement of Segment Boundaries in TTS DatabasesVijayaditya Peddinti, Kishore Prahallad. 429-432 [doi]
- Phoneme-Level Text to Audio Synchronization on Speech Signals with Background MusicAgnès Pedone, Juan José Burred, Simon Maller, Pierre Leveau. 433-436 [doi]
- Conversational Speech Transcription Using Context-Dependent Deep Neural NetworksFrank Seide, Gang Li, Dong Yu. 437-440 [doi]
- Sequential Classification Criteria for NNs in Automatic Speech RecognitionGuangsen Wang, Khe Chai Sim. 441-444 [doi]
- Grapheme-Based Automatic Speech Recognition Using KL-HMMMathew Magimai-Doss, Ramya Rasipuram, Guillermo Aradilla, Hervé Bourlard. 445-448 [doi]
- Direct Error Rate Minimization of Hidden Markov ModelsJoseph Keshet, Chih-Chieh Cheng, Mark Stoehr, David A. McAllester. 449-452 [doi]
- On the Effectiveness of Statistical Modeling Based Template Matching Approach for Continuous Speech RecognitionXie Sun, Xin Chen, Yunxin Zhao. 453-456 [doi]
- Comparison of Smoothing Techniques for Robust Context Dependent Acoustic Modelling in Hybrid NN/HMM SystemsGuangsen Wang, Khe Chai Sim. 457-460 [doi]
- Propagation of Uncertainty Through Multilayer Perceptrons for Robust Automatic Speech RecognitionRamón Fernandez Astudillo, João Paulo da Silva Neto. 461-464 [doi]
- Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech RecognitionKatariina Mahkonen, Antti Hurmalainen, Tuomas Virtanen, Jort F. Gemmeke. 465-468 [doi]
- Uncertainty Measures for Improving Exemplar-Based Source SeparationHeikki Kallasjoki, Ulpu Remes, Jort F. Gemmeke, Tuomas Virtanen, Kalle J. Palomäki. 469-472 [doi]
- Maximum Confidence Measure Based Interaural Phase Difference Estimation for Noise Masking in Dual-Microphone Robust Speech RecognitionHsien-Cheng Liao, Yuan-Fu Liao, Chin-Hui Lee. 473-476 [doi]
- A Performance Monitoring Approach to Fusing Enhanced Spectrogram Channels in Robust Speech RecognitionShirin Badiezadegan, Richard C. Rose. 477-480 [doi]
- Generalized Variable Parameter HMMs for Noise Robust Speech RecognitionNing Cheng, Xunying Liu, Lan Wang. 481-484 [doi]
- Intersession Compensation and Scoring Methods in the i-vectors Space for Speaker RecognitionPierre-Michel Bousquet, Driss Matrouf, Jean-François Bonastre. 485-488 [doi]
- Kernel Alignment Maximization for Speaker Recognition Based on High-Level FeaturesSzymon Drgas, Adam Dabrowski. 489-492 [doi]
- Kernel Partial Least Squares for Speaker RecognitionBalaji Vasan Srinivasan, Daniel Garcia-Romero, Dmitry N. Zotkin, Ramani Duraiswami. 493-496 [doi]
- Conversational-Side-Specific Inter-Session Variability CompensationMohamed Kamal Omar, Jason W. Pelecanos. 497-500 [doi]
- A Speaker Line-Up for the Likelihood RatioDavid A. van Leeuwen, Niko Brümmer. 501-504 [doi]
- Towards Fully Bayesian Speaker Recognition: Integrating Out the Between-Speaker CovarianceJesús A. Villalba, Niko Brümmer. 505-508 [doi]
- Novel VTEO Based Mel Cepstral Features for Classification of Normal and Pathological VoicesHemant A. Patil, Pallavi N. Baljekar. 509-512 [doi]
- Temporal Performance of Dysarthric Patients in Speech and Tapping TasksEiji Shimura, Kazuhiko Kakehi. 513-516 [doi]
- A Comparative Acoustic Study on Speech of Glossectomy Patients and Normal SubjectsXinhui Zhou, Maureen Stone, Carol Y. Espy-Wilson. 517-520 [doi]
- Dysperiodicity Analysis of Perceptually Assessed Synthetic Speech StimuliAli Alpan, Francis Grenez, Jean Schoentgen. 521-524 [doi]
- Is the Perception of Voice Quality Language-Dependant? A Comparison of French and Italian Listeners and Dysphonic SpeakersAlain Ghio, Frédérique Weisz, Giovanna Baracca, Giovanna Cantarella, Danièle Robert, Virginie Woisard, Franco Fussi, Antoine Giovanni. 525-528 [doi]
- Automatic Selection of Acoustic and Non-Linear Dynamic Features in Voice Signals for Hypernasality DetectionJuan R. Orozco-Arroyave, S. Murillo Rendón, Andrés Marino Álvarez-Meza, Julián D. Arias-Londoño, Edilson Delgado-Trejos, Jesus Francisco Vargas Bonilla, César Germán Castellanos-Domínguez. 529-532 [doi]
- Learning from Mistakes: Expanding Pronunciation Lexicons Using Word Recognition ErrorsSravana Reddy, Evandro B. Gouvêa. 533-536 [doi]
- Improving Non-Native ASR Through Stochastic Multilingual Phoneme Space TransformationsDavid Imseng, Hervé Bourlard, John Dines, Philip N. Garner, Mathew Magimai-Doss. 537-540 [doi]
- Unsupervised Arabic Dialect Adaptation with Self-TrainingScott Novotney, Richard M. Schwartz, Sanjeev Khudanpur. 541-544 [doi]
- Template-Based Automatic Speech Recognition Meets ProsodyDino Seppi, Kris Demuynck, Dirk Van Compernolle. 545-548 [doi]
- Pronunciation Learning from Continuous SpeechIbrahim Badr, Ian McGraw, James R. Glass. 549-552 [doi]
- State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMsYanmin Qian, Daniel Povey, Jia Liu. 553-560 [doi]
- Blind Speech Separation in Multiple Environments Using a Frequency Oriented PCA Method for Convolutive MixturesYasmina Benabderrahmane, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy. 557-560 [doi]
- Blind Speech Separation in Time-Domain Using Block-Toeplitz Structure of Reconstructed Signal MatricesZbynek Koldovský, Jirí Málek, Petr Tichavský. 561-564 [doi]
- Generalized Method for Solving the Permutation Problem in Frequency-Domain Blind Source Separation of Convolved Speech SignalsAuxiliadora Sarmiento, Iván Durán-Díaz, Sergio Cruces, Pablo Aguilera. 565-568 [doi]
- Adaptation of Speaker-Specific Bases in Non-Negative Matrix Factorization for Single Channel Speech-Music SeparationEmad M. Grais, Hakan Erdogan. 569-572 [doi]
- An Informed Source Separation System for Speech SignalsShuhua Zhang, Laurent Girin. 573-576 [doi]
- Adaptive Blocking Beamformer for Speech SeparationNgoc Thuy Tran, William G. Cowley, André Pollok. 577-580 [doi]
- Asynchronous Multimodal Text Entry Using Speech and Gesture KeyboardsPer Ola Kristensson, Keith Vertanen. 581-584 [doi]
- Robust Bimodal Person Identification Using Face and Speech with Limited Training Data and Corruption of Both ModalitiesNiall McLaughlin, Ji Ming, Danny Crookes. 585-588 [doi]
- Toward a Multi-Speaker Visual Articulatory Feedback SystemAtef Ben Youssef, Thomas Hueber, Pierre Badin, Gérard Bailly. 589-592 [doi]
- Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-Based Silent Speech InterfaceThomas Hueber, Elie-Laurent Benaroya, Bruce Denby, Gérard Chollet. 593-596 [doi]
- Unsupervised Geometry Calibration of Acoustic Sensor Networks Using Source CorrespondencesJoerg Schmalenstroeer, Florian Jacob, Reinhold Haeb-Umbach, Marius H. Hennecke, Gernot A. Fink. 597-600 [doi]
- Investigations on Speaking Mode Discrepancies in EMG-Based Speech RecognitionMichael Wand, Matthias Janke, Tanja Schultz. 601-604 [doi]
- Empirical Evaluation and Combination of Advanced Language Modeling TechniquesTomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukás Burget, Jan Cernocký. 605-608 [doi]
- Personalizing Model M for Voice-SearchGeoffrey Zweig, Shuangyu Chang. 609-612 [doi]
- Sentence Selection by Direct Likelihood Maximization for Language Model AdaptationTakahiro Shinozaki, Yu Kubota, Sadaoki Furui, Eiji Utsunomiya, Yasutaka Shindoh. 613-616 [doi]
- Feature Combination Approaches for Discriminative Language ModelsEbru Arisoy, Bhuvana Ramabhadran, Hong-Kwang Jeff Kuo. 617-620 [doi]
- On-Line Language Model Biasing for Multi-Pass Automatic Speech RecognitionSankaranarayanan Ananthakrishnan, Stavros Tsakalidis, Rohit Prasad, Premkumar Natarajan. 621-624 [doi]
- Mandarin Word-Character Hybrid-Input Neural Network Language ModelMoonyoung Kang, Tim Ng, Long Nguyen. 625-628 [doi]
- Laryngealization and Breathiness in PersianVahid Sadeghi. 629-632 [doi]
- Age-Dependent Differences in the Neutralization of the Intervocalic Voicing Contrast: Evidence from an Apparent-Time Study on East FranconianViola Müller, Jonathan Harrington, Felicitas Kleber, Ulrich Reubold. 633-636 [doi]
- Comparing Syllable Frequencies in Corpora of Written and Spoken LanguageBarbara Samlowski, Bernd Möbius, Petra Wagner. 637-640 [doi]
- Sylli: Automatic Phonological Syllabification for ItalianLuca Iacoponi, Renata Savy. 641-644 [doi]
- A Preliminary Study on the Production of Signs in Brazilian Sign Language when One of the Manual Articulators is UnavailableAndré N. Xavier, Plínio A. Barbosa. 645-648 [doi]
- Electroglottograph and Acoustic Cues for Phonation Contrasts in Taiwan Min Falling TonesHo-hsien Pan, Mao-Hsu Chen, Shao-Ren Lyu. 649-652 [doi]
- One-to-Many Voice Conversion Based on Tensor Representation of Speaker SpaceDaisuke Saito, Keisuke Yamamoto, Nobuaki Minematsu, Keikichi Hirose. 653-656 [doi]
- A Study on Bag of Gaussian Model with Application to Voice ConversionYu Qiao, Tong Tong, Nobuaki Minematsu. 657-660 [doi]
- A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model StructuresLei Li, Yoshihiko Nankaku, Keiichi Tokuda. 661-664 [doi]
- Quality Improvement of Voice Conversion Systems Based on Trellis Structured Vector QuantizationMahdi Eslami, Hamid Sheikhzadeh, Abolghasem Sayadiyan. 665-668 [doi]
- Voice Conversion Using GMM with Enhanced Global VarianceHadas Benisty, David Malah. 669-672 [doi]
- Spectral Envelope Transformation Using DFW and Amplitude Scaling for Voice Conversion with Parallel or Nonparallel CorporaElizabeth Godoy, Olivier Rosec, Thierry Chonavel. 673-676 [doi]
- Sinusoidal Approach for the Single-Channel Speech Separation and Recognition ChallengePejman Mowlaee, Rahim Saeidi, Zheng-Hua Tan, Mads Græsbøll Christensen, Tomi Kinnunen, Pasi Fränti, Søren Holdt Jensen. 677-680 [doi]
- Semi-Supervised Single-Channel Speech-Music Separation for Automatic Speech RecognitionCemil Demir, A. Taylan Cemgil, Murat Saraclar. 681-684 [doi]
- A Level-Dependent Auditory Filter-Bank for Speech Recognition in Reverberant EnvironmentsHari Krishna Maganti, Marco Matassoni. 685-688 [doi]
- A Multichannel Feature-Based Processing for Robust Speech RecognitionMehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani. 689-692 [doi]
- Feature Normalization Using Structured Full Transforms for Robust Speech RecognitionXiong Xiao, Jinyu Li, Chng Eng Siong, Haizhou Li. 693-696 [doi]
- A Robust Estimation Method of Noise Mixture Model for Noise SuppressionMasakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani. 697-700 [doi]
- Multi-Task Learning for Spoken Language Understanding with Shared SlotsXiao Li, Ye-Yi Wang, Gökhan Tür. 701-704 [doi]
- Learning Weighted Entity Lists from Web Click Logs for Spoken Language UnderstandingDustin Hillard, Asli Çelikyilmaz, Dilek Z. Hakkani-Tür, Gökhan Tür. 705-708 [doi]
- Bootstrapping Domain Detection Using Query Click Logs for New DomainsDilek Z. Hakkani-Tür, Gökhan Tür, Larry P. Heck, Elizabeth Shriberg. 709-712 [doi]
- Approximate Inference for Domain Detection in Spoken Language UnderstandingAsli Çelikyilmaz, Dilek Z. Hakkani-Tür, Gökhan Tür. 713-716 [doi]
- Speech Indexing Using Semantic Context InferenceChien-Lin Huang, Bin Ma, Haizhou Li, Chung-Hsien Wu. 717-720 [doi]
- Automatically Optimizing Utterance Classification Performance without Human in the LoopYun-Cheng Ju, Jasha Droppo. 721-724 [doi]
- In Search of Cues Discriminating West-African Accents in FrenchPhilippe Boula de Mareüil, Jean-Luc Rouas, Manuela Yapomo. 725-728 [doi]
- Computer and Human Recognition of Regional Accents of British EnglishAbualsoud Hanani, Martin J. Russell, Michael J. Carey 0002. 729-732 [doi]
- Target-Aware Lattice Rescoring for Dialect RecognitionRong Tong, Bin Ma, Haizhou Li, Chng Eng Siong. 733-736 [doi]
- Effective Arabic Dialect Classification Using Diverse Phonotactic ModelsMurat Akbacak, Dimitra Vergyri, Andreas Stolcke, Nicolas Scheffer, Arindam Mandal. 737-740 [doi]
- Characterizing Deletion Transformations Across Dialects Using a Sophisticated Tying MechanismNancy F. Chen, Wade Shen, Joseph P. Campbell. 741-744 [doi]
- Dialect and Accent Recognition Using Phonetic-Segmentation SupervectorsFadi Biadsy, Julia Hirschberg, Daniel P. W. Ellis. 745-748 [doi]
- The Multi Timescale Phoneme Acquisition Model of the Self-Organizing Based on the Dynamic FeaturesKouki Miyazawa, Hideaki Miura, Hideaki Kikuchi, Reiko Mazuka. 749-752 [doi]
- The Time-Course of Talker-Specificity Effects for Newly-Learned Pseudowords: Evidence for a Hybrid Model of Lexical RepresentationHelen Brown, M. Gareth Gaskell. 753-756 [doi]
- A Parametric Approach to Intonation Acquisition Research: Validation on Child-Directed Speech DataBritta Lintfert, Antje Schweitzer, Bernd Möbius. 757-760 [doi]
- Modelling Novelty Preference in Word LearningMaarten Versteegh, Louis ten Bosch, Lou Boves. 761-764 [doi]
- Using Imitation to Learn Infant-Adult Acoustic MappingsG. Ananthakrishnan, Giampiero Salvi. 765-768 [doi]
- Thresholding Word Activations for Response Scoring - Modelling Psycholinguistic DataChristina Bergmann, Louis ten Bosch, Lou Boves. 769-772 [doi]
- Generalized Baum-Welch Algorithm and its Implication to a New Extended Baum-Welch AlgorithmRoger Hsiao, Tanja Schultz. 773-776 [doi]
- Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text SystemsFrank Diehl, Mark John Francis Gales, Xunying Liu, Marcus Tomalin, Philip C. Woodland. 777-780 [doi]
- A Fully Automated Derivation of State-Based Eigentriphones for Triphone Modeling with No Tied States Using RegularizationTom Ko, Brian Mak. 781-784 [doi]
- Reducing Computational Complexities of Exemplar-Based Sparse Representations with Applications to Large Vocabulary Speech RecognitionTara N. Sainath, Bhuvana Ramabhadran, David Nahamoo, Dimitri Kanevsky. 785-788 [doi]
- An i-vector Based Approach to Training Data Clustering for Improved Speech RecognitionYu Zhang, Jian Xu, Zhi-Jie Yan, Qiang Huo. 789-792 [doi]
- Rapid Training of Acoustic Models Using Graphics Processing UnitSenaka Buthpitiya, Ian R. Lane, Jike Chong. 793-796 [doi]
- User Study of Spoken Decision Support SystemTeruhisa Misu, Kiyonori Ohtake, Chiori Hori, Hisashi Kawai, Satoshi Nakamura. 797-800 [doi]
- Efficient Probabilistic Tracking of User Goal and Dialog History for Spoken Dialog SystemsAntoine Raux, Yi Ma. 801-804 [doi]
- Tackling a Shilly-Shally Classifier for Predicting Task Success in Spoken Dialogue InteractionAlexander Schmitt, Alexander Zgorzelski, Wolfgang Minker. 805-808 [doi]
- Evaluation of Listening-Oriented Dialogue Control Rules Based on the Analysis of HMMsToyomi Meguro, Yasuhiro Minami, Ryuichiro Higashinaka, Kohji Dohsaka. 809-812 [doi]
- Large-Scale Experiments on Data-Driven Design of Commercial Spoken Dialog SystemsDavid Suendermann, Jackson Liscombe, Jonathan Bloom, Grace Li, Roberto Pieraccini. 813-816 [doi]
- Comparing System-Driven and Free Dialogue in In-Vehicle InteractionFredrik Kronlid, Jessica Villing, Alexander Berman, Staffan Larsson. 817-820 [doi]
- Rapid Evaluation of Speech Representations for Spoken Term DiscoveryMichael A. Carlin, Samuel Thomas, Aren Jansen, Hynek Hermansky. 821-824 [doi]
- Phonemic Similarity Metrics to Compare Pronunciation MethodsBen Hixon, Eric Schneider, Susan L. Epstein. 825-828 [doi]
- Investigating the Effect of Number of Interlocutors on the Quality of Experience for Multi-Party Audio ConferencingJanto Skowronek, Alexander Raake. 829-832 [doi]
- On Development of Consistently Punctuated Speech CorporaJáchym Kolár, Lori Lamel. 833-836 [doi]
- A Multimodal Real-Time MRI Articulatory Corpus for Speech ResearchShrikanth Narayanan, Erik Bresch, Prasanta Kumar Ghosh, Louis Goldstein, Athanasios Katsamanis, Yoon Kim, Adam C. Lammert, Michael I. Proctor, Vikram Ramanarayanan, Yinghua Zhu. 837-840 [doi]
- Building an Audio-Visual Corpus of Australian English: Large Corpus Collection with an Economical Portable and Replicable Black BoxDenis Burnham, Dominique Estival, Steven Fazio, Jette Viethen, Felicity Cox, Robert Dale, Steve Cassidy, Julien Epps, Roberto Togneri, Michael Wagner, Yuko Kinoshita, Roland Göcke, Joanne Arciuli, Mark Onslow, Trent Lewis, Andrew Butcher, John Hajek. 841-844 [doi]
- Data-Driven UBM Generation via Tied Gaussians for GMM-Supervector Based Accent IdentificationRong Zheng, Ce Zhang, Bo Xu. 845-848 [doi]
- I3A Language Recognition System for Albayzin 2010 LREDavid Martínez González, Jesús A. Villalba, Antonio Miguel, Alfonso Ortega, Eduardo Lleida. 849-852 [doi]
- Dimensionality Reduction for Using High-Order n-Grams in SVM-Based Phonotactic Language RecognitionMikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez, Germán Bordel. 853-856 [doi]
- Language Recognition via i-vectors and Dimensionality ReductionNajim Dehak, Pedro A. Torres-Carrasquillo, Douglas A. Reynolds, Réda Dehak. 857-860 [doi]
- Language Recognition in iVectors SpaceDavid Martínez González, Oldrich Plchot, Lukás Burget, Ondrej Glembek, Pavel Matejka. 861-864 [doi]
- On Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT)Xiaojun Qian, Helen M. Meng, Frank K. Soong. 865-868 [doi]
- Validating a Second Language Perception Model for Classroom Context - A Longitudinal Study within the Perceptual Assimilation ModelBianca Sisinni, Mirko Grimaldi. 869-872 [doi]
- The Role of Variability in Non-Native Perceptual Learning of a Japanese Geminate-Singleton Fricative ContrastMakiko Sadakata, James M. McQueen. 873-876 [doi]
- Fluency Changes with General Progress in L2 ProficiencyJared Bernstein, Jian Cheng, Masanori Suzuki. 877-880 [doi]
- Tongue Gestures Awareness and Pronunciation TrainingSlim Ouni. 881-844 [doi]
- Impact of Speaker Variability on Speech Perception in Non-Native ListenersWim A. van Dommelen, Valérie Hazan. 885-888 [doi]
- A Template Based Voice Trigger System Using Bhattacharyya Edit DistanceEvelyn Kurniawati, Samsudin Ng, Karthik Muralidhar, Sapna George. 889-892 [doi]
- Acoustic Look-Ahead for More Efficient Decoding in LVCSRDavid Nolden, Ralf Schlüter, Hermann Ney. 893-896 [doi]
- A New Epsilon Filter for Efficient Composition of Weighted Finite-State TransducersFrank Duckhorn, Matthias Wolff, Rüdiger Hoffmann. 897-900 [doi]
- A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State MachinesSabato Marco Siniscalchi, Torbjørn Svendsen, Chin-Hui Lee. 901-904 [doi]
- Combining Information Sources for Confidence Estimation with CRF ModelsMatthew Stephen Seigel, Philip C. Woodland. 905-908 [doi]
- Evaluation of Fast Spoken Term Detection Using a Suffix ArrayKouichi Katsurada, Shinta Sawada, Shigeki Teshima, Yurie Iribe, Tsuneo Nitta. 909-912 [doi]
- Latent Topic Modeling for Audio Corpus SummarizationTimothy J. Hazen. 913-916 [doi]
- Investigation of Spontaneous Speech Characterization Applied to Speaker Role RecognitionRichard Dufour, Yannick Estève, Paul Deléglise. 917-920 [doi]
- Zero-Resource Audio-Only Spoken Term Detection Based on a Combination of Template Matching TechniquesArmando Muscariello, Guillaume Gravier, Frédéric Bimbot. 921-924 [doi]
- Automatic Learning in Content Indexing Service Using Phonetic AlignmentYeon-Jun Kim, David C. Gibbon. 925-928 [doi]
- Leveraging Relevance Cues for Improved Spoken Document RetrievalPei-Ning Chen, Kuan-Yu Chen, Berlin Chen. 929-932 [doi]
- Spoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically Extracted Key TermsYun-Nung Chen, Yu Huang, Ching-feng Yeh, Lin-Shan Lee. 933-936 [doi]
- Speaker Diarization Using a priori Acoustic InformationHagai Aronowitz. 937-940 [doi]
- Improved Overlapped Speech Handling for Speaker DiarizationKofi Boakye, Oriol Vinyals, Gerald Friedland. 941-944 [doi]
- Exploiting Intra-Conversation Variability for Speaker DiarizationStephen Shum, Najim Dehak, Ekapol Chuangsuwanich, Douglas A. Reynolds, James R. Glass. 945-948 [doi]
- Speaker Clustering Based on Non-Negative Matrix FactorizationMasafumi Nishida, Seiichi Yamamoto. 949-952 [doi]
- Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings RecordingsSree Harsha Yella, Fabio Valente. 953-956 [doi]
- Cross Likelihood Ratio Based Speaker Clustering Using Eigenvoice ModelsDavid Wang, Robbie Vogt, Sridha Sridharan, David Dean. 957-960 [doi]
- A Quantitative Investigation of the Prosody of Verum Focus in ItalianGiuseppina Turco, Michele Gubian, Jessamyn Schertz. 961-964 [doi]
- Effects of Focus on f0 and Duration in Irish (Gaelic) DeclarativesAmelie Dorn, Ailbhe Ní Chasaide. 965-968 [doi]
- The Phonology and Phonetics of Perceived Prosody: What do Listeners Imitate?Jennifer Cole, Stefanie Shattuck-Hufnagel. 969-972 [doi]
- Uncovering the Effect of Imitation on Tonal Patterns of French Accentual PhrasesAmandine Michelas, Noël Nguyen. 973-976 [doi]
- Crossmodal Prosodic and Gestural Contribution to the Perception of Contrastive FocusPilar Prieto, Cecilia Pugliesi, Joan Borràs-Comes, Ernesto Arroyo, Josep Blat. 977-980 [doi]
- Temporal Relationship Between Auditory and Visual Prosodic CuesErin Cvejic, Jeesun Kim, Chris Davis. 981-984 [doi]
- New Methods for Template Selection and Compression in Continuous Speech RecognitionXie Sun, Yunxin Zhao. 985-988 [doi]
- Structured Support Vector Machines for Noise Robust Continuous Speech RecognitionShi-Xiong Zhang, Mark J. F. Gales. 989-990 [doi]
- Continuous Digits Recognition Leveraging Invariant StructureMasayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu. 993-996 [doi]
- Convergence of Line Search A-Function MethodsDimitri Kanevsky, David Nahamoo, Tara N. Sainath, Bhuvana Ramabhadran. 997-1000 [doi]
- Hidden Boosted MMI and Hierarchical State Posterior Feature for Automatic Speech Recognition Based on Hidden Conditional Neural FieldsYasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa. 1001-1004 [doi]
- Recognition and Real Time Performances of a Lightweight Ultrasound Based Silent Speech Interface Employing a Language ModelJun Cai, Bruce Denby, Pierre Roussel-Ragot, Gérard Dreyfus, Lise Crevier-Buchman. 1005-1008 [doi]
- Optimizing Situated Dialogue Management in Unknown EnvironmentsHeriberto Cuayáhuitl, Nina Dethlefs. 1009-1012 [doi]
- Acoustic-Similarity Based Technique to Improve Concept RecognitionOm Deshmukh, Shajith Ikbal, Ashish Verma, Etienne Marcheret. 1013-1016 [doi]
- Dialog Methods for Improved Alphanumeric String CaptureDoug Peters, Peter Stubley. 1017-1020 [doi]
- Detecting the Status of a Predictive Incremental Speech Understanding Model for Real-Time Decision-Making in a Spoken Dialogue SystemDavid DeVault, Kenji Sagae, David R. Traum. 1021-1024 [doi]
- User Simulation in Dialogue Systems Using Inverse Reinforcement LearningSenthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. 1025-1028 [doi]
- Lossless Value Directed Compression of Complex User Goal States for Statistical Spoken Dialogue SystemsPaul A. Crook, Oliver Lemon. 1029-1032 [doi]
- Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization SystemsJanez Zibert, France Mihelic. 1033-1036 [doi]
- Diarization-Based Speaker Retrieval for Broadcast Television ArchivesMarijn Huijbregts, David A. van Leeuwen. 1037-1040 [doi]
- The Detection of Overlapping Speech with Prosodic Features for Speaker DiarizationMartin Zelenák, Javier Hernando. 1041-1044 [doi]
- LP Residual Features for Robust, Privacy-Sensitive Speaker DiarizationSree Hari Krishnan Parthasarathi, Hervé Bourlard, Daniel Gatica-Perez. 1045-1048 [doi]
- Extending the Task of Diarization to Speaker AttributionHouman Ghaemmaghami, David Dean, Robbie Vogt, Sridha Sridharan. 1049-1052 [doi]
- Comparing Multi-Stage Approaches for Cross-Show Speaker DiarizationViet-Anh Tran, Viet Bac Le, Claude Barras, Lori Lamel. 1053-1056 [doi]
- Analysing the Correspondence Between Automatic Prosodic Segmentation and Syntactic StructureGyörgy Szaszák, Katalin Nagy, András Beke. 1057-1060 [doi]
- Long-Distance Rhythmic Dependencies and their Application to Automatic Language IdentificationJoseph Tepperman, Emily Nava. 1061-1064 [doi]
- Symbolic and Direct Sequential Modeling of Prosody for Classification of Speaking-Style and NativenessAndrew Rosenberg. 1065-1068 [doi]
- Prosodic Analysis and Perception of Mandarin Utterances Conveying AttitudesWentao Gu, Ting Zhang, Hiroya Fujisaki. 1069-1072 [doi]
- Predicting Taiwan Mandarin Tone Shapes from their DurationChierh Cheng, Michele Gubian. 1073-1076 [doi]
- Variation of Accent Type and of Context - Influences on Pragmatic Focus InterpretationCharlotte Wollermann, Ulrich Schade, Bernhard Schröder. 1077-1080 [doi]
- Model Adaptation for Automatic Speech Recognition Based on Multiple Time Scale EvolutionShinji Watanabe, Atsushi Nakamura, Biing-Hwang Juang. 1081-1084 [doi]
- Integrated Online Speaker Clustering and AdaptationCatherine Breslin, K. K. Chin, Mark J. F. Gales, Kate Knill. 1085-1088 [doi]
- A Study on Speaker Normalized MLP Features in LVCSRZoltán Tüske, Christian Plahl, Ralf Schlüter. 1089-1092 [doi]
- Matrix-Variate Distribution of Training Models for Robust Speaker AdaptationYongwon Jeong, Young-Kuk Kim. 1093-1096 [doi]
- Separating Speaker and Environmental Variability Using Factored TransformsMichael L. Seltzer, Alex Acero. 1097-1100 [doi]
- Your Mobile Virtual Assistant Just Got Smarter!Mazin Gilbert, Iker Arizmendi, Enrico Bocchieri, Diamantino Caseiro, Vincent Goffin, Andrej Ljolje, Mike Phillips, Chao Wang, Jay G. Wilpon. 1101-1104 [doi]
- Topic Segmentation of TV-Streams by Mathematical Morphology and VectorizationVincent Claveau, Sébastien Lefèvre. 1105-1108 [doi]
- Probabilistic Latent Semantic Analysis for Broadcast News Story SegmentationMimi Lu, Cheung Chi Leung, Lei Xie, Bin Ma, Haizhou Li. 1109-1112 [doi]
- Hybrid Speech Recognition for Voice Search: A Comparative StudyEvandro B. Gouvêa. 1113-1116 [doi]
- A New Phonetic Candidate Generator for Improving Search Query EfficiencyBo Peng, Yao Qian, Frank K. Soong, Bo Zhang. 1117-1120 [doi]
- Towards Voice-Input Symbolic Pattern Retrieval Using Parameter-Based SearchYukiko Suzuki, Kiyoaki Aikawa. 1121-1124 [doi]
- A Language Independent Approach to Audio SearchVikram Gupta, Jitendra Ajmera, Arun Kumar, Ashish Verma. 1125-1128 [doi]
- Acquisition of Timing Patterns in Second LanguageMikhail Ordin, Leona Polyanskaya, Christiane Ulbrich. 1129-1132 [doi]
- Context-Dependent Duration Modeling with Backoff Strategy and Look-Up Tables for Pronunciation Assessment and Mispronunciation DetectionHongyan Li, Shen Huang, Shijin Wang, Bo Xu. 1133-1136 [doi]
- Perceptual Training of Vowel Length Contrast of Japanese by L2 Listeners: Effects of an Isolated Word versus a Word Embedded in SentencesMee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka. 1137-1140 [doi]
- Similar Vowels in L1/L2 Production: Confused or Discerned in Early L2 English Learners with Different Amount of ExposureE-Chin Wu. 1141-1144 [doi]
- Production and Perception of Estonian Vowels by Native and Non-Native SpeakersLya Meister, Einar Meister. 1145-1148 [doi]
- New Feature Parameters for Pronunciation Evaluation in English Presentations at International ConferencesHiroshi Kibishi, Seiichi Nakagawa. 1149-1152 [doi]
- Synchronous Reading: Learning French Orthography by Audiovisual TrainingGérard Bailly, Will Barbour. 1153-1156 [doi]
- Phoneme Level Non-Native Pronunciation Analysis by an Auditory Model-Based Native Assessment SchemeChristos Koniaris, Olov Engwall. 1157-1160 [doi]
- The Open Front Vowel /æ/ in the Production and Perception of Czech Students of EnglishPavel Sturm, Radek Skarnitzl. 1161-1164 [doi]
- Error Selection for ASR-Based English Pronunciation Training in 'My Pronunciation Coach'Catia Cucchiarini, Henk van den Heuvel, Eric Sanders, Helmer Strik. 1165-1168 [doi]
- An Experimental Analysis of Pitch Patterns in Japanese Speakers of English with Verification by Speech Re-SynthesisTomoko Nariai, Kazuyo Tanaka. 1169-1172 [doi]
- An Analysis of Word Duration in Native Speakers and Japanese Speakers of EnglishTomoko Nariai, Kazuyo Tanaka, Yoshiaki Itoh. 1173-1176 [doi]
- Evaluating Artificial Bandwidth Extension by Conversational Tests in Car Using Mobile Devices with Integrated Hands-Free FunctionalityLaura Laaksonen, Ville Myllylä, Riitta Niemistö. 1177-1180 [doi]
- Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture ModelHannu Pulakka, Ulpu Remes, Santeri Yrttiaho, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku. 1181-1184 [doi]
- Memory-Based Approximation of the Gaussian Mixture Model Framework for Bandwidth Extension of Narrowband SpeechAmr H. Nour-Eldin, Peter Kabal. 1185-1188 [doi]
- Speech Enhancement by Reconstruction from Cleaned Acoustic FeaturesPhilip Harding, Ben Milner. 1189-1192 [doi]
- A Soft Decision-Based Speech Enhancement Using Acoustic Noise ClassificationJae Hun Choi, Sang-Kyun Kim, Joon-Hyuk Chang. 1193-1196 [doi]
- A Noise Estimation Method Based on Speech Presence Probability and Spectral SparsenessChao Li, Wen-Ju Liu. 1197-1120 [doi]
- Improved a posteriori Speech Presence Probability Estimation Based on Cepstro-Temporal Smoothing and Time-Frequency CorrelationChao Li, Wen-Ju Liu. 1201-1204 [doi]
- A Rapid Adaptation Algorithm for Tracking Highly Non-Stationary Noises based on Bayesian Inference for On-Line Spectral Change Point DetectionMd Foezur Rahman Chowdhury, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy. 1205-1208 [doi]
- Single Channel Speech Enhancement Using MMSE Estimation of Short-Time Modulation Magnitude SpectrumKuldip K. Paliwal, Belinda Schwerin, Kamil K. Wójcicki. 1209-1212 [doi]
- Speech Enhancement Using Masking Properties in Adverse EnvironmentsAtanu Saha, Tetsuya Shimamura. 1213-1216 [doi]
- Phoneme-Dependent NMF for Speech Enhancement in Monaural MixturesBhiksha Raj, Rita Singh, Tuomas Virtanen. 1217-1220 [doi]
- Kernel PCA for Speech EnhancementChristina Leitner, Franz Pernkopf, Gernot Kubin. 1221-1224 [doi]
- Objective Intelligibility Prediction of Speech by Combining Correlation and Distortion Based TechniquesAngel M. Gomez, Belinda Schwerin, Kuldip K. Paliwal. 1225-1228 [doi]
- Integrating Recent MLP Feature Extraction Techniques into TRAP ArchitectureFrantisek Grézl, Martin Karafiát. 1229-1232 [doi]
- Feature Frame Stacking in RNN-Based Tandem ASR Systems - Learned vs. Predefined ContextMartin Wöllmer, Björn Schuller, Gerhard Rigoll. 1233-1236 [doi]
- Improved Acoustic Feature Combination for LVCSR by Neural NetworksChristian Plahl, Ralf Schlüter, Hermann Ney. 1237-1240 [doi]
- Hierarchical Tandem Features for ASR in MandarinJoel Pinto, Mathew Magimai-Doss, Hervé Bourlard. 1241-1244 [doi]
- Analysis and Comparison of Recent MLP Features for LVCSR SystemsFabio Valente, Mathew Magimai-Doss, Wen Wang. 1245-1248 [doi]
- Deep Learning of Speech Features for Improved Phonetic RecognitionJaehyung Lee, Soo-Young Lee. 1249-1252 [doi]
- Globality-Locality Consistent Discriminant Analysis for Phone ClassificationHeyun Huang, Yang Liu, Jort F. Gemmeke, Louis ten Bosch, Bert Cranen, Lou Boves. 1253-1256 [doi]
- Front-End Compensation Methods for LVCSR Under Lombard EffectHynek Boril, Frantisek Grézl, John H. L. Hansen. 1257-1260 [doi]
- Classification of Fricatives Using Feature Extrapolation of Acoustic-Phonetic Features in Telephone SpeechJung-Won Lee, Jeung-Yoon Choi, Hong-Goo Kang. 1261-1264 [doi]
- Noise Robust Feature Extraction Based on Extended Weighted Linear Prediction in LVCSRSami Keronen, Jouni Pohjalainen, Paavo Alku, Mikko Kurimo. 1265-1268 [doi]
- Comparing Different Flavors of Spectro-Temporal Features for ASRBernd T. Meyer, Suman V. Ravuri, Marc René Schädler, Nelson Morgan. 1269-1272 [doi]
- VTLN in the MFCC Domain: Band-Limited versus Local InterpolationEhsan Variani, Thomas Schaaf. 1273-1276 [doi]
- Multistream Bandpass Modulation Features for Robust Speech RecognitionSridhar Krishna Nemala, Kailash Patil, Mounya Elhilali. 1277-1280 [doi]
- An Analysis of Automatic Speech Recognition with Multiple MicrophonesDavide Marino, Thomas Hain. 1281-1284 [doi]
- Multi-View Approach for Speaker Turn Role Labeling in TV Broadcast News ShowsGéraldine Damnati, Delphine Charlet. 1285-1288 [doi]
- Evaluation of an Integrated Authoring Tool for Building Advanced Question-Answering CharactersSudeep Gandhe, Michael Rushforth, Priti Aggarwal, David R. Traum. 1289-1292 [doi]
- Towards Unsupervised Spoken Language Understanding: Exploiting Query Click Logs for Slot FillingGökhan Tür, Dilek Z. Hakkani-Tür, Dustin Hillard, Asli Çelikyilmaz. 1293-1296 [doi]
- Web-Enhanced Content Retrieval for Information Access Dialogue SystemDonghyeon Lee, Cheongjae Lee, Minwoo Jeong, Kyungduk Kim, Seokhwan Kim, Junhwi Choi, Gary Geunbae Lee. 1297-1300 [doi]
- Uncertainty Management for On-Line Optimisation of a POMDP-Based Large-Scale Spoken Dialogue SystemLucie Daubigney, Milica Gasic, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, Steve Young. 1301-1304 [doi]
- Detection of Task-Incomplete Dialogs Based on Utterance-and-Behavior Tag N-Gram for Spoken Dialog SystemsSunao Hara, Norihide Kitaoka, Kazuya Takeda. 1305-1308 [doi]
- Shrinkage-Based Features for Natural Language Call RoutingRuhi Sarikaya, Stanley F. Chen, Bhuvana Ramabhadran. 1309-1312 [doi]
- Clustering with Modified Cosine Distance Learned from ConstraintsLeonid Rachevsky, Dimitri Kanevsky, Ruhi Sarikaya, Bhuvana Ramabhadran. 1313-1316 [doi]
- Using Speaker ID to Discover Repeat Callers of a Spoken Dialog SystemAndrew Fandrianto, Brian Langner, Alan W. Black. 1317-1320 [doi]
- Semantic Graph Clustering for POMDP-Based Spoken Dialog SystemsFlorian Pinault, Fabrice Lefèvre. 1321-1324 [doi]
- Learning Place-Names from Spoken Utterances and Localization Results by Mobile RobotRyo Taguchi, Yuji Yamada, Koosuke Hattori, Taizo Umezaki, Masahiro Hoguro, Naoto Iwahashi, Kotaro Funakoshi, Mikio Nakano. 1325-1328 [doi]
- Active Learning for Dialogue Act ClassificationBjörn Gambäck, Fredrik Olsson, Oscar Täckström. 1329-1332 [doi]
- Speaker Role Recognition Using Question Detection and CharacterizationThierry Bazillon, Benjamin Maza, Mickael Rouvier, Frédéric Béchet, Alexis Nasr. 1333-1336 [doi]
- Learning Score Structure from Spoken Language for a Tennis GameQiang Huang, Stephen J. Cox. 1337-1340 [doi]
- Semi-Automated Classifier Adaptation for Natural Language Call RoutingSilke M. Witt. 1341-1344 [doi]
- Interactional Style Detection for Versatile Dialogue Response Using Prosodic and Semantic FeaturesWei-Bin Liang, Chung-Hsien Wu, Chih-Hung Wang, Jhing-Fa Wang. 1345-1348 [doi]
- Quality Aspects of Multimodal Dialog Systems: Identity, Stimulation and SuccessChristine Kühnel, Benjamin Weiss, Matthias Schulz, Sebastian Möller. 1349-1352 [doi]
- Where Should Pitch Accents and Phrase Breaks Go? A Syntax Tree Transducer SolutionJoseph Tepperman, Emily Nava. 1353-1356 [doi]
- Phrasal Prominences do not need Pitch Movements: Postfocal Phrasal Heads in ItalianGiuliano Bocci, Cinzia Avesani. 1357-1360 [doi]
- Intonation of Left Dislocated Topics in Modern GreekDavid Le Gac, Hiyon Yoo. 1361-1364 [doi]
- Phrases, Pitch and Perceived Prominence in MaoriLaura Thompson, Catherine I. Watson, Ray Harlow, Jeanette King, Margaret Maclagan, Helen Charters, Peter Keegan. 1365-1368 [doi]
- Perceptual Sensitivity to Prenuclear and Nuclear Intonational PatternsTomás Dubeda. 1369-1372 [doi]
- Tonal Alignment Defined: The Case of Southern Irish EnglishRaya Kalaldeh. 1373-1376 [doi]
- Using Mutual Information to Identify Regions of Analysis for Prosodic AnalysisAndrew Rosenberg. 1377-1380 [doi]
- Prosodic Highlights in Mandarin Continuous Speech - Cross-Genre Attributes and ImplicationsChiu-yu Tseng, Zhao-yu Su, Chi-Feng Huang. 1381-1384 [doi]
- When Two Newly-Acquired Words are One: New Words Differing in Stress Alone are not Automatically Represented DifferentlySimone Sulpizio, James M. McQueen. 1385-1388 [doi]
- Automatic Determination of the Standard Chinese Prosodic Phrase Boundaries by F0 Generation ModelShehui Bu, Zhenjie Zhuo, Lingling Yang, Shuichi Itahashi. 1389-1392 [doi]
- Measuring Speakers' Similarity in Speech by Means of Prosodic Cues: Methods and PotentialCéline De Looze, Stéphane Rauzy. 1393-1396 [doi]
- Tonal Variations in Mandarin: New Evidence from Spontaneous and Read SpeechLi-chiung Yang. 1397-1400 [doi]
- Accounting for Prosodic Information to Improve ASR-Based Topic Tracking for TV Broadcast NewsCamille Guinaudeau, Julia Hirschberg. 1401-1404 [doi]
- Morpheme Conversion for Connecting Speech Recognizer and Language Analyzers in Unsegmented LanguagesKenji Imamura, Tomoko Izumi, Kugatsu Sadamitsu, Kuniko Saito, Satoshi Kobashikawa, Hirokazu Masataki. 1405-1408 [doi]
- Emotion Detection Based on Concept Inference and Spoken Sentence Analysis for Customer ServiceRen-Ying Fang, Bo-Wei Chen, Jhing-Fa Wang, Chung-Hsien Wu. 1409-1412 [doi]
- Commas Recovery with Syntactic Features in French and in CzechChristophe Cerisara, Pavel Král, Claire Gardent. 1413-1416 [doi]
- Redundancy Reduction in ASR of Spontaneous Speech Through Statistical Machine TranslationDaniele Falavigna. 1417-1420 [doi]
- From Interview to News Text: A Study of Taiwan TV Political Interviews in Newspaper ReportsChin-Chih Chiang. 1421-1424 [doi]
- Unary Data Structures for Language ModelsJeffrey Sorensen, Cyril Allauzen. 1425-1428 [doi]
- Bayesian Language Model Interpolation for Mobile Speech InputCyril Allauzen, Michael Riley. 1429-1432 [doi]
- On the Estimation of Discount Parameters for Language Model SmoothingMartin Sundermeyer, Ralf Schlüter, Hermann Ney. 1433-1436 [doi]
- N-Grams for Conditional Random Fields or a Failure-Transition(f) Posterior for Acyclic FSTsPatrick Lehnen, Stefan Hahn, Hermann Ney. 1437-1440 [doi]
- Hybrid Language Models Using Mixed Types of Sub-Lexical Units for Open Vocabulary German LVCSRM. Ali Basha Shaik, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney. 1441-1444 [doi]
- Morpheme Based Factored Language Models for German LVCSRAmr El-Desoky Mousa, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney. 1445-1448 [doi]
- Compound Word Recombination for German LVCSRMarkus Nußbaum-Thom, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney. 1449-1452 [doi]
- Lattice-Based Risk Minimization Training for Unsupervised Language Model AdaptationAkio Kobayashi, Takahiro Oku, Shinichi Homma, Toru Imai, Seiichi Nakagawa. 1453-1456 [doi]
- Similarity Language ModelChristian Gillot, Christophe Cerisara. 1457-1460 [doi]
- Data Sampling and Dimensionality Reduction Approaches for Reranking ASR Outputs Using Discriminative Language ModelsErinç Dikici, Murat Semerci, Murat Saraclar, Ethem Alpaydin. 1461-1464 [doi]
- Training a Language Model Using Webdata for Large Vocabulary Japanese Spontaneous Speech RecognitionRyo Masumura, Seongjun Hahm, Akinori Ito. 1465-1468 [doi]
- Large Vocabulary SOUL Neural Network Language ModelsHai-son Le, Ilya Oparin, Abdelkhalek Messaoudi, Alexandre Allauzen, Jean-Luc Gauvain, François Yvon. 1469-1472 [doi]
- Improved Spoken Query Transcription Using Co-Occurrence InformationJonathan Mamou, Abhinav Sethy, Bhuvana Ramabhadran, Ron Hoory, Paul Vozila. 1473-1476 [doi]
- Unsupervised Latent Speaker Language ModelingYik-Cheung Tam, Paul Vozila. 1477-1480 [doi]
- Measurement of Objective Intelligibility of Japanese Accented English Using ERJ (English Read by Japanese) DatabaseNobuaki Minematsu, Koji Okabe, Keisuke Ogaki, Keikichi Hirose. 1481-1484 [doi]
- From Single-Call to Multi-Call Quality: A Study on Long-Term Quality Integration in Audio-Visual Speech CommunicationSebastian Möller, Chihuy Bang, Teele Tamme, Markus Vaalgamaa, Benjamin Weiss. 1485-1488 [doi]
- Optimal Selection of Limited Vocabulary Speech CorporaHui Lin, Jeff A. Bilmes. 1489-1492 [doi]
- Open Source Multi-Language Audio Database for Spoken Language Processing ApplicationsStephen A. Zahorian, Jiang Wu, Montri Karnjanadecha, Chandra SekharVootkuri, Brian Wong, Andrew Hwang, Eldar Tokhtamyshev. 1493-1496 [doi]
- The USC CARE Corpus: Child-Psychologist Interactions of Children with Autism Spectrum DisordersMatthew Black, Daniel Bone, Marian E. Williams, Phillip Gorrindo, Pat Levitt, Shrikanth S. Narayanan. 1497-1500 [doi]
- Towards a Versatile Multi-Layered Description of Speech Corpora Using Algebraic RelationsNelly Barbot, Vincent Barreaud, Olivier Boëffard, Laure Charonnat, Arnaud Delhay, Sébastien Le Maguer, Damien Lolive. 1501-1504 [doi]
- Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0 Articulatory CorpusKorin Richmond, Phil Hoole, Simon King. 1505-1508 [doi]
- A Pitch Tracking Corpus with Evaluation on Multipitch Tracking ScenarioGregor Pirker, Michael Wohlmayr, Stefan Petrik, Franz Pernkopf. 1509-1512 [doi]
- On Building and Evaluating a Broadcast-News Audio Segmentation SystemTaras Butko. 1513-1516 [doi]
- Time- and Acoustic-Mediated Alignment Algorithms for Speech Recognition EvaluationSimon Dobrisek, France Mihelic. 1517-1520 [doi]
- Effects of Shortening Speech Prompts of In-Car Voice User Interfaces on Users Mental ModelsJulia Niemann, Kati Schulz, Ina Wechsung. 1521-1524 [doi]
- Speech Transcript Evaluation for Information RetrievalLaurens van der Werff, Wessel Kraaij, Franciska de Jong. 1525-1528 [doi]
- The Albayzin 2010 Language Recognition EvaluationLuis Javier Rodríguez, Mikel Peñagarikano, Amparo Varona, Mireia Díez, Germán Bordel. 1529-1532 [doi]
- Progress and Prospects for Speech Technology: Results from Three Sexennial SurveysRoger K. Moore. 1533-1536 [doi]
- Painless WFST Cascade Construction for LVCSR - TransducersaurusJosef R. Novak, Nobuaki Minematsu, Keikichi Hirose. 1537-1540 [doi]
- On the Use of Multimodal Cues for the Prediction of Degrees of Involvement in Spontaneous ConversationCatharine Oertel, Stefan Scherer, Nick Campbell. 1541-1544 [doi]
- Anger Recognition in Spoken Dialog Using Linguistic and Para-Linguistic InformationNarichika Nomoto, Masafumi Tamoto, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi. 1545-1548 [doi]
- Recognition of Personality Traits from Human Spoken ConversationsAlexei V. Ivanov, Giuseppe Riccardi, Adam J. Sporka, Jakub Franc. 1549-1552 [doi]
- Using Multiple Databases for Training in Emotion Recognition: To Unite or to Vote?Björn Schuller, Zixing Zhang, Felix Weninger, Gerhard Rigoll. 1553-1556 [doi]
- "Would You Buy a Car from Me?" - On the Likability of Telephone VoicesFelix Burkhardt, Björn Schuller, Benjamin Weiss, Felix Weninger. 1557-1560 [doi]
- Automatic Identification of Salient Acoustic Instances in Couples' Behavioral Interactions Using Diverse Density Support Vector MachinesJames Gibson, Athanasios Katsamanis, Matthew P. Black, Shrikanth S. Narayanan. 1561-164 [doi]
- Predicting Speaker Changes and Listener Responses with and without Eye-ContactDaniel Neiberg, Joakim Gustafson. 1565-1568 [doi]
- Emotion Classification Using Inter- and Intra-Subband Energy VariationSenaka Amarakeerthi, Tin Lay Nwe, Liyanage C. De Silva, Michael Cohen. 1569-1572 [doi]
- Emotion Classification of Infants' Cries Using Duration Ratios of Acoustic SegmentsKazuki Kitahara, Shinzi Michiwiki, Miku Sato, Shoichi Matsunaga, Masaru Yamashita, Kazuyuki Shinohara. 1573-1576 [doi]
- Vowels Formants Analysis Allows Straightforward Detection of High Arousal Acted and Spontaneous EmotionsBogdan Vlasenko, Dmytro Prylipko, David Philippou-Hübner, Andreas Wendemuth. 1577-1580 [doi]
- Intra-, Inter-, and Cross-Cultural Classification of Vocal AffectDaniel Neiberg, Petri Laukka, Hillary Anger Elfenbein. 1581-1584 [doi]
- Verifying Human Users in Speech-Based InteractionsSajad Shirali-Shahreza, Yashar Ganjali, Ravin Balakrishnan. 1585-1588 [doi]
- Automatic Assessment of Prosody in High-Stakes English TestsJian Cheng. 1589-1592 [doi]
- Improvement of Segmental Mispronunciation Detection with Prior Knowledge Extracted from Large L2 Speech CorpusDean Luo, Xuesong Yang, Lan Wang. 1593-1596 [doi]
- Off-Topic Detection in Automated Speech Assessment ApplicationsJian Cheng, Jianqiang Shen. 1597-1600 [doi]
- Towards Context-Dependent Phonetic Spelling Error Correction in Children's Freely Composed Text for Diagnostic and Pedagogical PurposesSebastian Stüker, Johanna Fay, Kay Berkling. 1601-1604 [doi]
- Factored Translation Models for Improving a Speech into Sign Language Translation SystemVerónica López-Ludeña, Rubén San Segundo, Ricardo de Córdoba, Javier Ferreiros, Juan Manuel Montero, José Manuel Pardo. 1605-1608 [doi]
- Formant Maps in Hungarian Vowels - Online Data Inventory for Research, and EducationKálmán Abari, Zsuzsanna Zsófia Rácz, Gábor Olaszy. 1609-1612 [doi]
- Automatic Subtitling of the Basque Parliament Plenary Sessions VideosGermán Bordel, Silvia Nieto, Mikel Peñagarikano, Luis Javier Rodríguez, Amparo Varona. 1613-1616 [doi]
- Generating Animated Pronunciation from Speech Through Articulatory Feature ExtractionYurie Iribe, Silasak Manosavanh, Kouichi Katsurada, Ryoko Hayashi, Chunyue Zhu, Tsuneo Nitta. 1617-1620 [doi]
- A Tale of Two Tasks: Detecting Children's Off-Task Speech in a Reading TutorWei Chen 0019, Jack Mostow. 1621-1624 [doi]
- Problems Encountered by Japanese EL2 with English Short Vowels as Illustrated on a 3D Vowel ChartToshiko Isei-Jaakkola, Takatoshi Naka, Keikichi Hirose. 1625-1628 [doi]
- Automatic Generation of Listening Comprehension Learning Material in European PortugueseThomas Pellegrini, Rui Correia, Isabel Trancoso, Jorge Baptista, Nuno J. Mamede. 1629-1632 [doi]
- Candidate Generation for ASR Output Error Correction Using a Context-Dependent Syllable Cluster-Based Confusion MatrixChao-Hong Liu, Chung-Hsien Wu, David Sarwono, Jhing-Fa Wang. 1633-1636 [doi]
- Semi-Supervised Tree Support Vector Machine for Online Cough RecognitionHuynh Thai Hoa, An Vu Tran, Tran Huy Dat. 1637-1640 [doi]
- A Versatile Gaussian Splitting Approach to Non-Linear State Estimation and its Application to Noise-Robust ASRVolker Leutnant, Alexander Krueger, Reinhold Haeb-Umbach. 1641-1644 [doi]
- Generalized-Log Spectral Mean Normalization for Speech RecognitionHilman Ferdinandus Pardede, Koichi Shinoda. 1645-1648 [doi]
- Zero-Crossing-Based Channel Attentive Weighting of Cepstral Features for Robust Speech Recognition: The ETRI 2011 CHiME Challenge SystemYoung Ik Kim, Hoon-Young Cho, Sang-hun Kim. 1649-1652 [doi]
- Feature Compensation for Speech Recognition in Severely Adverse Environments Due to Background Noise and Channel DistortionWooil Kim, John H. L. Hansen. 1653-1656 [doi]
- Binaural Cues for Fragment-Based Speech Recognition in Reverberant Multisource EnvironmentsNing Ma, Jon Barker, Heidi Christensen, Phil D. Green. 1657-1660 [doi]
- Sub-Band Level Histogram Equalization for Robust Speech RecognitionVikas Joshi, Raghavendra Bilgi, Srinivasan Umesh, Luz García, M. Carmen Benítez. 1661-1664 [doi]
- GMM-Based Missing-Feature Reconstruction on Multi-Frame WindowsUlpu Remes, Yoshihiko Nankaku, Keiichi Tokuda. 1665-1668 [doi]
- Improvements of a Dual-Input DBN for Noise Robust ASRYang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves. 1669-1672 [doi]
- Denoising Using Optimized Wavelet Filtering for Automatic Speech RecognitionRandy Gomez, Tatsuya Kawahara. 1673-1676 [doi]
- Noise Robust Speaker-Independent Speech Recognition with Invariant-Integration Features Using Power-Bias SubtractionFlorian Müller, Alfred Mertins. 1677-1680 [doi]
- Semi-Automatic Acoustic Model Generation from Large Unsynchronized Audio and Text ChunksMichele Alessandrini, Giorgio Biagetti, Alessandro Curzi, Claudio Turchetti. 1681-1684 [doi]
- Unsupervised Testing Strategies for ASRBrian Strope, Doug Beeferman, Alexander Gruenstein, Xin Lei. 1685-1688 [doi]
- Acoustic Model Training with Detecting Transcription Errors in the Training DataGakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura. 1689-1692 [doi]
- Towards Unsupervised Training of Speaker Independent Acoustic ModelsAren Jansen, Kenneth Church. 1693-1692 [doi]
- Acoustic Modeling with Bootstrap and Restructuring Based on Full CovarianceXiaodong Cui, Xin Chen, Jian Xue, Peder A. Olsen, John R. Hershey, Bowen Zhou. 1697-1700 [doi]
- An i-vector Based Approach to Acoustic Sniffing for Irrelevant Variability Normalization Based Acoustic Model Training and Speech RecognitionJian Xu, Yu Zhang, Zhi-Jie Yan, Qiang Huo. 1701-1704 [doi]
- Log-Linear Optimization of Second-Order Polynomial Features with Subsequent Dimension Reduction for Speech RecognitionMuhammad Ali Tahir, Ralf Schlüter, Hermann Ney. 1705-1708 [doi]
- Genre Categorization and Modeling for Broadcast Speech TranscriptionQingqing Zhang, Lori Lamel, Jean-Luc Gauvain. 1709-1712 [doi]
- Individual Error Minimization Learning Framework and its Applications to Speech Recognition and Utterance VerificationSunghwan Shin, Ho-Young Jung, Biing-Hwang Juang. 1713-1716 [doi]
- Effective Triphone Mapping for Acoustic Modeling in Speech RecognitionSakhia Darjaa, Milos Cernak, Marián Trnka, Milan Rusko, Róbert Sabo. 1717-1720 [doi]
- Analysis of Dialectal Influence in Pan-Arabic ASRUdhyakumar Nallasamy, Michael Garbus, Florian Metze, Qin Jin, Thomas Schaaf, Tanja Schultz. 1721-1724 [doi]
- Connected Digit Recognition by Means of Reservoir ComputingAzarakhsh Jalalvand, Fabian Triefenbach, David Verstraeten, Jean-Pierre Martens. 1725-1728 [doi]
- Large Margin - Minimum Classification Error Using Sum of Shifted Sigmoids as the Loss FunctionMadhavi Vedula Ratnagiri, Biing-Hwang Juang, Lawrence R. Rabiner. 1729-1732 [doi]
- Representing Phonological Features Through a Two-Level Finite State ModelJavier Mikel Olaso, M. Inés Torres, Raquel Justo. 1733-1736 [doi]
- Optimization of the Gaussian Mixture Model Evaluation on GPUJan Vanek, Jan Trmal, Josef V. Psutka, Josef Psutka. 1737-1740 [doi]
- Monaural Voiced Speech Segregation Based on Pitch and Comb FilterXueliang Zhang, Wenju Liu. 1741-1744 [doi]
- Fast and Simple Iterative Algorithm of Lp-Norm Minimization for Under-Determined Speech SeparationYasuharu Hirasawa, Naoki Yasuraoka, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno. 1745-1748 [doi]
- Monaural Speech Separation Based on a 2D Processing and Harmonic AnalysisAzam Rabiee, Saeed Setayeshi, Soo-Young Lee. 1749-1752 [doi]
- Underdetermined Blind Source Separation with Fuzzy Clustering for Arbitrarily Arranged SensorsIngrid Jafari, Serajul Haque, Roberto Togneri, Sven Nordholm. 1753-1756 [doi]
- On Initial Seed Selection for Frequency Domain Blind Speech SeparationDang Hai Tran Vu, Reinhold Haeb-Umbach. 1757-1760 [doi]
- Spatial Filter Calibration Based on Minimization of Modified LSDNobuaki Tanaka, Tetsuji Ogawa, Tetsunori Kobayashi. 1761-1764 [doi]
- Probabilistic Spectrum Envelope: Categorized Audio-Features Representation for NMF-Based Sound DecompositionToru Nakashika, Tetsuya Takiguchi, Yasuo Ariki. 1765-1768 [doi]
- A High Resolution Multiple Source Localization Based on Generalized Cumulant Structure (GCS) MatrixJinho Choi 0002, Chang D. Yoo. 1769-1772 [doi]
- Single Channel Speech Music Separation Using Nonnegative Matrix Factorization with Sliding Windows and Spectral MasksEmad M. Grais, Hakan Erdogan. 1773-1776 [doi]
- Perceptually-Inspired Processing for Multichannel Wiener FilterJorge I. Marin-Hurtado, David V. Anderson. 1777-1780 [doi]
- Speech Recognition in Mixed Sound of Speech and Music Based on Vector Quantization and Non-Negative Matrix FactorizationShoichi Nakano, Kazumasa Yamamoto, Seiichi Nakagawa. 1781-1784 [doi]
- Reduction of Highly Nonstationary Ambient Noise by Integrating Spectral and Locational Characteristics of Speech and Noise for Robust ASRTomohiro Nakatani, Shoko Araki, Marc Delcroix, Takuya Yoshioka, Masakiyo Fujimoto. 1785-1788 [doi]
- Voice Processing by Dynamic Glottal Models with Applications to Speech EnhancementCarlo Drioli, Andrea Calanca. 1789-1792 [doi]
- Supervised Sparse Coding Strategy in Cochlear ImplantsJinqiu Sang, Guoping Li, Hongmei Hu, Mark E. Lutman, Stefan Bleeck. 1793-1796 [doi]
- Continuous Control of the Degree of Articulation in HMM-Based Speech SynthesisBenjamin Picart, Thomas Drugman, Thierry Dutoit. 1797-1800 [doi]
- Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech SynthesisLing-Hui Chen, Yoshihiko Nankaku, Heiga Zen, Keiichi Tokuda, Zhen-Hua Ling, Li-Rong Dai. 1801-1804 [doi]
- Inverse Filtering Based Harmonic Plus Noise Excitation Model for HMM-Based Speech SynthesisZhengqi Wen, Jianhua Tao. 1805-1808 [doi]
- Improved HNM-Based Vocoder for Statistical SynthesizersDaniel Erro, Iñaki Sainz, Eva Navas, Inma Hernáez. 1809-1812 [doi]
- A Statistical Phrase/Accent Model for Intonation ModelingGopala Krishna Anumanchipalli, Luís C. Oliveira, Alan W. Black. 1813-1816 [doi]
- Intermediate-State HMMs to Capture Continuously-Changing Signal FeaturesGustav Eje Henter, W. Bastiaan Kleijn. 1817-1820 [doi]
- Automatic Sentence Selection from Speech Corpora Including Diverse Speech for Improved HMM-TTS Synthesis QualityNorbert Braunschweiler, Sabine Buchholz. 1821-1824 [doi]
- Phonological Knowledge Guided HMM State Mapping for Cross-Lingual Speaker AdaptationHui Liang, John Dines. 1825-1828 [doi]
- Reformulating Prosodic Break Model into Segmental HMMs and Information FusionNicolas Obin, Pierre Lanchantin, Anne Lacheret, Xavier Rodet. 1829-1832 [doi]
- Multipulse Sequences for Residual Signal ModelingRanniery Maia, Heiga Zen, Kate Knill, Mark J. F. Gales, Sabine Buchholz. 1833-1836 [doi]
- Can Objective Measures Predict the Intelligibility of Modified HMM-Based Synthetic Speech in Noise?Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King. 1837-1840 [doi]
- Speech Synthesis Based on Articulatory-Movement HMMs with Voice-Source CodebooksTsuneo Nitta, Takayuki Onoda, Masashi Kimura, Yurie Iribe, Kouichi Katsurada. 1841-1844 [doi]
- Large-Scale Subjective Evaluations of Speech Rate Control Methods for HMM-Based Speech SynthesizersTsuneo Kato, Makoto Yamada, Nobuyuki Nishizawa, Keiichiro Oura, Keiichi Tokuda. 1845-1848 [doi]
- HMM-Based Emphatic Speech Synthesis Using Unsupervised Context LabelingYu Maeno, Takashi Nose, Takao Kobayashi, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka. 1849-1852 [doi]
- Chinese and Italian Speech Rhythm: Normalization and the CCI AlgorithmChiara Bertini, Pier Marco Bertinetto, Na Zhi. 1853-1852 [doi]
- Rhythm Metrics on Syllables and Feet do not Work as ExpectedPaolo Mairano, Antonio Romano. 1857-1860 [doi]
- Applying Rhythm Features to Automatically Assess Non-Native SpeechLei Chen, Klaus Zechner. 1861-1864 [doi]
- Prosodic Synchrony in Co-Operative Task-Based Dialogues: A Measure of Agreement and DisagreementBrian Vaughan. 1865-1868 [doi]
- Low and High, Short and Long by Crook or by Hook?Oliver Niebuhr, Astrid Wolf. 1869-1872 [doi]
- Estimating Speaking Rate by Means of Rhythmicity ParametersChristian Heinrich, Florian Schiel. 1873-1876 [doi]
- Comparing Word and Syllable Prominence Rated by Naïve ListenersDenis Arnold, Bernd Möbius, Petra Wagner. 1877-1880 [doi]
- L1/L2 Perception of Lexical Stress with F0 Peak-Delay: Effect of an Extra Syllable AddedShinichi Tokuma, Yi Xu. 1881-1884 [doi]
- Letter-to-Phoneme Conversion Based on Two-Stage Neural Network Focusing on Letter and Phoneme ContextsKheang Seng, Yurie Iribe, Tsuneo Nitta. 1885-1888 [doi]
- An International English Speech Corpus for Longitudinal Study of Accent DevelopmentRosemary Orr, Hugo Quené, Roeland van Beek, Thari Diefenbach, David A. van Leeuwen, Marijn Huijbregts. 1889-1892 [doi]
- A Corpus-Based Study of English Pronunciation VariationsSunHee Kim, Kyuwhan Lee, Minhwa Chung. 1893-1896 [doi]
- Long Term Average Speech Spectra in Yolngu Matha and Pitjantjatjara Speaking Females and MalesHywel Stoakes, Andrew Butcher, Janet Fletcher, Marija Tabain. 1897-1900 [doi]
- Context and Speaker Dependency in the Relation of Vowel Formants and Subglottal Resonances - Evidence from HungarianTekla Etelka Gráczi, Steven M. Lulich, Tamás Gábor Csapó, András Beke. 1901-1904 [doi]
- Event Selection from Phone Posteriorgrams Using Matched FiltersKeith Kintzley, Aren Jansen, Hynek Hermansky. 1905-1908 [doi]
- A Piecewise Aggregate Approximation Lower-Bound Estimate for Posteriorgram-Based Dynamic Time WarpingYaodong Zhang, James R. Glass. 1909-1912 [doi]
- OOV Detection and Recovery Using Hybrid Models with Different FragmentsLong Qin, Ming Sun, Alexander I. Rudnicky. 1913-1916 [doi]
- AUC Optimization Based Confidence Measure for Keyword SpottingHaiyang Li, Jiqing Han, Tieran Zheng. 1917-1920 [doi]
- An Empirical Study of Multilingual Spoken Term DetectionZejun Ma, Xiaorui Wang, Bo Xu. 1921-1924 [doi]
- Fusing Multiple Confidence Measures for Chinese Spoken Term DetectionZejun Ma, Xiaorui Wang, Bo Xu. 1925-1928 [doi]
- Response Probability Based Decoding Algorithm for Large Vocabulary Continuous Speech RecognitionZhanlei Yang, Hao Chao, Wenju Liu. 1929-1932 [doi]
- Combining Lattice-Based Language Dependent and Independent Approaches for Out-of-Language Detection in LVCSRYuxiang Shan, Yan Deng, Jia Liu. 1933-1936 [doi]
- Evaluation of Tree-Trellis Based Decoding in Over-Million LVCSRNaoaki Ito, Yoshihiko Nankaku, Akinobu Lee. 1937-1940 [doi]
- Lattice Based Discriminative Model Combination Using Automatically Induced Phonetic ContextsHao Huang, Bing Hu Li. 1941-1944 [doi]
- Predicting Human Perceived Accuracy of ASR SystemsTaniya Mishra, Andrej Ljolje, Mazin Gilbert. 1945-1948 [doi]
- Cross-Lingual Study of ASR Errors: On the Role of the Context in Human Perception of Near-HomophonesIoana Vasilescu, Dahbia Yahia, Natalie D. Snoeren, Martine Adda-Decker, Lori Lamel. 1949-1952 [doi]
- Performance Prediction of Speech Recognition Using Average-Voice-Based Speech SynthesisTatsuhiko Saito, Takashi Nose, Takao Kobayashi, Yohei Okato, Akio Horii. 1953-1956 [doi]
- Confidence Measures for Turkish Call Center ConversationsAli Haznedaroglu, Levent M. Arslan. 1957-1960 [doi]
- Spoken Document Confidence Estimation Using Contextual CoherenceTaichi Asami, Narichika Nomoto, Satoshi Kobashikawa, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi. 1961-1964 [doi]
- Fundamental Frequency Estimation Using Modified Higher Order Moments and Multiple WindowsAlipah Pawi, Saeed Vaseghi, Ben Milner, Seyed Ghorshi. 1965-1968 [doi]
- EM-Based Gain Adaptation for Probabilistic Multipitch TrackingMichael Wohlmayr, Franz Pernkopf. 1969-1972 [doi]
- Joint Robust Voicing Detection and Pitch Estimation Based on Residual HarmonicsThomas Drugman, Abeer Alwan. 1973-1976 [doi]
- Epoch Extraction in High Pass Filtered Speech Using Hilbert EnvelopeD. Govind, S. R. Mahadeva Prasanna, Debadatta Pati. 1977-1980 [doi]
- Robust HNR-Based Closed-Loop Pitch and Harmonic Parameters EstimationAlexander Pavlovets, Alexander A. Petrovsky. 1981-1984 [doi]
- Exploring Bessel Features for Detection of Glottal Closure InstantsChetana Prakash, N. Dhananjaya, Suryakanth V. Gangashetty. 1985-1988 [doi]
- Evaluation of Glottal Epoch Detection Algorithms on Different Voice TypesJoão P. Cabral, John Kane, Christer Gobl, Julie Carson-Berndsen. 1989-1992 [doi]
- A Divide et impera Algorithm for Optimal Pitch StylizationAntonio Origlia, Giovanni Abete, Francesco Cutugno, Iolanda Alfano, Renata Savy, Bogdan Ludusan. 1993-1996 [doi]
- Singing Voice Analysis Using Relative Harmonic DelaysRicardo Sousa, Aníbal Ferreira. 1997-2000 [doi]
- Singing Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral EnvelopeSiu Wa Lee, Minghui Dong. 2001-2004 [doi]
- Chorus Digitalis: Experiments in Chironomic Choir SingingSylvain Le Beux, Lionel Feugère, Christophe d'Alessandro. 2005-2008 [doi]
- Prominence Model for Prosodic Features in Automatic Lexical Stress and Pitch Accent DetectionKun Li, Shuang Zhang, Mingxing Li, Wai Kit Lo, Helen M. Meng. 2009-2012 [doi]
- Hierarchical Stress Modeling in Mandarin Text-to-SpeechYa Li, Jianhua Tao, Xiaoying Xu. 2013-2016 [doi]
- Automatic Prosodic Events Detection by Using Syllable-Based Acoustic, Lexical and Syntactic FeaturesChong-Jia Ni, Wenju Liu, Bo Xu. 2017-2020 [doi]
- Using Dynamic Time Warping to Compute Prosodic Similarity MeasuresAlbert Rilliard, Alexandre Allauzen, Philippe Boula de Mareüil. 2021-2024 [doi]
- Applying the Quantitative Target Approximation Model (qTA) to German and Brazilian PortuguesePlínio Almeida Barbosa, Hansjörg Mixdorff, Sandra Madureira. 2025-2028 [doi]
- Stylization and Trajectory Modelling of Short and Long Term Speech Prosody VariationsNicolas Obin, Anne Lacheret, Xavier Rodet. 2029-2032 [doi]
- Toward a Continuous Modeling of French Prosodic Structure: Using Acoustic Features to Predict Prominence Location and Prominence DegreeMathieu Avanzi, Nicolas Obin, Anne Lacheret-Dujour, Bernard Victorri. 2033-2036 [doi]
- Optimal Models of Prosodic Prominence Using the Bayesian Information CriterionTim Mahrt, Jui-Ting Huang, Yoonsook Mo, Margaret M. Fleck, Mark Hasegawa-Johnson, Jennifer Cole. 2037-2040 [doi]
- Quantitative Analysis of Tone Coarticulation in MandarinHussein Hussein, Hansjörg Mixdorff, Hue San Do, Rüdiger Hoffmann. 2041-2044 [doi]
- Tracking Pitch Contours Using Minimum Jerk TrajectoriesDaniel Neiberg, G. Ananthakrishnan, Joakim Gustafson. 2045-2048 [doi]
- On the Use of Linguistic Features in an Automatic System for Speech Analytics of Telephone ConversationsBenjamin Maza, Marc El-Bèze, Georges Linares, Renato de Mori. 2049-2052 [doi]
- Determining what Questions to Ask, with the Help of Spectral Graph TheoryAbe Kazemzadeh, Sungbok Lee, Panayiotis G. Georgiou, Shrikanth S. Narayanan. 2053-2056 [doi]
- 'Are You Sure You're Paying Attention?' - 'Uh-Huh' Communicating Understanding as a Marker of AttentivenessHendrik Buschmeier, Zofia Malisz, Marcin Wlodarczak, Stefan Kopp, Petra Wagner. 2057-2060 [doi]
- Projectability of Transition-Relevance Places Using Prosodic Features in Japanese Spontaneous ConversationYuichi Ishimoto, Mika Enomoto, Hitoshi Iida. 2061-2064 [doi]
- Measuring Final Lengthening for Speaker-Change PredictionAnna Hjalmarsson, Kornel Laskowski. 2065-2068 [doi]
- Incremental Learning and Forgetting in Stochastic Turn-Taking ModelsKornel Laskowski, Jens Edlund, Mattias Heldner. 2069-2072 [doi]
- Reinforcement Learning of Argumentation Dialogue Policies in NegotiationKallirroi Georgila, David R. Traum. 2073-2076 [doi]
- Topic Switching Strategies for Spoken Dialogue SystemsTobias Heinroth, Savina Koleva, Wolfgang Minker. 2077-2080 [doi]
- Unsupervised Clustering of Utterances Using Non-Parametric Bayesian MethodsRyuichiro Higashinaka, Noriaki Kawamae, Kugatsu Sadamitsu, Yasuhiro Minami, Toyomi Meguro, Kohji Dohsaka, Hirohito Inagaki. 2081-2084 [doi]
- OOV Sensitive Named-Entity Recognition in SpeechCarolina Parada, Mark Dredze, Frederick Jelinek. 2085-2088 [doi]
- Speech Translation with Grammar Driven Probabilistic Phrasal Bilexica ExtractionMarkus Saers, Dekai Wu, Chi-kiu Lo, Karteek Addanki. 2089-2092 [doi]
- An Efficient Unified Extraction Algorithm for Bilingual DataChristoph Tillmann, Sanjika Hewavitharana. 2093-2096 [doi]
- Using Features from Topic Models to Alleviate Over-Generation in Hierarchical Phrase-Based TranslationSongfang Huang, Bowen Zhou. 2097-2100 [doi]
- An Empirical Study on Improving Hierarchical Phrase-Based Translation Using Alignment FeaturesSongfang Huang, Bowen Zhou. 2101-2104 [doi]
- Robust Speech Translation by Domain AdaptationXiaodong He, Li Deng. 2105-2108 [doi]
- Enhancements to the Training Process of Classifier-Based Speech Translator via Topic ModelingEmil Ettelaie, Panayiotis G. Georgiou, Shrikanth Narayanan. 2109-2112 [doi]
- A Scalable Approach to Building a Parallel Corpus from the WebVivek Kumar Rangarajan Sridhar, Luciano Barbosa, Srinivas Bangalore. 2113-2116 [doi]
- Spoken Term Detection Results Using Plural Subword Models by Estimating Detection Performance for Each QueryYoshiaki Itoh, Kohei Iwata, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee. 2117-2120 [doi]
- SpeechForms: From Web to Speech and BackLuciano Barbosa, Diamantino Caseiro, Giuseppe Di Fabbrizio. 2121-2124 [doi]
- Image Processing Filters for Line Detection-Based Spoken Term DetectionKazuyuki Noritake, Hiroaki Nanjo, Takehiko Yoshimi. 2125-2128 [doi]
- Using Latent Topic Features for Named Entity Extraction in Search QueriesJoe Polifroni, François Mairesse. 2129-2132 [doi]
- Language Model Expansion Using Webdata for Spoken Document RetrievalRyo Masumura, Seongjun Hahm, Akinori Ito. 2133-2136 [doi]
- Effects of Query Expansion for Spoken Document Passage RetrievalTomoyosi Akiba, Koichiro Honda. 2137-2140 [doi]
- Unsupervised Hidden Markov Modeling of Spoken Queries for Spoken Term Detection without Speech RecognitionChun-an Chan, Lin-Shan Lee. 2141-2144 [doi]
- Topic Identification from Audio Recordings Using Rich Recognition Results and Neural Network Based ClassifiersRoberto Gemello, Franco Mana, Pier Domenico Batzu. 2145-2148 [doi]
- A Grammar Based Approach to Style Specific Phrase PredictionAlok Parlikar, Alan W. Black. 2149-2152 [doi]
- Unsupervised Features from Text for Speech Synthesis in a Speech-to-Speech Translation SystemOliver Watts, Bowen Zhou. 2153-2156 [doi]
- Unsupervised Continuous-Valued Word Features for Phrase-Break Prediction without a Part-of-Speech TaggerOliver Watts, Junichi Yamagishi, Simon King. 2157-2160 [doi]
- Albayzín 2010: A Spanish Text to Speech EvaluationFrancisco Campillo, Francisco Méndez Pazó, Montserrat Arza, Laura Docío Fernández, Antonio Bonafonte, Eva Navas, Iñaki Sainz. 2161-2164 [doi]
- Combining Active and Semi-Supervised Learning for Homograph Disambiguation in Mandarin Text-to-Speech SynthesisBinbin Shen, Zhiyong Wu, Yongxin Wang, Lianhong Cai. 2165-2168 [doi]
- Automatically Creating a Diphone Set from a Speech DatabaseThomas Ewender, Beat Pfister. 2169-2172 [doi]
- Automatic Viseme Clustering for Audiovisual Speech SynthesisWesley Mattheyses, Lukas Latacz, Werner Verhelst. 2173-2176 [doi]
- Perceptual Quality Dimensions of Text-to-Speech SystemsFlorian Hinterleitner, Sebastian Möller, Christoph Norrenbrock, Ulrich Heute. 2177-2180 [doi]
- A Pointwise Approach to Pronunciation Estimation for a TTS Front-EndShinsuke Mori, Graham Neubig. 2181-2184 [doi]
- Correlating Text with ProsodyMohamed Abou-Zleikha, Julie Carson-Berndsen. 2185-2188 [doi]
- "What is... Dengue Fever?" - Modeling and Predicting Pronunciation Errors in a Text-to-Speech SystemAndrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran. 2189-2192 [doi]
- Aperiodicity Analysis for Quality Estimation of Text-to-Speech SignalsChristoph Norrenbrock, Ulrich Heute, Florian Hinterleitner, Sebastian Möller. 2193-2196 [doi]
- Parallels in Infants' Attention to Speech Articulation and to Physical Changes in Speech-Unrelated ObjectsEeva Klintfors, Ellen Marklund, Francisco Lacerda. 2197-2200 [doi]
- Speech Events are Recoverable from Unlabeled Articulatory Data: Using an Unsupervised Clustering Approach on Data Obtained from Electromagnetic Midsaggital Articulography (EMA)Daniel Duran, Jagoda Bruni, Grzegorz Dogil, Hinrich Schütze. 2201-2204 [doi]
- Children's Recognition of their own Voice: Influence of Phonological ImpairmentSofia Strömbergsson. 2205-2208 [doi]
- Evaluation of Bone-Conducted Ultrasonic Hearing-Aid Regarding Transmission of Speaker Discrimination InformationTakayuki Kagomiya, Seiji Nakagawa. 2209-2212 [doi]
- Impact of Different Feedback Mechanisms in EMG-Based Speech RecognitionChristian Herff, Matthias Janke, Michael Wand, Tanja Schultz. 2213-2216 [doi]
- Phonotactic Constraints and the Segmentation of Cantonese SpeechMichael C. W. Yip. 2217-2220 [doi]
- Reaction Time and Decision Difficulty in the Perception of IntonationKatrin Schneider, Grzegorz Dogil, Bernd Möbius. 2221-2224 [doi]
- Processing of Stress Related Acoustic Cues as Indexed by ERPsFerenc Honbolygo, Valéria Csépe. 2225-2228 [doi]
- On the Relationship Between Perceived Accentedness, Acoustic Similarity, and Processing Difficulty in Foreign-Accented SpeechMarijt J. Witteman, Andrea Weber, James M. McQueen. 2229-2232 [doi]
- The Perception Boundary Between Single and Geminate Stops in 3- and 4-Mora Japanese WordsShigeaki Amano, Yukari Hirata. 2233-2236 [doi]
- Correlation Analysis of Acoustic Features with Perceptual Voice Quality Similarity for Similar Speaker SelectionYusuke Ijima, Mitsuaki Isogai, Hideyuki Mizuno. 2237-2240 [doi]
- Can Audio-Visual Speech Recognition Outperform Acoustically Enhanced Speech Recognition in Automotive Environment?Rajitha Navarathna, Tristan Kleinschmidt, David Dean, Sridha Sridharan, Patrick Lucey. 2241-2244 [doi]
- A Multimodal Approach to Dictation of Handwritten Historical DocumentsVicent Alabau, Verónica Romero, Antonio L. Lagarda, Carlos D. Martínez-Hinarejos. 2245-2248 [doi]
- Weight Optimization for Bimodal Unit-Selection Talking Head SynthesisAsterios Toutios, Utpala Musti, Slim Ouni, Vincent Colotte. 2249-2252 [doi]
- Modality Selection and Perceived Mental Effort in a Mobile ApplicationStefan Schaffer, Benjamin Jöckel, Ina Wechsung, Robert Schleicher, Sebastian Möller. 2253-2256 [doi]
- A Cross-Lingual Spoken Content Search SystemJitendra Ajmera, Ashish Verma. 2257-2260 [doi]
- NeMo: A Platform for Multilingual News MonitoringChristian Girardi, Roberto Gretter, Daniele Falavigna, Fabio Brugnara, Diego Giuliani, Marcello Federico. 2261-2264 [doi]
- Unsupervised Learning of Acoustic Unit Descriptors for Audio Content Representation and ClassificationSourish Chaudhuri, Mark Harvilla, Bhiksha Raj. 2265-2268 [doi]
- Conditioned Hidden Markov Model Fusion for Multimodal ClassificationMichael Glodek, Stefan Scherer, Friedhelm Schwenker. 2269-2272 [doi]
- Distant Speech Recognition in a Smart Home: Comparison of Several Multisource ASRs in Realistic ConditionsBenjamin Lecouteux, Michel Vacher, François Portet. 2273-2276 [doi]
- A Robust Approach to Mining Repeated Sequence in Audio StreamJiansong Chen, Lei Zhu, Bailan Feng, Peng Ding, Bo Xu. 2277-2280 [doi]
- Accelerated Parallelizable Neural Network Learning Algorithm for Speech RecognitionDong Yu, Li Deng. 2281-2284 [doi]
- Deep Convex Net: A Scalable Architecture for Speech Pattern ClassificationDong Yu, Li Deng. 2285-2288 [doi]
- Modeling Broad Context for Tone Recognition with Conditional Random FieldsSiwei Wang, Gina-Anne Levow. 2289-2292 [doi]
- Improved Tonal Language Speech Recognition by Integrating Spectro-Temporal Evidence and Pitch Information with Properly Chosen Tonal Acoustic UnitsShang-wen Li, Yow-Bang Wang, Liang-Che Sun, Lin-Shan Lee. 2293-2296 [doi]
- Kullback-Leibler Divergence-Based ASR Training Data SelectionEvandro Gouvêa, Marelie H. Davel. 2297-2300 [doi]
- Articulatory Feature Classification Using Nearest NeighborsArild Brandrud Næss, Karen Livescu, Rohit Prabhavalkar. 2301-2304 [doi]
- Continuous Episodic Memory Based Speech Recognition Using Articulatory DynamicsSébastien Demange, Slim Ouni. 2305-2308 [doi]
- Graphone Model Interpolation and Arabic Pronunciation GenerationT. Li, Philip C. Woodland, Frank Diehl, Mark J. F. Gales. 2309-2312 [doi]
- Grapheme-to-Phoneme Conversion Using Conditional Random FieldsIrina Illina, Dominique Fohr, Denis Jouvet. 2313-2316 [doi]
- Bilingual Acoustic Model Adaptation by Unit Merging on Different Levels and Cross-Level IntegrationChing-feng Yeh, Chao-Yu Huang, Lin-Shan Lee. 2317-2320 [doi]
- A Qualitative Evaluation of Phoneme-to-Phoneme TechnologyMarijn Schraagen, Gerrit Bloothooft. 2321-2324 [doi]
- Cheap Bootstrap of Multi-Lingual Hidden Markov ModelsDaniele Falavigna, Roberto Gretter. 2325-2328 [doi]
- Adaptive Stream Fusion in Multistream Recognition of SpeechNima Mesgarani, Samuel Thomas, Hynek Hermansky. 2329-2332 [doi]
- Unsupervised Audio Patterns Discovery Using HMM-Based Self-Organized UnitsMan-Hung Siu, Herbert Gish, Steve Lowe, Arthur Chan. 2333-2336 [doi]
- Nearest Neighbors with Learned Distances for Phonetic Frame ClassificationJohn Labiak, Karen Livescu. 2337-2340 [doi]
- i-vector Based Speaker Recognition on Short UtterancesAhilan Kanagasundaram, Robbie Vogt, David Dean, Sridha Sridharan, Michael Mason. 2341-2344 [doi]
- Study of Overlapped Speech Detection for NIST SRE Summed Channel Speaker RecognitionHanwu Sun, Bin Ma. 2345-2348 [doi]
- Super-Dirichlet Mixture Models Using Differential Line Spectral Frequencies for Text-Independent Speaker IdentificationZhanyu Ma, Arne Leijon. 2349-2352 [doi]
- Comparison of Voice Activity Detectors for Interview Speech in NIST Speaker Recognition EvaluationHon-Bill Yu, Man-Wai Mak. 2353-2356 [doi]
- Eigen-Voice Based Anchor Modeling System for Speaker Identification Using MLLR Super-VectorAchintya Kumar Sarkar, Srinivasan Umesh. 2357-2360 [doi]
- Automatic Detection of Speaker Attributes Based on Utterance TextWen Wang, Andreas Kathol, Harry Bratt. 2361-2364 [doi]
- Comparison of Speaker Recognition Approaches for Real ApplicationsSandro Cumani, Pier Domenico Batzu, Daniele Colibro, Claudio Vair, Pietro Laface, Vasileios Vasilakakis. 2365-2368 [doi]
- Modeling Speaker Personality Using VoiceTim Polzehl, Sebastian Möller, Florian Metze. 2369-2372 [doi]
- Structural Joint Factor Analysis for Speaker RecognitionMarc Ferras, Koichi Shinoda, Sadaoki Furui. 2373-2376 [doi]
- Acoustic Forest for SMAP-Based Speaker VerificationSangeeta Biswas, Marc Ferras, Koichi Shinoda, Sadaoki Furui. 2377-2380 [doi]
- Mixture of Auto-Associative Neural Networks for Speaker VerificationGarimella S. V. S. Sivaram, Samuel Thomas, Hynek Hermansky. 2381-2384 [doi]
- Stop Consonant Recognition by Temporal Fine Structure of BurstSeppo Fagerlund, Unto K. Laine. 2385-2388 [doi]
- Phonetic Classification Using Controlled Random WalksKatrin Kirchhoff, Andrei Alexandrescu. 2389-2392 [doi]
- Keyphrase Cloud Generation of Broadcast NewsLuís Marujo, Márcio Viveiros, João Paulo Neto. 2393-2396 [doi]
- Optimized Feature Extraction and HMMs in Subword DetectorsAlfonso M. Canterla, Magne Hallstein Johnsen. 2397-2400 [doi]
- Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCsZiqiang Shi, Jiqing Han, Tieran Zheng. 2401-2404 [doi]
- Privacy Preserving Speaker Verification Using Adapted GMMsManas A. Pathak, Bhiksha Raj. 2405-2408 [doi]
- Clustering Expressive Speech Styles in Audiobooks Using Glottal Source ParametersÉva Székely, João P. Cabral, Peter Cahill, Julie Carson-Berndsen. 2409-2412 [doi]
- On the Use of the Rhythmogram for Automatic Syllabic Prominence DetectionBogdan Ludusan, Antonio Origlia, Francesco Cutugno. 2413-2416 [doi]
- Speech Modulation Features for Robust Nonnative Speech Accent DetectionSethserey Sam, Xiong Xiao, Laurent Besacier, Eric Castelli, Haizhou Li, Chng Eng Siong. 2417-2420 [doi]
- Frame-Level Vocal Effort Likelihood Space Modeling for Improved Whisper-Island DetectionChi Zhang, John H. L. Hansen. 2421-2424 [doi]
- Speaker Identification for Whispered Speech Using a Training Feature Transformation from Neutral to WhisperXing Fan, John H. L. Hansen. 2425-2428 [doi]
- An Accurate and Robust Gender Identification AlgorithmAndrea DeMarco, Stephen J. Cox. 2429-2432 [doi]
- Deep Belief Networks for Automatic Music Genre ClassificationXiaohong Yang, Qingcai Chen, Shusen Zhou, Xiaolong Wang. 2433-2436 [doi]
- Image Representation of the Subband Power Distribution for Robust Sound ClassificationJonathan Dennis, Tran Huy Dat, Haizhou Li. 2437-2440 [doi]
- Acoustic and Visual Cues of Turn-Taking Dynamics in Dyadic InteractionsBo Xiao, Viktor Rozgic, Athanasios Katsamanis, Brian R. Baucom, Panayiotis G. Georgiou, Shrikanth Narayanan. 2441-2444 [doi]
- Pointing Gestures do not Influence the Perception of Lexical StressAlexandra Jesse, Holger Mitterer. 2445-2448 [doi]
- Relationships Between Phonetic Features and Speech Perception - A Statistical Investigation from a Large Anechoic British English CorpusIan R. Cushing, Francis F. Li, Ken Worrall, Tim D. Jackson. 2449-2452 [doi]
- The Representation of Speech in a Nonlinear Auditory Model: Time-Domain Analysis of Simulated Auditory-Nerve Firing PatternsGuy J. Brown, Tim Jürgens, Ray Meddis, Matthew Robertson, Nicholas R. Clark. 2453-2456 [doi]
- An Automatic Voice Pleasantness Classification System Based on Prosodic and Acoustic Patterns of Voice PreferenceLuís Pinto Coelho, Daniela Braga, Miguel Sales Dias, Carmen García-Mateo. 2457-2460 [doi]
- Contributions of F1 and F2 (F2') to the Perception of Plosive ConsonantsRené Carré, Pierre L. Divenyi, Willy Serniclaes, Emmanuel Ferragne, Egidio Marsico, Viet Son Nguyen. 2461-2464 [doi]
- Auditory Speech Processing is Affected by Visual Speech in the PeripheryJeesun Kim, Chris Davis. 2465-2468 [doi]
- Visual Speech Speeds Up Auditory Identification ResponsesTim Paris, Jeesun Kim, Chris Davis. 2469-2472 [doi]
- Agglomerative Hierarchical Clustering of Emotions in Speech Based on Subjective Relative SimilarityRyoichi Takashima, Tohru Nagano, Ryuki Tachibana, Masafumi Nishimura. 2473-2476 [doi]
- Optimal Syllabic Rates and Processing Units in Perceiving Mandarin Spoken SentencesGuangting Mai, Gang Peng. 2477-2480 [doi]
- Cross-Lingual Speaker Discrimination Using Natural and Synthetic SpeechMirjam Wester, Hui Liang. 2481-2484 [doi]
- Robust Audio Fingerprinting Based on Local Spectral Luminance Maxima SchemeYongzhe Shi, Weiqiang Zhang, Jia Liu. 2485-2488 [doi]
- Entropy-Rate Driven Inference of Stochastic GrammarsUnto K. Laine. 2489-2492 [doi]
- An Efficient Pre-Processing Scheme to Improve the Sound Source Localization System in Noisy EnvironmentSheng-Chieh Lee, K. Bharanitharan, Bo-Wei Chen, Jhing-Fa Wang, Chung-Hsien Wu, Min-Jian Liao. 2493-2496 [doi]
- A Study on Auditory Feature Spaces for Speech-Driven Lip AnimationGuylaine Le-Jan, Yannick Benezeth, Guillaume Gravier, Frédéric Bimbot. 2497-2500 [doi]
- Phase-Only Speech Reconstruction Using Very Short FramesErfan Loweimi, Seyed Mohammad Ahadi, Hamid Sheikhzadeh. 2501-2504 [doi]
- Frequency-Warped and Stabilized Time-Varying Cepstral CoefficientsTrond Skogstad, Torbjørn Svendsen. 2505-2508 [doi]
- Using Human Perception for Automatic Accent AssessmentFreddy William, Abhijeet Sangwan, John H. L. Hansen. 2509-2512 [doi]
- A Study of the Effectiveness of Articulatory Strokes for Phonemic RecognitionCarlos Molina, Sungbok Lee, Shrikanth S. Narayanan, Néstor Becerra Yoma. 2513-2516 [doi]
- Auditory Filterbank Improves Voice MorphingErika Okamoto, Toshio Irino, Ryuichi Nisimura, Hideki Kawahara. 2517-2520 [doi]<