Journal: Speech Communication

Volume 53, Issue 9-10

1059 -- 1061Björn Schuller, Anton Batliner, Stefan Steidl. Introduction to the special issue on sensing emotion and affect - Facing realism in speech processing
1062 -- 1087Björn Schuller, Anton Batliner, Stefan Steidl, Dino Seppi. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge
1088 -- 1103Raul Fernandez, Rosalind W. Picard. Recognizing affect from speech prosody using hierarchical graphical models
1104 -- 1114Simon Worgan, Roger K. Moore. Towards the detection of social dominance in dialogue
1115 -- 1136Katherine Forbes-Riley, Diane J. Litman. Benefits and challenges of real-time uncertainty detection and adaptation in a spoken dialogue computer tutor
1137 -- 1148Jaime C. Acosta, Nigel G. Ward. Achieving rapport with turn-by-turn, user-responsive emotional coloring
1149 -- 1161Ammar Mahdhaoui, Mohamed Chetouani. Supervised and semi-supervised infant-directed speech classification for parent-infant interaction analysis
1162 -- 1171Chi-Chun Lee, Emily Mower, Carlos Busso, Sungbok Lee, Shrikanth Narayanan. Emotion recognition using a hierarchical binary decision tree approach
1172 -- 1185Marcel Kockmann, Lukás Burget, Jan Cernocký. Application of speaker- and language identification state-of-the-art techniques for emotion recognition
1186 -- 1197Elif Bozkurt, Engin Erzin, Çigdem Eroglu Erdem, A. Tanju Erdem. Formant position based weighted spectral features for emotion recognition
1198 -- 1209Tim Polzehl, Alexander Schmitt, Florian Metze, Michael Wagner. Anger recognition in speech using acoustic and linguistic cues
1210 -- 1228Ramón López-Cózar, Jan Silovský, Martin Kroul. Enhancement of emotion detection in spoken dialogue systems by combining several information sources

Volume 53, Issue 8

991 -- 1001Philip N. Garner. Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition
1002 -- 1025Luis Fernando D Haro, Ricardo de Córdoba, Rubén San Segundo, Javier Ferreiros, José Manuel Pardo. Design and evaluation of acceleration strategies for speeding up the development of dialog applications
1026 -- 1041Tatyana Polyakova, Antonio Bonafonte. Introducing nativization to Spanish TTS systems
1042 -- 1058Lisa Davidson. Characteristics of stop releases in American English spontaneous speech

Volume 53, Issue 7

955 -- 972Trevor H. Chen, Dominic W. Massaro. Evaluation of synthetic and natural Mandarin visual speech: Initial consonants, single vowels, and syllables
973 -- 985Takashi Nose, Takao Kobayashi. Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency
986 -- 990Jianxin Peng, Chengxun Bei, Haitao Sun. Relationship between Chinese speech intelligibility and speech transmission index in rooms based on auralization

Volume 53, Issue 6

801 -- 806David R. Beukelman, Jana Childes, Tom Carrell, Trisha Funk, Laura J. Ball, Gary L. Pattee. Perceived attention allocation of listeners who transcribe the speech of speakers with amyotrophic lateral sclerosis
807 -- 817Amy Irwin, Michael Pilling, Sharon M. Thomas. An analysis of British regional accent and contextual cue effects on speechreading performance
818 -- 829Stephen So, Kuldip K. Paliwal. Modulation-domain Kalman filtering for single-channel speech enhancement
830 -- 841Florian Müller, Alfred Mertins. Contextual invariant-integration features for improved speaker-independent speech recognition
842 -- 854Yongqiang Feng, Grace J. Hao, Steve A. Xue, Ludo Max. Detecting anticipatory effects in speech articulation by means of spectral coefficient analyses
855 -- 866Thomas Drugman, Baris Bozkurt, Thierry Dutoit. Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation
877 -- 888Joseph D. W. Stephens, Lori L. Holt. A standard set of American-English voiced stop-consonant stimuli from morphed natural speech
889 -- 902Eren Akdemir, Tolga Çiloglu. Bimodal automatic speech segmentation based on audio and visual information fusion
903 -- 913Garreth Prendergast, Sam R. Johnson, Gary G. R. Green. Extracting amplitude modulations from speech in the time domain
914 -- 923Kai Yu, Heiga Zen, François Mairesse, Steve Young. Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis
924 -- 940Kalle J. Palomäki, Guy J. Brown. A computational model of binaural speech recognition: Role of across-frequency vs. within-frequency processing and internal noise
941 -- 954Frank Zimmerer, Mathias Scharinger, Henning Reetz. When BEAT becomes HOUSE: Factors of word final /t/-deletion in German

Volume 53, Issue 5

591 -- 0Martin Heckmann, Bhiksha Raj, Paris Smaragdis. Preface
592 -- 605Mathias Dietz, Stephan Dieter Ewert, Volker Hohmann. Auditory model based direction estimation of concurrent speakers from binaural signals
606 -- 621Ron J. Weiss, Michael I. Mandel, Daniel P. W. Ellis. Combining localization cues and source model constraints for binaural source separation
622 -- 642Yan-Chen Lu, Martin Cooke. Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners
643 -- 657Ramin Pichevar, Hossein Najaf-Zadeh, Louis Thibault, Hassan Lahdili. Auditory-inspired sparse representation of audio signals
658 -- 676Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Alain de Cheveigné, Shigeki Sagayama. Computational auditory induction as a missing-data model-fitting problem with Bregman divergence
677 -- 689Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki. Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication
690 -- 706Jörg-Hendrik Bach, Jörn Anemüller, Birger Kollmeier. Robust speech detection in real acoustic backgrounds with perceptually motivated features
707 -- 715Hui Yin, Volker Hohmann, Climent Nadeu. Acoustic features for speech recognition based on Gammatone filterbank and instantaneous frequency
716 -- 725Yotaro Kubo, Shigeki Okawa, Akira Kurematsu, Katsuhiko Shirai. Temporal AM-FM combination for robust speech recognition
726 -- 735Maria E. Markaki, Yannis Stylianou. Discrimination of speech from nonspeeech in broadcast news based on modulation frequency features
736 -- 752Martin Heckmann, Xavier Domont, Frank Joublin, Christian Goerick. A hierarchical framework for spectro-temporal feature extraction
753 -- 767Bernd T. Meyer, Birger Kollmeier. Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition
768 -- 785Siqing Wu, Tiago H. Falk, Wai-Yip Chan. Automatic speech emotion recognition using modulation spectral features
786 -- 800Francesc Alías, Lluís Formiga, Xavier Llorà. Efficient and reliable perceptual weight tuning for unit-selection text-to-speech synthesis based on active interactive genetic algorithms: A proof-of-concept

Volume 53, Issue 4

451 -- 464Wooil Kim, John H. L. Hansen. Variational noise model composition through model perturbation for robust speech recognition with time-varying background noise
465 -- 494Kuldip K. Paliwal, Kamil K. Wójcicki, Benjamin J. Shannon. The importance of phase in speech enhancement
495 -- 507Ching-Ta Lu. Enhancement of single channel speech using perceptual-decision-directed approach
508 -- 523Jie Gao, QingWei Zhao, YongHong Yan. Towards precise and robust automatic synchronization of live speech and its transcripts
524 -- 539Tariqullah Jan, Wenwu Wang, DeLiang Wang. A multistage approach to blind separation of convolutive speech mixtures
540 -- 551Phu Ngoc Le, Eliathamby Ambikairajah, Julien Epps, Vidhyasaharan Sethu, Eric H. C. Choi. Investigation of spectral centroid features for cognitive load classification
552 -- 566Milan Legát, Jindrich Matousek, Daniel Tihelka. On the detection of pitch marks using a robust multi-phase algorithm
567 -- 589G. Ananthakrishnan, Olov Engwall. Mapping between acoustic and articulatory gestures

Volume 53, Issue 3

269 -- 282Eero Väyrynen, Juhani Toivanen, Tapio Seppänen. Classification of emotion in spoken Finnish using vowel-length segments: Increasing reliability with a fusion technique
283 -- 291Koichi Shinoda, Yasushi Watanabe, Kenji Iwata, Yuan Liang, Ryuta Nakagawa, Sadaoki Furui. Semi-synchronous speech and pen input for mobile user interfaces
292 -- 310Bianca Vieru, Philippe Boula de Mareüil, Martine Adda-Decker. Characterisation and identification of non-native French accents
311 -- 326Catherine Mayo, Robert A. J. Clark, Simon King. Listeners weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis
327 -- 339Kuldip K. Paliwal, Belinda Schwerin, Kamil K. Wójcicki. Role of modulation magnitude and phase spectrum towards speech intelligibility
340 -- 354Jianfen Ma, Philipos C. Loizou. SNR loss: A new objective measure for predicting the intelligibility of noise-suppressed speech
355 -- 378Stephen So, Kuldip K. Paliwal. Suppressing the influence of additive noise on the Kalman gain for low residual noise speech enhancement
379 -- 389Jean C. Krause, Katherine A. Pelley-Lopez, Morgan P. Tessler. A method for transcribing the manual components of Cued Speech
390 -- 402Jedrzej Kocinski, Pawel Libiszewski, Aleksander Sek. Spatial efficiency of blind source separation based on decorrelation - subjective and objective assessment
403 -- 416Anthony P. Stark, Kuldip K. Paliwal. MMSE estimation of log-filterbank energies for robust speech recognition
417 -- 430I. Hoonhorst, Victoria Medina, C. Colin, E. Markessis, Monique Radeau, P. Deltenre, Willy Serniclaes. Categorical perception of voicing, colors and facial expressions: A developmental study
431 -- 441Rupal Patel, Catherine McNab. Displaying prosodic text to enhance expressive oral reading
442 -- 450Adriana Stan, Junichi Yamagishi, Simon King, Matthew P. Aylett. The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

Volume 53, Issue 2

143 -- 153Marijn Huijbregts, Franciska de Jong. Robust speech/non-speech classification in heterogeneous multimedia content
154 -- 174P. Krishnamoorthy, S. R. Mahadeva Prasanna. Enhancement of noisy speech by temporal and spectral processing
175 -- 184Ruili Wang, Jingli Lu. Investigation of golden speakers for second language learners from imitation preference perspective by voice modification
185 -- 194B. E. Lobdell, J. B. Allen, Mark Hasegawa-Johnson. Intelligibility predictors and neural representation of speech
195 -- 209Abeer Alwan, Jintao Jiang, Willa Chen. Perception of place of articulation for plosives and fricatives in noise
210 -- 219Adam Borowicz, Alexander A. Petrovsky. Signal subspace approach for psychoacoustically motivated speech enhancement
220 -- 228Julia Feld, Mitchell Sommers. There goes the neighborhood: Lipreading and the structure of the mental lexicon
229 -- 241Peng Dai, Ing Yann Soon. A temporal warped 2D psychoacoustic modeling for robust speech recognition system
242 -- 256Geoffrey Stewart Morrison. A comparison of procedures for the calculation of forensic likelihood ratios from acoustic-phonetic data: Multivariate kernel density (MVKD) versus Gaussian mixture model-universal background model (GMM-UBM)
257 -- 268Yun Lei, John H. L. Hansen. Mismatch modeling and compensation for robust speaker verification

Volume 53, Issue 1

1 -- 11Wooil Kim, Richard M. Stern. Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
12 -- 22Monja A. Knoll, Lisa Scharrer, Alan Costall. Look at the shark : Evaluation of student- and actress-produced standardised sentences of infant- and foreigner-directed speech
23 -- 35Anna Hjalmarsson. The additive effect of turn-taking cues in human and synthetic voice
36 -- 50Hiroki Mori, Tomoyuki Satake, Makoto Nakamura, Hideki Kasuya. Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics
51 -- 61Anthony P. Stark, Kuldip K. Paliwal. Use of speech presence uncertainty with MMSE spectral energy estimation for robust automatic speech recognition
62 -- 74Martin Raab, Rainer Gruhn, Elmar Nöth. A scalable architecture for multilingual speech recognition on embedded devices
75 -- 84Linsen Loots, Thomas Niesler. Automatic conversion between pronunciations of different English accents
85 -- 97Alexandros Lazaridis, Iosif Mporas, Todor Ganchev, George K. Kokkinakis, Nikos Fakotakis. Improving phone duration modelling using support vector regression fusion
98 -- 109Prasanta Kumar Ghosh, Shrikanth S. Narayanan. Joint source-filter optimization for robust glottal source estimation in the presence of shimmer and jitter
110 -- 118Vijendra Raj Apsingekar, Phillip L. De Leon. Speaker verification score normalization using speaker model clusters
119 -- 130Man-Wai Mak, Wei Rao. Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification
131 -- 141A. Alpan, Y. Maryn, Abdellah Kacha, Francis Grenez, Jean Schoentgen. Multi-band dysperiodicity analyses of disordered connected speech