Speech Communication - researchr journal

researchr

You are not signed in
Sign in
Sign up

1059	--	1061	Björn Schuller, Anton Batliner, Stefan Steidl. Introduction to the special issue on sensing emotion and affect - Facing realism in speech processing
1062	--	1087	Björn Schuller, Anton Batliner, Stefan Steidl, Dino Seppi. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge
1088	--	1103	Raul Fernandez, Rosalind W. Picard. Recognizing affect from speech prosody using hierarchical graphical models
1104	--	1114	Simon Worgan, Roger K. Moore. Towards the detection of social dominance in dialogue
1115	--	1136	Katherine Forbes-Riley, Diane J. Litman. Benefits and challenges of real-time uncertainty detection and adaptation in a spoken dialogue computer tutor
1137	--	1148	Jaime C. Acosta, Nigel G. Ward. Achieving rapport with turn-by-turn, user-responsive emotional coloring
1149	--	1161	Ammar Mahdhaoui, Mohamed Chetouani. Supervised and semi-supervised infant-directed speech classification for parent-infant interaction analysis
1162	--	1171	Chi-Chun Lee, Emily Mower, Carlos Busso, Sungbok Lee, Shrikanth Narayanan. Emotion recognition using a hierarchical binary decision tree approach
1172	--	1185	Marcel Kockmann, Lukás Burget, Jan Cernocký. Application of speaker- and language identification state-of-the-art techniques for emotion recognition
1186	--	1197	Elif Bozkurt, Engin Erzin, Çigdem Eroglu Erdem, A. Tanju Erdem. Formant position based weighted spectral features for emotion recognition
1198	--	1209	Tim Polzehl, Alexander Schmitt, Florian Metze, Michael Wagner. Anger recognition in speech using acoustic and linguistic cues
1210	--	1228	Ramón López-Cózar, Jan Silovský, Martin Kroul. Enhancement of emotion detection in spoken dialogue systems by combining several information sources

991	--	1001	Philip N. Garner. Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition
1002	--	1025	Luis Fernando D Haro, Ricardo de Córdoba, Rubén San Segundo, Javier Ferreiros, José Manuel Pardo. Design and evaluation of acceleration strategies for speeding up the development of dialog applications
1026	--	1041	Tatyana Polyakova, Antonio Bonafonte. Introducing nativization to Spanish TTS systems
1042	--	1058	Lisa Davidson. Characteristics of stop releases in American English spontaneous speech

955	--	972	Trevor H. Chen, Dominic W. Massaro. Evaluation of synthetic and natural Mandarin visual speech: Initial consonants, single vowels, and syllables
973	--	985	Takashi Nose, Takao Kobayashi. Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency
986	--	990	Jianxin Peng, Chengxun Bei, Haitao Sun. Relationship between Chinese speech intelligibility and speech transmission index in rooms based on auralization

801	--	806	David R. Beukelman, Jana Childes, Tom Carrell, Trisha Funk, Laura J. Ball, Gary L. Pattee. Perceived attention allocation of listeners who transcribe the speech of speakers with amyotrophic lateral sclerosis
807	--	817	Amy Irwin, Michael Pilling, Sharon M. Thomas. An analysis of British regional accent and contextual cue effects on speechreading performance
818	--	829	Stephen So, Kuldip K. Paliwal. Modulation-domain Kalman filtering for single-channel speech enhancement
830	--	841	Florian Müller, Alfred Mertins. Contextual invariant-integration features for improved speaker-independent speech recognition
842	--	854	Yongqiang Feng, Grace J. Hao, Steve A. Xue, Ludo Max. Detecting anticipatory effects in speech articulation by means of spectral coefficient analyses
855	--	866	Thomas Drugman, Baris Bozkurt, Thierry Dutoit. Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation
877	--	888	Joseph D. W. Stephens, Lori L. Holt. A standard set of American-English voiced stop-consonant stimuli from morphed natural speech
889	--	902	Eren Akdemir, Tolga Çiloglu. Bimodal automatic speech segmentation based on audio and visual information fusion
903	--	913	Garreth Prendergast, Sam R. Johnson, Gary G. R. Green. Extracting amplitude modulations from speech in the time domain
914	--	923	Kai Yu, Heiga Zen, François Mairesse, Steve Young. Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis
924	--	940	Kalle J. Palomäki, Guy J. Brown. A computational model of binaural speech recognition: Role of across-frequency vs. within-frequency processing and internal noise
941	--	954	Frank Zimmerer, Mathias Scharinger, Henning Reetz. When BEAT becomes HOUSE: Factors of word final /t/-deletion in German

591	--	0	Martin Heckmann, Bhiksha Raj, Paris Smaragdis. Preface
592	--	605	Mathias Dietz, Stephan Dieter Ewert, Volker Hohmann. Auditory model based direction estimation of concurrent speakers from binaural signals
606	--	621	Ron J. Weiss, Michael I. Mandel, Daniel P. W. Ellis. Combining localization cues and source model constraints for binaural source separation
622	--	642	Yan-Chen Lu, Martin Cooke. Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners
643	--	657	Ramin Pichevar, Hossein Najaf-Zadeh, Louis Thibault, Hassan Lahdili. Auditory-inspired sparse representation of audio signals
658	--	676	Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Alain de Cheveigné, Shigeki Sagayama. Computational auditory induction as a missing-data model-fitting problem with Bregman divergence
677	--	689	Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki. Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication
690	--	706	Jörg-Hendrik Bach, Jörn Anemüller, Birger Kollmeier. Robust speech detection in real acoustic backgrounds with perceptually motivated features
707	--	715	Hui Yin, Volker Hohmann, Climent Nadeu. Acoustic features for speech recognition based on Gammatone filterbank and instantaneous frequency
716	--	725	Yotaro Kubo, Shigeki Okawa, Akira Kurematsu, Katsuhiko Shirai. Temporal AM-FM combination for robust speech recognition
726	--	735	Maria E. Markaki, Yannis Stylianou. Discrimination of speech from nonspeeech in broadcast news based on modulation frequency features
736	--	752	Martin Heckmann, Xavier Domont, Frank Joublin, Christian Goerick. A hierarchical framework for spectro-temporal feature extraction
753	--	767	Bernd T. Meyer, Birger Kollmeier. Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition
768	--	785	Siqing Wu, Tiago H. Falk, Wai-Yip Chan. Automatic speech emotion recognition using modulation spectral features
786	--	800	Francesc Alías, Lluís Formiga, Xavier Llorà. Efficient and reliable perceptual weight tuning for unit-selection text-to-speech synthesis based on active interactive genetic algorithms: A proof-of-concept

451	--	464	Wooil Kim, John H. L. Hansen. Variational noise model composition through model perturbation for robust speech recognition with time-varying background noise
465	--	494	Kuldip K. Paliwal, Kamil K. Wójcicki, Benjamin J. Shannon. The importance of phase in speech enhancement
495	--	507	Ching-Ta Lu. Enhancement of single channel speech using perceptual-decision-directed approach
508	--	523	Jie Gao, QingWei Zhao, YongHong Yan. Towards precise and robust automatic synchronization of live speech and its transcripts
524	--	539	Tariqullah Jan, Wenwu Wang, DeLiang Wang. A multistage approach to blind separation of convolutive speech mixtures
540	--	551	Phu Ngoc Le, Eliathamby Ambikairajah, Julien Epps, Vidhyasaharan Sethu, Eric H. C. Choi. Investigation of spectral centroid features for cognitive load classification
552	--	566	Milan Legát, Jindrich Matousek, Daniel Tihelka. On the detection of pitch marks using a robust multi-phase algorithm
567	--	589	G. Ananthakrishnan, Olov Engwall. Mapping between acoustic and articulatory gestures

269	--	282	Eero Väyrynen, Juhani Toivanen, Tapio Seppänen. Classification of emotion in spoken Finnish using vowel-length segments: Increasing reliability with a fusion technique
283	--	291	Koichi Shinoda, Yasushi Watanabe, Kenji Iwata, Yuan Liang, Ryuta Nakagawa, Sadaoki Furui. Semi-synchronous speech and pen input for mobile user interfaces
292	--	310	Bianca Vieru, Philippe Boula de Mareüil, Martine Adda-Decker. Characterisation and identification of non-native French accents
311	--	326	Catherine Mayo, Robert A. J. Clark, Simon King. Listeners weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis
327	--	339	Kuldip K. Paliwal, Belinda Schwerin, Kamil K. Wójcicki. Role of modulation magnitude and phase spectrum towards speech intelligibility
340	--	354	Jianfen Ma, Philipos C. Loizou. SNR loss: A new objective measure for predicting the intelligibility of noise-suppressed speech
355	--	378	Stephen So, Kuldip K. Paliwal. Suppressing the influence of additive noise on the Kalman gain for low residual noise speech enhancement
379	--	389	Jean C. Krause, Katherine A. Pelley-Lopez, Morgan P. Tessler. A method for transcribing the manual components of Cued Speech
390	--	402	Jedrzej Kocinski, Pawel Libiszewski, Aleksander Sek. Spatial efficiency of blind source separation based on decorrelation - subjective and objective assessment
403	--	416	Anthony P. Stark, Kuldip K. Paliwal. MMSE estimation of log-filterbank energies for robust speech recognition
417	--	430	I. Hoonhorst, Victoria Medina, C. Colin, E. Markessis, Monique Radeau, P. Deltenre, Willy Serniclaes. Categorical perception of voicing, colors and facial expressions: A developmental study
431	--	441	Rupal Patel, Catherine McNab. Displaying prosodic text to enhance expressive oral reading
442	--	450	Adriana Stan, Junichi Yamagishi, Simon King, Matthew P. Aylett. The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

143	--	153	Marijn Huijbregts, Franciska de Jong. Robust speech/non-speech classification in heterogeneous multimedia content
154	--	174	P. Krishnamoorthy, S. R. Mahadeva Prasanna. Enhancement of noisy speech by temporal and spectral processing
175	--	184	Ruili Wang, Jingli Lu. Investigation of golden speakers for second language learners from imitation preference perspective by voice modification
185	--	194	B. E. Lobdell, J. B. Allen, Mark Hasegawa-Johnson. Intelligibility predictors and neural representation of speech
195	--	209	Abeer Alwan, Jintao Jiang, Willa Chen. Perception of place of articulation for plosives and fricatives in noise
210	--	219	Adam Borowicz, Alexander A. Petrovsky. Signal subspace approach for psychoacoustically motivated speech enhancement
220	--	228	Julia Feld, Mitchell Sommers. There goes the neighborhood: Lipreading and the structure of the mental lexicon
229	--	241	Peng Dai, Ing Yann Soon. A temporal warped 2D psychoacoustic modeling for robust speech recognition system
242	--	256	Geoffrey Stewart Morrison. A comparison of procedures for the calculation of forensic likelihood ratios from acoustic-phonetic data: Multivariate kernel density (MVKD) versus Gaussian mixture model-universal background model (GMM-UBM)
257	--	268	Yun Lei, John H. L. Hansen. Mismatch modeling and compensation for robust speaker verification

1	--	11	Wooil Kim, Richard M. Stern. Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
12	--	22	Monja A. Knoll, Lisa Scharrer, Alan Costall. Look at the shark : Evaluation of student- and actress-produced standardised sentences of infant- and foreigner-directed speech
23	--	35	Anna Hjalmarsson. The additive effect of turn-taking cues in human and synthetic voice
36	--	50	Hiroki Mori, Tomoyuki Satake, Makoto Nakamura, Hideki Kasuya. Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics
51	--	61	Anthony P. Stark, Kuldip K. Paliwal. Use of speech presence uncertainty with MMSE spectral energy estimation for robust automatic speech recognition
62	--	74	Martin Raab, Rainer Gruhn, Elmar Nöth. A scalable architecture for multilingual speech recognition on embedded devices
75	--	84	Linsen Loots, Thomas Niesler. Automatic conversion between pronunciations of different English accents
85	--	97	Alexandros Lazaridis, Iosif Mporas, Todor Ganchev, George K. Kokkinakis, Nikos Fakotakis. Improving phone duration modelling using support vector regression fusion
98	--	109	Prasanta Kumar Ghosh, Shrikanth S. Narayanan. Joint source-filter optimization for robust glottal source estimation in the presence of shimmer and jitter
110	--	118	Vijendra Raj Apsingekar, Phillip L. De Leon. Speaker verification score normalization using speaker model clusters
119	--	130	Man-Wai Mak, Wei Rao. Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification
131	--	141	A. Alpan, Y. Maryn, Abdellah Kacha, Francis Grenez, Jean Schoentgen. Multi-band dysperiodicity analyses of disordered connected speech

runs on WebDSL