The 9th ISCA Speech Synthesis Workshop, Sunnyvale, CA, USA, 13-15 September 2016 - researchr publication

researchr

You are not signed in
Sign in
Sign up

The 9th ISCA Speech Synthesis Workshop, Sunnyvale, CA, USA, 13-15 September 2016. ISCA, 2016. [doi]

Conference: ssw2016

Abstract is missing.

Siri's voice gets deep learningAlex Acero. [doi]

Large-scale finite element simulations of the physics of voiceOriol Guasch. [doi]

End-to-end Learning for Text and SpeechQuoc V. Le. [doi]

Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration featuresMahsa Sadat Elyasi Langarani, Jan P. H. van Santen. 1-6 [doi]

Synthesising Filled Pauses: Representation and DatamixingRasmus Dall, Marcus Tomalin, Mirjam Wester. 7-13 [doi]

Emphasis recreation for TTS using intonation atomsPierre-Edouard Honnet, Philip N. Garner. 14-20 [doi]

Prediction of Emotions from Text using Sentiment Analysis for Expressive Speech SynthesisEva Vanmassenhove, João P. Cabral, Fasih Haider. 21-26 [doi]

Non-filter waveform generation from cepstrum using spectral phase reconstructionYasuhiro Hamada, Nobutaka Ono, Shigeki Sagayama. 27-31 [doi]

Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech SynthesisAlexandros Lazaridis, Milos Cernak, Pierre-Edouard Honnet, Philip N. Garner. 32-37 [doi]

Multidimensional scaling of systems in the Voice Conversion Challenge 2016Mirjam Wester, Zhizheng Wu, Junichi Yamagishi. 38-43 [doi]

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker SimilarityDong-Yan Huang, Lei Xie, Yvonne Siu Wa Lee, Jie Wu, Huaiping Ming, Xiaohai Tian, Shaofei Zhang, Chuang Ding, Mei Li, Nguyen Quy Hy, Minghui Dong, Haizhou Li. 44-51 [doi]

Nonaudible murmur enhancement based on statistical voice conversion and noise suppression with external noise monitoringYusuke Tajiri, Tomoki Toda. 52-58 [doi]

Prosodic and Spectral iVectors for Expressive Speech SynthesisIgor Jauk, Antonio Bonafonte. 59-63 [doi]

Development of a statistical parametric synthesis system for operatic singing in GermanMichael Pucher, Fernando Villavicencio, Junichi Yamagishi. 64-69 [doi]

DNN-based Speech Synthesis for Indian Languages from ASCII textSrikanth Ronanki, Siva Reddy Gangireddy, Bajibabu Bollepalli, Simon King. 70-75 [doi]

Experiments with Cross-lingual Systems for Synthesis of Code-Mixed TextSunayana Sitaram, Sai Krishna Rallabandi, Shruti Rijhwani, Alan W. Black. 76-81 [doi]

Jerk Minimization for Acoustic-To-Articulatory InversionAvni Rajpal, Hemant A. Patil. 82-87 [doi]

How to select a good voice for TTSSunHee Kim. 88-92 [doi]

WikiSpeech - enabling open source text-to-speech for WikipediaJohn Andersson, Sebastian Berlin, André Costa, Harald Berthelsen, Hanna Lindgren, Nikolaj Lindberg, Jonas Beskow, Jens Edlund, Joakim Gustafson. 93-99 [doi]

Parallel and cascaded deep neural networks for text-to-speech synthesisManuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi. 100-105 [doi]

Temporal modeling in neural network based statistical parametric speech synthesisKeiichi Tokuda, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku. 106-111 [doi]

Multi-output RNN-LSTM for multiple speaker speech synthesis with α-interpolation modelSantiago Pascual, Antonio Bonafonte. 112-117 [doi]

A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent CorporaXin Wang, Shinji Takaki, Junichi Yamagishi. 118-121 [doi]

Prosodic Reading Tutor of Japanese, Suzuki-kun: The first and only educational tool to teach the formal JapaneseNobuaki Minematsu, Daisuke Saito, Nobuyuki Nishizawa. 122 [doi]

Aliasing-free L-F model and its application to an interactive MATLAB tool and test signal generation for speech analysis proceduresHideki Kawahara. 123 [doi]

A Demonstration of the Merlin Open Source Neural Network Speech Synthesis SystemSrikanth Ronanki, Zhizheng Wu, Oliver Watts, Simon King. 124 [doi]

WaveNet: A Generative Model for Raw AudioAäron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, Koray Kavukcuoglu. 125 [doi]

Demo of Idlak Tangle, An Open Source DNN-Based Parametric Speech SynthesiserBlaise Potard, Matthew P. Aylett, David A. Baude. 126 [doi]

Non-intrusive Quality Assessment of Synthesized Speech using Spectral Features and Support Vector RegressionMeet H. Soni, Hemant A. Patil. 127-133 [doi]

Novel Pre-processing using Outlier Removal in Voice ConversionSushant V. Rao, Nirmesh J. Shah, Hemant A. Patil. 134-139 [doi]

Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet TransformZhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki. 140-145 [doi]

Investigating RNN-based speech enhancement methods for noise-robust Text-to-SpeechCassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi. 146-152 [doi]

Speaker Adaptation of Various Components in Deep Neural Network based Speech SynthesisShinji Takaki, Sangjin Kim, Junichi Yamagishi. 153-159 [doi]

Mandarin Prosodic Phrase Prediction based on Syntactic TreesZhengchen Zhang, Fuxiang Wu, Chenyu Yang, Minghui Dong, Fugen Zhou. 160-165 [doi]

Investigating Very Deep Highway Networks for Parametric Speech SynthesisXin Wang, Shinji Takaki, Junichi Yamagishi. 166-171 [doi]

Contextual Representation using Recurrent Neural Network Hidden State for Statistical Parametric Speech SynthesisSivanand Achanta, Rambabu Banoth, Ayushi Pandey, Anandaswarup Vadapalli, Suryakanth V. Gangashetty. 172-177 [doi]

Wide Passband Design for Cosine-Modulated Filter Banks in Sinusoidal Speech SynthesisNobuyuki Nishizawa, Tomonori Yazaki. 178-183 [doi]

Utterance Selection Techniques for TTS Systems Using Found SpeechPallavi Baljekar, Alan W. Black. 184-189 [doi]

Open-Source Consumer-Grade Indic Text To SpeechAndrew Wilkinson, Alok Parlikar, Sunayana Sitaram, Tim White, Alan W. Black, Suresh Bazaj. 190-195 [doi]

On the impact of phoneme alignment in DNN-based speech synthesisMei Li, Zhizheng Wu, Lei Xie. 196-201 [doi]

Merlin: An Open Source Neural Network Speech Synthesis SystemZhizheng Wu, Oliver Watts, Simon King. 202-207 [doi]

A hybrid harmonics-and-bursts modelling approach to speech synthesisJonas Beskow, Harald Berthelsen. 208-213 [doi]

A Pulse Model in Log-domain for a Uniform SynthesizerGilles Degottex, Pierre Lanchantin, Mark J. F. Gales. 214-220 [doi]

Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesisHideki Kawahara, Yannis Agiomyrgiannakis, Heiga Zen. 221-228 [doi]

Wideband Harmonic Model: Alignment and Noise Modeling for High Quality Speech SynthesisSlava Shechtman, Alexander Sorin. 229-234 [doi]

runs on WebDSL