An engineering blog for engineers working in Audiovisuals, Telecommunications, Electronics, ICT, Computer Engineering, Multimedia and Telematics.

23 July 2010 | Posted by Redacción Ingeniería

New PhD thesis presented on quality voice modelling

Last wednesday July 14th a new PhD thesis was presented within the research group on media technologies, the author of the thesis is Carlos Monzo and the title is “Quality voice modelling for expressive speech synthesis”. The advisors of this work are Dr. Joan Claudi Socoró and Dr. Ignasi Iriondo. We congratulate all for this excellent work! Abstract: The final goal of the thesis is the expressive speech styles generation in the field of Text-to-Speech (TTS) systems aimed at Expressive Speech Synthesis (ESS), with the possibility of communicating an oral message with a certain expressivity that the listener will be able to correctly perceive and interpret. The search for the naturalness improvement has implied a better characterization of the emotional or expressive speech, so we have researched on parametrizations that could perform this task. These are Voice Quality (VoQ) parameters, which main feature is that they are able to characterize the speech in a individual way, identifying each factor that makes it unique. Once the parameters selection is conducted, the VoQ modelling is raised, so each of them can be extracted from the voice signal and later on modified during the synthesis. Also, variations are proposed for the involved and traditionally used parameters, adjusting their definition to the expressive speech context.

Share