Blog by the Media Technologies research group (GTM). Researching interactions between humans, machines and their environments.

06 November 2019 | Posted by Editorial Team GTM

GTM participates in the 10th ISCA Speech Synthesis Workshop (SSW10)

Figure 1 – Marc Freixes presenting at the 10th ISCA Speech Synthesis Workshop (SSW10)

 

The member of the GTM Marc Freixes attended the 10th ISCA Speech Synthesis Workshop (SSW10), held in Vienna (Austria) from 20 to 22 September. Marc Freixes presented a paper made in collaboration with Dr. Marc Arnela, Prof. Francesc Alías and Dr. Joan Claudi Socoró in the framework of the project GENIOVOX.

That work aimed at adding some expressiveness to the 3D numerical synthesis of vowel [a]. An LF (Liljencrants-Fant) model was used to obtain a glottal source signal, similar to that generated by the vocal folds vibration. The propagation of this signal through a realistic 3D vocal tract was simulated by means of a finite element method (FEM).

A parallel corpus of speech containing neutral and tense voice emotional styles (aggressive and happy) was analysed using the GlottDNN vocoder to decompose the speech signal into a glottal source signal and a vocal tract response. The glottal source characteristics of the emotional styles were analysed with respect to neutral speech, thus deriving the parameters to control the LF model and resemble these styles.

 

The reference of the paper is:

Marc Freixes, Marc Arnela, Francesc Alías, and Joan Claudi Socoró (2019); “GlottDNN-based spectral tilt analysis of tense voice emotional styles for the expressive 3D numerical synthesis of vowel [a]”. Proceedings of the 10th ISCA Speech Synthesis Workshop, 132-136, 20-22 September 2019, Vienna (Austria). DOI: 10.21437/SSW.2019-24.

Link: https://www.isca-speech.org/archive/SSW_2019/pdfs/SSW10_O_4-3.pdf

 

Figure 2- Workflow diagram used for the analysis and comparison of expressive natural speech with respect to synthetic speech generated with a 3D FEM-based acoustic model that uses an LF model as glottal excitation.

This research has been supported by the Agencia Estatal de Investigación (AEI) and FEDER, EU, through project GENIOVOX TEC2016-81107-P. Prof. Francesc Alías also acknowledges the support from the Obra Social “La Caixa" for grant ref. 2018-URL-IR2nQ-029.

 

 

Share

Add new comment

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
17 + 3 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.