GTM participates in the IberSPEECH2020 conference
Despite being initially planned to be held in person in November 2020 in Valladolid, Spain, the IberSPEECH2020 conference finally took place as an online event between March 25 and 26, 2021 as a consequence of the COVID19 pandemic. The conference included two keynotes, 5 oral sessions, the presentation of the results of the last edition of the Albayzín Challenges, together with two special sessions, one about R&D projects and another focused on recent PhD theses regarding speech and language technologies. The Spanish Thematic Network on Speech Technology (Red Temática en Tecnologías del Habla, RTTH) met during the conference, after becoming an Association. Prof. Francesc Alías, Dr. Marc Arnela, and Marc Freixes attended the meeting, as current representatives of the GTM in this association. These GTM researchers also attended the conference, and chaired 1 session and presented 2 papers:
- Prof. Francesc Alías chaired the “R&D Projects” session and presented a paper in that session about the main results of the GENIOVOX Project (“Computational synthesis of expressive voice”) developed in collaboration with Prof. Oriol Guasch, Dr. Joan Claudi Socoró, Dr. Marc Arnela, Marc Freixes and Dr. Arnau Pont.
This project funded by the Ministerio de Economía, Industria y Competitividad (Plan Nacional de I+D Excelencia, ref. TEC2016-81107-P) was carried out in the period 2016-2019. The two main computational objectives of the Project were focused on the simulation of diphthongs and hiatuses in three-dimensional (3D) geometries using finite element method (FEM), and on the development of new techniques to simulate fricative consonants without having to reckon on supercomputing facilities. Moreover, the investigations also included a first attempt to incorporate expressive effects in the numerical simulations through the modification of the 3D vocal tract geometry and the glottal source model. Specifically, vowel sounds were computationally generated by convoluting the impulse response of 3D FEM vocal tracts with glottal pulses that incorporated tense, neutral and lax phonations from pre-recorded expressive speech corpora.
Reference and link to the paper:
Oriol Guasch, Francesc Alías, Marc Arnela, Joan Claudi Socoró, Marc Freixes and Arnau Pont (2021); "GENIOVOX Project: Computational generation of expressive voice", Proc. of IberSPEECH2021, pp. 151-154, 24-25 March 2021, Valladolid, Spain [pdf]
The authors acknowledge the Agencia Estatal de Investigación (AEI) and FEDER, EU, for funding the project GENIOVOX (ref. TEC2016-81107-P).
- Marc Freixes presented a paper developed in collaboration with Prof. Francesc Alías and Dr. Joan Claudi Socoró in the framework of the Aniveu project. The motivation behind this work is the numerical simulation of expressive voice. To move towards this goal, this paper analyses the contribution of the vocal tract (VT) and glottal source spectral (GSS) cues to the production of happy and aggressive vowels with respect to neutral vowels. A parallel corpus of speech containing neutral and the aforementioned tense voice emotional styles was analysed using the GlottDNN vocoder to decompose the speech signal into a glottal source signal and a vocal tract response (see Figure 1). The neutral utterances were transplanted with the prosody of the expressive pairs, and subsequently resynthesised considering also their GSS and/or VT. Objective and subjective evaluations show that, both GSS and VT significantly contribute to convey the studied expressive styles, prevailing the effect of the VT over the GSS. The incorporation of both GSS and VT increases the perceived emotional intensity a 55.3 % for happy and a 62.8% for aggressive compared with the baseline, which only considers the expressive prosody.
Figure 1. Diagram of the framework proposed to study the contribution of vocal tract and glottal source spectral cues in the generation of tense voice expressive vowels
Reference and link to the paper:
Marc Freixes, Francesc Alías and Joan Claudi Socoró (2021); “Contribution of vocal tract and glottal source spectral cues in the generation of happy and aggressive [a] vowels” Proc. of IberSPEECH2021, pp. 240-244, 24-25 March 2021, Valladolid, Spain [pdf]
The research that has led to the results reported in this work has been funded by the SUR/DEC from the Government of Catalonia and the Ramon Llull University (ref. 2020-URL-Proj-056). The authors also would like to thank the participants on the perceptual test for their collaboration in this work.
Finally, it is also worth mentioning that Prof. Francesc Alías collaborated in the organization of the conference as General Co-Chair and Publication Co-Chair, being also member of the “Best Paper Award” and “Best PhD Award” committees, as well as serving as one of the Guest Editors of the upcoming "IberSPEECH 2020: Speech and Language Technologies for Iberian Languages" Special Issue of the Applied Sciences journal.