LSTM deep neural networks postfiltering for improving the quality of synthetic voices

Coto Jiménez, Marvin; Goddard Close, John

LSTM deep neural networks postfiltering for improving the quality of synthetic voices

dc.creator	Coto Jiménez, Marvin
dc.creator	Goddard Close, John
dc.date.accessioned	2022-03-25T20:06:21Z
dc.date.available	2022-03-25T20:06:21Z
dc.date.issued	2016
dc.description	Part of the Lecture Notes in Computer Science book series (LNCS, volume 9703).	es_ES
dc.description.abstract	Recent developments in speech synthesis have produced systems capable of providing intelligible speech, and researchers now strive to create models that more accurately mimic human voices. One such development is the incorporation of multiple linguistic styles in various languages and accents. HMM-based speech synthesis is of great interest to researchers, due to its ability to produce sophisticated features with a small footprint. Despite such progress, its quality has not yet reached the level of the current predominant unit-selection approaches, that select and concatenate recordings of real speech. Recent efforts have been made in the direction of improving HMM-based systems. In this paper, we present the application of long short-term memory deep neural networks as a postfiltering step in HMM-based speech synthesis. Our motivation stems from a desire to obtain spectral characteristics closer to those of natural speech. The results described in the paper indicate that HMM-voices can be improved using this approach.	es_ES
dc.description.procedence	UCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ingeniería Eléctrica	es_ES
dc.description.sponsorship	Universidad de Costa Rica/[]/UCR/Costa Rica	es_ES
dc.description.sponsorship	Consejo Nacional de Ciencia y Tecnología/[CB-2012-01, No.182432]/CONACyT/México	es_ES
dc.identifier.citation	https://link.springer.com/chapter/10.1007/978-3-319-39393-3_28	es_ES
dc.identifier.doi	10.1007/978-3-319-39393-3_28
dc.identifier.isbn	978-3-319-39393-3
dc.identifier.uri	https://hdl.handle.net/10669/86292
dc.language.iso	eng	es_ES
dc.source	Pattern Recognition (pp.280-289).Guanajuato, Mexico: Springer, Cham	es_ES
dc.subject	Long short-term memory (LSTM)	es_ES
dc.subject	Hidden Markov Models (HMM)	es_ES
dc.subject	Speech synthesis	es_ES
dc.subject	Statistical parametric speech synthesis	es_ES
dc.subject	Postfiltering	es_ES
dc.subject	Deep learning	es_ES
dc.title	LSTM deep neural networks postfiltering for improving the quality of synthetic voices	es_ES
dc.type	comunicación de congreso	es_ES

Files

Original bundle

Now showing 1 - 1 of 1

Name:: HMM2.pdf
Size:: 2.38 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 3.5 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Ingeniería eléctrica