LSTM deep neural networks postfiltering for improving the quality of synthetic voices
dc.creator | Coto Jiménez, Marvin | |
dc.creator | Goddard Close, John | |
dc.date.accessioned | 2022-03-25T20:06:21Z | |
dc.date.available | 2022-03-25T20:06:21Z | |
dc.date.issued | 2016 | |
dc.description | Part of the Lecture Notes in Computer Science book series (LNCS, volume 9703). | es_ES |
dc.description.abstract | Recent developments in speech synthesis have produced systems capable of providing intelligible speech, and researchers now strive to create models that more accurately mimic human voices. One such development is the incorporation of multiple linguistic styles in various languages and accents. HMM-based speech synthesis is of great interest to researchers, due to its ability to produce sophisticated features with a small footprint. Despite such progress, its quality has not yet reached the level of the current predominant unit-selection approaches, that select and concatenate recordings of real speech. Recent efforts have been made in the direction of improving HMM-based systems. In this paper, we present the application of long short-term memory deep neural networks as a postfiltering step in HMM-based speech synthesis. Our motivation stems from a desire to obtain spectral characteristics closer to those of natural speech. The results described in the paper indicate that HMM-voices can be improved using this approach. | es_ES |
dc.description.procedence | UCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ingeniería Eléctrica | es_ES |
dc.description.sponsorship | Universidad de Costa Rica/[]/UCR/Costa Rica | es_ES |
dc.description.sponsorship | Consejo Nacional de Ciencia y Tecnología/[CB-2012-01, No.182432]/CONACyT/México | es_ES |
dc.identifier.citation | https://link.springer.com/chapter/10.1007/978-3-319-39393-3_28 | es_ES |
dc.identifier.doi | 10.1007/978-3-319-39393-3_28 | |
dc.identifier.isbn | 978-3-319-39393-3 | |
dc.identifier.uri | https://hdl.handle.net/10669/86292 | |
dc.language.iso | eng | es_ES |
dc.source | Pattern Recognition (pp.280-289).Guanajuato, Mexico: Springer, Cham | es_ES |
dc.subject | Long short-term memory (LSTM) | es_ES |
dc.subject | Hidden Markov Models (HMM) | es_ES |
dc.subject | Speech synthesis | es_ES |
dc.subject | Statistical parametric speech synthesis | es_ES |
dc.subject | Postfiltering | es_ES |
dc.subject | Deep learning | es_ES |
dc.title | LSTM deep neural networks postfiltering for improving the quality of synthetic voices | es_ES |
dc.type | comunicación de congreso | es_ES |