Logo Kérwá
 

Reconstructing fundamental frequency from noisy speech using initialized autoencoders

dc.creatorZeledón Córdoba, Marisol
dc.creatorSánchez Solís, Joseline
dc.creatorCoto Jiménez, Marvin
dc.date.accessioned2022-03-22T16:20:29Z
dc.date.available2022-03-22T16:20:29Z
dc.date.issued2020-10
dc.description.abstractIn this paper, we present a new approach for fundamental frequency (f0) detection in noisy speech, based on Long Short-term Memory Neural Networks (LSTM). f0 is one of the most important parameters of human speech. Its detection is relevant in many speech signal processing areas and remains an important challenge for severely degraded signals. In previous references for f0 detection in speech enhancement and noise reduction tasks, LSTM has been initialized with random weights, following a back-propagation through time algorithm to adjust them. Our proposal is an alternative for a more efficient initialization, based on the weights of an Autoassociative network. This initialization is a better starting point for the f0 detection in noisy speech. We show the advantages of pre-training using objective measures for the parameter and the training process, with artificial and natural noise added at different signal-to-noise levels. Results show the performance of the LSTM increases in comparison to the random initialization, and represents a significant improvement in comparison with traditional initialization of neural networks for f0 detection in noisy conditions.es_ES
dc.description.procedenceUCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ingeniería Eléctricaes_ES
dc.description.sponsorshipUniversidad de Costa Rica/[322-B9-105]/UCR/Costa Ricaes_ES
dc.identifier.citationhttps://ieeexplore.ieee.org/abstract/document/9387643es_ES
dc.identifier.codproyecto322-B9-105
dc.identifier.doi10.1109/TLA.2020.9387643
dc.identifier.issn1548-0992
dc.identifier.urihttps://hdl.handle.net/10669/86261
dc.language.isoenges_ES
dc.sourceIEEE Latin America Transactions, vol.18(10), pp.1724-1731.es_ES
dc.subjectDeep learninges_ES
dc.subjectFundamental frequencyes_ES
dc.subjectLong short-term memory (LSTM)es_ES
dc.subjectNEURAL NETWORKSes_ES
dc.titleReconstructing fundamental frequency from noisy speech using initialized autoencoderses_ES
dc.typeartículo originales_ES

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
IEEE_Latam.pdf
Size:
953.82 KB
Format:
Adobe Portable Document Format
Description:
Artículo principal

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.5 KB
Format:
Item-specific license agreed upon to submission
Description: