Logo Kérwá
 

An experimental study on fundamental frequency detection in reverberated speech with pre-trained recurrent neural networks

Loading...
Thumbnail Image

Date

Authors

Alfaro Picado, Andrei Fabian
Solís Cerdas, Stacy Daniela
Coto Jiménez, Marvin

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The detection of the fundamental frequency (f0) in speech signals is relevant in areas such as automatic speech recognition and identification, with multiple potential applications. For example, in virtual assistants, assistive technology devices and biomedical applications. It has been acknowledged that the extraction of this parameter is affected in adverse conditions, for example, when reverberation or background noise is present. In this paper, we present a new method to improve the detection of the f0 in speech signals with reverberation, based on initialized Long Short-term Memory (LSTM) neural networks. In previous works, LSTM has used weights initialized with random numbers. We propose an initialization in the form of an auto-associative memory, which learns the identity function from non-reverberated data. The advantages of our proposal are shown using different objective quality measures, in particular, in the detection of segments with and without f0.

Description

Part of the Communications in Computer and Information Science book series (CCIS, volume 1087).

Keywords

Deep learning, Fundamental frequency, Long short-term memory (LSTM), Reverberation

Citation

https://link.springer.com/chapter/10.1007/978-3-030-41005-6_24

Endorsement

Review

Supplemented By

Referenced By