Exploring Discrete Wavelet Transforms for Bimodal Speech Recognition

Branko Marković; Veljko Lončarević; Jovan Galić

doi:10.2298/SJEE250911005M

PDF

Published: Nov 18, 2025

DOI: https://doi.org/10.2298/SJEE250911005M

Keywords:

Bimodal speech recognition, Discrete wavelet transformations, Daubechies, Symlets, Coiflets, Biorthogonal, Dynamic time warping.

Branko Marković

https://orcid.org/0000-0003-3924-307X

Veljko Lončarević

https://orcid.org/0009-0007-4296-2709

Jovan Galić

https://orcid.org/0000-0002-2487-7136

Abstract

Discrete Wavelet Transforms (DWTs) provide time–frequency representations that are well suited for nonstationary signals such as speech. This study presents a comparison of four wavelet families (Daubechies, Symlets, Coiflets, and Biorthogonal) for bimodal automatic speech recognition across two speech modes (normal and whispered). Experiments use the Whi-Spe database comprising ten speakers (five female and five male). A Dynamic Time Warping (DTW) back-end performs sequence alignment and recognition. Results are reported via summary tables, histograms, and confusion matrices and reveal systematic differences among the wavelet families, identifying the most effective transform for bimodal recognition. These findings provide practical guidance for selecting wavelet-based front ends in whisper-robust automatic speech recognition (ASR) systems.

Issue

Vol 22 No 3 (2025)

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details