Home / Articles
ENHANCING SINGLE CHANNEL SPEECH PROCESSING WITH ADVANCED NOISE REDUCTION TECHNIQUES | |
Author Name Shreyaa V and Deepa D Abstract The effectiveness of a CNN-based U-Net architecture in enhancing speech signals amidst background noise is evaluated in this study. The primary aim is to quantify improvements in speech clarity and intelligibility for children experiencing speech perception challenges due to urban noise. A comprehensive analysis was conducted on a dataset combining clean speech from the RAVDESS dataset and urban noise from UrbanSound8K. The audio samples were processed using Short-Time Fourier Transform (STFT) and fed into a U-Net model. Objective metrics such as SNR, Itakura-Saito distance, RMSE, and STOI were computed to assess the enhancements. Results were compared to traditional noise reduction methods like Wiener filtering and spectral subtraction. The STOI score exhibited a notable improvement, rising from 0.71 to 0.83, indicating a marked enhancement in speech intelligibility. Furthermore, subjective evaluations through the Mean Opinion Score (MOS) highlighted an overall positive perception of the enhanced audio quality, confirming the effectiveness of the U-Net model in reducing noise and improving speech clarity. The results demonstrate that the CNN-based U-Net architecture significantly improves speech quality in noisy environments compared to traditional methods. These findings suggest potential applications in hearing aids and other audio processing technologies to enhance communication in challenging acoustic settings.
Key Words: Speech enhancement, CNN, U-Net, SNR, noise reduction, urban noise, audio processing, hearing aids, intelligibility, machine learning. Published On : 2024-11-25 Article Download : |