A Multifaceted Feature Extraction Approach for Noise-Robust Punjabi Spoken Digit Recognition System Under Low-Resource Conditions

Puneet Bawa,Virender Kadyan,Gunjan Chhabra

2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)（2024）

引用 0|浏览6

暂无评分

摘要

The development of optimal solutions in un-favourable acoustic conditions has become a crucial step in guaranteeing the durability and dependability of automated speech recognition (ASR) systems across many real-world applications. The study aims to enhance the efficiency of low-resource ASR systems in scenarios when conventional systems exhibit sub-optimal performance. The proposed methodology aims to enhance the performance of Punjabi spoken digit recognition in noisy situations with limited resources. This is achieved via the use of multidimensional feature extraction and model adapation methods. Initially, a baseline system has been developed under clean enviornmental conditions. Subsequently, a singular enhancement of a spoken digit-based system is carried out, including phonetically diverse and continuous Punjabi phrases. Moreover, research on noise augmentation demonstrate that the approaches used for feature extraction have a subtle but significant effect. The vulnerability of Mel-Frequency Cepstral Coefficients (MFCC) to noise has been found to be remarkable, however, Gammatone Frequency Cepstral Coefficients (GFCC) and the combined feature extraction approach of MFCC and GFCC (MF-GFCC) exhibit excellent resistance. The multifaceted approach undergoes a battery of rigorous tests at various noise levels, aiming to simulate the conditions often seen in real-world situations. The efficacy of the suggested enhancements has been validated by evaluating and appraising the performance evaluation metrics, the Word Error Rate (WER). The findings reveals the lower WER of 13.25% under high noise conditions and 13.30% under low noise conditions indicating not only provide valuable insights for improving speech recognition in challenging real-world situations, but also providing ramifications for various languages that have comparable limits as English.

查看译文

关键词

Speech Recognition,Spoken Digit,Noisy environment,Low-resource conditions

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要