Back to Search Start Over

A proposed method to improve the WER of an ASR system in the noisy reverberant room.

Authors :
Sadeghi, Mohammad Ebrahim
Sheikhzadeh, Hamid
Emadi, Mohammad Javad
Source :
Journal of the Franklin Institute. Jan2024, Vol. 361 Issue 1, p99-109. 11p.
Publication Year :
2024

Abstract

This paper proposes a novel approach to reducing the word error rate (WER) of an automatic speech recognition (ASR) system in a noisy reverberant room. This research utilizes the integration of beamforming, dereverberation, and ambisonic. Based on the demonstrated formula, the proposed system synthesizes the signal of desired points on the sphere surface from a combination of 32 signals of a uniform spherical microphone array (USMA). This method uses the non-parametric sound field reproduction technique in the spherical harmonics domain (SHD). Also, the suggested new geometry determines the place of the desired points. In addition to improving the dereverberation performance, the proposed method also improves the performance of the beamformer in terms of directivity factor (DF) and white noise gain (WNG). The results show that objective metrics such as PESQ are significantly improved, and the WER of the Kaldi and the WeNet ASR systems is reduced considerably. • We propose a simplified formula to synthesize the sound field of a point in space. • We present a new geometry to overcome the diffuse noise on the WPE algorithm. • We propose a method to rotate the beam pattern of a fixed beamformer without deformation. • The proposed approach improves speech quality in a noisy reverberant environment. • This approach combines beamforming, dereverberation and sound field reproduction. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00160032
Volume :
361
Issue :
1
Database :
Academic Search Index
Journal :
Journal of the Franklin Institute
Publication Type :
Periodical
Accession number :
174816043
Full Text :
https://doi.org/10.1016/j.jfranklin.2023.11.039