Back to Search
Start Over
Multilevel structured convolution neural network for speech keyword location and recognition: MSS‐Net
- Source :
- The Journal of Engineering, Vol 2021, Iss 10, Pp 582-593 (2021)
- Publication Year :
- 2021
- Publisher :
- Wiley, 2021.
-
Abstract
- Language keyword extraction has broad application prospects. Most current keyword search networks are based on speech recognition and text search. This method is helpless for some languages that are difficult to text and cannot locate the location where the keywords appear. A multi‐level structure of speech keyword location and recognition system (MSS‐Net), which can locate the location interval of keywords while recognizing keywords is proposed here. MSS‐Net contains three levels. The first level is the voice start and end detection module, which is responsible for detecting the start and end positions of valid audio. When valid audio is detected, the second level is activated: the keyword detection module. The keyword detection module detects keyword information in the audio signal and locates the keyword position interval. The third level is the keyword recognition module. The system recognizes the detected position interval of the location to obtain specific keyword information. Comparative experiments show that MSS‐Net is more accurate for keyword positioning, with a network recall rate of 94.91% and an accuracy rate of 96.32%, which is better than other network models participating in the experiment. In addition, the network has strong anti‐interference ability, which can achieve good suppression effect for normal life noise. The ablation experiment verified the effect of the hierarchical network structure on the performance of speech recognition tasks.
Details
- Language :
- English
- ISSN :
- 20513305
- Volume :
- 2021
- Issue :
- 10
- Database :
- OpenAIRE
- Journal :
- The Journal of Engineering
- Accession number :
- edsair.doi.dedup.....e56658c80b1d4f16c26c2860529618e5