Back to Search Start Over

Domain-Specific Chinese Word Segmentation Based on Bi-Directional Long-Short Term Memory Model

Authors :
Yan Xiang
Dangguo Shao
Zhaoqiang Yang
Na Zheng
Zhengtao Yu
Zhenhua Chen
Yantuan Xian
Source :
IEEE Access, Vol 7, Pp 12993-13002 (2019)
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Most of the current word segmentation methods are rule-based and traditional machine learning methods. Universal word segmentation tools do not work well in the field such as metallurgy. Domain-specific Chinese word segmentation is rarely studied. In recent years, with the development of deep learning, the neural network has been proved to be effective in Chinese word segmentation. However, this promising performance relies on large-scale training data. Neural networks with conventional architectures cannot achieve the desired results in low-resource datasets due to the lack of labeled training data. This paper takes the field of metallurgy as an example and proposes a domain-specific Chinese word segmentation based on Bi-directional long-short term memory (Bi-directional LSTM) model in the metallurgical field. First, the word segmentation model is obtained by using the Bi-directional LSTM model to train the internal and external domain knowledge. Then, a series of tuning parameters are carried out and the label probability of the word is combined with the weight. Finally, the result of word segmentation is obtained by label inference layer. The experimental results show that the proposed method can create a better word segmentation effect in the field of metallurgy.

Details

Language :
English
ISSN :
21693536
Volume :
7
Database :
OpenAIRE
Journal :
IEEE Access
Accession number :
edsair.doi.dedup.....7dba26a3835fbca2402d799b7169b9c7