1. Text Classification Based on Latent Semantic Indexing and Graph Embedding
- Author
-
Zhilong Zhen and Juxiao Zhang
- Subjects
0209 industrial biotechnology ,Information retrieval ,Semantic feature ,Computer science ,Graph embedding ,02 engineering and technology ,Graph ,ComputingMethodologies_PATTERNRECOGNITION ,020901 industrial engineering & automation ,Semantic similarity ,0202 electrical engineering, electronic engineering, information engineering ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,Latent semantic indexing - Abstract
Text classification has become a key technology for processing and organizing large amounts of documents. In order to improve the performance of text classification, we propose a method based on latent semantic indexing and graph embedding technology. The method transforms original document space into semantic feature space by means of latent semantic indexing, and then graph embedding in this space is employed to construct an adjacent graph that reflects semantic similarity between documents and its complementary graph which reflects non-neighboring relationship. The proposed method tries to obtain the global structure and the local structure of documents. Using the $k$ nearest neighbor classifier on 20-Newsgroups dataset, the experimental results show that this method can improve the performance of text classification.
- Published
- 2018
- Full Text
- View/download PDF