Back to Search
Start Over
An entropy-based corpus method for improving keyword extraction: An example of sustainability corpus.
- Source :
-
Engineering Applications of Artificial Intelligence . Jul2024:Part A, Vol. 133, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- Natural language processing (NLP), a subfield of artificial intelligence (AI), has progressively influenced corpus-based methods, with keyword extraction often relying on complex NLP algorithms or models as an integral technique within corpus-based methods. With growing concern for sustainability issues, keyword extraction impacts the information acquisition in decision-making and policy development. However, traditional corpus-based keyword extraction methods involve limitations, such as the inability to automatically exclude meaningless words, evaluate the relative importance of keyword parameters, and integrate parameters for comprehensively keyword evaluation. To address these limitations, this paper proposes an entropy-based corpus method. The proposed method first optimizes the keyword list by excluding function and generic words using a machine-based technique (word types decrease by 5.76%; total words decrease by 72.2%). Second, it calculates the objective weights of log-likelihood (0.5518), frequency (0.4048), and range (0.0433) parameters to define their relative importance, facilitating parameter integration before evaluating keyword importance. Then, it calculates the aggregated value of each keyword to assess its level of importance. As a result, it streamlines the manual word selection process and comprehensively evaluates the importance of keywords. Compared to the four traditional methods, the keyword extraction results of the proposed method, which accounts for only 1.77% of the original list, better reflects the linguistic patterns of the target corpus, potentially facilitating future corpus-based keyword analysis research. [Display omitted] [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09521976
- Volume :
- 133
- Database :
- Academic Search Index
- Journal :
- Engineering Applications of Artificial Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 177605425
- Full Text :
- https://doi.org/10.1016/j.engappai.2024.108049