Back to Search Start Over

An entropy-based corpus method for improving keyword extraction: An example of sustainability corpus.

Authors :
Chen, Liang-Ching
Chang, Kuei-Hu
Source :
Engineering Applications of Artificial Intelligence. Jul2024:Part A, Vol. 133, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Natural language processing (NLP), a subfield of artificial intelligence (AI), has progressively influenced corpus-based methods, with keyword extraction often relying on complex NLP algorithms or models as an integral technique within corpus-based methods. With growing concern for sustainability issues, keyword extraction impacts the information acquisition in decision-making and policy development. However, traditional corpus-based keyword extraction methods involve limitations, such as the inability to automatically exclude meaningless words, evaluate the relative importance of keyword parameters, and integrate parameters for comprehensively keyword evaluation. To address these limitations, this paper proposes an entropy-based corpus method. The proposed method first optimizes the keyword list by excluding function and generic words using a machine-based technique (word types decrease by 5.76%; total words decrease by 72.2%). Second, it calculates the objective weights of log-likelihood (0.5518), frequency (0.4048), and range (0.0433) parameters to define their relative importance, facilitating parameter integration before evaluating keyword importance. Then, it calculates the aggregated value of each keyword to assess its level of importance. As a result, it streamlines the manual word selection process and comprehensively evaluates the importance of keywords. Compared to the four traditional methods, the keyword extraction results of the proposed method, which accounts for only 1.77% of the original list, better reflects the linguistic patterns of the target corpus, potentially facilitating future corpus-based keyword analysis research. [Display omitted] [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09521976
Volume :
133
Database :
Academic Search Index
Journal :
Engineering Applications of Artificial Intelligence
Publication Type :
Academic Journal
Accession number :
177605425
Full Text :
https://doi.org/10.1016/j.engappai.2024.108049