Back to Search Start Over

Classification of tourism destination review texts based on sentiment using k-nearest neighbor with information gain feature selection.

Authors :
Husni
Kifli, Abd
Syakur, Muhammad Ali
Muntasa, Arif
Rachman, Eka Mala Sari
Fauzan, Hermawan Bin
Rachmad, Aeri
Source :
AIP Conference Proceedings; 2024, Vol. 3176 Issue 1, p1-8, 8p
Publication Year :
2024

Abstract

Sentiment analysis of tourism destination reviews can be used as feedback to managers to improve the quality of tourism services. Many methods have been used to classify the review text based on its sentiment. k-Nearest Neighbor (KNN) is a classification method that is widely used in sentiment analysis. This simple approach is capable of providing very high accuracy. The main drawback of KNN is the long computing time, so by default it is not recommended for big data computing. This article explains how the KNN method is combined with Information Gain (IG) feature selection to select only the best terms (words) in the dataset to be involved in computing. This research analyzes the review text of a tourism destination on Madura Island which was downloaded from Google Map. This review was preprocessed using case-folding, cleansing, normalization, tokenization, stop-word removal, and stemming techniques. Each term is given a weight using TF-IDF, then feature restrictions are carried out using IG. Testing shows that the KNN classifier without involving IG provides the best accuracy of 98.4% (only oversampling), namely when the k value = 1, whereas when KNN is combined with IG the best accuracy is 97.6 (oversampling plus feature selection) when the k value is set to 3 and the threshold is 0.0008. The combination of KNN and IG is recommended to be applied to classify large-scale review texts based on sentiment. Reducing the number of features can shorten computing time while maintaining the accuracy of the classifier. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0094243X
Volume :
3176
Issue :
1
Database :
Complementary Index
Journal :
AIP Conference Proceedings
Publication Type :
Conference
Accession number :
178717858
Full Text :
https://doi.org/10.1063/5.0222729