Start Over

Arabic Text Mining Using Rule Based Classification.

Authors :: Thabtah, Fadi
Gharaibeh, Omar
Al-Zubaidy, Rashid
Source :: Journal of Information & Knowledge Management; Mar2012, Vol. 11 Issue 1, p1250006-1-1250006-10, 10p, 4 Charts, 4 Graphs
Publication Year :: 2012
Abstract: A well-known classification problem in the domain of text mining is text classification, which concerns about mapping textual documents into one or more predefined category based on its content. Text classification arena recently attracted many researchers because of the massive amounts of online documents and text archives which hold essential information for a decision-making process. In this field, most of such researches focus on classifying English documents while there are limited studies conducted on other languages like Arabic. In this respect, the paper proposes to investigate the problem of Arabic text classification comprehensively. More specifically the study measures the performance of different rule based classification approaches adopted from machine learning and data mining towards the problem of text Arabic classification. In particular, four different rule based classification approaches: Decision trees (C4.5), Rule Induction (RIPPER), Hybrid (PART) and Simple Rule (One Rule) are evaluated against the published Corpus of Contemporary Arabic Arabic text collection. This experimentation is carried out by employing a modified version of WEKA business intelligence tool. Through analysing the produced results from the experimentation, we determine the most suitable classification algorithms for classifying Arabic texts. [ABSTRACT FROM AUTHOR]