1. Performance Assessment of Multiple Machine Learning Classifiers for Detecting the Phishing URLs
- Author
-
Sheikh Shah Mohammad Motiur Rahman, Tapushe Rabaya Toma, Khalid Been Md. Badruzzaman Biplob, Fatama Binta Rafiq, and Syeda Sumbul Hossain
- Subjects
Computer science ,business.industry ,Confusion matrix ,Machine learning ,computer.software_genre ,Phishing ,Field (computer science) ,Random forest ,Support vector machine ,Tree (data structure) ,ComputingMethodologies_PATTERNRECOGNITION ,Feature (computer vision) ,Artificial intelligence ,Gradient boosting ,business ,computer - Abstract
In the field of information security, phishing URLs detection and prevention has recently become egregious. For detecting, phishing attacks several anti-phishing systems have already been proposed by researchers. The performance of those systems can be affected due to the lack of proper selection of machine learning classifiers along with the types of feature sets. A details investigation on machine learning classifiers (KNN, DT, SVM, RF, ERT and GBT) along with three publicly available datasets with multidimensional feature sets have been presented on this paper. The performance of the classifiers has been evaluated by confusion matrix, precision, recall, F1-score, accuracy and misclassification rate. The best output obtained from Random Forest and Extremely Randomized Tree with dataset one and three (binary class feature set) of 97% and 98% accuracy accordingly. In multiclass feature set (dataset two), Gradient Boosting Tree provides highest performance with 92% accuracy.
- Published
- 2020