Back to Search
Start Over
Visualizing and Interpreting RNN Models in URL-based Phishing Detection
- Source :
- SACMAT
- Publication Year :
- 2020
- Publisher :
- ACM, 2020.
-
Abstract
- Existing studies have demonstrated that using traditional machine learning techniques, phishing detection simply based on the features of URLs can be very effective. In this paper, we explore the deep learning approach and build four RNN (Recurrent Neural Network) models that only use lexical features of URLs for detecting phishing attacks. We collect 1.5 million URLs as the dataset and show that our RNN models can achieve a higher than 99% detection accuracy without the need of any expert knowledge to manually identify the features. However, it is well known that RNNs and other deep learning techniques are still largely in black boxes. Understanding the internals of deep learning models is important and highly desirable to the improvement and proper application of the models. Therefore, in this work, we further develop several unique visualization techniques to intensively interpret how RNN models work internally in achieving the outstanding phishing detection performance. Especially, we identify and answer six important research questions, showing that our four RNN models (1) are complementary to each other and can be combined into an ensemble model with even better accuracy, (2) can well capture the relevant features that were manually extracted and used in the traditional machine learning approach for phishing detection, and (3) can help identify useful new features to enhance the accuracy of the traditional machine learning approach. Our techniques and experience in this work could be helpful for researchers to effectively apply deep learning techniques in addressing other real-world security or privacy problems.
- Subjects :
- 021110 strategic, defence & security studies
Creative visualization
Ensemble forecasting
Computer science
business.industry
Deep learning
media_common.quotation_subject
0211 other engineering and technologies
020206 networking & telecommunications
Access control
02 engineering and technology
Phishing detection
Machine learning
computer.software_genre
Phishing
Visualization
Recurrent neural network
0202 electrical engineering, electronic engineering, information engineering
Artificial intelligence
business
computer
media_common
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 25th ACM Symposium on Access Control Models and Technologies
- Accession number :
- edsair.doi...........c49efa5bd17ca113967fc264f37d77ab
- Full Text :
- https://doi.org/10.1145/3381991.3395602