Descriptor: "phishing detection" / Journal: expert systems with applications - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"phishing detection"' showing total 6 results

Start Over Descriptor "phishing detection" Journal expert systems with applications

6 results on '"phishing detection"'

1. Look before you leap: Detecting phishing web pages by exploiting raw URL and HTML characteristics.

Author: Opara, Chidimma, Chen, Yingke, and Wei, Bo
Subjects: *UNIFORM Resource Locators, *WEBSITES, *ARTIFICIAL neural networks, *PHISHING, *INTERNET fraud, *SPAM email, *EMAIL security, *EMAIL
Abstract: Phishing websites distribute unsolicited content and are frequently used to commit email and internet fraud. Detecting them before any user information is submitted is critical. Several efforts have been made to detect these phishing websites in recent years. Most existing approaches use hand-crafted lexical and statistical features from a website's textual content to train classification models to detect phishing web pages. However, these phishing detection approaches have limitations, including (1) the tediousness of extracting hand-crafted features, which require specialized domain knowledge to determine which features are useful for a particular platform; and (2) the difficulties encountered by models built on hand-crafted features to capture the semantic patterns in words and characters in URL and HTML content. To address these challenges, this paper proposes WebPhish, an end-to-end deep neural network trained using embedded raw URLs and HTML content to detect website phishing attacks. First, the proposed model automatically employs an embedding technique to extract the corresponding characters into homologous dense vectors. Then, the concatenation layer merges the URL and HTML embedding matrices. Following that, Convolutional layers are used to model its semantic dependencies. Extensive experiments were conducted with real-world phishing data, which yielded an accuracy of 98.1%, showing that WebPhish outperforms baseline detection approaches in identifying phishing pages. • WebPhish employs both raw URL and HTML content to detect phishing web pages. • WebPhish uses character-level embeddings to enable the feature vectors to generalize to new data. • Extensive experiments conducted on a real-world dataset yields 98.1% accuracy. • Its application on the airline Twitter dataset shows the proposed model's flexibility. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Hybrid phishing detection using joint visual and textual identity.

Author: Tan, Colin Choon Lin, Chiew, Kang Leng, Yong, Kelvin S.C., Sebastian, Yakub, Than, Joel Chia Ming, and Tiong, Wei King
Subjects: *PHISHING, *BRAND identification, *CONTENT analysis, *LANDSCAPE changes, *COMPUTER vision
Abstract: In recent years, phishing attacks have evolved considerably, causing existing adversarial features that were widely utilised for detecting phishing websites to become less discriminative. These developments have fuelled growing interests among security researchers towards an anti-phishing strategy known as the identity-based detection technique. Identity-based detection techniques have consistently achieved high true positive rates in a rapidly changing phishing landscape, owing to its capitalisation on fundamental brand identity relations that are inherent in most legitimate webpages. However, existing identity-based techniques often suffer higher false positive rates due to complexities and challenges in establishing the webpage's brand identity. To close the existing performance gap, this paper proposes a new hybrid identity-based phishing detection technique that leverages webpage visual and textual identity. Extending earlier anti-phishing work based on the website logo as visual identity, our method incorporates novel image features that mimic human vision to enhance the logo detection accuracy. The proposed hybrid technique integrates the visual identity with a textual identity, namely, brand-specific keywords derived from the webpage content using textual analysis methods. We empirically demonstrated on multiple benchmark datasets that this joint visual-textual identity detection approach significantly improves phishing detection performance with an overall accuracy of 98.6%. Benchmarking results against an existing technique showed comparable true positive rates and a reduction of up to 3.4% in false positive rates, thus affirming our objective of reducing the misclassification of legitimate webpages without sacrificing the phishing detection performance. The proposed hybrid identity-based technique is proven to be a significant and practical contribution that will enrich the anti-phishing community with improved defence strategies against rapidly evolving phishing schemes. • Remain sustainable against evolving phishing threats by using identity-based method. • Achieve reduced false positives thus improving practicality of proposed solution. • Leverage hybrid identities to attain robustness across diverse range of websites. • Exploit novel image features based on human vision to enhance website logo detection. • Enrich website visual analysis via active browser rendering and DOM manipulation. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

3. Multi-scale semantic deep fusion models for phishing website detection.

Author: Liu, Dong-Jie, Geng, Guang-Gang, and Zhang, Xin-Chang
Subjects: *PHISHING, *DEEP learning, *MULTISCALE modeling, *MEDIUM density fiberboard
Abstract: In view of semantic counterfeiting characteristics of phishing websites and their multi-scale composition, this paper fully considers the semantic information of different scales, and proposes three semantic-based phishing detection models at different depths using various deep learning methods. The proposed three models are Multi-scale Data-layer Fusion (MDF) model, Multi-scale Feature-layer Fusion (MFF) model and Multi-scale In-depth Fusion(MIF) model. Experimental results on a constructed complex dataset show that the three models all have good recognition capabilities and the MIF model achieves the best performance on a complex dataset, with an F1-Measure of 0.9830, AUC value of 0.9993 and a false positive rate of 0.0047. Then with further comparison with both visual and text methods and an active discovery experiment lasting for 6 months with 3016 phishing websites detected in the real network environment, it is found that the proposed model is both competitive and practical for real detection scenarios. • Semantic information at different scales is mined and fused at different depths. • Three semantic-based deep phishing detection models are proposed. • Various comparative experiments are carried out and prove the effectiveness. • A phishing discovery experiment in reality detected 3016 phishing websites. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. Phishing websites detection using a novel multipurpose dataset and web technologies features.

Author: Sánchez-Paniagua, Manuel, Fidalgo, Eduardo, Alegre, Enrique, and Alaiz-Rodríguez, Rocío
Subjects: *PHISHING, *WEBSITES, *SOCIAL engineering (Fraud), *CYBERTERRORISM, *EMAIL
Abstract: Phishing attacks are one of the most challenging social engineering cyberattacks due to the large amount of entities involved in online transactions and services. In these attacks, criminals deceive users to hijack their credentials or sensitive data through a login form which replicates the original website and submits the data to a malicious server. Many anti-phishing techniques have been developed in recent years, using different resource such as the URL and HTML code from legitimate index websites and phishing ones. These techniques have some limitations when predicting legitimate login websites, since, usually, no login forms are present in the legitimate class used for training the proposed model. Hence, in this work we present a methodology for phishing website detection in real scenarios, which uses URL, HTML, and web technology features. Since there is not any updated and multipurpose dataset for this task, we crafted the Phishing Index Login Websites Dataset (PILWD), an offline phishing dataset composed of 134,000 verified samples, that offers to researchers a wide variety of data to test and compare their approaches. Since approximately three-quarters of collected phishing samples request the introduction of credentials, we decided to crawl legitimate login websites to match the phishing standpoint. The developed approach is independent of third party services and the method relies on a new set of features used for the very first time in this problem, some of them extracted from the web technologies used by the on each specific website. Experimental results show that phishing websites can be detected with 97.95% accuracy using a LightGBM classifier and the complete set of the 54 features selected, when it was evaluated on PILWD dataset. • Using legitimate homepage websites foster false positives during login classification. • Proposed web technology features improve phishing detection accuracy. • Legitimate login websites ensure generalization in practical scenarios. • PILWD-134K: A publicly available dataset for phishing detection benchmarking. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

5. An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features.

Author: Yang, Liqun, Zhang, Jiawei, Wang, Xiaozhe, Li, Zhi, Li, Zhoujun, and He, Yueying
Subjects: *PHISHING, *MACHINE learning, *MATRIX inversion, *ALGORITHMS
Abstract: • Define three types of features extracted from URLs, domains, etc. • Exploit a method to balance the majority and minority class samples. • Adopt an improved DAE-based method to reduce the dimension of the dataset. • Boost the detection performance by using the improved ELM-based classifier. • Do experiments to verify the feasibility and effectiveness of the proposed approach. In this paper, a novel approach based on non-inverse matrix online sequence extreme learning machine (NIOSELM) for phishing detection is presented, which takes into account three types of features to comprehensively characterize a website. For the NIOSELM algorithm, we use Sherman Morriso Woodbury equation to avoid the matrix inversion operation, and introduce the idea of online sequence extreme learning machine (OSELM) to update the training model. In order to reduce the dependence of the detection model on the majority class, we use Adaptive Synthetic Sampling (ADASYN) algorithm to generate the synthetic minority class samples to balance the distribution between the samples of the majority and minority classes. Furthermore, an improved denoising auto-encoder (SDAE) is designed to reduce the dimension of the experimental dataset. The experimental results show the efficiency and feasibility of the proposed detection mechanism. Moreover, the overall detection performance of NIOSELM is better than that of other existing methods, especially in training speed and the detection accuracy. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

6. Phishing detection based Associative Classification data mining

Author: Neda Abdelhamid, Fadi Thabtah, and Aladdin Ayesh
Subjects: Password, business.industry, Computer science, General Engineering, Phishing detection, Classification, Phishing, Online community, Internet security, computer.software_genre, Computer Science Applications, Forged websites, Artificial Intelligence, Data mining, business, computer, Classifier (UML), Associative property
Abstract: Website phishing is considered one of the crucial security challenges for the online community due to the massive numbers of online transactions performed on a daily basis. Website phishing can be described as mimicking a trusted website to obtain sensitive information from online users such as usernames and passwords. Black lists, white lists and the utilisation of search methods are examples of solutions to minimise the risk of this problem. One intelligent approach based on data mining called Associative Classification (AC) seems a potential solution that may effectively detect phishing websites with high accuracy. According to experimental studies, AC often extracts classifiers containing simple “If-Then” rules with a high degree of predictive accuracy. In this paper, we investigate the problem of website phishing using a developed AC method called Multi-label Classifier based Associative Classification (MCAC) to seek its applicability to the phishing problem. We also want to identify features that distinguish phishing websites from legitimate ones. In addition, we survey intelligent approaches used to handle the phishing problem. Experimental results using real data collected from different sources show that AC particularly MCAC detects phishing websites with higher accuracy than other intelligent algorithms. Further, MCAC generates new hidden knowledge (rules) that other algorithms are unable to find and this has improved its classifiers predictive performance.
Published: 2014
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results on '"phishing detection"'

1. Look before you leap: Detecting phishing web pages by exploiting raw URL and HTML characteristics.

2. Hybrid phishing detection using joint visual and textual identity.

3. Multi-scale semantic deep fusion models for phishing website detection.

4. Phishing websites detection using a novel multipurpose dataset and web technologies features.

5. An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features.

6. Phishing detection based Associative Classification data mining

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

6 results on '"phishing detection"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources