ARCHIVES

Original Article

Explainable Phishing URL Detection Using Ensemble Learning and SHAP-Based Feature Attribution

Arpita Ghetiya1 Harsh Aghera2 Tejaswi Telkar3
1 2 3 Department of CSE, Dayananda Sagar University, Bangalore, Karnataka, India.

Published Online: May-August 2026

Pages: 58-63

Abstract

Phishing attacks continue to rely heavily on deceptive URLs, making them a persistent source of credential theft and financial fraud, with over 1.35 million incidents reported globally in 2023. While machine learning models have shown strong performance in detecting such threats, their lack of transparency often makes it difficult for analysts and end-users to understand why a particular URL is flagged, which can reduce trust in real-world deployments.In this work, we present XPhishNet, an explainable framework for phishing URL detection that combines a Random Forest classifier with SHAP (SHapley Additive exPlanations) to provide clear, instance-level explanations alongside prediction outcomes. The system utilizes a set of 32 features derived from lexical patterns, host-based properties, and content-level characteristics of URLs. Experiments were conducted on a dataset of 280,945 labeled URLs collected from the PhiUSIIL and PhishTank repositories. Among the evaluated models, Random Forest consistently achieved the best performance across accuracy, precision, recall, and F1-score using stratified 5-fold cross-validation. Further analysis using SHAP highlights domain age, the use of IP addresses, and subdomain depth as the most influential indicators of phishing activity. To improve usability, a lightweight module is introduced to translate feature importance scores into simple, human-readable alerts without relying on large language model infrastructure. The reliability of these explanations is supported by strong agreement with a permutation-based feature importance baseline, with Kendall’s τ measured at 0.89. Overall, the proposed approach balances detection performance with interpretability, making it more suitable for practical and compliance- driven cybersecurity applications

Related Articles

2026

Artificial Intelligence in Learning and Teaching

2026

Admin Assist: An AI – Driven Configuration and Orchestration for Enterprise Application

2026

Enhancing Blood Group Identification using pigeon inspired optimization: An Innovative Approach

2026

Eco-Genius: Power Up Smart, Power Down Waste

2026

Crowd-Sourced Disaster Response and Rescue Assistant

2026

Unveiling Deepfake Detection Using Vision Transformers: A Survey and Experimental Study

Share Article

X
LinkedIn
Facebook
WhatsApp

Or copy link

https://test.indjcst.com/archives/10.59256/indjcst.20260502006

*Instagram doesn't support direct link sharing from web. Copy the link and share it in your Instagram story or post.