ARCHIVES
Explainable Phishing URL Detection Using Ensemble Learning and SHAP-Based Feature Attribution
Published Online: May-August 2026
Pages: 58-63
Cite this article
↗ https://www.doi.org/10.59256/indjcst.20260502006Abstract
Phishing attacks continue to rely heavily on deceptive URLs, making them a persistent source of credential theft and financial fraud, with over 1.35 million incidents reported globally in 2023. While machine learning models have shown strong performance in detecting such threats, their lack of transparency often makes it difficult for analysts and end-users to understand why a particular URL is flagged, which can reduce trust in real-world deployments.In this work, we present XPhishNet, an explainable framework for phishing URL detection that combines a Random Forest classifier with SHAP (SHapley Additive exPlanations) to provide clear, instance-level explanations alongside prediction outcomes. The system utilizes a set of 32 features derived from lexical patterns, host-based properties, and content-level characteristics of URLs. Experiments were conducted on a dataset of 280,945 labeled URLs collected from the PhiUSIIL and PhishTank repositories. Among the evaluated models, Random Forest consistently achieved the best performance across accuracy, precision, recall, and F1-score using stratified 5-fold cross-validation. Further analysis using SHAP highlights domain age, the use of IP addresses, and subdomain depth as the most influential indicators of phishing activity. To improve usability, a lightweight module is introduced to translate feature importance scores into simple, human-readable alerts without relying on large language model infrastructure. The reliability of these explanations is supported by strong agreement with a permutation-based feature importance baseline, with Kendall’s τ measured at 0.89. Overall, the proposed approach balances detection performance with interpretability, making it more suitable for practical and compliance- driven cybersecurity applications
Related Articles
2026
Artificial Intelligence in Learning and Teaching
2026
Admin Assist: An AI – Driven Configuration and Orchestration for Enterprise Application
2026
Enhancing Blood Group Identification using pigeon inspired optimization: An Innovative Approach
2026
Eco-Genius: Power Up Smart, Power Down Waste
2026
Crowd-Sourced Disaster Response and Rescue Assistant
2026