Article Content

Abstract

Background

Phishing attacks are now regarded as one of the most prevalent cyberattacks that often compromise the security of different communication and internet networks. Phishing websites are created with the goal of generating cyber threats in order to ascertain the user’s financial information. Fake websites are frequently created and circulated online, which results in the loss of essential user assets. Phishing websites can result in monetary loss, intellectual property theft, damage to one’s reputation, and disruption of regular business activities. Over the past decade, a number of anti-phishing tactics have been proposed to detect and reduce these attempts. They are still imprecise and ineffective, though. Deep Learning (DL), which can precisely learn the intrinsic features of the websites and recognize phishing websites, is one of the innovative techniques utilized to solve this issue.

Methods

In this study, we proposed a novel OptSHQCNN phishing detection method. Pre-deployment and post-deployment are the two phases of the proposed methodology. The dataset undergoes preprocessing in the pre-deployment phase, which includes data balancing, and handling invalid features, irrelevant features, and missing values. The convolutional block attention module (CBAM) then extracts the main characteristics from web page code and linkages. The red kite optimization algorithm (RKOA) selects the significant key attributes in the third stage. The final phase involves classifying the data using the Shallow hybrid quantum-classical convolutional neural network (SHQCNN) model. To improve the effectiveness of the classification approach, the hyperparameters present in the SHQCNN model are fine-tuned using the shuffled shepherd optimization algorithm (SSOA).

Results

In the post-deployment phase, the URL is encoded using Optimized Bidirectional Encoder Representations from Transformers (OptBERT), after which the features are extracted. The retrieved properties are fed into a trained classifier. Next, a prediction of “phishing” or “Legitimate” is produced by the classifier. With a maximum of above 99% accuracy, precision, recall, and F1-score, respectively, the investigation’s findings showed that the suggested technique performed better than other popular phishing detection methods. The creation of a security plugin for clients, browsers, and other instant messaging applications that operate on network edges, PCs, smartphones, and other personal terminals can be aided by these findings.

Cite this as

Meda S, Srinivas VS, Rao KCB, Ramesh R, Yamarthi NR. 2025A dual-phase deep learning framework for advanced phishing detection using the novel OptSHQCNN approachPeerJ Computer Science 11:e3014 https://doi.org/10.7717/peerj-cs.3014

Introduction

Motivation and our research contributions

  • To extract primary features, the convolutional block attention module (CBAM) is utilized to efficiently capture significant characteristics from URLs and webpage code for accurate phishing detection.

  • The red kite optimization algorithm (RKOA) is used to select the most essential features, minimizing the feature set and improving the computational efficiency of the model.

  • To classify phishing websites, a novel shallow hybrid quantum-classical convolutional neural network (SHQCNN) approach is introduced, leveraging their strengths for enhanced detection capabilities.

  • The shuffled shepherd optimization algorithm (SSOA) is utilized to fine-tune the parameters presented in the SHQCNN approach.

  • For URL encoding in the post-deployment phase, Optimized Bidirectional Encoder Representations from Transformers (OptBERT) is utilized to facilitate high-quality feature extraction that supports accurate predictions.

Result and discussions

Experimental setup

Description of the dataset

  • ISCX-URL-2016: This dataset includes 114,400 URLs categorized as phishing, spam, malware, or vandalism. It contains features like URL length, special characters, and linguistic properties, aiding in the identification of phishing websites.

  • URL-Based Phishing dataset: From Kaggle, this dataset contains 11,054 records with 33 attributes, including URL structure and domain features. It helps distinguish between phishing and benign URLs.

  • Mendeley_2020: Divided into small (58,645 instances) and large (88,647 cases) subsets, this dataset includes 111 features per URL, including domain age, DNS records, and WHOIS information, useful for identifying phishing websites.

Evaluation criteria

Acc=Tp+TnTp+Tn+Fp+Fn
Prec=TpFp+Tp
Rec=TpTp+Fn
FPR=FpTn+Fp
F1=2×Prec×RecPrec+Rec
MCC=TPTNFPFN(TP+FP)(TP+FN)(TN+FP)(TN+FN)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−√

Ablation study

Discussions

Existing limitations

Overcoming limitations with the proposed method

Benefits of the proposed method

  • High accuracy and efficiency: Over 99% accuracy, precision, recall, and F1-score are attained by combining sophisticated feature extraction and selection approaches.

  • Reduced computational complexity: The model’s operations are streamlined by optimization methods such as RKOA and SSOA, which qualifies it for real-time detection scenarios.

    Scalability and adaptability: The model may generalize over a variety of datasets by utilizing OptBERT for URL encoding, which overcomes the drawbacks of domain-specific methods.

  • Effective feature selection: By ensuring that only the most pertinent features are taken into consideration, the CBAM model lowers noise and improves classification results.

  • Broad applicability: The two-stage framework can be expanded to a number of real-world uses, including personal terminals, network edges, and browser plugins.

Proposed limitations

Conclusion and future scope

Additional Information and Declarations

Competing Interests

Author Contributions

Data Availability

Funding

The authors received no funding for this work.

 

 

 

Acknowledgements

We promulgate that this manuscript is authentic, has not been divulged before, and is not currently being contemplated for publication otherwhere.

References

  • Albahadili AJSAkbas ARahebi J. 2024. Detection of phishing URLs with deep learning based on GAN-CNN-LSTM network and swarm intelligence algorithms. Signal, Image and Video Processing 18(6):49794995
  • Alohali MAAlasmari NMaashi MNouri AMRizwanullah MYaseen IOsman AEAlneil AA. 2023. Metaheuristics with deep learning driven phishing detection for sustainable and secure environment. Sustainable Energy Technologies and Assessments 56(4):103114
  • Alsubaei FSAlmazroi AAAyub N. 2024. Enhancing phishing detection: a novel hybrid deep learning framework for cybercrime forensics. IEEE Access 12:83738389
  • Asiri SXiao YAlzahrani SLi T. 2024. PhishingRTDS: a real-time detection system for phishing attacks using a deep Learning model. Computers & Security 141(11):103843
  • Barik KMisra SMohan R. 2025. Web-based phishing URL detection model using deep learning optimization techniques. International Journal of Data Science and Analytics 39(4):123
  • Bozkir ASDalgic FCAydos M. 2023. GramBeddings: a new neural network for URL based identification of phishing web pages through n-gram embeddings. Computers & Security 124(9):102964
  • Butt UAAmin RAldabbas HMohan SAlouffi BAhmadian A. 2023. Cloud-based email phishing attack using machine and deep learning algorithm. Complex & Intelligent Systems 9(3):30433070
  • Das Guptta SShahriar KTAlqahtani HAlsalman DSarker IH. 2024. Modeling hybrid feature-based phishing websites detection using machine learning techniques. Annals of Data Science 11(1):217242
  • Do NQSelamat AFujita HKrejcar O. 2024. An integrated model based on deep learning classifiers and pre-trained transformer for phishing URL detection. Future Generation Computer Systems 161:269
  • Hendaoui FHendaoui S. 2024. SENTINEY: securing ENcrypted multi-party computatIoN for enhanced data privacY and phishing detection. Expert Systems with Applications 256(10):124896
  • Hussain MCheng CXu RAfzal M. 2023. CNN-Fusion: an effective and lightweight phishing detection method based on multi-variant ConvNet. Information Sciences 631(9):328345
  • Karim AShahroz MMustofa KBelhaouari SBJoga SRK. 2023. Phishing detection system through hybrid machine learning based on URL. IEEE Access 11(3):3680536822
  • Kumar PPJaya TRajendran V. 2023. SI-BBA—a novel phishing website detection based on Swarm intelligence with deep learning. Materials Today: Proceedings 80(1):31293139
  • Nanda MGoel S. 2024. URL based phishing attack detection using BiLSTM-gated highway attention block convolutional neural network. Multimedia Tools and Applications 83(27):6934569375
  • Ozcan ACatal CDonmez ESenturk B. 2023. A hybrid DNN-LSTM model for detecting phishing URLs. Neural Computing and Applications 35(7):117
  • Prasad AChandra S. 2024. PhiUSIIL: a diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Computers & Security 136(4):103545
  • Roy PKKumar ASingh A. 2024. Advanced learning for phishing URLs detection to secure consumer-centric applications. IEEE Transactions on Consumer Electronics 70(3):57565763
  • Sahingoz OKBuber EKugu E. 2024. DEPHIDES: deep learning based phishing detection system. IEEE Access 12:80528070
  • Shafin SS. 2024. An explainable feature selection framework for web phishing detection with machine learning. Data Science and Management 8(2):127136
  • Shirazi HMuramudalige SRRay IJayasumana APWang H. 2023. Adversarial autoencoder data synthesis for enhancing machine learning-based phishing detection algorithms. IEEE Transactions on Services Computing 16(4):24112422
  • Van Geest RJCascavilla GHulstijn JZannone N. 2024. The applicability of a hybrid framework for automated phishing detection. Computers & Security 139(9):103736
  • Vidyasri PSuresh S. 2025. FDN-SA: fuzzy deep neural-stacked autoencoder-based phishing attack detection in social engineering. Computers & Security 148(9):104188
  • Wen TXiao YWang AWang H. 2023. A novel hybrid feature fusion model for detecting phishing scam on Ethereum using deep neural network. Expert Systems with Applications 211(1):118463
  • Yamarthy AKKoteswararao C. 2024. MDepthNet based phishing attack detection using integrated deep learning methodologies for cyber security enhancement. Cluster Computing 25:119
  • Zhu ECheng KZhang ZWang H. 2024. PDHF: effective phishing detection model combining optimal artificial and automatic deep features. Computers & Security 136(1):103561
WhatsApp