|
|
|
| Article name |
Machine Learning-Based Phishing URL Classification with Grid Search Hyperparameter Tuning
|
|
|
|
| Article type |
Research article
|
| Authors |
Triyaporn Kongaon(1) and Nithizethe Mhuadthongon(1*)
|
| Office |
School of Science and Technology, Sukhothai Thammathirat Open University, Nonthaburi 11120, Thailand(1) *Corresponding author: Nithizethe.Mhu@stou.ac.th
|
| Journal name |
Vol. 12 No.2 (2026): May - August
|
| Abstract |
The increasing use of the internet has led to new forms of data theft, which are considered a form of cybercrime. Phishing attacks are one method used to? deceive users into disclosing personal information, and URL phishing is the most? common type? of such attacks. This research aims to present an approach for detecting phishing websites by developing a machine learning model for phishing URL classification, together with hyperparameter tuning using the grid search method? across five algorithms, in order to obtain the model that provides the best phishing website classification performance. The evaluated algorithms include Neural Networks, ? Logistic Regression, Naive Bayes, Support Vector Machines using the SMO algorithm, and Decision Trees. The Weka software was used as the simulation and testing tool, and model performance was measured using five-fold cross validation. The results indicate that the Support Vector Machine model using the SMO algorithm achieved the best performance, with an accuracy of 95?.5?5%, precision of 95.60%, recall of 95.60%, and an F1 score of 95.50%. These results demonstrate that the developed model can accurately classify phishing URLs and can serve as an important prototype for the effective future development of automated phishing detection systems.
|
| Keywords |
Machine Learning; URL Phishing; Support Vector Machine; Hyperparameter; Grid Search
|
| Page number |
161-181
|
| ISSN |
ISSN 3027-7280 (Online)
|
| DOI |
|
| ORCID_ID |
0009-0000-3031-9985
|
| Article file |
https://mitij.mju.ac.th/ARTICLE/R69059.pdf
|
| | |
| Reference | |
| |
นิสธิวัฒน์ เจนศิริศักดิ์, ดารารัตน์ ทาสาจันทร์ และ ปวีณา วันชัย. (2568). การวิเคราะห์ความรู้สึก จากบทวิจารณ์ภาษาไทยของผู้บริโภคที่มีต่อสมาร์ตโฟน ในระดับมุมมอง. วารสาร วิทยาศาสตร์ลาดกระบัง 34(1). 20-42.
|
| |
รัชนีวรรณ ไพศาลวรเกียรติ. (2564). การเปรียบเทียบตัวแบบการถดถอยลอจิสติกและเทคนิค เหมืองข้อมูลสำหรับพยากรณ์การเป็นโรคเบาหวาน. วิทยานิพนธ์ปริญญามหาบัณฑิต ภาควิชา คณิตศาสตร์ คณะวิทยาศาสตร์. มหาวิทยาลัยนเรศวร, ประเทศไทย.
|
| |
Abutaha, M., Ababneh, M., Mahmoud, K., & Baddar, S. A.-H. (2021). URL Phishing
Detection using Machine Learning Techniques based on URLs Lexical Analysis. 2021 12th International Conference on Information and Communication Systems (ICICS), 147–152. https://doi.org/10.1109/ICICS52457.2021.9464539.
|
| |
Ademola P.A. and Boniface K. (2021). Phishing Attack in Communication Networks is exposed using a Multi-Stage Machine Learning Approach. ECTI TRANSACTIONS ON COMPUTER AND INFORMATION TECHNOLOGY.
https:://doi.org/ 10.37936/ecti-cit.2021153.240565.
|
| |
Ahammad, S. H., Kale, S. D., Upadhye, G. D., Pande, S. D., Babu, E. V., Dhumane, A. V., & Bahadur, Mr. D. K. J. (2022). Phishing URL detection using machine learning methods. Advances in Engineering Software, 173, 103288. https://doi.org/10.1016/j.advengsoft.2022.103288.
|
| |
Aljofey, A., Jiang, Q., Qu, Q., Huang, M., & Niyigena, J.-P. (2020). An Effective Phishing
Detection Model Based on Character Level Convolutional Neural Network
from URL. Electronics, 9(9), 1514. https://doi.org/10.3390/electronics9091514.
|
| |
Anggreani, D., Hamdani, & Nurmisba. (2024). Grid Search Hyperparameter Analysis in Optimizing The Decision Tree Method for Diabetes Prediction. Indonesian Journal of Data and Science, 5(3), 190–197. https://doi.org/10.56705/ijodas.v5i3.190.
|
| |
Ashok, A., Rathis, D., Raghavendra, R., & Umadevi, V. (2024). A Comparative Analysis of Traditional Machine Learning, Deep Learning and Boosting Algorithms on Phishing URL Detection. 2024 IEEE International Conference on Computer
Vision and Machine Intelligence (CVMI), 1–6.
https://doi.org/10.1109/CVMI61877.2024.10782525.
|
| |
Chaiyaphop Jamjumrat. (2022), รู้จักกับ Decision Tree มันคือต้นไม้อะไร ทำงานอย่างไร?
สืบค้นจาก https://www.borntodev.com/2022/09/15/รู้จักกับ-decision-tree.
|
| |
Elgeldawi, E., Sayed, A., Galal, A.R. and Zaki, A.M. (2021) Hyperparameter Tuning for
Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics, 8, Article 79. https://doi.org/10.3390/informatics8040079.
|
| |
Ghalechyan, H., Israyelyan, E., Arakelyan, A., Hovhannisyan, G., & Davtyan, A. (2024). Phishing URL detection with neural networks: An empirical study. Scientific Reports, 14(1), 25134. https://doi.org/10.1038/s41598-024-74725-6.
|
| |
Hannousse, Abdelhakim, Yahiouche, Salima. (2021). Web page phishing detection.
Retrieved from https://www.kaggle.com/datasets/shashwatwork/web-page-
phishing-detection-dataset/data.
|
| |
Haq, Q. E. U., Faheem, M. H., & Ahmad, I. (2024). Detecting Phishing URLs Based on
a Deep Learning Approach to Prevent Cyber-Attacks. Applied Sciences, 14(22), 10086. https://doi.org/10.3390/app142210086.
|
| |
Justus, A. I., Durodola, O., Alade, O., J Awotunde, O., T Olanrewaju, A., Falana, O., Ogungbire, A., Osinuga, A., Ogunbiyi, D., Ifeanyi, A., E Odezuligbo, I., & E Edu, O. (2024). Hyperparameter Tuning in Machine Learning: A Comprehensive Review. Journal of Engineering Research and Reports, 26(6), 388–395. https://doi.org/10.9734/jerr/2024/v26i61188.
|
| |
Pasith Thanapatpisarn. (2022), Decision Tree โมเดลต้นไม้ กับความเรื่องมากในการ ตัดสินใจ!!! Part1, สืบค้นจาก https://datascihaeng.medium.com/decision-tree- part01- 47ef24539fba.
|
| |
Rashid, J., Mahmood, T., Nisar, M. W., & Nazir, T. (2020). Phishing Detection Using
Machine Learning Technique. 2020 First International Conference of Smart
Systems and Emerging Technologies (SMARTTECH), 43–46.
https://doi.org/10.1109/SMART-TECH49988.2020.00026
|
| |
Safi, A., & Singh, S. (2023). A systematic literature review on phishing website detection techniques. Journal of King Saud University - Computer and Information Sciences, 35(2), 590–611. https://doi.org/10.1016/j.jksuci.2023.01.004.
|
| |
Salloum, S., Gaber, T., Vadera, S., & Shaalan, K. (2021). Phishing Email Detection Using Natural Language Processing Techniques: A Literature Survey. Procedia Computer Science, 189, 19–28. https://doi.org/10.1016/j.procs.2021.05.077.
|
| |
Sanjay Dutta. (2024). Understanding the ROC Curve: When and How to Use It in
Binary Classification. Retrieved from https://medium.com/@sanjay_dutta/ understanding-the-roc-curve-when-and-how-to-use-it-in-binary-classification-724b97f641f4.
|
| |
Stephen Oladele. (2024). Top 12 Dimensionality Reduction Techniques for Machine Learning. Retrieve from https://encord.com/blog/dimentionality-reduction-techniques-machine-learning.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
Return to search menu
|
|
|