Comparative Analysis of K-Nn, Naïve Bayes, and logistic regression for credit card fraud detection

Kavita Arora; Sonal Pathak; Nguyen Thi Dieu Linh

doi:10.16925/2357-6014.2023.03.05

Comparative Analysis of K-Nn, Naïve Bayes, and logistic regression for credit card fraud detection

Research Articles

https://doi.org/10.16925/2357-6014.2023.03.05

Dr. Kavita Arora Ver Biografía

Manav Rachna International Institute of Research & Studies

Dr. Sonal Pathak Ver Biografía

Manav Rachna International Institute of Research and Studies

Nguyen Thi Dieu Linh Ver Biografía

Hanoi University of Industry

Introduction: This paper highlights the outcome of the comparative study of “Various Machine learning algorithms namely K-NN, Naive Bayes, and Logistic Regression for Credit Card Fraud Detection” carried out based on a dataset taken from UCI.com in 2022-23 at Manav Rachna International Institute of Research and Studies.

Problem: Credit card fraud is still rife today and the modes are increasingly varied. Quite often we hear of fraud cases that cause irreplaceable injury to banks and financial institutions which cannot be compensated in terms of costs. To avoid scams with various modes of credit cards, we must be able to identify and find out the modes often used by fraudsters. This scheme liberates such financial institutions and banks with complete and appropriate information using Machine Learning Techniques, not only about the modes that scammers or fraudsters often use but also ways to protect against such frauds.

Objective: The present paper discusses the various machine learning models based on classification and regression, namely K-Nearest Neighbors, Naïve Bayes, and Logistic Regression, which are successfully able to achieve the classification accuracy of 80% using Logistic Regression with a Precision of 78%, Recall of 100%, and F1-Score of 88% for fraudulent credit card transactions.

Methodology: The comparative analysis demonstrates that for Precision, Recall, and Accuracy parameters, the K-Nearest Neighbor is a better approach for detecting fraudulent transactions than the Logistic Regression and Naïve Bayes.

Results: The accuracy is marginal high in Logistic Regression but the False Positive parameters are not able to identify the imbalanced data; therefore, they disguise the results and accuracy of Logistic Regression and K-Nearest Neighbor deems fit for such cases.

Conclusion: This scheme depicts the automated fraud classification systems using machine learning techniques, namely K-Nearest Neighbor, Logistic Regression, and Naive Bayes, to produce a model that can distinguish valid and invalid credit card transactions.

Originality: Through this research, the most relevant features are used to go through the visualization of accuracy with the confusion matrix, and accuracy calculations are obtained from the dataset used.
Limitations: Deep learning techniques could have been used to fetch even better results.

Keywords: Naïve Bayes, K Nearest Neighbor, fraud detection, logistic regression, machine learning

S. L. Vailshery, “Wide-area and short-range IoT device installed base Worldwide 2014-2027,” Technology & Telecommunications. [Online] Available: https://www.statista.com.

Credit card fraud, [Online] Available: https://en.wikipedia.org/wiki/Credit_card_fraud.

S. Okoro, “Combatting Cybercrime, Tools and Capacity Building for Emerging Economies”, 2017. [Online], Available: https://documents1.worldbank.org/curated/en

Lookerstudio, 2018. [Online] Available: https://www.indiacode.nic.in/bitstream/123456789/1999/3/A2000-21.pdf

A. Rashmi, “Predictive Analysis Of Breast Cancer Using Machine Learning Techniques,” Revista Ingeniería Solidaria, vol. 15, no. 3, 2019. doi: https://doi.org/10.16925/2357-6014.2019.03.01

R. Wheeler, S. Aitken, “Multiple algorithms for fraud detection,” Knowledge-Based Systems. vol.13, pp.93–99. [Online]. Available:https://isiarticles.com/bundles/Article/pre/pdf/17658.pdf

Y.K. Saheed, Hambali, “Application of feature selection on Naive Bayes, random forest, and SVM for credit card fraud detection,” International Conference on Decision Aid Sciences and Application. (DASA), 2020. doi: https://doi.org/10.1109/DASA51403.2020.9317228

H. Najadat, O. Altiti, “Credit card fraud detection based on machines and Deep Learning,” International Conference on Information and Communication Systems.2020. doi: 10.1109/ICICS49469.2020.239524

R. Sailusha, V. Gnaneswar, R. Ramesh, G. R. Rao, “Credit Card Fraud Detection Using Machine Learning,” 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2020, pp. 1264-1270. doi: https://doi.org/10.1109/ICICCS48265.2020.9121114.

A. Gupta, M.C. Lohani, “Financial fraud detection using naive Bayes algorithm in highly imbalance data set,” Journal of Discrete Mathematical Sciences and Cryptography, vol. 24, no. 5, pp. 1559–1572, 2021.

D. Dighe, S. Kokate, “Detection of credit card fraud transactions using machine learning algorithms and Neural Networks: A comparative study,” International Conference on Computing Communication Control and Automation. doi: https://doi.org/10.1109/ICCUBEA.2018.8697799

Y. Jain, S. Jain, “A comparative analysis of various credit card fraud detection Techniques,” International Journal of Recent Technology and Engineering, vol. 7, no.52, pp.402-407, 2019.

Maniraj, S. Sarkar, “Credit card fraud detection using machine learning and Data Science,” International Journal of Engineering Research and Technology, vol. 8, no. 9, 2019. doi: https://doi.org/10.17577/IJERTV8IS090031S.

S. Kiran, J. Guru, “Credit card fraud detection using Naïve Bayes model based and KNN classifier,” International Journal of Advance Research, Ideas And Innovations In Technology, vol. 4, no. 3, pp.44 - 47, 2018.

S. Maes, K. Tuyls, “Credit card fraud detection using Bayesian and neural networks,” International naiso congress on neuro fuzzy technologies. pp. 261-270.2002

M. Zareapoor, K. Seeja, “Analysis on credit card fraud detection techniques: Based on certain design criteria,” International Journal of Computer Applications, vol. 52, no. 3, pp. 35–42, 2012

[1]

K. Arora, S. Pathak, and N. T. Dieu Linh, “Comparative Analysis of K-Nn, Naïve Bayes, and logistic regression for credit card fraud detection”, ing. Solidar, vol. 19, no. 3, pp. 1–22, Sep. 2023, doi: 10.16925/2357-6014.2023.03.05.

Download Citation

This work is licensed under a Creative Commons Attribution 4.0 International License.

Cession of rights and ethical commitment

As the author of the article, I declare that is an original unpublished work exclusively created by me, that it has not been submitted for simultaneous evaluation by another publication and that there is no impediment of any kind for concession of the rights provided for in this contract.

In this sense, I am committed to await the result of the evaluation by the journal Ingeniería Solidaría before considering its submission to another medium; in case the response by that publication is positive, additionally, I am committed to respond for any action involving claims, plagiarism or any other kind of claim that could be made by third parties.

At the same time, as the author or co-author, I declare that I am completely in agreement with the conditions presented in this work and that I cede all patrimonial rights, in other words, regarding reproduction, public communication, distribution, dissemination, transformation, making it available and all forms of exploitation of the work using any medium or procedure, during the term of the legal protection of the work and in every country in the world, to the Universidad Cooperativa de Colombia Press.

Issue

Vol. 19 No. 3 (2023)

Published

2023-09-22

Downloads

PDF

How to Cite

[1]

Download Citation

Metrics

File downloads

281

https://plu.mx/plum/a/?doi=10.16925/2357-6014.2023.03.05