Imbalanced credit card fraud detection data: A solution based on hybrid neural network and clustering-based undersampling technique

Huajie Huang,Bo Liu, Xiaoyu Xue,Jiuxin Cao, Xinyi Chen

APPLIED SOFT COMPUTING(2024)

引用 0|浏览1
暂无评分
摘要
With the economy rapid development, the credit card business enjoys sustained growth, which leads to the frauds happen frequently. Recent years, the intelligence technology has been applied in fraud detection, but they still leave huge potential to improve reliability. Most of the existing researches designed the model only related to transaction information; however, the user's background information and economy status may be helpful to find abnormal behavior. In view of this, we extract valuable features about individual and transaction information, which can reflect personal background and economic status. Meanwhile, in order to solve the problem of fraud detection and imbalanced class, we innovatively construct a fraud detect framework by learning user features and transaction features, which uses a hybrid neural network with a clustering -based undersampling technique on identity and transaction features (HNN-CUHIT). To test the performance of the HNN-CUHIT in credit card fraud detection, we use a real dataset from a city bank during SARS-CoV2 in 2020 to conduct the experiments. In the imbalanced class problem, the experimental result indicates that the ratio of the number of the normal and fraud classes is 1:1 and then the model performance is optimal, while the F1 -score is 0.0572 in HNN-CUHIT and is 0.0454 in CNN by ROS. In the fraud detection experiment, the F1 -score is 0.0416 in HNN-CUHIT, getting the best performance, while it is 0.0360, 0.0284 and 0.0396 respectively in LR, RF and CNN. According to experimental results, the HNN-CUHIT performs better than other machine learning models in imbalanced class solutions and fraud detection. Our work provides a new approach to detect credit card fraud in the finance field.
更多
查看译文
关键词
Credit Card Fraud Detection,Imbalanced Class Problem,Clustering -based Undersampling,User Feature
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要