Selecting Features for Detecting Credit Card Fraud Using SHAP Values

Selecting Features for Detecting Credit Card Fraud Using SHAP Values
Author	Huanjing Wang
Co-Author(s)	Taghi M. Khoshgoftaar; Qianxin Liang
Abstract	Credit card fraud detection is essential not only for protecting customers and financial institutions but also for maintaining the trust and reliability of the entire financial system. Machine learning techniques play a central role in credit card fraud detection, offering powerful tools to identify fraudulent transactions accurately and efficiently. This study employs SHAP (SHapley Additive exPlanations)- value-based feature selection technique. Top features are selected based on the SHAP values. Various classification models are investigated, including Decision Tree, Random Forests, XGBoost, and Logistic Regression. Evaluation is done using the Area under the Precision-Recall Curve (AUPRC) metric. All experiments are conducted with the Kaggle Credit Card Fraud Detection Dataset. In our investigation, Decision Tree is employed as the learner in the SHAP-value-based feature selection process. The fraud detection models created using Random Forest and XGBoost excel beyond the performance of Decision Tree, while Decision Tree itself outperforms Logistic Regression. Our findings indicate that the classifier utilized in the model construction phase does not necessarily have to match the learners used in the feature selection stage.
Keywords	SHAP, Feature Selection, Credit Card Fraud Detection, Machine Learning

		Article #: RQD2024-144

Proceedings of 29th ISSAT International Conference on Reliability & Quality in Design
August 8-10, 2024

	International Society of Science and Applied Technologies