Credit Card Fraud Detection XGBOOST Guide

User Manual:

Open the PDF directly: View PDF .
Page Count: 14

Overview
Loading of Libraries
Split Data
Imbalance Dataset
Setting Parameters for XGBoost
Train Model
Predict
Confusion Matrix
Importance Matrix
ROC Curve

Credit Card Fraud Detection - XGBOOST ∗

Javier Ng EY

The datasets contains transactions made by credit cards in September 2013 by european cardhold-

ers. This dataset presents transactions that occurred in two days, where we have 492 frauds out

of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for

0.172% of all transactions.

Keywords: xgboost, machine learning, R

Overview

Loading of Libraries

Split Data

Split data into train, test and cv using caret.

∗PDF is available on EY FDA shared drive. Current version: March 11, 2019; Corresponding author:javier.ng@sg.

ey.com.

Imbalance Dataset

Using SMOTE to make the dataset more balanced.

Setting Parameters for XGBoost

Train Model

Train the model with the parameters set above and nrounds = 25 (increasing nrounds does not

improve the model anymore). Plots show increasing train and cv AUC in the beginning and

stagnating at later rounds as expected.

Predict

Predict with dataset and set threshold of 0.5

Confusion Matrix

Sensitivity (TPR) = 0.9991 Speciﬁcity (TNR) = 0.9135 Accuracy = 0.9989

568039

5495

Fraud

Not Fraud

Fraud Not Fraud

True

Predicted

XGBoost with min logloss at round: 139

Sensitivity: 0.9991

Specificity: 0.9135

Confusion matrix

Importance Matrix

Identify features that are most important

V28

V16

V27

V22

V24

V25

V21

V20

V15

V19

V23

Time

V18

V13

Amount

V17

V26

V12

V10

V11

V14

0.0 0.1 0.2 0.3 0.4 0.5 0.60.0 0.1 0.2 0.3 0.4 0.5 0.6

ROC Curve

AUC = 0.956

ROC Curve w/ Thresholds

Average false positive rate

Average true positive rate

0.0 0.2 0.4 0.6 0.8 1.0

0 0.4 0.8 1.2 1.6 2

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.0 0.2 0.4 0.6 0.8 1.0

Credit Card Fraud Detection XGBOOST Guide

Navigation menu

Versions of this User Manual:

Views

Navigation