
Credit Card Fraud Detection
In this project I build machine learning models to identify fraud in European credit card transactions. I also make several data visualizations to reveal patterns and structure in the data.I was able to accurately identify fraudulent transactions using a random forest model. I also calculated mutual information values to identify the variables most correlated with fraud. On a test set consisting of 20% of the original data, the predictions from the random forest model had an F1 score of 0.869 and a Matthews correlation coefficient (MCC) of 0.869. I also trained logistic regression and linear support vector classifier models, but these models underperformed the random forest. To improve a particular model, I optimized hyperparameters via a grid search with 5-fold cross-validation.
See Project