Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection (Wiley and SAS Business Series) by Baesens Bart & Van Vlasselaer Veronique & Verbeke Wouter
Author:Baesens, Bart & Van Vlasselaer, Veronique & Verbeke, Wouter [Baesens, Bart]
Language: eng
Format: azw3
ISBN: 9781119146834
Publisher: Wiley
Published: 2015-07-26T16:00:00+00:00
Figure 4.37 Calculating Predictions Using a Cut-Off
A confusion matrix can now be calculated as shown in Table 4.5.
Table 4.5 Confusion Matrix
Actual Status
Positive (Fraud) Negative (No Fraud)
Predicted status Positive (Fraud) True Positive (John) False Positive (Sophie)
Negative (No Fraud) False Negative (David) True Negative (Emma, Bob)
Based on this matrix, one can now calculate the following performance measures:
The classification accuracy is the percentage of correctly classified observations. The classification error is the complement thereof and also referred to as the misclassification rate. The sensitivity, recall or hit rate measures how many of the fraudsters are correctly labeled by the model as a fraudster. The specificity looks at how many of the nonfraudsters are correctly labeled by the model as nonfraudster. The precision indicates how many of the predicted fraudsters are actually fraudsters.
Note that all these classification measures depend on the cut-off. For example, for a cut-off of 0 (1), the classification accuracy becomes 40 percent (60 percent), the error 60 percent (40 percent), the sensitivity 100 percent (0), the specificity 0 (100 percent), the precision 40 percent (0) and the F-measure 0.57 (0). Given this dependence, it would be nice to have a performance measure that is independent from the cut-off. One could construct a table with the sensitivity, specificity, and 1-specificity for various cut-offs as shown in Table 4.6.
Table 4.6 Table for ROC Analysis
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Access | Data Mining |
Data Modeling & Design | Data Processing |
Data Warehousing | MySQL |
Oracle | Other Databases |
Relational Databases | SQL |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8063)
Learning SQL by Alan Beaulieu(5747)
Weapons of Math Destruction by Cathy O'Neil(5423)
Azure Data and AI Architect Handbook by Olivier Mertens & Breght Van Baelen(5052)
Building Statistical Models in Python by Huy Hoang Nguyen & Paul N Adams & Stuart J Miller(5041)
Serverless Machine Learning with Amazon Redshift ML by Debu Panda & Phil Bates & Bhanu Pittampally & Sumeet Joshi(4927)
Data Wrangling on AWS by Navnit Shukla | Sankar M | Sam Palani(4694)
Driving Data Quality with Data Contracts by Andrew Jones(4637)
Big Data Analysis with Python by Ivan Marin(4399)
Machine Learning Model Serving Patterns and Best Practices by Md Johirul Islam(4388)
Data Engineering with dbt by Roberto Zagni(3442)
Blockchain Basics by Daniel Drescher(3077)
Solidity Programming Essentials by Ritesh Modi(3076)
Time Series Analysis with Python Cookbook by Tarek A. Atwan(2948)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2759)
Feature Store for Machine Learning by Jayanth Kumar M J(2687)
Learn T-SQL Querying by Pam Lahoud & Pedro Lopes(2670)
Pandas Cookbook by Theodore Petrou(2660)
Mastering Python for Finance by Unknown(2617)