Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques by Baesens Bart & Van Vlasselaer Veronique & Verbeke Wouter
Author:Baesens, Bart & Van Vlasselaer, Veronique & Verbeke, Wouter
Language: eng
Format: epub
ISBN: 9781119146834
Publisher: Wiley
Published: 2015-07-26T16:00:00+00:00
Figure 4.37 Calculating Predictions Using a Cut-Off
A confusion matrix can now be calculated as shown in Table 4.5.
Table 4.5 Confusion Matrix
Actual Status
Positive (Fraud) Negative (No Fraud)
Predicted status Positive (Fraud) True Positive (John) False Positive (Sophie)
Negative (No Fraud) False Negative (David) True Negative (Emma, Bob)
Based on this matrix, one can now calculate the following performance measures:
The classification accuracy is the percentage of correctly classified observations. The classification error is the complement thereof and also referred to as the misclassification rate. The sensitivity, recall or hit rate measures how many of the fraudsters are correctly labeled by the model as a fraudster. The specificity looks at how many of the nonfraudsters are correctly labeled by the model as nonfraudster. The precision indicates how many of the predicted fraudsters are actually fraudsters.
Note that all these classification measures depend on the cut-off. For example, for a cut-off of 0 (1), the classification accuracy becomes 40 percent (60 percent), the error 60 percent (40 percent), the sensitivity 100 percent (0), the specificity 0 (100 percent), the precision 40 percent (0) and the F-measure 0.57 (0). Given this dependence, it would be nice to have a performance measure that is independent from the cut-off. One could construct a table with the sensitivity, specificity, and 1-specificity for various cut-offs as shown in Table 4.6.
Table 4.6 Table for ROC Analysis
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Access | Data Mining |
Data Modeling & Design | Data Processing |
Data Warehousing | MySQL |
Oracle | Other Databases |
Relational Databases | SQL |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(7845)
Learning SQL by Alan Beaulieu(5398)
Weapons of Math Destruction by Cathy O'Neil(5027)
Big Data Analysis with Python by Ivan Marin(2954)
Blockchain Basics by Daniel Drescher(2882)
Pandas Cookbook by Theodore Petrou(2495)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2486)
Mastering Python for Finance by Unknown(2466)
Building Statistical Models in Python by Huy Hoang Nguyen & Paul N Adams & Stuart J Miller(2383)
Azure Data and AI Architect Handbook by Olivier Mertens & Breght Van Baelen(2349)
Serverless Machine Learning with Amazon Redshift ML by Debu Panda & Phil Bates & Bhanu Pittampally & Sumeet Joshi(2279)
How The Mind Works by Steven Pinker(2206)
Data Wrangling on AWS by Navnit Shukla | Sankar M | Sam Palani(2058)
Building Machine Learning Systems with Python by Richert Willi Coelho Luis Pedro(2056)
Data Engineering with dbt by Roberto Zagni(1994)
Network Science with Python and NetworkX Quick Start Guide by Edward L. Platt(1950)
Driving Data Quality with Data Contracts by Andrew Jones(1927)
Python Natural Language Processing by Jalaj Thanaki(1887)
Solidity Programming Essentials by Ritesh Modi(1754)