Tree-Based Machine Learning Methods in SAS Viya by Dr. Sharad Saxena

Tree-Based Machine Learning Methods in SAS Viya by Dr. Sharad Saxena

Author:Dr. Sharad Saxena [Dr. Sharad Saxena]
Language: eng
Format: epub
Publisher: SAS Institute
Published: 2022-02-21T00:00:00+00:00


Figure 5.17: Event Classification in Confusion Matrix

The counts need to be adjusted if the class proportion in the training sample (ρj) is not the same as in the population (πj). This is shown on the right side of Figure 5.17. It is common practice to try to balance the classes, particularly with rare outcomes.

A confusion matrix presupposes an allocation (decision) rule. A primitive rule allocates cases to the class with the greatest posterior probability. For binary targets, this corresponds to a 50% cutoff on the posterior probability.

Model Studio automatically adjusts assessment measures, assessment graphs, and prediction estimates for this bias. After running the pipeline, you can examine the score code. The score code contains a section titled Adjust Posterior Probabilities. This code block modifies the posterior probability by multiplying it by the ratio of the actual probability to the event-based sampling values specified above.

Tree Accuracy

The accuracy of a tree can be calculated as a weighted average of the accuracy in each leaf. The weights are the node sizes as shown in Figure 5.18.

Accuracy is obtained by multiplying the proportion of observations falling into each leaf by the proportion of those correctly classified in the leaf.

For example, to calculate training accuracy for the tree shown in Figure 5.19, 42% of the observations fall into the left branch and all are classified as a 1. Of those, 85% are classified correctly. Consequently, the accuracy in the left branch would be (0.42)(0.85). Similarly, 58% of the observations fall into the right branch and all are classified as a 0. Of those, 91% are classified correctly. The accuracy in the right branch would be (0.58)(0.91). The training accuracy of the tree would be (0.42)(0.85) + (0.58)(0.91) = 0.88. The validation accuracy is calculated in a similar manner.

Figure 5.18: Accuracy in a Decision Tree



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Popular ebooks
In-Memory Analytics with Apache Arrow by Matthew Topol(2694)
Data Forecasting and Segmentation Using Microsoft Excel by Fernando Roque(2691)
PostgreSQL 14 Administration Cookbook by Simon Riggs(2218)
Cloud Auditing Best Practices: Perform Security and IT Audits across AWS, Azure, and GCP by building effective cloud auditing plans by Shinesa Cambric Michael Ratemo(1618)
Architects of Intelligence_The Truth About AI From the People Building It by Martin Ford(1240)
In-Memory Analytics with Apache Arrow: Perform fast and efficient data analytics on both flat and hierarchical structured data by Matthew Topol(1037)
Mastering Azure Virtual Desktop: The Ultimate Guide to the Implementation and Management of Azure Virtual Desktop by Ryan Mangan(1014)
Automated Machine Learning in Action by Qingquan Song Haifeng Jin Xia Hu(903)
Python GUI Programming with Tkinter, 2nd edition by Alan D. Moore(870)
Ansible for Real-Life Automation - A complete Ansible handbook filled with practical IT automation use cases (2022) by Packt(742)
Learn Wireshark - A definitive guide to expertly analyzing protocols and troubleshooting networks using Wireshark - 2nd Edition (2022) by Packt(734)
Data Engineering with Scala and Spark by Eric Tome Rupam Bhattacharjee David Radford(418)
Introduction to Algorithms, Fourth Edition by unknow(363)
ABAP Development for SAP HANA by Unknown(359)
Automated Machine Learning in Action by Qingquan Song & Haifeng Jin & Xia Hu(304)
Kubernetes Secrets Handbook by Emmanouil Gkatziouras | 
Rom Adams
 | Chen Xi(286)
Asynchronous Programming in Rust by Carl Fredrik Samson;(259)
Learn Enough Developer Tools to Be Dangerous: Git Version Control, Command Line, and Text Editors Essentials by Michael Hartl(255)
Machine Learning for Imbalanced Data by Kumar Abhishek Dr. Mounir Abdelaziz(251)
The AWK Programming Language by Aho Alfred V. Kernighan Brian W. Weinberger Peter J. & Brian W. Kernighan & Peter J. Weinberger(241)