Supervised Machine Learning with Python by Taylor Smith

Supervised Machine Learning with Python by Taylor Smith

Author:Taylor Smith
Language: eng
Format: mobi
Tags: COM018000 - COMPUTERS / Data Processing, COM051360 - COMPUTERS / Programming Languages / Python, COM051300 - COMPUTERS / Programming / Algorithms
Publisher: Packt
Published: 2019-05-21T08:05:29+00:00


The information gain metric is used by CART trees in a classification

context. It measures the difference in the gini or entropy before and

after a split to determine whether the split "taught" us anything.

If you remember from the last section, uncertainty is essentially the level of impurity, or entropy, induced by the split. When we compute InformationGain using either gini or entropy, our uncertainty is going to be metric pre-split:

def __init__(self, metric):

# let fail out with a KeyError if an improper metric

self.crit = {'gini': gini_impurity,

'entropy': entropy}[metric]

If we compute uncertainty, we would pass in a node and say compute Gini, for instance, on all of the samples inside of the node before we split, and then, when we call to actually compute InformationGain, we pass in mask for whether something is going left or right. We will compute the Gini on the left and right side, and return InformationGain:

def __call__(self, target, mask, uncertainty):

"""Compute the information gain of a split.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.