Mastering pandas by Femi Anthony

Mastering pandas by Femi Anthony

Author:Femi Anthony [Anthony, Femi]
Language: eng
Format: epub, azw3
Publisher: Packt Publishing
Published: 2015-06-22T07:00:00+00:00


The median

The median is the data value that divides the set of sorted data values into two halves. It has exactly half of the population to its left and the other half to its right. In the case when the number of values in the dataset is even, the median is the average of the two middle values. It is less affected by outliers and skewed data.

The mode

The mode is the most frequently occurring value in the dataset. It is more commonly used for categorical data in order to know which category is most common. One downside to using the mode is that it is not unique. A distribution with two modes is described as bimodal, and one with many modes is denoted as multimodal. Here is an illustration of a bimodal distribution with modes at two and seven since they both occur four times in the dataset:

In [4]: import matplotlib.pyplot as plt %matplotlib inline In [5]: plt.hist([7,0,1,2,3,7,1,2,3,4,2,7,6,5,2,1,6,8,9,7]) plt.xlabel('x') plt.ylabel('Count') plt.title('Bimodal distribution') plt.show()



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.