Data Science With Python: The Absolute Beginners Guide To Learn Data Science With Python by George Henry

Data Science With Python: The Absolute Beginners Guide To Learn Data Science With Python by George Henry

Author:George, Henry [George, Henry]
Language: eng
Format: epub
Published: 2020-01-22T16:00:00+00:00


Slicing And Indexing NumPy Arrays

When we index, we can get a view of a ndarray and we do this by pointing out wither an index or the slice of the rows and columns we want to visualize.

A working array is defined like this:

In: import numpy as np

In: M = np.arange(10*10, dtype=int).reshape(10,10)

The array is a bidimensional array, 10x10. We can begin by slicing it into one dimension and the notation is exactly the same as that used in a Python list:

[start_index_included:end_index_exclude:steps]

Let’s assume that we want the even numbered rows from 2 to 28 :

In: M[2:9:2,:]

Out: array([[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],

[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],

[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],

[80, 81, 82, 83, 84, 85, 86, 87, 88, 89]])

Once the rows have been sliced, the columns can be sliced, taking the columns from index 5:

In: M[2:9:2,5:]

Out: array([[25, 26, 27, 28, 29],

[45, 46, 47, 48, 49],

[65, 66, 67, 68, 69],

[85, 86, 87, 88, 89]])

As with lists, we can use negative index values to start the count from the end. Using negative numbers for parameters like steps will reverse the output array order as in the next example where we start the count from the column at index 5, in reverse towards index 0:

In: M[2:9:2,5::-1 ]

Out: array([[25, 24, 23, 22, 21, 20],

[45, 44, 43, 42, 41, 40],

[65, 64, 63, 62, 61, 60],

[85, 84, 83, 82, 81, 80]])

Boolean indexes can also be created to point out the rows and columns we want to select. As such, the previous example can be replicated with two variables – row_index and col_index:

In: row_index = (M[:,0]>=20) & (M[:,0]<=80)

col_index = M[0,:]>=5

M[row_index,:][:,col_index]

Out:array([[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],

[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],

[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],

[80, 81, 82, 83, 84, 85, 86, 87, 88, 89]])

Boolean indexes cannot be used contextually on rows and columns inside the same set of square brackets although the usual indexing can be applied to the other dimension by using integer indexes. As a consequence, a Boolean selection needs to be operated first on the rows and then the square brackets are re-opened and a second selection is operated on the first but on the columns this time.

If a global selection is required on the array elements, a mask of Boolean values can be used, like this:

In: mask = (M>=20) & (M<=90) & ((M / 10.) % 1 >= 0.5)

M[mask]

Out: array([25, 26, 27, 28, 29, 35, 36, 37, 38, 39,

45, 46, 47, 48, 49, 55, 56, 57, 58, 59, 65, 66, 67,

68, 69, 75, 76, 77, 78, 79, 85, 86, 87, 88, 89] )

This is a useful approach when you need to operate on the array partition that the mask selected, i.e. M[mask]=0.

Another way of pointing out the elements that we want to be selected from the array is to provide a column or row index that consists of integers. We can define indexes like this by using a np.where()



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.