Python for Data Science by Yuli Vasiliev

Python for Data Science by Yuli Vasiliev

Author:Yuli Vasiliev [Vasiliev, Yuli]
Language: eng
Format: epub
ISBN: 9781718502215
Publisher: No Starch Press
Published: 2022-05-16T00:00:00+00:00


Exercise #10: Excluding Total Rows from the DataFrame

Having rows for totals in a DataFrame allows you to use it as a report without having to add further steps. However, if you’re going to use the DataFrame in further aggregation operations, you may need to exclude rows for totals.

Try filtering the df_totals DataFrame created in the previous section, excluding the grand total and subtotal rows. Use the slicing techniques discussed in this chapter.

Selecting All Rows in a Group

In addition to aiding with aggregation, the groupby() function also helps you to select all the rows belonging to a certain group. To accomplish this, objects returned by groupby() provide the get_group() method. Here’s how it works:

group = df_result.groupby(['Date','Region'])

group.get_group(('2022-02-04','West'))

You group the df_result DataFrame by Date and Region, passing the column names in as a list to groupby(), just like you did previously. Then you invoke the get_group() method on the resulting GroupBy object, passing a tuple with the desired index. This returns the following DataFrame:

Date Region Total

0 2022-02-04 West 87.0

1 2022-02-04 West 112.0

2 2022-02-04 West 20.0

3 2022-02-04 West 24.0

As you can see, the result set isn’t an aggregation. Rather, it includes all the order rows related to the specified date and region.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.