Hands-On Automated Machine Learning: A beginner's guide to building automated machine learning systems using AutoML and Python by Sibanjan Das & Umit Mert Cakmak

Hands-On Automated Machine Learning: A beginner's guide to building automated machine learning systems using AutoML and Python by Sibanjan Das & Umit Mert Cakmak

Author:Sibanjan Das & Umit Mert Cakmak [Das, Sibanjan]
Language: eng
Format: epub
Tags: COM037000 - COMPUTERS / Machine Theory, COM004000 - COMPUTERS / Intelligence (AI) and Semantics, COM018000 - COMPUTERS / Data Processing
Publisher: Packt Publishing
Published: 2018-04-25T22:00:00+00:00


Sometimes, the raw data that we use doesn't have enough information that can create a good model. In such cases, we need to create features. In the following section, we will describe a few different methods to create features.

Feature generation

Creating new features out of the existing features is an art and it can be done in many different ways.

The objective of feature creation is to provide ML algorithms with such predictors that makes it easy for them to understand the patterns and derive better relationship from the data.

For example, in HR attrition problems, the duration of stay of an employee in an organization is an important attribute. However, sometimes we don't have the duration of stay as a feature in the dataset, but we have the employee start date. In such cases, we can create the data for the duration of stay feature by subtracting the employee start date from the current date.

In the following sections, we will see some of the different ways to generate new features out of the data. However, this is not an extensive list, but a few different methods that can be employed to create new features. One needs to think through the problem statement, explore the data, and be creative to discover new ways to build features:

Numerical feature generation: Generating new features out of numerical data is somewhat easier than other data types. Even if we don't understand the meaning of various numerical features, we can do various kinds of operations, such as adding two or more numbers, computing the relative differences, and multiplying and dividing the numbers. After this task, we identify what are the important features out of all generated features and discard the others. Though it is a resource intensive task, it helps to discover new features when we are unaware of direct methods to derive new features.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.