Automated Machine Learning with Microsoft Azure by Dennis Michael Sawyers
Author:Dennis Michael Sawyers
Language: eng
Format: epub
Publisher: Packt Publishing Pvt Ltd
Published: 2021-03-26T00:00:00+00:00
You now have the OJ Sales data prepped for the accelerator. In order to bring your own data into the accelerator, there are a few important caveats you need to follow. Most importantly, the OJ Sales data comes presplit based on store and orange juice brand. You will need to mimic this structure using your own data in a new Jupyter notebook.
Prepping a pandas dataframe
Bringing your own data into the MMSA is unclear. OJ Sales, after all, is a file dataset consisting of 11,793 files. You are much more likely to use data that consists of a single file or comes from a single table within a database. Moreover, you are most likely to read it in via pandas, the most common Python package. To learn how to use pandas dataframes with the MMSA, perform the following steps:
Download the ManyModelsSampleData.csv file from the Automated-Machine-Learning-on-Microsoft-Azure GitHub repository.
Navigate to your Jupyter environment.
Open the solution-accelerator-many-models folder.
Click the Upload button in the top-left corner of your screen. Upload the ManyModelsSampleData.csv file to your Jupyter environment.
Create a new Jupyter notebook and open it. Rename it 01_Data_PreparationMy-Data.ipynb.
To load in all of the libraries, you will require the following code:import pandas as pd
import numpy as np
import os
import datetime as dt
from azureml.core import Workspace, Dataset, Datastore
from scripts.helper import split_data
You should recognize pandas, numpy, Workspace, Dataset, and Datastore from Chapter 4, Building an AutoML Regression Solution. You've also used os in Chapter 6, Building an AutoML Forecasting Solution.
New to this script is split_data, which is a helper function. Helper functions are reusable functions written for a program to reduce complexity. The MMSA has a few helper functions and split data is used to divide data into training and inference data based on a date you pass in.
Another new package is datetime, which lets you convert string objects into proper Python datetime objects. This is a requirement since split_data requires datetime objects to function properly.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Exploring Deepfakes by Bryan Lyon and Matt Tora(7730)
Robo-Advisor with Python by Aki Ranin(7626)
Offensive Shellcode from Scratch by Rishalin Pillay(6107)
Microsoft 365 and SharePoint Online Cookbook by Gaurav Mahajan Sudeep Ghatak Nate Chamberlain Scott Brewster(5025)
Ego Is the Enemy by Ryan Holiday(4958)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4438)
Python for ArcGIS Pro by Silas Toms Bill Parker(4184)
Elevating React Web Development with Gatsby by Samuel Larsen-Disney(3890)
Machine Learning at Scale with H2O by Gregory Keys | David Whiting(3627)
Learning C# by Developing Games with Unity 2021 by Harrison Ferrone(3285)
Speed Up Your Python with Rust by Maxwell Flitton(3231)
Liar's Poker by Michael Lewis(3225)
OPNsense Beginner to Professional by Julio Cesar Bueno de Camargo(3195)
Extreme DAX by Michiel Rozema & Henk Vlootman(3172)
Agile Security Operations by Hinne Hettema(3124)
Linux Command Line and Shell Scripting Techniques by Vedran Dakic and Jasmin Redzepagic(3109)
Essential Cryptography for JavaScript Developers by Alessandro Segala(3083)
Cryptography Algorithms by Massimo Bertaccini(3002)
AI-Powered Commerce by Andy Pandharikar & Frederik Bussler(2983)
