Home > Computers & Technology > Networking & Cloud Computing > Network Administration > Storage & Retrieval

Practical DataOps by Harvinder Atwal

Author:Harvinder Atwal , Date: January 2, 2020 ,Views: 238

Practical DataOps by Harvinder Atwal

Author:Harvinder Atwal
Language: eng
Format: epub
ISBN: 9781484251041
Publisher: Apress

Concept Drift

Machine learning and deep learning models learn rules from historical input and output data to make predictions from new input data. The relationship between new input and output data is assumed to remain the same as historical input and output data, so the machine learning model is expected to make useful predictions for new unseen data. The relationship may hold in some cases, for instance, those fixed by laws of nature such as image recognition algorithms for cats. However, other relationships like customer purchasing behavior, spam email detection, or product quality with machinery wear will eventually evolve in a process known as concept drift.

A passive strategy to solve the problem of concept drift is to retrain models using a window of recent data periodically. However, this is not an option in some circumstances due to negative feedback loops. Imagine a recommender system where specific customers see recommendations for product X based on previous purchasing relationships. The data for these customers is now biased because they are now even more likely to buy product X. A model trained on this data will boost the recommendation for product X further, even if in the outside world the original relationship has changed and customers prefer product Y to product X.

One solution to the negative feedback problem is to randomly hold out some instances from model predictions to create a baseline to measure model performance and generate an unbiased dataset for model training. Another strategy is to use a trigger to initiate a model update. Concept drift can be challenging to discover, and there are multiple algorithms for detection. Some of the commonly used algorithms are Drift detection method (DDM), Early drift detection method (EDDM), Geometric moving average detection method (GMADM), and Exponentially weighted moving average chart detection method. These algorithms can be built into continuous diagnostic monitoring of performance and alert when a model requires retraining. Predictions can then be turned off for some or all instances to generate unbiased data for retraining to avoid negative feedback loops. The retraining only happens when the model is not performing well so the cost of training and opportunity cost of prediction benefit is lower than the other two methods.

Download

Practical DataOps by Harvinder Atwal.epub

Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.

Categories

Disaster & Recovery	Email Administration
Linux & UNIX Administration	Storage & Retrieval
Windows Administration

Popular ebooks

The Mikado Method by Ola Ellnestam Daniel Brolund(9350)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7435)
Grails in Action by Glen Smith Peter Ledbrook(7327)
Kotlin in Action by Dmitry Jemerov(4681)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4155)
The Age of Surveillance Capitalism by Shoshana Zuboff(3463)
Learn Windows PowerShell in a Month of Lunches by Don Jones(3274)
Mastering Azure Security by Mustafa Toroman and Tom Janetscheck(3049)
Mastering Python for Networking and Security by José Manuel Ortega(2999)
Blockchain Basics by Daniel Drescher(2923)
Microsoft 365 Identity and Services Exam Guide MS-100 by Aaron Guilmette(2849)
Configuring Windows Server Hybrid Advanced Services Exam Ref AZ-801 by Chris Gill(2799)
Azure Containers Explained by Wesley Haakman & Richard Hooper(2687)
TCP IP by Todd Lammle(2666)
From CIA to APT: An Introduction to Cyber Security by Edward G. Amoroso & Matthew E. Amoroso(2506)
Hands-On Azure for Developers by Kamil Mrzyglod(2461)
Combating Crime on the Dark Web by Nearchos Nearchou(2419)
React Native - Building Mobile Apps with JavaScript by Novick Vladimir(2357)
The Social Psychology of Inequality by Unknown(2342)
MCSA Windows Server 2016 Study Guide: Exam 70-740 by William Panek(2329)