Applied Text Analysis with Python by Benjamin Bengfort Tony Ojeda Rebecca Bilbro
Author:Benjamin Bengfort, Tony Ojeda, Rebecca Bilbro
Language: eng
Format: mobi, epub
Publisher: O'Reilly Media, Inc.
Published: 2016-12-19T05:00:00+00:00
Figure 2-1. The Model Selection Triple
In the feature extraction phase, which we will begin to explore in our discussion of vectorization in this chapter, the goal is to analyze, extract, and select a sufficiently hearty set of features with which to model the data. In the second phase, a set of algorithms are selected from a model family, which can then be used, evaluated, and compared in parallel. Finally, we conduct tuning by adjusting the model hyperparameters to identify the combination that result in the most predictive fitted model.
These tasks together allow data scientists to define and describe a learning model that is able to effectively leverage specific data (feature engineering) with a specific interaction between variables and the target of interest (algorithm selection) then optimize the behavior of that model during learning and prediction (hyperparameter tuning). Applied methodologies for all three workflows usually include heuristics or rules of thumb for specific algorithms, which can loosely be described as intuition, combined with automatic optimization and search techniques.
While the workflow it describes is one with which many machine learning practitioners are likely familiar, the model selection triple was first explicitly described in a 2015 SIGMOD paper by Kumar et al1. In their paper, which concerns the development of next-generation database systems built to anticipate predictive modeling, the authors cogently express that such systems are badly needed due to the highly experimental nature of machine learning in practice. “Model selection,” they explain, “is iterative and exploratory because the space of [model selection triples] is usually infinite, and it is generally impossible for analysts to know a priori which [combination] will yield satisfactory accuracy and/or insights.”
Indeed, the process of model selection is complex, iterative, and substantially more intricate than, say, the choice of a support vector machine over a decision tree classifier. Our model selection triple workflow aims to treat these iterations as central to the science of machine learning. It is a workflow that, thanks to the robust and secure foundational data layer, can afford to enable optimization by facilitating rather than limiting those iterations.
Download
Applied Text Analysis with Python by Benjamin Bengfort Tony Ojeda Rebecca Bilbro.epub
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Hello! Python by Anthony Briggs(9916)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(9796)
The Mikado Method by Ola Ellnestam Daniel Brolund(9779)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8302)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7782)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(7763)
Grails in Action by Glen Smith Peter Ledbrook(7697)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(7557)
Windows APT Warfare by Sheng-Hao Ma(6863)
Layered Design for Ruby on Rails Applications by Vladimir Dementyev(6595)
Blueprints Visual Scripting for Unreal Engine 5 - Third Edition by Marcos Romero & Brenden Sewell(6464)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6419)
Kotlin in Action by Dmitry Jemerov(5066)
Hands-On Full-Stack Web Development with GraphQL and React by Sebastian Grebe(4318)
Functional Programming in JavaScript by Mantyla Dan(4038)
Solidity Programming Essentials by Ritesh Modi(4017)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3808)
Unity 3D Game Development by Anthony Davis & Travis Baptiste & Russell Craig & Ryan Stunkel(3750)
The Ultimate iOS Interview Playbook by Avi Tsadok(3726)
