Data Smart by John W. Foreman

Data Smart by John W. Foreman

Author:John W. Foreman
Language: eng
Format: epub, mobi, pdf
Publisher: John Wiley & Sons
Published: 2013-10-18T04:00:00+00:00


Don't Kid Yourself

Folks who don't know how AI models work often experience some combination of awe and creepiness when hearing about how these models can predict the future. But to paraphrase the great 1992 film Sneakers, “Don't kid yourself. It's not that [intelligent].”

Why? Because AI models are no smarter than the sum of their parts. At a simplistic level, you feed a supervised AI algorithm some historical data, purchases at Target for example, and you tell the algorithm, “Hey, these purchases were from pregnant people, and these other purchases were from not-so-pregnant people.” The algorithm munches on the data and out pops a model. In the future, you feed the model a customer's purchases and ask, “Is this person pregnant?” and the model answers, “No, that's a 26-year-old dude living in his mom's basement.”

That's extremely helpful, but the model isn't a magician. It just cleverly turns past data into a formula or set of rules that it uses to predict a future case. As we saw in the case of naïve Bayes in Chapter 3, it's the AI model's ability to recall this data and associated decision rules, probabilities, or coefficients that make it so effective.

We do this all the time in our own non-artificially intelligent lives. For example, using personal historical data, my brain knows that when I eat a sub sandwich with brown-looking alfalfa sprouts on it, there's a good chance I may be ill in a few hours. I've taken past data (I got sick) and trained my brain on it, so now I have a rule, formula, model, whatever you'd like to call it: brown sprouts = gastrointestinal nightmare.

In this chapter, we're going to implement two different regression models just to see how straightforward AI can be. Regression is the granddaddy of supervised predictive modeling with research being done on it as early as the turn of the 19th century. It's an oldie, but its pedigree contributes to its power—regression has had time to build up all sorts of rigor around it in ways that some newer AI techniques have not. In contrast to the MacGyver feel of naïve Bayes in Chapter 3, you'll feel the weight of the statistical rigor of regression in this chapter, particularly when we investigate significance testing.

Similarly to how we used the naïve Bayes model in Chapter 3, we'll use these models for classification. However as you'll see, the problem at hand is very different from the bag-of-words document classification problem we encountered earlier.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.