Reprogramming The American Dream by Kevin Scott & Greg Shaw

Reprogramming The American Dream by Kevin Scott & Greg Shaw

Author:Kevin Scott & Greg Shaw [Scott, Kevin]
Language: eng
Format: azw3
Publisher: Harper Business
Published: 2020-04-06T16:00:00+00:00


How Models Learn

The machine learning approach I just described is an example of supervised learning. With supervised learning, a human being labels all the data required to train a model. You can think of this labeling process as a way to teach a model how to recognize patterns in data. To borrow an analogy from Sean Gerrish’s book How Smart Machines Think, it’s a bit like teaching a young child about the world through flash cards. The cards are the labeled training data, and the training process is repeatedly showing the cards to the child until they grasp the pattern. Any child is a far more advanced learning machine than a machine learning algorithm, so you don’t want to carry this analogy too far. But at a high level, this is the essence of supervised machine learning.

Your model is only as good as your data. If you are teaching it to recognize a bucket, you must figure out what readily definable features of your data are going to help your model learn. For e-mail that might be easy because e-mails have lots of easily discernible structure. It’s a lot tougher to say what the features might be for recognizing buckets. A human, when asked to describe useful features of buckets, might say things like “they sometimes have handles” or “they’re kind of cylindrical and have holes in the top” or “they’re usually made from plastic or metal.” But for a computer, those aren’t readily definable features. They are difficult to describe directly to a machine that knows nothing at all about many things we take for granted like handles, holes, and materials. In fact, the difficulty of describing such complex features to machines is one of the reasons machine learning was invented in the first place!

Once you’ve figured out features, you might have to label a hundred thousand different buckets of all different shapes, sizes, and color, from different perspectives and under different lighting conditions. If you don’t provide enough variety and quantity of training examples, the model won’t be able to generalize, to recognize buckets that it’s never seen before, or to say that a picture of a dog is not a bucket. If you have biased data, you will likely train a biased model. If, for instance, you mostly used pictures of red buckets in training, your model might not recognize blue buckets as buckets at all, which is problematic in a world with both red and blue buckets.

Some of the biggest challenges in modern supervised machine learning arise from the need to make your training data representative enough of the domain you want the AI to learn. A tremendous amount of human effort goes into feature engineering and labeling data. You must make sure that the features you choose are predictive, that the labels are accurate, and that the training data is representative. Human developers, data scientists, machine teachers, and data workers also must manage bias in the data they are feeding the AI. Machine-learned models learn what we teach them.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.