Artificial Intelligence by Margaret A. Boden
Author:Margaret A. Boden
Language: eng
Format: epub
Publisher: Oxford University Press
Published: 2018-06-23T00:00:00+00:00
Backprop and brains—and deep learning
PDP enthusiasts argue that their networks are more biologically realistic than symbolic AI. It’s true that PDP is inspired by brains, and that some neuroscientists use it to model neural functioning. However, ANNs differ significantly from what lies inside our heads.
One difference between (most) ANNs and brains is back-propagation, or backprop. This is a learning rule—or rather, a general class of learning rules—that’s frequently used in PDP. Anticipated by Paul Werbos in 1974, it was defined more useably by Geoffrey Hinton in the early 1980s. It solves the problem of credit assignment.
This problem arises across all types of AI, especially when the system is continually changing. Given a complex AI system that’s successful, just which parts of it are most responsible for the success? In evolutionary AI, credit is often assigned by the ‘bucket-brigade’ algorithm (see Chapter 5). In PDP systems with deterministic (not stochastic) units, credit is typically assigned by backprop.
The backprop algorithm traces responsibility back from the output layer into the hidden layers, identifying the individual units that need to be adapted. (The weights are updated to minimize prediction errors.) The algorithm needs to know the precise state of the output layer when the network is giving the right answer. (So backprop is supervised learning.) Unit-by-unit comparisons are made between this exemplary output and the output actually obtained from the network. Any difference between an output unit’s activity in the two cases counts as an error.
The algorithm assumes that error in an output unit is due to error(s) in the units connected to it. Working backwards through the system, it attributes a specific amount of error to each unit in the first hidden layer, depending on the connection weight between it and the output unit. Blame is shared between all the hidden units connected to the mistaken output unit. (If a hidden unit is linked to several output units, its mini-blames are summed.) Proportional weight changes are then made to the connections between the hidden layer and the preceding layer.
That layer may be another (and another …) stratum of hidden units. But ultimately it will be the input layer, and the weight changes will stop. This process is iterated until the discrepancies at the output layer are minimized.
For many years, backprop was used only on networks with one hidden layer. Multilayer networks were rare: they are difficult to analyse, and even to experiment with. Recently, however, they have caused huge excitement—and some irresponsible hype—by the advent of deep learning. Here, a system learns structure reaching deep into a domain, as opposed to mere superficial patterns. In other words, it discovers a multilevel knowledge representation, not a single-level one.
Deep learning is exciting because it promises to enable ANNs, at last, to deal with hierarchy. Since the early 1980s, connectionists such as Hinton and Jeff Elman had struggled to represent hierarchy—by combining local/distributed representation, or by defining recurrent nets. (Recurrent nets, in effect, perform as a sequence of discrete steps. Recent versions, using deep learning, can sometimes predict the next word in a sentence, or even the next ‘thought’ in a paragraph.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Computer Vision & Pattern Recognition | Expert Systems |
Intelligence & Semantics | Machine Theory |
Natural Language Processing | Neural Networks |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(7857)
Hadoop in Practice by Alex Holmes(5661)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5516)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(4512)
Functional Programming in JavaScript by Mantyla Dan(3723)
The Age of Surveillance Capitalism by Shoshana Zuboff(3425)
Big Data Analysis with Python by Ivan Marin(3033)
Blockchain Basics by Daniel Drescher(2893)
The Rosie Effect by Graeme Simsion(2712)
WordPress Plugin Development Cookbook by Yannick Lefebvre(2608)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2525)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2483)
Dawn of the New Everything by Jaron Lanier(2440)
Test-Driven Development with Java by Alan Mellor(2424)
The Art Of Deception by Kevin Mitnick(2300)
Data Augmentation with Python by Duc Haba(2271)
The Infinite Retina by Robert Scoble Irena Cronin(2199)
Rapid Viz: A New Method for the Rapid Visualization of Ideas by Kurt Hanks & Larry Belliston(2197)
Human Dynamics Research in Smart and Connected Communities by Shih-Lung Shaw & Daniel Sui(2179)