Deep Belief Nets in C++ and CUDA C: Volume 1 by Timothy Masters
Author:Timothy Masters
Language: eng
Format: epub
Publisher: Apress, Berkeley, CA
The following notation will be used:
W Weight matrix, a column for each visible neuron and a row for each hidden neuron
b Column vector of visible neuron biases
c Column vector of hidden neuron biases
K The number of Monte Carlo iterations to perform
x The training case being processed (column vector)
q Data Vector of probabilities under the data distribution that each hidden neuron will be one (as opposed to zero)
h Data Hidden neuron activation vector under the data distribution, zero or one
p Model Vector of reconstruction probabilities under the model distribution that each visible neuron will be one (as opposed to zero)
v Model Reconstructed visible neuron activation vector, zero or one
q Model Vector of probabilities under the model distribution that each hidden neuron will be one (as opposed to zero)
h Model Hidden neuron activation vector under the model distribution, zero or one
It is to be understood that p is a vector of length equal to the number of inputs (visible neurons), and it contains probabilities computed by Equation 3-2 or 3-4. Each element of v is individually sampled from 0/1 according to these probabilities. The hidden neuron probabilities and activations are defined similarly.
vData = x
qData = f (c + WvData) Equation 3-3
Optionally compute the reconstruction error using the slow, accurate method.
q Model = q Data MC chain loop below initializes from data
k = 0
while k < K K must be at least 1
Sample hModel from qModel This sampling is critical; must not use q
pModel = f (b + W ′hModel) Equation 3-4
If k=0, optionally compute the reconstruction error using the fast method.
if mean field
qModel = f (c + WpModel)
else
vModel is sampled from pModel
qModel = f (c + WvModel)
k = k+1
end while
if mean field
Visible bias gradient = p Model − v Data
Hidden bias gradient = q Model − q Data
Weight gradient = q Model p ′ Model − q Data v ′ Data This product is a matrix
else
Visible bias gradient = v Model − v Data
Sample h Data from q Data
Hidden bias gradient = q Model − h Data
Weight gradient = q Model v ′ Model − h Data v ′ Data
A few things should be noted about this algorithm. First, the weight gradient is a matrix that, like W, has a row for each hidden neuron and a column for each visible neuron. The products given in Equation 3-12 are efficiently represented in the algorithm by showing them as the product of a column vector for hidden neurons times a row vector for visible neurons.
There are two different places in the algorithm in which one can compute the reconstruction error. This error has no use in the training algorithm itself, but it is nice to display it for the user. Regardless of which place we choose, Equations 3-3 and 3-4 are used to jump from the visible layer to the hidden layer and then bounce back to the visible layer. The reconstruction error will compare the original data with the reconstructed data. The only question is whether we use the raw probabilities from these equations or samples based on the probabilities.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Computer Vision & Pattern Recognition | Expert Systems |
Intelligence & Semantics | Machine Theory |
Natural Language Processing | Neural Networks |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8259)
Test-Driven Development with Java by Alan Mellor(6390)
Data Augmentation with Python by Duc Haba(6288)
Principles of Data Fabric by Sonia Mezzetta(6065)
Hadoop in Practice by Alex Holmes(5939)
Learn Blender Simulations the Right Way by Stephen Pearson(5926)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(5813)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5784)
RPA Solution Architect's Handbook by Sachin Sahgal(5209)
Big Data Analysis with Python by Ivan Marin(5179)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5105)
The Infinite Retina by Robert Scoble Irena Cronin(4900)
Pretrain Vision and Large Language Models in Python by Emily Webber(4155)
Functional Programming in JavaScript by Mantyla Dan(4019)
The Age of Surveillance Capitalism by Shoshana Zuboff(3915)
Infrastructure as Code for Beginners by Russ McKendrick(3912)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3616)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3427)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3402)
