Statistical Reinforcement Learning: Modern Machine Learning Approaches by Masashi Sugiyama
Author:Masashi Sugiyama
Language: rus
Format: mobi, epub
Publisher: CRC Press
Published: 2015-03-26T22:00:00+00:00
86
Statistical Reinforcement Learning
1
10
Gaussian density
True
Laplacian density
Sample with noise
8
0.8
6
0.6
4
2
0.4
Immediate reward 0
0.2
−2
−4
0
0.55
1.5
1.75 1.85
−4
−2
0
2
4
Height of end effector
FIGURE 6.4: Probability density
FIGURE 6.5: Example of training
functions of Gaussian and Lapla-
samples with Laplacian noise. The
cian distributions.
horizontal axis is the height of the
end effector. The solid line denotes
the noiseless immediate reward and
“◦” denotes a noisy training sample.
14
12
12
10
10
8
8
6
6
Sum of rewards
Sum of rewards
4
4
2
2
LSPI
LAPI
0
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Iteration
Iteration
(a) No noise
(b) Laplacian noise
FIGURE 6.6: Average and standard deviation of the sum of rewards over 50
runs for the acrobot swinging-up simulation. The best method in terms of the
mean value and comparable methods according to the t-test at the significance
level 5% specified by “◦.”
Figure 6.7 and Figure 6.8 depict motion examples of the acrobot learned
by LAPI and LSPI in the Laplacian-noise environment. When LSPI is used
(Figure 6.7), the second joint is swung hard in order to lift the end effector.
However, the end effector tends to stay below the horizontal bar, and therefore
only a small amount of reward can be obtained by LSPI. This would be due to
the existence of outliers. On the other hand, when LAPI is used (Figure 6.8),
the end effector goes beyond the bar, and therefore a large amount of reward
can be obtained even in the presence of outliers.
Download
Statistical Reinforcement Learning: Modern Machine Learning Approaches by Masashi Sugiyama.epub
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Computer Vision & Pattern Recognition | Expert Systems |
Intelligence & Semantics | Machine Theory |
Natural Language Processing | Neural Networks |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(7846)
Hadoop in Practice by Alex Holmes(5656)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5510)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(4495)
Functional Programming in JavaScript by Mantyla Dan(3720)
The Age of Surveillance Capitalism by Shoshana Zuboff(3413)
Big Data Analysis with Python by Ivan Marin(2966)
Blockchain Basics by Daniel Drescher(2884)
The Rosie Effect by Graeme Simsion(2704)
WordPress Plugin Development Cookbook by Yannick Lefebvre(2581)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2491)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2474)
Dawn of the New Everything by Jaron Lanier(2433)
Test-Driven Development with Java by Alan Mellor(2298)
The Art Of Deception by Kevin Mitnick(2295)
Rapid Viz: A New Method for the Rapid Visualization of Ideas by Kurt Hanks & Larry Belliston(2190)
Human Dynamics Research in Smart and Connected Communities by Shih-Lung Shaw & Daniel Sui(2175)
Once Upon an Algorithm by Martin Erwig(2142)
Data Augmentation with Python by Duc Haba(2142)