Statistical Reinforcement Learning: Modern Machine Learning Approaches by Masashi Sugiyama

Statistical Reinforcement Learning: Modern Machine Learning Approaches by Masashi Sugiyama

Author:Masashi Sugiyama
Language: rus
Format: mobi, epub
Publisher: CRC Press
Published: 2015-03-26T22:00:00+00:00


86

Statistical Reinforcement Learning

1

10

Gaussian density

True

Laplacian density

Sample with noise

8

0.8

6

0.6

4

2

0.4

Immediate reward 0

0.2

−2

−4

0

0.55

1.5

1.75 1.85

−4

−2

0

2

4

Height of end effector

FIGURE 6.4: Probability density

FIGURE 6.5: Example of training

functions of Gaussian and Lapla-

samples with Laplacian noise. The

cian distributions.

horizontal axis is the height of the

end effector. The solid line denotes

the noiseless immediate reward and

“◦” denotes a noisy training sample.

14

12

12

10

10

8

8

6

6

Sum of rewards

Sum of rewards

4

4

2

2

LSPI

LAPI

0

0

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

Iteration

Iteration

(a) No noise

(b) Laplacian noise

FIGURE 6.6: Average and standard deviation of the sum of rewards over 50

runs for the acrobot swinging-up simulation. The best method in terms of the

mean value and comparable methods according to the t-test at the significance

level 5% specified by “◦.”

Figure 6.7 and Figure 6.8 depict motion examples of the acrobot learned

by LAPI and LSPI in the Laplacian-noise environment. When LSPI is used

(Figure 6.7), the second joint is swung hard in order to lift the end effector.

However, the end effector tends to stay below the horizontal bar, and therefore

only a small amount of reward can be obtained by LSPI. This would be due to

the existence of outliers. On the other hand, when LAPI is used (Figure 6.8),

the end effector goes beyond the bar, and therefore a large amount of reward

can be obtained even in the presence of outliers.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.