Multi-Agent Machine Learning by Schwartz H. M
Author:Schwartz, H. M.
Language: eng
Format: epub
ISBN: 9781118362082
Publisher: Wiley
Published: 2014-05-19T00:00:00+00:00
4.10 Policy Hill Climbing
Policy hill climbing (PHC) is a simple practical algorithm that can play mixed strategies. This algorithm was first proposed by Bowling and Veloso (2002). The PHC does not require much information as neither the player's recently executed actions nor its opponent's current strategy is required to be known. The PHC is a simple modification of the single-agent Q-learning algorithm. A hill climbing is performed by the PHC algorithm in the space of the mixed strategies. The PHC algorithm is composed of two parts. The reinforcement learning is the first part, as the Q-learning algorithm maintains the values of the particular actions in the states. The game-theoretic part is the second part in which the current strategy in each system's state is maintained.
The probability that selects the highest valued actions is increased by a small learning rate (0,1] so that the policy is improved. The algorithm is equivalent to Q-learning when , as the policy moves to the greedy policy with probability 1 while executing the highest valued action. The PHC algorithm is rational and converges to the optimal solution when a fixed (stationary) strategy is followed by the other players. However, the PHC algorithm may not converge to a stationary policy if the other players are learning although its average reward will converge to the reward of a Nash equilibrium. The PHC algorithm is illustrated in Algorithm 4.4.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Modelling of Convective Heat and Mass Transfer in Rotating Flows by Igor V. Shevchuk(6213)
Weapons of Math Destruction by Cathy O'Neil(5798)
Factfulness: Ten Reasons We're Wrong About the World – and Why Things Are Better Than You Think by Hans Rosling(4469)
Descartes' Error by Antonio Damasio(3148)
A Mind For Numbers: How to Excel at Math and Science (Even If You Flunked Algebra) by Barbara Oakley(3089)
Factfulness_Ten Reasons We're Wrong About the World_and Why Things Are Better Than You Think by Hans Rosling(3033)
TCP IP by Todd Lammle(2994)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2884)
Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets by Nassim Nicholas Taleb(2840)
The Tyranny of Metrics by Jerry Z. Muller(2824)
The Book of Numbers by Peter Bentley(2753)
The Great Unknown by Marcus du Sautoy(2521)
Once Upon an Algorithm by Martin Erwig(2464)
Easy Algebra Step-by-Step by Sandra Luna McCune(2442)
Lady Luck by Kristen Ashley(2392)
Practical Guide To Principal Component Methods in R (Multivariate Analysis Book 2) by Alboukadel Kassambara(2365)
Police Exams Prep 2018-2019 by Kaplan Test Prep(2339)
All Things Reconsidered by Bill Thompson III(2247)
Linear Time-Invariant Systems, Behaviors and Modules by Ulrich Oberst & Martin Scheicher & Ingrid Scheicher(2217)
