|
Showing 1 - 1 of
1 matches in All Departments
A Markov Decision Process (MDP) is a natural framework for
formulating sequential decision-making problems under uncertainty.
In recent years, researchers have greatly advanced algorithms for
learning and acting in MDPs. This book reviews such algorithms,
beginning with well-known dynamic programming methods for solving
MDPs such as policy iteration and value iteration, then describes
approximate dynamic programming methods such as trajectory based
value iteration, and finally moves to reinforcement learning
methods such as Q-Learning, SARSA, and least-squares policy
iteration. It describes algorithms in a unified framework, giving
pseudocode together with memory and iteration complexity analysis
for each. Empirical evaluations of these techniques, with four
representations across four domains, provide insight into how these
algorithms perform with various feature sets in terms of running
time and performance. This tutorial provides practical guidance for
researchers seeking to extend DP and RL techniques to larger
domains through linear value function approximation. The practical
algorithms and empirical successes outlined also form a guide for
practitioners trying to weigh computational costs, accuracy
requirements, and representational concerns. Decision making in
large domains will always be challenging, but with the tools
presented here this challenge is not insurmountable.
|
You may like...
Widows
Viola Davis, Michelle Rodriguez, …
Blu-ray disc
R22
R19
Discovery Miles 190
|
Email address subscribed successfully.
A activation email has been sent to you.
Please click the link in that email to activate your subscription.