Event-learning and robust policy heuristics期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Event-learning and robust policy heuristics

Authors:	Andrs Lrincz Imre Plik Istvn Szita

Institution:	Department of Information Systems, Eötvös Loránd University, Pázmány Péter sétány 1/C, H-1117, Budapest, Hungary

Abstract:	In this paper we introduce a novel reinforcement learning algorithm called event-learning. The algorithm uses events, ordered pairs of two consecutive states. We define event-value function and we derive learning rules. Combining our method with a well-known robust control method, the SDS algorithm, we introduce Robust Policy Heuristics (RPH). It is shown that RPH, a fast-adapting non-Markovian policy, is particularly useful for coarse models of the environment and could be useful for some partially observed systems. RPH may be of help in alleviating the ‘curse of dimensionality’ problem. Event-learning and RPH can be used to separate time scales of learning of value functions and adaptation. We argue that the definition of modules is straightforward for event-learning and event-learning makes planning feasible in the RL framework. Computer simulations of a rotational inverted pendulum with coarse discretization are shown to demonstrate the principle.

Keywords:	Reinforcement learning Event-learning Robust control Continuous SDS controller
本文献已被 ScienceDirect 等数据库收录！