首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   1篇
  免费   0篇
  2003年   1篇
排序方式: 共有1条查询结果,搜索用时 0 毫秒
1
1.
In this paper we introduce a novel reinforcement learning algorithm called event-learning. The algorithm uses events, ordered pairs of two consecutive states. We define event-value function and we derive learning rules. Combining our method with a well-known robust control method, the SDS algorithm, we introduce Robust Policy Heuristics (RPH). It is shown that RPH, a fast-adapting non-Markovian policy, is particularly useful for coarse models of the environment and could be useful for some partially observed systems. RPH may be of help in alleviating the ‘curse of dimensionality’ problem. Event-learning and RPH can be used to separate time scales of learning of value functions and adaptation. We argue that the definition of modules is straightforward for event-learning and event-learning makes planning feasible in the RL framework. Computer simulations of a rotational inverted pendulum with coarse discretization are shown to demonstrate the principle.  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号