首页 | 本学科首页   官方微博 | 高级检索  
     


Hierarchical Deep Q-Network from imperfect demonstrations in Minecraft
Affiliation:1. Artificial Intelligence Research Institute FRC CSC RAS, Russia;2. Moscow Institute of Physics and Technology, Russia;3. Higher School of Economics, Russia;4. Moscow Aviation Institute, Russia;1. Department of Mathematics, Shaanxi University of Science & Technology, Xi’an 710021, China;2. Department of Mathematics, Shanghai Maritime University, Shanghai 201306, China;3. Department of Mathematics, University of New Mexico, Gallup, NM 87301, USA;4. Department of Mathematics, Obafemi Awolowo University, Ile Ife 220005, Nigeria;1. Departments of Computer Science of Purdue University, USA;2. TU Darmstadt, Germany;3. IMDEA Networks Institute, Spain;4. National Research University Higher School of Economics, St. Petersburg, Russia;5. Steklov Mathematical Institute at St. Petersburg, Russia;6. International Laboratory for Applied Network Research at National Research University Higher School of Economics, Moscow, Russia;1. Russian State University for the Humanities, Moscow, Russia;2. National Research Center “Kurchatov Institute”, Moscow, Russia;3. National Research Nuclear University MEPhI, Moscow, Russia;4. Institute for Advanced Brain Studies, Lomonosov Moscow State University, Moscow, Russia;5. Mental-Health Clinic No. 1 Named after N.A. Alexeev, Moscow, Russia
Abstract:We present Hierarchical Deep Q-Network (HDQfD) that won first place in the MineRL competition. The HDQfD works on imperfect demonstrations and utilizes the hierarchical structure of expert trajectories. We introduce the procedure of extracting an effective sequence of meta-actions and subgoals from the demonstration data. We present a structured task-dependent replay buffer and an adaptive prioritizing technique that allow the HDQfD agent to gradually erase poor-quality expert data from the buffer. In this paper, we present the details of the HDQfD algorithm and give the experimental results in the Minecraft domain.
Keywords:Reinforcement learning  Minecraft  Demonstrations
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号