中国科学技术大学学报 ›› 2019, Vol. 49 ›› Issue (10): 812-819.DOI: 10.3969/j.issn.0253-2778.2019.10.006

• 论著 • 上一篇    下一篇

基于渐进式神经网络的机器人控制策略迁移

隋洪建,尚伟伟,李想,丛爽   

  1. 中国科学技术大学自动化系,安徽合肥 230027
  • 收稿日期:2018-12-24 修回日期:2019-05-16 接受日期:2019-05-16 出版日期:2019-10-31 发布日期:2019-05-16
  • 通讯作者: 尚伟伟
  • 作者简介:隋洪建,男,1993年生,硕士生,研究方向:机器人迁移学习,E-mail: suihj@mail.ustc.edu.cn
  • 基金资助:
    国家自然科学基金(51675501)资助.

Robot control policy transfer based on progressive neural network

SUI Hongjian, SHANG Weiwei, LI Xiang, CONG Shuang   

  1. Department of Automation,University of Science and Technology of China, Hefei 230027
  • Received:2018-12-24 Revised:2019-05-16 Accepted:2019-05-16 Online:2019-10-31 Published:2019-05-16

摘要: 在机器人领域,通过深度学习方法来解决复杂的控制任务非常具有吸引力,但是收集足够的机器人运行数据来训练深度学习模型是困难的.为此,提出一种基于渐进式神经网络(progressive neural network,PNN)的迁移算法,该算法基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)框架,通过把模型池中的预训练模型与目标任务的控制模型有机地结合起来,从而完成从源任务到目标任务的控制策略的迁移.两个仿真实验的结果表明,该算法成功地把先前任务中学习到的控制策略迁移到了目标任务的控制模型中.相比于其他基准方法,该算法学习目标任务所需的时间大大减少.

关键词: 机器人控制, 迁移学习, 深度强化学习, 渐进式神经网络

Abstract: In the field of robotic control, it is appealing to solve complicated control tasks through deep learning techniques. However, collecting enough robot operating data to train deep learning models is difficult. Thus, in this paper a transfer approach based on progressive neural network (PNN) and deep deterministic policy gradient (DDPG) is proposed. By linking the current task model and pretrained task models in the model pool with a novel structure, the control strategy in the pretrained task models is transferred to the current task model. Simulation experiments validate that, the proposed approach can successfully transfer control policies learned from the source task to the current task. And compared with other baselines, the proposed approach takes remarkably less time to achieve the same performance in all the experiments.

Key words: robot control, transfer learning, deep reinforcement learning, progressive neural network

中图分类号: