Robot control policy transfer based on progressive neural network

doi:10.3969/j.issn.0253-2778.2019.10.006

Journal of University of Science and Technology of China ›› 2019, Vol. 49 ›› Issue (10): 812-819.DOI: 10.3969/j.issn.0253-2778.2019.10.006

• Original Paper • Previous Articles Next Articles

Robot control policy transfer based on progressive neural network

SUI Hongjian, SHANG Weiwei, LI Xiang, CONG Shuang

Department of Automation,University of Science and Technology of China, Hefei 230027

Received:2018-12-24 Revised:2019-05-16 Accepted:2019-05-16 Online:2019-10-31 Published:2019-05-16

Abstract

Abstract: In the field of robotic control, it is appealing to solve complicated control tasks through deep learning techniques. However, collecting enough robot operating data to train deep learning models is difficult. Thus, in this paper a transfer approach based on progressive neural network (PNN) and deep deterministic policy gradient (DDPG) is proposed. By linking the current task model and pretrained task models in the model pool with a novel structure, the control strategy in the pretrained task models is transferred to the current task model. Simulation experiments validate that, the proposed approach can successfully transfer control policies learned from the source task to the current task. And compared with other baselines, the proposed approach takes remarkably less time to achieve the same performance in all the experiments.

Key words: robot control, transfer learning, deep reinforcement learning, progressive neural network

CLC Number:

TP242

SUI Hongjian, SHANG Weiwei, LI Xiang, CONG Shuang. Robot control policy transfer based on progressive neural network[J]. Journal of University of Science and Technology of China, 2019, 49(10): 812-819.

References

［1］
LEVINE S, FINN C, DARRELL T, et al. End-to-end training of deep visuomotor policies[J]. The Journal of Machine Learning Research, 2016, 17(1): 1334-1373.
[2] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[J]. Computer Science, 2013，arXiv:1312.5602.
[3] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529.
[4] RABINOWITZ N C, DESJARDINS G, RUSU A A, et al. Progressive neural networks: U.S. Patent Application 15/396,319[P]. 2017-11-23.
[5] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012，25(2): 1097-1105.
[6] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer Science, 2014, arXiv preprint arXiv:1409.1556.
[7] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 1-9.
[8] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE, 2016: 770-778.
[9] FINN C, LEVINE S. Deep visual foresight for planning robot motion[C]//IEEE International Conference on Robotics and Automation. Ningbo, China: IEEE, 2017: 2786-2793.
[10] YAHYA A, LI A, KALAKRISHNAN M, et al. Collective robot reinforcement learning with distributed asynchronous guided policy search[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems . Vancouver, Canada:IEEE, 2017: 79-86.
[11] MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//International conference on machine learning. New York, USA: IEEE, 2016: 1928-1937.
[12] LILLICRAP T P , HUNT J J , PRITZEL A , et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015, 8(6):A187.
[13] SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. MIT press, 2018.
[14] SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//International Conference on Machine Learning. Beijing, China: IEEE, 2014: 387-395.

（上接第804页）

[12] ZOU Q, ZHANG H, WEN C K, et al. Concise derivation for generalized approximate message passing using expectation propagation[J]. IEEE Signal Processing Letters, 2018, 25(12): 1835-1839.
[13] MINKA T P. Expectation propagation for approximate Bayesian inference[C]//Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. Pittsburgh: Morgan Kaufmann Publishers Inc., 2001: 362-369.
[14] RASMUSSEN CE, WILLIAMS K I. Gaussian Process for Machine Learning[M]. The MIT Press, 2006.
[15] VILA J, SCHNITER P, RANGAN S, et al. Adaptive damping and mean removal for the generalized approximate message passing algorithm[C]//2015 IEEE International Conference on Acoustics, Speech and Signal Processing. Brisbane, Australia: IEEE, 2015: 2021-2025.
[16] CALTAGIRONE F, ZDEBOROV L, KRZAKALA F. On convergence of approximate message passing[C]//2014 IEEE International Symposium on Information Theory. Honolulu, USA:IEEE, 2014: 1812-1816.
[17] SCHNITER P, RANGAN S. Compressive phase retrieval via generalized approximate message passing[J]. IEEE Transactions on Signal Processing, 2015, 63(4): 1043-1055.
[18] BEYME S, LEUNG C. Efficient computation of DFT of Zadoff-Chu sequences[J]. Electronics letters, 2009, 45(9): 461-463.

()
()

[1]	QU Zhaowei,ZHAO Yanjiao,WANG Xiaoru. A multi-domain sentiment classification model based on sample filtering and transfer learning [J]. Journal of University of Science and Technology of China, 2019, 49(1): 8-14.
[2]	YANG Ziwen, CHEN Lei, PU Jianyu. Recognizing emotions from abstract paintings using convolutional neural network with two-layer transfer learning scheme [J]. Journal of University of Science and Technology of China, 2019, 49(1): 40-48.

Robot control policy transfer based on progressive neural network

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 2

Recommended Articles

Metrics

Comments