[1] LEVINE S, FINN C, DARRELL T, et al. End-to-end training of deep visuomotor policies[J]. The Journal of Machine Learning Research, 2016, 17(1): 1334-1373. [2] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[J]. Computer Science, 2013,arXiv:1312.5602. [3] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529. [4] RABINOWITZ N C, DESJARDINS G, RUSU A A, et al. Progressive neural networks: U.S. Patent Application 15/396,319[P]. 2017-11-23. [5] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012,25(2): 1097-1105. [6] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer Science, 2014, arXiv preprint arXiv:1409.1556. [7] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 1-9. [8] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE, 2016: 770-778. [9] FINN C, LEVINE S. Deep visual foresight for planning robot motion[C]//IEEE International Conference on Robotics and Automation. Ningbo, China: IEEE, 2017: 2786-2793. [10] YAHYA A, LI A, KALAKRISHNAN M, et al. Collective robot reinforcement learning with distributed asynchronous guided policy search[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems . Vancouver, Canada:IEEE, 2017: 79-86. [11] MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//International conference on machine learning. New York, USA: IEEE, 2016: 1928-1937. [12] LILLICRAP T P , HUNT J J , PRITZEL A , et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015, 8(6):A187. [13] SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. MIT press, 2018. [14] SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//International Conference on Machine Learning. Beijing, China: IEEE, 2014: 387-395.
(上接第804页)
[12] ZOU Q, ZHANG H, WEN C K, et al. Concise derivation for generalized approximate message passing using expectation propagation[J]. IEEE Signal Processing Letters, 2018, 25(12): 1835-1839. [13] MINKA T P. Expectation propagation for approximate Bayesian inference[C]//Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. Pittsburgh: Morgan Kaufmann Publishers Inc., 2001: 362-369. [14] RASMUSSEN CE, WILLIAMS K I. Gaussian Process for Machine Learning[M]. The MIT Press, 2006. [15] VILA J, SCHNITER P, RANGAN S, et al. Adaptive damping and mean removal for the generalized approximate message passing algorithm[C]//2015 IEEE International Conference on Acoustics, Speech and Signal Processing. Brisbane, Australia: IEEE, 2015: 2021-2025. [16] CALTAGIRONE F, ZDEBOROV L, KRZAKALA F. On convergence of approximate message passing[C]//2014 IEEE International Symposium on Information Theory. Honolulu, USA:IEEE, 2014: 1812-1816. [17] SCHNITER P, RANGAN S. Compressive phase retrieval via generalized approximate message passing[J]. IEEE Transactions on Signal Processing, 2015, 63(4): 1043-1055. [18] BEYME S, LEUNG C. Efficient computation of DFT of Zadoff-Chu sequences[J]. Electronics letters, 2009, 45(9): 461-463.
() () |