Abstract
We present a novel solution to the problem of simulation-to-real transfer, which builds on recent advances in robot skill decomposition. Rather than focusing on minimizing the simulation-reality gap, we learn a set of diverse policies that are parameterized in a way that makes them easily reusable. This diversity and parameterization of low-level skills allows us to find a transferable policy that is able to use combinations and variations of different skills to solve more complex, high-level tasks. In particular, we first use simulation to jointly learn a policy for a set of low-level skills, and a “skill embedding” parameterization which can be used to compose them. Later, we learn high-level policies which actuate the low-level policies via this skill embedding parameterization. The high-level policies encode how and when to reuse the low-level skills together to achieve specific high-level tasks. Importantly, our method learns to control a real robot in joint-space to achieve these high-level tasks with little or no on-robot time, despite the fact that the low-level policies may not be perfectly transferable from simulation to real, and that the low-level skills were not trained on any examples of high-level tasks. We illustrate the principles of our method using informative simulation experiments. We then verify its usefulness for real robotics problems by learning, transferring, and composing free-space and contact motion skills on a Sawyer robot using only joint-space control. We experiment with several techniques for composing pre-learned skills, and find that our method allows us to use both learning-based approaches and efficient search-based planning to achieve high-level tasks using only pre-learned skills.
R. Julian and E. Heiden—Equal Contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cooper, P.A.: Paradigm shifts in designed instruction: from behaviorism to cognitivism to constructivism. Educ. Technol. 33(5), 12–19 (1993)
Drescher, G.L.: Made-Up Minds: A Constructivist Approach to Artificial Intelligence. MIT Press, Cambridge (1991)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR, abs/1509.02971 (2015)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. JMLR 17(1), 1334–1373 (2016)
Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.: Combining model-based and model-free updates for trajectory-centric reinforcement learning (2017)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: ICRA. IEEE (2017)
Duan, Y., Chen, X., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. arXiv, 48:14 (2016)
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. IJRR 37(4–5), 421–436 (2018)
Hausman, K., Springenberg, J.T., Wang, Z., Heess, N., Riedmiller, M.: Learning an embedding space for transferable robot skills. In: ICLR (2018)
Gupta, A., Mendonca, R., Liu, Y., Abbeel, P., Levine, S.: Meta-reinforcement learning of structured exploration strategies. CoRR, abs/1802.07245 (2018)
Eysenbach, B., Gupta, A., Ibarz, J., Levine, S.: Diversity is all you need: learning skills without a reward function. CoRR, abs/1802.06070 (2018)
Heess, N., Wayne, G., Tassa, Y., Lillicrap, T., Riedmiller, M., Silver, D.: Learning and transfer of modulated locomotor controllers. CoRR, abs/1610.05182 (2016)
Haarnoja, T., Hartikainen, K., Abbeel, P., Levine, S.: Latent space policies for hierarchical reinforcement learning. CoRR, abs/1804.02808 (2018)
Co-Reyes, J.D., Liu, Y., Gupta, A., Eysenbach, B., Abbeel, P., Levine, S.: Self-consistent trajectory autoencoder: hierarchical reinforcement learning with trajectory embeddings. ArXiv e-prints, June 2018
Pastor, P., Kalakrishnan, M., Righetti, L., Schaal, S.: Towards associative skill memories. In: Humanoids, November 2012
Rueckert, E., Mundo, J., Paraschos, A., Peters, J., Neumann, G.: Extracting low-dimensional control variables for movement primitives. In: ICRA, May 2015
Rusu, A.A., Rabinowitz, N.C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R., Hadsell, R.: Progressive neural networks. CoRR, abs/1606.04671 (2016)
Rajendran, J., Prasanna, P., Ravindran, B., Khapra, M.M.: ADAAPT: a deep architecture for adaptive policy transfer from multiple sources. CoRR, abs/1510.02879 (2015)
Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. CoRR, abs/1710.06537 (2017)
Sadeghi, F., Levine, S.: CAD2RL: real single-image flight without a single real image. In: RSS (2017)
Tzeng, E., Devin, C., Hoffman, J., Finn, C., Peng, X., Levine, S., Saenko, K., Darrell, T.: Towards adapting deep visuomotor representations from simulated to real environments. CoRR, abs/1511.07111 (2015)
Kroemer, O., Sukhatme, G.S.: Learning relevant features for manipulation skills using meta-level priors. CoRR, abs/1605.04439 (2016)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR, abs/1707.06347 (2017)
Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: IROS (2012)
Acknowledgements
The authors would like to thank Angel Gonzalez Garcia, Jonathon Shen, and Chang Su for their work on the garage\(^2\) reinforcement learning for robotics framework, on which the software for this work was based. This research was supported in part by National Science Foundation grants IIS-1205249, IIS-1017134, EECS-0926052, the Office of Naval Research, the Okawa Foundation, and the Max-Planck-Society. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding organizations (https://github.com/rlworkgroup/garage).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 33362 KB)
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Julian, R. et al. (2020). Scaling Simulation-to-Real Transfer by Learning Composable Robot Skills. In: Xiao, J., Kröger, T., Khatib, O. (eds) Proceedings of the 2018 International Symposium on Experimental Robotics. ISER 2018. Springer Proceedings in Advanced Robotics, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-030-33950-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-33950-0_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33949-4
Online ISBN: 978-3-030-33950-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)