Scaling Simulation-to-Real Transfer by Learning Composable Robot Skills | SpringerLink
Skip to main content

Scaling Simulation-to-Real Transfer by Learning Composable Robot Skills

  • Conference paper
  • First Online:
Proceedings of the 2018 International Symposium on Experimental Robotics (ISER 2018)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 11))

Included in the following conference series:

  • 2132 Accesses

Abstract

We present a novel solution to the problem of simulation-to-real transfer, which builds on recent advances in robot skill decomposition. Rather than focusing on minimizing the simulation-reality gap, we learn a set of diverse policies that are parameterized in a way that makes them easily reusable. This diversity and parameterization of low-level skills allows us to find a transferable policy that is able to use combinations and variations of different skills to solve more complex, high-level tasks. In particular, we first use simulation to jointly learn a policy for a set of low-level skills, and a “skill embedding” parameterization which can be used to compose them. Later, we learn high-level policies which actuate the low-level policies via this skill embedding parameterization. The high-level policies encode how and when to reuse the low-level skills together to achieve specific high-level tasks. Importantly, our method learns to control a real robot in joint-space to achieve these high-level tasks with little or no on-robot time, despite the fact that the low-level policies may not be perfectly transferable from simulation to real, and that the low-level skills were not trained on any examples of high-level tasks. We illustrate the principles of our method using informative simulation experiments. We then verify its usefulness for real robotics problems by learning, transferring, and composing free-space and contact motion skills on a Sawyer robot using only joint-space control. We experiment with several techniques for composing pre-learned skills, and find that our method allows us to use both learning-based approaches and efficient search-based planning to achieve high-level tasks using only pre-learned skills.

R. Julian and E. Heiden—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 22879
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 28599
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
JPY 28599
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cooper, P.A.: Paradigm shifts in designed instruction: from behaviorism to cognitivism to constructivism. Educ. Technol. 33(5), 12–19 (1993)

    Google Scholar 

  2. Drescher, G.L.: Made-Up Minds: A Constructivist Approach to Artificial Intelligence. MIT Press, Cambridge (1991)

    MATH  Google Scholar 

  3. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR, abs/1509.02971 (2015)

    Google Scholar 

  4. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  5. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. JMLR 17(1), 1334–1373 (2016)

    MathSciNet  MATH  Google Scholar 

  6. Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.: Combining model-based and model-free updates for trajectory-centric reinforcement learning (2017)

    Google Scholar 

  7. Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: ICRA. IEEE (2017)

    Google Scholar 

  8. Duan, Y., Chen, X., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. arXiv, 48:14 (2016)

    Google Scholar 

  9. Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. IJRR 37(4–5), 421–436 (2018)

    Google Scholar 

  10. Hausman, K., Springenberg, J.T., Wang, Z., Heess, N., Riedmiller, M.: Learning an embedding space for transferable robot skills. In: ICLR (2018)

    Google Scholar 

  11. Gupta, A., Mendonca, R., Liu, Y., Abbeel, P., Levine, S.: Meta-reinforcement learning of structured exploration strategies. CoRR, abs/1802.07245 (2018)

    Google Scholar 

  12. Eysenbach, B., Gupta, A., Ibarz, J., Levine, S.: Diversity is all you need: learning skills without a reward function. CoRR, abs/1802.06070 (2018)

    Google Scholar 

  13. Heess, N., Wayne, G., Tassa, Y., Lillicrap, T., Riedmiller, M., Silver, D.: Learning and transfer of modulated locomotor controllers. CoRR, abs/1610.05182 (2016)

    Google Scholar 

  14. Haarnoja, T., Hartikainen, K., Abbeel, P., Levine, S.: Latent space policies for hierarchical reinforcement learning. CoRR, abs/1804.02808 (2018)

    Google Scholar 

  15. Co-Reyes, J.D., Liu, Y., Gupta, A., Eysenbach, B., Abbeel, P., Levine, S.: Self-consistent trajectory autoencoder: hierarchical reinforcement learning with trajectory embeddings. ArXiv e-prints, June 2018

    Google Scholar 

  16. Pastor, P., Kalakrishnan, M., Righetti, L., Schaal, S.: Towards associative skill memories. In: Humanoids, November 2012

    Google Scholar 

  17. Rueckert, E., Mundo, J., Paraschos, A., Peters, J., Neumann, G.: Extracting low-dimensional control variables for movement primitives. In: ICRA, May 2015

    Google Scholar 

  18. Rusu, A.A., Rabinowitz, N.C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R., Hadsell, R.: Progressive neural networks. CoRR, abs/1606.04671 (2016)

    Google Scholar 

  19. Rajendran, J., Prasanna, P., Ravindran, B., Khapra, M.M.: ADAAPT: a deep architecture for adaptive policy transfer from multiple sources. CoRR, abs/1510.02879 (2015)

    Google Scholar 

  20. Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. CoRR, abs/1710.06537 (2017)

    Google Scholar 

  21. Sadeghi, F., Levine, S.: CAD2RL: real single-image flight without a single real image. In: RSS (2017)

    Google Scholar 

  22. Tzeng, E., Devin, C., Hoffman, J., Finn, C., Peng, X., Levine, S., Saenko, K., Darrell, T.: Towards adapting deep visuomotor representations from simulated to real environments. CoRR, abs/1511.07111 (2015)

    Google Scholar 

  23. Kroemer, O., Sukhatme, G.S.: Learning relevant features for manipulation skills using meta-level priors. CoRR, abs/1605.04439 (2016)

    Google Scholar 

  24. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR, abs/1707.06347 (2017)

    Google Scholar 

  25. Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: IROS (2012)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Angel Gonzalez Garcia, Jonathon Shen, and Chang Su for their work on the garage\(^2\) reinforcement learning for robotics framework, on which the software for this work was based. This research was supported in part by National Science Foundation grants IIS-1205249, IIS-1017134, EECS-0926052, the Office of Naval Research, the Okawa Foundation, and the Max-Planck-Society. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding organizations (https://github.com/rlworkgroup/garage).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryan Julian .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 33362 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Julian, R. et al. (2020). Scaling Simulation-to-Real Transfer by Learning Composable Robot Skills. In: Xiao, J., Kröger, T., Khatib, O. (eds) Proceedings of the 2018 International Symposium on Experimental Robotics. ISER 2018. Springer Proceedings in Advanced Robotics, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-030-33950-0_24

Download citation

Publish with us

Policies and ethics