Hybrid Control for Learning Motor Skills

Abraham, Ian; Broad, Alexander; Pinosky, Allison; Argall, Brenna; Murphey, Todd D.

doi:10.1007/978-3-030-66723-8_27

Ian Abraham¹⁵,
Alexander Broad¹⁶,
Allison Pinosky¹⁵,
Brenna Argall^15,16 &
…
Todd D. Murphey¹⁵

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 17))

Included in the following conference series:

International Workshop on the Algorithmic Foundations of Robotics

983 Accesses

Abstract

We develop a hybrid control approach for robot learning based on combining learned predictive models with experience-based state-action policy mappings to improve the learning capabilities of robotic systems. Predictive models provide an understanding of the task and the physics (which improves sample-efficiency), while experience-based policy mappings are treated as “muscle memory” that encode favorable actions as experiences that override planned actions. Hybrid control tools are used to create an algorithmic approach for combining learned predictive models with experience-based learning. Hybrid learning is presented as a method for efficiently learning motor skills by systematically combining and improving the performance of predictive models and experience-based policies. A deterministic variation of hybrid learning is derived and extended into a stochastic implementation that relaxes some of the key assumptions in the original derivation. Each variation is tested on experience-based learning methods (where the robot interacts with the environment to gain experience) as well as imitation learning methods (where experience is provided through demonstrations and tested in the environment). The results show that our method is capable of improving the performance and sample-efficiency of learning motor skills in a variety of experimental domains.

T. D. Murphey—This material is based upon work supported by the National Science Foundation under Grants CNS 1837515. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the aforementioned institutions. For videos of results and code please visit https://sites.google.com/view/hybrid-learning-theory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 32031; Price includes VAT (Japan)

Softcover Book: JPY 40039; Price includes VAT (Japan)

Hardcover Book: JPY 40039; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards Human-Like Learning Dynamics in a Simulated Humanoid Robot for Improved Human-Machine Teaming

Learning Control

Notes

1.
We exclude the dependency on the action for clarity as one could always append the state vector with the action and obtain the dependency.
2.
We will add the uncertainty into the hybrid problem in the stochastic derivation of our approach for hybrid learning.
3.
We avoid the problem of instability of the robotic system from switching control strategies as later we develop and use the best action for all \(\tau \in [0, t_H]\) instead of searching for a particular time when to switch.
4.
We refer to uncontrolled as the unaugmented control response of the robotic agent subject to a stochastic policy \(\pi \).
5.
The motivation is to use the optimal density function to gauge how well the policy \(\pi \) performs.
6.
The same default parameters for SAC are used tor this experiment.

References

Williams, G., Wagener, N., Goldfain, B., Drews, P., Rehg, J.M., Boots, B., Theodorou, E.A.: Information theoretic MPC for model-based reinforcement learning. In: IEEE International Conference on Robotics and Automation (2017)
Google Scholar
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Advances in Neural Information Processing Systems, pp. 4754–4765 (2018)
Google Scholar
Abraham, I., Handa, A., Ratliff, N., Lowrey, K., Murphey, T.D., Fox, D.: Model-based generalization under parameter uncertainty using path integral control. IEEE Robot. Autom. Lett. 5(2), 2864–2871 (2020)
Article Google Scholar
Nagabandi, A., Kahn, G., Fearing, R.S., Levine, S.: Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In: International Conference on Robotics and Automation (ICRA), pp. 7559–7566 (2018)
Google Scholar
Havens, A., Ouyang, Y., Nagarajan, P., Fujita, Y.: Learning latent state spaces for planning through reward prediction. arXiv preprint arXiv:1912.04201 (2019)
Sharma, A., Gu, S., Levine, S., Kumar, V., Hausman, K.: Dynamics-aware unsupervised discovery of skills. arXiv preprint arXiv:1907.01657 (2019)
Abraham, I., De La Torre, G., Murphey, T.D.: Model-based control using Koopman operators. In: Proceedings of Robotics: Science and Systems (2017). https://doi.org/10.15607/RSS.2017.XIII.052
Abraham, I., Murphey, T.D.: Active learning of dynamics for data-driven control using Koopman operators. IEEE Trans. Robot. 35(5), 1071–1083 (2019)
Article Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018)
Deisenroth, M., Rasmussen, C.E.: PILCO: a model-based and data-efficient approach to policy search. In: Proceedings of the 28th International Conference on machine learning (ICML 2011), pp. 465–472 (2011)
Google Scholar
Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: Conference on Computer Vision and Pattern Recognition Workshops, pp. 16–17 (2017)
Google Scholar
Chebotar, Y., Kalakrishnan, M., Yahya, A., Li, A., Schaal, S., Levine, S.: Path integral guided policy search. In: International Conference on Robotics and Automation (ICRA), pp. 3381–3388 (2017)
Google Scholar
Bansal, S., Calandra, R., Chua, K., Levine, S., Tomlin, C.: MBMF: model-based priors for model-free reinforcement learning. arXiv preprint arXiv:1709.03153 (2017)
Pomerleau, D.: An autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems’, vol. 1. Morgan Kaufmann Publishers Inc. (1998)
Google Scholar
Li, W., Todorov, E.: Iterative linear quadratic regulator design for nonlinear biological movement systems. In: International Conference on Informatics in Control, Automation and Robotics, pp. 222–229 (2004)
Google Scholar
Axelsson, H., Wardi, Y., Egerstedt, M., Verriest, E.I.: Gradient descent approach to optimal mode scheduling in hybrid dynamical systems. J. Optim. Theory Appl. 136(2), 167–186 (2008)
Article MathSciNet Google Scholar
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)
Google Scholar
Abraham, I., Broad, A., Pinosky, A., Argall, B., Murphey, T.D.: Hybrid control for learning motor skills. arXiv preprint arXiv:2006.03636 (2020)
Theodorou, E.A., Todorov, E.: Relative entropy and free energy dualities: connections to path integral and KL control. In: IEEE Conference on Decision and Control (CDC), pp. 1466–1473 (2012)
Google Scholar
Williams, G., Drews, P., Goldfain, B., Rehg, J.M., Theodorou, E.A.: Aggressive driving with model predictive path integral control. In: IEEE International Conference on Robotics and Automation, pp. 1433–1440 (2016)
Google Scholar
Coumans, E., Bai, Y.: Pybullet, a python module for physics simulation for games, robotics and machine learning. GitHub repository (2016)
Google Scholar
Ansari, A.R., Murphey, T.D.: Sequential action control: closed-form optimal control for nonlinear and nonsmooth systems. IEEE Trans. Robot. 32(5), 1196–1214 (2016)
Article Google Scholar
Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
Article Google Scholar
Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mechanical Engineering, Northwestern University, Evanston, IL, 60208, USA
Ian Abraham, Allison Pinosky, Brenna Argall & Todd D. Murphey
Computer Science Department, Northwestern University, Evanston, IL, 60208, USA
Alexander Broad & Brenna Argall

Authors

Ian Abraham
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Broad
View author publications
You can also search for this author in PubMed Google Scholar
Allison Pinosky
View author publications
You can also search for this author in PubMed Google Scholar
Brenna Argall
View author publications
You can also search for this author in PubMed Google Scholar
Todd D. Murphey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ian Abraham .

Editor information

Editors and Affiliations

Center for Ubiquitous Computing, University of Oulu, Oulu, Finland
Steven M. LaValle
Brendan Iribe Center for Computer Science and Engineering, University of Maryland, College Park, MD, USA
Ming Lin
Center for Ubiquitous Computing, University of Oulu, Oulu, Finland
Timo Ojala
Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
Dylan Shell
Department of Computer Science, Rutgers University, Piscataway, NJ, USA
Jingjin Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abraham, I., Broad, A., Pinosky, A., Argall, B., Murphey, T.D. (2021). Hybrid Control for Learning Motor Skills. In: LaValle, S.M., Lin, M., Ojala, T., Shell, D., Yu, J. (eds) Algorithmic Foundations of Robotics XIV. WAFR 2020. Springer Proceedings in Advanced Robotics, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-030-66723-8_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-66723-8_27
Published: 09 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66722-1
Online ISBN: 978-3-030-66723-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Hybrid Control for Learning Motor Skills

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Towards Human-Like Learning Dynamics in a Simulated Humanoid Robot for Improved Human-Machine Teaming

Learning Control

Learning Control

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us