Robots Learn Increasingly Complex Tasks with Intrinsic Motivation and Automatic Curriculum Learning | KI - Künstliche Intelligenz Skip to main content
Log in

Robots Learn Increasingly Complex Tasks with Intrinsic Motivation and Automatic Curriculum Learning

Domain Knowledge by Emergence of Affordances, Hierarchical Reinforcement and Active Imitation Learning

  • Project Reports
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

Multi-task learning by robots poses the challenge of the domain knowledge: complexity of tasks, complexity of the actions required, relationship between tasks for transfer learning. We demonstrate that this domain knowledge can be learned to address the challenges in life-long learning. Specifically, the hierarchy between tasks of various complexities is key to infer a curriculum from simple to composite tasks. We propose a framework for robots to learn sequences of actions of unbounded complexity in order to achieve multiple control tasks of various complexity. Our hierarchical reinforcement learning framework, named SGIM-SAHT, offers a new direction of research, and tries to unify partial implementations on robot arms and mobile robots. We outline our contributions to enable robots to map multiple control tasks to sequences of actions: representations of task dependencies, an intrinsically motivated exploration to learn task hierarchies, and active imitation learning. While learning the hierarchy of tasks, it infers its curriculum by deciding which tasks to explore first, how to transfer knowledge, and when, how and whom to imitate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Asada M, MacDorman KF, Ishiguro H, Kuniyoshi Y (2001) Cognitive developmental robotics as a new paradigm for the design of humanoid robots. Robot Auton Syst 37(2–3):185–193. https://doi.org/10.1016/S0921-8890(01)00157-9

    Article  MATH  Google Scholar 

  2. Baranes A, Py Oudeyer (2009) R-IAC: robust intrinsically motivated exploration and active learning. IEEE Trans Auton Ment Dev 1(3):155–169

    Article  Google Scholar 

  3. Baranes A, Oudeyer PY (2013) Active learning of inverse models with intrinsically motivated goal exploration in robots. Robot Auton Syst 61(1):49–73

    Article  Google Scholar 

  4. Begus K, Southgate V (2018) Curious learners: how infants’ motivation to learn shapes and is shaped by infants’ interactions with the social world. In: Saylor MM, Ganea PA (eds) Active learning from infancy to childhood. Springer International Publishing, Cham, pp 13–37. https://doi.org/10.1007/978-3-319-77182-3_2

    Chapter  Google Scholar 

  5. Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: International conference on machine learning, ACM, New York, NY, USA, ICML ’09, pp 41–48. https://doi.org/10.1145/1553374.1553380

  6. Cangelosi A, Schlesinger M (2015) Developmental robotics: from babies to robots. MIT press, Cambridge

    Book  Google Scholar 

  7. Colas C, Sigaud O, Oudeyer PY (2018) GEP-PG: decoupling exploration and exploitation in deep reinforcement learning algorithms. In: ICML, Stockholm, Sweden. https://hal.inria.fr/hal-01890151

  8. Colas C, Fournier P, Chetouani M, Sigaud O, Oudeyer PY (2019) CURIOUS: intrinsically motivated modular multi-goal reinforcement learning. In: International conference on machine learning, PMLR, Long Beach, California, USA, vol 97, pp 1331–1340. http://proceedings.mlr.press/v97/colas19a.html

  9. Deci E, Ryan RM (1985) Intrinsic motivation and self-determination in human behavior. Plenum Press, New York

    Book  Google Scholar 

  10. Duminy N, Nguyen SM, Duhaut D (2016) Strategic and interactive learning of a hierarchical set of tasks by the Poppy humanoid robot. In: ICDL-EPIROB 2016: 6th Joint IEEE international conference developmental learning and epigenetic robotics, pp 204–209. https://doi.org/10.1109/DEVLRN.2016.7846820

  11. Duminy N, Manoury A, Nguyen SM, Buche C, Duhaut D (2018a) Learning sequences of policies by using an intrinsically motivated learner and a task hierarchy. In: Workshop on continual unsupervised sensorimotor learning, ICDL-EpiRob, Tokyo, Japan. https://hal.archives-ouvertes.fr/hal-01887073. https://youtu.be/US84HjUuUtg

  12. Duminy N, Nguyen SM, Duhaut D (2018b) Effects of social guidance on a robot learning sequences of policies in hierarchical learning. In: IEEE (ed) International conference on systems man and cybernetics

  13. Duminy N, Nguyen SM, Duhaut D (2018c) Learning a set of interrelated tasks by using sequences of motor policies for a strategic intrinsically motivated learner. In: IEEE international on robotic computing, pp 288–291

  14. Duminy N, Nguyen SM, Zhu J, Duhaut D, Kerdreux J (2021) Intrinsically motivated open-ended multi-task learning using transfer learning to discover task hierarchy. Appl Sci 11(3):975. https://doi.org/10.3390/app11030975

    Article  Google Scholar 

  15. Elman J (1993) Learning and development in neural networks: the importance of starting small. Cognition 48:71–99

    Article  Google Scholar 

  16. Forestier S, Mollard Y, Oudeyer P (2017) Intrinsically motivated goal exploration processes with automatic curriculum learning. CoRR abs/1708.02190. arxiv:1708.02190

  17. Gibson JJ (1979) The theory of affordances. In: Shaw R, Bransford J (eds) Perceiving, acting, and knowing. Houghton Mifflin, Boston, pp 67–82

  18. Jamone L, Ugur E, Cangelosi A, Fadiga L, Bernardino A, Piater J, Santos-Victor J (2016) Affordances in psychology, neuroscience, and robotics: a survey. IEEE Trans Cogn Dev Syst 10(1):4–25

    Article  Google Scholar 

  19. Konidaris G, Barto AG (2009) Skill discovery in continuous reinforcement learning domains using skill chaining. Adv Neural Inf Process Syst 22:1015–1023

    Google Scholar 

  20. Kulkarni TD, Narasimhan K, Saeedi A, Tenenbaum J (2016) Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, Inc., pp 3675–3683

  21. Levine S, Pastor P, Krizhevsky A, Ibarz J, Quillen D (2018) Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int J Robot Res 37(4–5):421–436. https://doi.org/10.1177/0278364917710318

    Article  Google Scholar 

  22. Manoury A, Nguyen SM, Buche C (2019) Hierarchical affordance discovery using intrinsic motivation. In: Proceedings of the 7th international conference on human-agent interaction, HAI '19,Kyoto, Japan. Association for Computing Machinery, New York, pp 186–193

  23. Mitriakov A, Papadakis P, Nguyen SM, Garlatti S (2020) Staircase traversal via reinforcement learning for active reconfiguration of assistive robots. In: International conference on fuzzy systems (FUZZ-IEEE), pp 1–8. https://doi.org/10.1109/FUZZ48607.2020.9177581

  24. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  25. Mnih V, Badia AP, Mirza M, Graves A, Lillicrap TP, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783. arxiv:1602.01783

  26. Montesano L, Lopes M (2009) Learning grasping affordances from local visual descriptors. In: 2009 IEEE 8th international conference on development and learning, pp 1–6. https://doi.org/10.1109/DEVLRN.2009.5175529

  27. Moulin-Frier C, Nguyen SM, Oudeyer PY (2014) Self-organization of early vocal development in infants and machines: the role of intrinsic motivation. Front Psychol 4(1006). https://doi.org/10.3389/fpsyg.2013.01006. http://www.frontiersin.org/cognitive_science/10.3389/fpsyg.2013.01006/abstract

  28. Nguyen SM, Oudeyer PY (2012) Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner. Paladyn J Behav Robot 3(3):136–146. https://doi.org/10.2478/s13230-013-0110-z

    Article  Google Scholar 

  29. Nguyen SM, Oudeyer PY (2014) Socially guided intrinsic motivation for robot learning of motor skills. Auton Robots 36(3):273–294. https://doi.org/10.1007/s10514-013-9339-y

    Article  Google Scholar 

  30. Nguyen SM, Ivaldi S, Lyubova N, Droniou A, Gerardeaux-Viret D, Filliat D, Padois V, Sigaud O, Oudeyer PY (2013) Learning to recognize objects through curiosity-driven manipulation with the iCub humanoid robot. In: IEEE international conference on development and learning - Epirob, No 1–8. https://doi.org/10.1109/DevLrn.2013.6652525

  31. Oudeyer PY, Kaplan F, Hafner V (2007) Intrinsic motivation systems for autonomous mental development. IEEE Trans Evol Comput 11(2):265–286. https://doi.org/10.1109/TEVC.2006.890271

    Article  Google Scholar 

  32. Rafols E, Koop A, Sutton RS (2006) Temporal abstraction in temporal-difference networks. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 1313–1320

    Google Scholar 

  33. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge. http://webdocs.cs.ualberta.ca/~sutton/book/the-book.html

  34. Sutton RS, Precup D, Singh S (1999) Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif Intell 112:181–211. http://www.sciencedirect.com/science/article/pii/S0004370299000521

  35. Ugur E, Piater J (2017) Emergent structuring of interdependent affordance learning tasks using intrinsic motivation and empirical feature selection. IEEE Trans Cogn Dev Syst 9(4):328–340. https://doi.org/10.1109/TCDS.2016.25813072016.2581307

    Article  Google Scholar 

  36. Ugur E, Piater J, Sahin E, Oztop E (2009) Affordance learning from range data for multi-step planning. In: International conference on epigenetic robotics. http://win.rossiproject.net/downloads/ugur-epirob-2009.pdf

  37. Vigorito C, Barto A (2010) Intrinsically motivated hierarchical skill learning in structured environments. IEEE Trans Auton Ment Dev 2(2):132–143. https://ieeexplore.ieee.org/document/5464347/

  38. Zech P, Renaudo E, Haller S, Zhang X, Piater J (2019) Action representations in robotics: a taxonomy and systematic classification. Int J Robot Res 38(5):518–562. https://doi.org/10.1177/0278364919835020

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the European Regional Development Fund (ERDF) via the VITAAL CPER, by Institut Mines Telecom (IMT) and by the French Ministry of Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sao Mai Nguyen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, S.M., Duminy, N., Manoury, A. et al. Robots Learn Increasingly Complex Tasks with Intrinsic Motivation and Automatic Curriculum Learning. Künstl Intell 35, 81–90 (2021). https://doi.org/10.1007/s13218-021-00708-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-021-00708-8

Keywords

Navigation