Error-Correction for AI Safety | SpringerLink
Skip to main content

Error-Correction for AI Safety

  • Conference paper
  • First Online:
Artificial General Intelligence (AGI 2020)

Abstract

The complex socio-technological debate underlying safety-critical and ethically relevant issues pertaining to AI development and deployment extends across heterogeneous research subfields and involves in part conflicting positions. In this context, it seems expedient to generate a minimalistic joint transdisciplinary basis disambiguating the references to specific subtypes of AI properties and risks for an error-correction in the transmission of ideas. In this paper, we introduce a high-level transdisciplinary system clustering of ethical distinction between antithetical clusters of Type I and Type II systems which extends a cybersecurity-oriented AI safety taxonomy with considerations from psychology. Moreover, we review relevant Type I AI risks, reflect upon possible epistemological origins of hypothetical Type II AI from a cognitive sciences perspective and discuss the related human moral perception. Strikingly, our nuanced transdisciplinary analysis yields the figurative formulation of the so-called AI safety paradox identifying AI control and value alignment as conjugate requirements in AI safety. Against this backdrop, we craft versatile multidisciplinary recommendations with ethical dimensions tailored to Type II AI safety. Overall, we suggest proactive and importantly corrective instead of prohibitive methods as common basis for both Type I and Type II AI safety.

S. Ziesche—Independent Researcher.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 10295
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12869
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    AI risks of Type Ib have already been recognized in the AI field. However, risk Ib is still understudied for intelligent systems (often referred to as “autonomous” systems) deployed in real-world environments offering a wider attack surface.

  2. 2.

    It is not contested that inductive inferences are logically invalid as shown by Popper. However, he also stated that “I hold that neither animals nor men use any procedure like induction, or any argument based on repetition of instances. The belief that we use induction is simply a mistake” [27] and that “induction simply does not exist” [27] (see [25] for an in-depth analysis of potential hereto related semantic misunderstandings). Arguments based on repetition of instances are existing but logically unfounded human habits as assumed by Hume [25], however they additionally require a point of view recognizing repetitions as such in the first place.

References

  1. Aliman, N.M., Kester, L.: Artificial creativity augmentation. In: Goertzel, B., Panov, A.I., Potapov, A., Yampolskiy, R. (eds.) AGI 2020. LNCS (LNAI), vol. 12177, pp. 23–33. Springer, Cham (2020)

    Google Scholar 

  2. Aliman, N.M., Kester, L., Werkhoven, P., Ziesche, S.: Sustainable AI safety? Delphi Interdisc. Rev. Emerg. Technol. 2(4), 226–233 (2020)

    Google Scholar 

  3. Atzil, S., Gao, W., Fradkin, I., Barrett, L.F.: Growing a social brain. Nat. Hum. Behav. 2(9), 624–636 (2018)

    Article  Google Scholar 

  4. Barrett, L.F.: The theory of constructed emotion: an active inference account of interoception and categorization. Soc. Cogn. Affect. Neurosci. 12(1), 1–23 (2017)

    Article  Google Scholar 

  5. Barrett, L.F., Simmons, W.K.: Interoceptive predictions in the brain. Nat. Rev. Neurosci. 16(7), 419 (2015)

    Article  Google Scholar 

  6. Baum, S.D.: Reconciliation between factions focused on near-term and long-term artificial intelligence. AI Soc. 33(4), 565–572 (2017). https://doi.org/10.1007/s00146-017-0734-3

    Article  Google Scholar 

  7. Benedek, M.: The neuroscience of creative idea generation. In: Kapoula, Z., Volle, E., Renoult, J., Andreatta, M. (eds.) Exploring Transdisciplinarity in Art and Sciences, pp. 31–48. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76054-4_2

    Chapter  Google Scholar 

  8. Bieger, J., Thórisson, K.R., Wang, P.: Safe baby AGI. In: Bieger, J., Goertzel, B., Potapov, A. (eds.) AGI 2015. LNCS (LNAI), vol. 9205, pp. 46–49. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21365-1_5

    Chapter  Google Scholar 

  9. Bigman, Y.E., Waytz, A., Alterovitz, R., Gray, K.: Holding robots responsible: the elements of machine morality. Trends Cogn. Sci. 23(5), 365–368 (2019)

    Article  Google Scholar 

  10. Bostrom, N.: The superintelligent will: motivation and instrumental rationality in advanced artificial agents. Mind. Mach. 22(2), 71–85 (2012). https://doi.org/10.1007/s11023-012-9281-3

    Article  MathSciNet  Google Scholar 

  11. Brockman, J.: Possible Minds: Twenty-Five Ways of Looking at AI. Penguin Press, London (2019)

    Google Scholar 

  12. Bruineberg, J., Kiverstein, J., Rietveld, E.: The anticipating brain is not a scientist: the free-energy principle from an ecological-enactive perspective. Synthese 195(6), 2417–2444 (2016). https://doi.org/10.1007/s11229-016-1239-1

    Article  Google Scholar 

  13. Clark, A., Friston, K., Wilkinson, S.: Bayesing qualia: consciousness as inference, not raw datum. J. Conscious. Stud. 26(9–10), 19–33 (2019)

    Google Scholar 

  14. De Rooij, A., Valtulina, J.: The predictive creative mind: a first look at spontaneous predictions and evaluations during idea generation. Front. Psychol. 10, 2465 (2019)

    Article  Google Scholar 

  15. Deutsch, D.: Creative blocks. https://aeon.co/essays/how-close-are-we-to-creating-artificial-intelligence. Accessed Nov 2019

  16. Deutsch, D.: The Beginning of Infinity: Explanations that Transform the World. Penguin, New York (2011)

    MATH  Google Scholar 

  17. Deutsch, D.: Constructor theory. Synthese 190(18), 4331–4359 (2013). https://doi.org/10.1007/s11229-013-0279-z

    Article  MathSciNet  MATH  Google Scholar 

  18. Dietrich, A.: How Creativity Happens in the Brain. Springer, London (2015). https://doi.org/10.1057/9781137501806

    Book  Google Scholar 

  19. Friston, K.: Am I self-conscious? (Or does self-organization entail self-consciousness?). Front. Psychol. 9, 579 (2018)

    Article  Google Scholar 

  20. Friston, K.: A free energy principle for a particular physics. arXiv preprint arXiv:1906.10184 (2019)

  21. Goertzel, B.: The real reasons we don’ t have AGI yet. https://www.kurzweilai.net/the-real-reasons-we-dont-have-agi-yet. Accessed 21 Nov 2019

  22. Goertzel, B.: Infusing advanced AGIs with human-like value systems: two theses. J. Evol. Technol. 26(1), 50–72 (2016)

    Google Scholar 

  23. Gray, K., Schein, C., Ward, A.F.: The myth of harmless wrongs in moral cognition: automatic dyadic completion from sin to suffering. J. Exp. Psychol. Gen. 143(4), 1600 (2014)

    Article  Google Scholar 

  24. Gray, K., Wegner, D.M.: Feeling robots and human zombies: mind perception and the uncanny valley. Cognition 125(1), 125–130 (2012)

    Article  Google Scholar 

  25. Greenland, S.: Induction versus popper: substance versus semantics. Int. J. Epidemiol. 27(4), 543–548 (1998)

    Article  Google Scholar 

  26. Parr, T., Da Costa, L., Friston, K.: Markov blankets, information geometry and stochastic thermodynamics. Philos. Trans. R. Soc. A 378(2164), 20190159 (2019)

    Article  Google Scholar 

  27. Popper, K.: In: Schilpp, P.A. (ed.) The Philosophy of Karl Popper, vol. 2, p. 1015. Open Court Press, Chicago (1974)

    Google Scholar 

  28. Popper, K.R.: The Poverty of Historicism. Routledge & Kegan Paul, Abingdon (1966)

    Google Scholar 

  29. Russell, S.: How to Stop Superhuman A.I. Before It Stops Us. https://www.nytimes.com/2019/10/08/opinion/artificial-intelligence.html?module=inline. Accessed 21 Nov 2019

  30. Schein, C., Gray, K.: The theory of dyadic morality: reinventing moral judgment by redefining harm. Pers. Soc. Psychol. Rev. 22(1), 32–70 (2018)

    Article  Google Scholar 

  31. Schulkin, J., Sterling, P.: Allostasis: a brain-centered, predictive mode of physiological regulation. Trends Neurosci. 42(10), 740–752 (2019)

    Article  Google Scholar 

  32. Thórisson, K.R., Bieger, J., Li, X., Wang, P.: Cumulative learning. In: Hammer, P., Agrawal, P., Goertzel, B., Iklé, M. (eds.) AGI 2019. LNCS (LNAI), vol. 11654, pp. 198–208. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-27005-6_20

    Chapter  Google Scholar 

  33. Wang, P.: Motivation management in AGI systems. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 352–361. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35506-6_36

    Chapter  Google Scholar 

  34. Wiese, W.: Perceptual presence in the Kuhnian-Popperian Bayesian brain: a commentary on Anil K. Johannes Gutenberg-Universität Mainz, Seth (2016)

    Google Scholar 

  35. Yampolskiy, R.V.: Taxonomy of pathways to dangerous artificial intelligence. In: Workshops at the Thirtieth AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

Download references

Acknowledgement

Nadisha-Marie Aliman would like to thank David Deutsch for providing a concise feedback on AI safety and Joscha Bach for a relevant exchange on AI ethics.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nadisha-Marie Aliman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aliman, NM. et al. (2020). Error-Correction for AI Safety. In: Goertzel, B., Panov, A., Potapov, A., Yampolskiy, R. (eds) Artificial General Intelligence. AGI 2020. Lecture Notes in Computer Science(), vol 12177. Springer, Cham. https://doi.org/10.1007/978-3-030-52152-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-52152-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-52151-6

  • Online ISBN: 978-3-030-52152-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics