{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T04:26:51Z","timestamp":1729052811624},"publisher-location":"Cham","reference-count":18,"publisher":"Springer Nature Switzerland","isbn-type":[{"value":"9783031739026","type":"print"},{"value":"9783031739033","type":"electronic"}],"license":[{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025]]},"DOI":"10.1007\/978-3-031-73903-3_8","type":"book-chapter","created":{"date-parts":[[2024,10,15]],"date-time":"2024-10-15T21:01:53Z","timestamp":1729026113000},"page":"113-127","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Non-maximizing Policies that\u00a0Fulfill Multi-criterion Aspirations in\u00a0Expectation"],"prefix":"10.1007","author":[{"ORCID":"http:\/\/orcid.org\/0009-0003-7815-8238","authenticated-orcid":false,"given":"Simon","family":"Dima","sequence":"first","affiliation":[]},{"ORCID":"http:\/\/orcid.org\/0009-0000-7261-3031","authenticated-orcid":false,"given":"Simon","family":"Fischer","sequence":"additional","affiliation":[]},{"ORCID":"http:\/\/orcid.org\/0000-0002-0442-8077","authenticated-orcid":false,"given":"Jobst","family":"Heitzig","sequence":"additional","affiliation":[]},{"ORCID":"http:\/\/orcid.org\/0009-0008-6333-7598","authenticated-orcid":false,"given":"Joss","family":"Oliver","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,10,16]]},"reference":[{"key":"8_CR1","unstructured":"Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Man\u00e9, D.: Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016)"},{"key":"8_CR2","unstructured":"Bonet, B., Geffner, H.: Solving POMDPs: RTDP-Bel vs. point-based algorithms. In: IJCAI, pp. 1641\u20131646. Pasadena CA (2009)"},{"key":"8_CR3","unstructured":"Chen, L., et al.: Decision transformer: reinforcement learning via sequence modeling (2021)"},{"key":"8_CR4","unstructured":"Clymer, J., et al.: Generalization analogies (GENIES): a testbed for generalizing AI oversight to hard-to-measure domains. arXiv preprint arXiv:2311.07723 (2023)"},{"key":"8_CR5","unstructured":"Conitzer, V., et al.: Social choice for AI alignment: dealing with diverse human feedback. arXiv preprint arXiv:2404.10271 (2024)"},{"key":"8_CR6","unstructured":"Dalrymple, D., et al.: Towards guaranteed safe AI: a framework for ensuring robust and reliable AI systems. arXiv preprint arXiv:2405.06624 (2024)"},{"key":"8_CR7","doi-asserted-by":"publisher","unstructured":"Feinberg, E.A., Sonin, I.: Notes on equivalent stationary policies in Markov decision processes with total rewards. Math. Meth. Oper. Res. 44(2), 205\u2013221 (1996). https:\/\/doi.org\/10.1007\/BF01194331","DOI":"10.1007\/BF01194331"},{"issue":"1","key":"8_CR8","first-page":"89","volume":"2631","author":"G Kern-Isberner","year":"2024","unstructured":"Kern-Isberner, G., Spohn, W.: Inductive reasoning, conditionals, and belief dynamics. J. Appl. Log. 2631(1), 89 (2024)","journal-title":"J. Appl. Log."},{"key":"8_CR9","unstructured":"Miryoosefi, S., Brantley, K., Daum\u00e9, H., Dud\u00edk, M., Schapire, R.E.: Reinforcement learning with convex constraints. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (2019)"},{"issue":"2","key":"8_CR10","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1037\/h0042769","volume":"63","author":"HA Simon","year":"1956","unstructured":"Simon, H.A.: Rational choice and the structure of the environment. Psychol. Rev. 63(2), 129 (1956)","journal-title":"Psychol. Rev."},{"key":"8_CR11","unstructured":"Skalse, J.M.V., Farrugia-Roberts, M., Russell, S., Abate, A., Gleave, A.: Invariance in policy optimisation and partial identifiability in reward learning. In: International Conference on Machine Learning, pp. 32033\u201332058. PMLR (2023)"},{"key":"8_CR12","unstructured":"Subramani, R., et al.: On the expressivity of objective-specification formalisms in reinforcement learning. arXiv preprint arXiv:2310.11840 (2023)"},{"key":"8_CR13","unstructured":"Taylor, J.: Quantilizers: a safer alternative to maximizers for limited optimization (2015). https:\/\/intelligence.org\/files\/QuantilizersSaferAlternative.pdf"},{"key":"8_CR14","doi-asserted-by":"crossref","unstructured":"Tschantz, A., et al.: Reinforcement learning through active inference (2020)","DOI":"10.1109\/IJCNN48605.2020.9207382"},{"key":"8_CR15","doi-asserted-by":"crossref","unstructured":"Vaidya, P.: Speeding-up linear programming using fast matrix multiplication. In: 30th Annual Symposium on Foundations of Computer Science, pp. 332\u2013337 (1989)","DOI":"10.1109\/SFCS.1989.63499"},{"key":"8_CR16","doi-asserted-by":"publisher","first-page":"104186","DOI":"10.1016\/j.engappai.2021.104186","volume":"100","author":"P Vamplew","year":"2021","unstructured":"Vamplew, P., Foale, C., Dazeley, R., Bignold, A.: Potential-based multiobjective reinforcement learning approaches to low-impact agents for AI safety. Eng. Appl. Artif. Intell. 100, 104186 (2021)","journal-title":"Eng. Appl. Artif. Intell."},{"issue":"1","key":"8_CR17","doi-asserted-by":"publisher","first-page":"109","DOI":"10.7146\/math.scand.a-10655","volume":"11","author":"JG Wendel","year":"1962","unstructured":"Wendel, J.G.: A problem in geometric probability. Math. Scand. 11(1), 109\u2013111 (1962)","journal-title":"Math. Scand."},{"key":"8_CR18","unstructured":"Yen, I.E.H., Zhong, K., Hsieh, C.J., Ravikumar, P.K., Dhillon, I.S.: Sparse linear programming via primal and dual augmented coordinate descent. In: Advances in Neural Information Processing Systems, vol. 28 (2015)"}],"container-title":["Lecture Notes in Computer Science","Algorithmic Decision Theory"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-73903-3_8","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,15]],"date-time":"2024-10-15T21:03:02Z","timestamp":1729026182000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-73903-3_8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,16]]},"ISBN":["9783031739026","9783031739033"],"references-count":18,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-73903-3_8","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,16]]},"assertion":[{"value":"16 October 2024","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"The authors have no competing interests to declare.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Disclosure of Interests"}},{"value":"Authors are listed in alphabetical ordering and have contributed equally. Simon Dima: Formal analysis, Writing - Original Draft, Writing - Review & Editing. Simon Fischer: Formal analysis, Writing - Original Draft, Writing - Review & Editing. Jobst Heitzig: Conceptualization, Methodology, Software, Writing - Original Draft, Writing - Review & Editing, Supervision. Joss Oliver: Formal analysis, Writing - Original Draft, Writing - Review & Editing.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"CRediT Author Statement"}},{"value":"ADT","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Algorithmic Decision Theory","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"New Brunswick, NJ","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"USA","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2024","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"14 October 2024","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"16 October 2024","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"8","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"aldt2024","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/preflib.github.io\/adt2024\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}