{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,11,8]],"date-time":"2022-11-08T05:29:44Z","timestamp":1667885384469},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"01","license":[{"start":{"date-parts":[[2019,7,17]],"date-time":"2019-07-17T00:00:00Z","timestamp":1563321600000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.aaai.org"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options. This framework features a planner \u2013 controller \u2013 meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches.<\/jats:p>","DOI":"10.1609\/aaai.v33i01.33019995","type":"journal-article","created":{"date-parts":[[2019,8,15]],"date-time":"2019-08-15T07:33:56Z","timestamp":1565854436000},"page":"9995-9996","source":"Crossref","is-referenced-by-count":0,"title":["Logic-Based Sequential Decision-Making"],"prefix":"10.1609","volume":"33","author":[{"given":"Daoming","family":"Lyu","sequence":"first","affiliation":[]},{"given":"Fangkai","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Bo","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Daesub","family":"Yoon","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2019,7,17]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/5134\/5007","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/5134\/5007","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,11,7]],"date-time":"2022-11-07T06:59:17Z","timestamp":1667804357000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/5134"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,17]]},"references-count":0,"journal-issue":{"issue":"01","published-online":{"date-parts":[[2019,7,23]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v33i01.33019995","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2019,7,17]]}}}