{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T01:11:49Z","timestamp":1722993109747},"reference-count":19,"publisher":"Fuji Technology Press Ltd.","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Robot. Mechatron.","JRM"],"published-print":{"date-parts":[[2015,2,20]]},"abstract":"<div class=\"\"abs_img\"\"><img src=\"\"[disp_template_path]\/JRM\/abst-image\/00270001\/07.jpg\"\" width=\"\"300\"\" \/>Moderate-based reward generator<\/div> In conventional reinforcement learning, a reward function influences the learning results, and therefore, the reward function is very important. To design this function considering a task, knowledge of reinforcement learning is required. In addition to this, a reward function must be designed for each task. These requirements make the design of a reward function unfeasible. We focus on this problem and aim at realizing a method to generate a reward without the design of a special reward function. In this paper, we propose a universal evaluation for sensor inputs, which is independent of a task and is modeled on the basis of the indicator of pleasure and pain in biological organisms. This evaluation estimates the trend of sensor inputs based on the ease of input prediction. Instead of the design of a reward function, our approach assists a human being in learning how to interact with an agent and teaching it his\/her demand. We recruited a research participant and attempted to solve the path planning problem. The results show that a participant can teach an agent his\/her demand by interacting with the agent and the agent can generate an adaptive route by interacting with the participant and the environment. <\/span><\/jats:p>","DOI":"10.20965\/jrm.2015.p0057","type":"journal-article","created":{"date-parts":[[2016,4,14]],"date-time":"2016-04-14T02:23:03Z","timestamp":1460600583000},"page":"57-63","source":"Crossref","is-referenced-by-count":4,"title":["Self-Generation of Reward by Moderate-Based Index for Senor Inputsvspace"],"prefix":"10.20965","volume":"27","author":[{"given":"Kentarou","family":"Kurashige","sequence":"first","affiliation":[]},{"name":"Department of Information and Electronic Engineering, Muroran Institute of Technology","sequence":"first","affiliation":[]},{"given":"Kaoru","family":"Nikaido","sequence":"additional","affiliation":[]}],"member":"8550","published-online":{"date-parts":[[2015,2,20]]},"reference":[{"unstructured":"R. S. Sutton and A. G. Barto, \u201cReinforcement Learning,\u201d The MIT Press, 1998.","key":"key-10.20965\/jrm.2015.p0057-1"},{"doi-asserted-by":"crossref","unstructured":"M. Riedmiller, T. Gabel, R. Hafner, and S. Lange, \u201cReinforcement learning for robot soccer,\u201d Autonomous Robots, Vol.27, No.1 pp. 57-73, 2009.","key":"key-10.20965\/jrm.2015.p0057-2","DOI":"10.1007\/s10514-009-9120-4"},{"doi-asserted-by":"crossref","unstructured":"R. Yamashina, M. Kuroda, and T. Yabuta, \u201cCaterpillar Robot Locomotion Based on Q-Learning using Objective\/Subjective Reward,\u201d Proc. of IEEE\/SICE Int. Symposium on System Integration (SII 2011), pp. 1311-1316, 2011.","key":"key-10.20965\/jrm.2015.p0057-3","DOI":"10.1109\/SII.2011.6147638"},{"doi-asserted-by":"crossref","unstructured":"M. Hara, N. Kawabe, J. Huang, and T. Yabuta, \u201cAcquisition of a Gymnast-Like Robotic Giant-Swing Motion by Q-Learning and Improvement of the Repeatability,\u201d J. of Robotics and Mechatronics, Vol.23, No.1, pp.126-136, 2011.","key":"key-10.20965\/jrm.2015.p0057-4","DOI":"10.20965\/jrm.2011.p0126"},{"unstructured":"K. Inoue, T. Arai, and J. Ota, \u201cAcceleration of Reinforcement Learning by a Mobile Robot Using Generalized Inhibition Rules,\u201d J. of Robotics and Mechatronics, Vol.22, No.1, pp. 122-133, 2010. Vol.22, No.1, 2010.","key":"key-10.20965\/jrm.2015.p0057-5"},{"doi-asserted-by":"crossref","unstructured":"S. Aoyagi and K. Hiraoka, \u201cPath Searching of Robot Manipulator Using Reinforcement Learning -- Reduction of Searched Configuration Space Using SOM and Multistage Learning --,\u201d J. of Robotics and Mechatronics, Vol.22, No.4, pp. 532-541, 2010.","key":"key-10.20965\/jrm.2015.p0057-6","DOI":"10.20965\/jrm.2010.p0532"},{"doi-asserted-by":"crossref","unstructured":"K. Yamada, \u201cExpression of Continuous State and Action Spaces for Q-Learning Using Neural Networks and CMAC,\u201d J. of Robotics and Mechatronics, Vol.24, No.2, pp. 330-339, 2012.","key":"key-10.20965\/jrm.2015.p0057-7","DOI":"10.20965\/jrm.2012.p0330"},{"unstructured":"P. Weng, R. Busa-Fekete, and E. H\u00fcllermeier, \u201cInteractive Q-Learning with Ordinal Rewards and Unreliable Tutor,\u201d ECML\/PKDD Workshop Reinforcement Learning with Generalized Feedback, 2013.","key":"key-10.20965\/jrm.2015.p0057-8"},{"doi-asserted-by":"crossref","unstructured":"S. Whiteson, \u201cEvolutionary Computation for Reinforcement Learning\u201d in M. Wiering and M. van Otterlo (Eds.), Reinforcement Learning: State of the Art, pp. 325-358, Springer, 2012.","key":"key-10.20965\/jrm.2015.p0057-9","DOI":"10.1007\/978-3-642-27645-3_10"},{"unstructured":"K. Kurashige and Y. Onoue, \u201cThe robot learning by using \u201csense of pain\u201d,\u201d Proc. of Int. Symposium on Humanized Systems 2007, pp. 1-4, 2007.","key":"key-10.20965\/jrm.2015.p0057-10"},{"doi-asserted-by":"crossref","unstructured":"J. A. Starzyk, \u201cMotivation in Embodied Intelligence,\u201d in Frontiers in Robotics, Automation and Control, I-Tech Education and Publishing, pp. 83-110, Oct. 2008.","key":"key-10.20965\/jrm.2015.p0057-11","DOI":"10.5772\/6332"},{"doi-asserted-by":"crossref","unstructured":"J. A. Starzyk, \u201cMotivated Learning for Computational Intelligence,\u201d in B. Igelnik (Ed.), Computational Modeling and Simulation of Intellect: Current State and Future Perspectives, IGI Publishing, ch.11, pp. 265-292, 2011.","key":"key-10.20965\/jrm.2015.p0057-12","DOI":"10.4018\/978-1-60960-551-3.ch011"},{"unstructured":"S. Sugimoto, \u201cThe Effect of Prolonged Lack of Sensory Stimulation upon Human Behavior,\u201d Philosophy, Vol.50, pp. 361-374, 1967.","key":"key-10.20965\/jrm.2015.p0057-13"},{"unstructured":"S. Sugimoto, \u201cHuman Mental Processes under Sensory Restriction Environment,\u201d The Japanese Society of Social Psychology, Vol.1, No.2, pp. 27-34, 1986.","key":"key-10.20965\/jrm.2015.p0057-14"},{"doi-asserted-by":"crossref","unstructured":"[15] N. Matsunaga, A. T. Zengin, H. Okajima, and S. Kawaji, \u201cEmulation of Fast and Slow Pain Using Multi-Layered Sensor Modeled the Layered Structure of Human Skin,\u201d J. of Robotics and Mechatronics, Vol.23, No.1, pp. 173-179, 2011.","key":"key-10.20965\/jrm.2015.p0057-15","DOI":"10.20965\/jrm.2011.p0173"},{"unstructured":"[16] J. Zhen, H. Aoki, E. Sato-Shimokawara, and T. Yamaguchi, \u201cObtaining Objects Information from a Human Robot Interaction using Gesture and Voice Recognition,\u201d IWACIII 2011 Proc., 101_GS1_1, 2011.","key":"key-10.20965\/jrm.2015.p0057-16"},{"doi-asserted-by":"crossref","unstructured":"S. Hashimoto, A. Ishida, M. Inami, and T. Igarashi, \u201cTouchMe: An Augmented Reality Interface for Remote Robot Control,\u201d J. of Robotics and Mechatronics, Vol.25, No.3, pp. 529-537, 2013.","key":"key-10.20965\/jrm.2015.p0057-17","DOI":"10.20965\/jrm.2013.p0529"},{"doi-asserted-by":"crossref","unstructured":"N. Kubota and Y. Urushizaki, \u201cCommunication Interface for Human-Robot Partnership,\u201d J. of Robotics and Mechatronics, Vol.16, No.5, pp. 526-534, 2004.","key":"key-10.20965\/jrm.2015.p0057-18","DOI":"10.20965\/jrm.2004.p0526"},{"unstructured":"[19] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, and A. Ng, \u201cROS: An open-source Robot Operating System,\u201d ICRA Workshop on Open Source Software, 2009.","key":"key-10.20965\/jrm.2015.p0057-19"}],"container-title":["Journal of Robotics and Mechatronics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.fujipress.jp\/main\/wp-content\/themes\/Fujipress\/phyosetsu.php?ppno=ROBOT002700010007","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,9,6]],"date-time":"2019-09-06T12:57:04Z","timestamp":1567774624000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.fujipress.jp\/jrm\/rb\/robot002700010057"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,2,20]]},"references-count":19,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2015,2,20]]},"published-print":{"date-parts":[[2015,2,20]]}},"URL":"https:\/\/doi.org\/10.20965\/jrm.2015.p0057","relation":{},"ISSN":["1883-8049","0915-3942"],"issn-type":[{"type":"electronic","value":"1883-8049"},{"type":"print","value":"0915-3942"}],"subject":[],"published":{"date-parts":[[2015,2,20]]}}}