{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,4,1]],"date-time":"2022-04-01T21:12:09Z","timestamp":1648847529654},"reference-count":54,"publisher":"MIT Press - Journals","issue":"9","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Neural Computation"],"published-print":{"date-parts":[[2020,9]]},"abstract":"We study the problem of stochastic multiple-arm identification, where an agent sequentially explores a size-[Formula: see text] subset of arms (also known as a super arm) from given [Formula: see text] arms and tries to identify the best super arm. Most work so far has considered the semi-bandit setting, where the agent can observe the reward of each pulled arm or assumed each arm can be queried at each round. However, in real-world applications, it is costly or sometimes impossible to observe a reward of individual arms. In this study, we tackle the full-bandit setting, where only a noisy observation of the total sum of a super arm is given at each pull. Although our problem can be regarded as an instance of the best arm identification in linear bandits, a naive approach based on linear bandits is computationally infeasible since the number of super arms [Formula: see text] is exponential. To cope with this problem, we first design a polynomial-time approximation algorithm for a 0-1 quadratic programming problem arising in confidence ellipsoid maximization. Based on our approximation algorithm, we propose a bandit algorithm whose computation time is [Formula: see text](log [Formula: see text]), thereby achieving an exponential speedup over linear bandit algorithms. We provide a sample complexity upper bound that is still worst-case optimal. Finally, we conduct experiments on large-scale data sets with more than 10[Formula: see text] super arms, demonstrating the superiority of our algorithms in terms of both the computation time and the sample complexity.<\/jats:p>","DOI":"10.1162\/neco_a_01299","type":"journal-article","created":{"date-parts":[[2020,7,20]],"date-time":"2020-07-20T21:05:09Z","timestamp":1595279109000},"page":"1733-1773","source":"Crossref","is-referenced-by-count":0,"title":["Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback"],"prefix":"10.1162","volume":"32","author":[{"given":"Yuko","family":"Kuroki","sequence":"first","affiliation":[{"name":"University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan, and RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan"}]},{"given":"Liyuan","family":"Xu","sequence":"additional","affiliation":[{"name":"University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan, and RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan"}]},{"given":"Atsushi","family":"Miyauchi","sequence":"additional","affiliation":[{"name":"RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan"}]},{"given":"Junya","family":"Honda","sequence":"additional","affiliation":[{"name":"University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan, and RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan"}]},{"given":"Masashi","family":"Sugiyama","sequence":"additional","affiliation":[{"name":"RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan, and University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan"}]}],"member":"281","reference":[{"key":"B1","first-page":"2312","volume-title":"Advances in neural information processing systems","author":"Abbasi-Yadkori Y.","year":"2011"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1080\/17442509008833627"},{"key":"B3","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.1987.1104491"},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.1006\/jagm.1999.1062"},{"key":"B5","first-page":"41","volume-title":"Proceedings of the 23rd Annual Conference on Learning Theory","author":"Audibert J.-Y.","year":"2010"},{"key":"B6","first-page":"201","author":"Bhaskara A.","year":"2010","journal-title":"Proceedings of the 42nd ACM Symposium on Theory of Computing"},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.1016\/j.endm.2010.05.086"},{"key":"B8","doi-asserted-by":"publisher","DOI":"10.1561\/2200000024"},{"key":"B9","first-page":"258","author":"Bubeck S.","year":"2013","journal-title":"Proceedings of the 30th International Conference on Machine Learning"},{"key":"B10","author":"Cao T.","year":"2017","journal-title":"Disagreement-based combinatorial pure exploration: Efficient algorithms and an analysis with localization"},{"key":"B11","first-page":"1036","volume-title":"Advances in neural information processing systems","author":"Cao W.","year":"2015"},{"key":"B12","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511546921"},{"key":"B13","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2012.01.001"},{"key":"B14","first-page":"1977","author":"Chaudhuri A. R.","year":"2017","journal-title":"Proceedings of the 31st AAAI Conference on Artificial Intelligence"},{"key":"B15","first-page":"991","author":"Chaudhuri A. R.","year":"2019","journal-title":"Proceedings of the 36th International Conference on Machine Learning"},{"key":"B16","first-page":"647","author":"Chen L.","year":"2016","journal-title":"Proceedings of the 29th Annual Conference on Learning Theory"},{"key":"B17","first-page":"482","author":"Chen L.","year":"2017","journal-title":"Proceedings of the 30th Annual Conference on Learning Theory"},{"key":"B18","author":"Chen L.","year":"2015","journal-title":"On the optimal sample complexity for best arm identification"},{"key":"B19","first-page":"379","volume-title":"Advances in neural information processing systems","volume":"27","author":"Chen S.","year":"2014"},{"key":"B20","first-page":"2116","volume-title":"Advances in neural information processing systems","author":"Combes R.","year":"2015"},{"key":"B21","first-page":"255","author":"Even-Dar E.","year":"2002","journal-title":"Proceedings of the 15th Annual Conference on Learning Theory"},{"key":"B22","first-page":"1079","volume":"7","author":"Even-Dar E.","year":"2006","journal-title":"Journal of Machine Learning Research"},{"key":"B23","doi-asserted-by":"publisher","DOI":"10.1007\/s004530010050"},{"key":"B24","first-page":"3212","volume-title":"Advances in neural information processing systems","volume":"25","author":"Gabillon V.","year":"2012"},{"key":"B25","first-page":"2222","volume-title":"Advances in neural information processing systems","volume":"25","author":"Gabillon V.","year":"2011"},{"key":"B26","first-page":"1004","author":"Gabillon V.","year":"2016","journal-title":"Proceedings of the 19th International Conference on Artificial Intelligence and Statistics"},{"key":"B27","first-page":"1427","author":"Huang S.","year":"2008","journal-title":"Proceedings of the 27th IEEE International Conference on Computer Communications"},{"key":"B28","first-page":"2291","author":"Huang W.","year":"2018","journal-title":"Proceedings of the 27th International Joint Conference on Artificial Intelligence"},{"key":"B29","first-page":"423","author":"Jamieson K.","year":"2014","journal-title":"Proceedings of the 27th Annual Conference on Learning Theory"},{"key":"B30","first-page":"99","volume-title":"Advances in neural information processing systems, 30","author":"Jun K.","year":"2017"},{"key":"B31","first-page":"511","author":"Kalyanakrishnan S.","year":"2010","journal-title":"Proceedings of the 27th International Conference on Machine Learning"},{"key":"B32","first-page":"655","author":"Kalyanakrishnan S.","year":"2012","journal-title":"Proceedings of the 29th International Conference on Machine Learning"},{"key":"B33","doi-asserted-by":"publisher","DOI":"10.1007\/BF01585865"},{"key":"B34","first-page":"1","volume":"17","author":"Kaufmann E.","year":"2016","journal-title":"Journal of Machine Learning Research"},{"key":"B35","first-page":"228","author":"Kaufmann E.","year":"2013","journal-title":"Proceedings of the 26th Annual Conference on Learning Theory"},{"key":"B36","doi-asserted-by":"publisher","DOI":"10.4153\/CJM-1960-030-4"},{"key":"B37","first-page":"1152","author":"Komiyama J.","year":"2015","journal-title":"Proceedings of the 32nd International Conference on Machine Learning"},{"key":"B38","first-page":"1597","volume-title":"Advances in neural information processing systems","volume":"29","author":"Lagr\u00e9e P.","year":"2016"},{"key":"B39","doi-asserted-by":"publisher","DOI":"10.1016\/0196-8858(85)90002-8"},{"key":"B40","first-page":"728","author":"Lattimore T.","year":"2017","journal-title":"Proceedings of the 20th International Conference on Artificial Intelligence and Statistics"},{"key":"B42","first-page":"1069","author":"Li J.","year":"2017","journal-title":"Proceedings of the 26th ACM International Conference on Information and Knowledge Management"},{"key":"B43","first-page":"5123","author":"Perrault P.","year":"2019","journal-title":"Proceedings of the 36th International Conference on Machine Learning"},{"key":"B44","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719109"},{"key":"B45","first-page":"784","author":"Radlinski F.","year":"2008","journal-title":"Proceedings of the 25th International Conference on Machine Learning"},{"key":"B46","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1145\/2642918.2647409","author":"Retelny D.","year":"2014","journal-title":"Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology"},{"key":"B47","first-page":"260","author":"Rusmevichientong P.","year":"2006","journal-title":"An adaptive algorithm for selecting profitable keywords for search-based advertising services."},{"key":"B48","doi-asserted-by":"publisher","DOI":"10.1016\/j.dam.2012.07.016"},{"key":"B49","first-page":"828","volume-title":"Advances in neural information processing systems","volume":"27","author":"Soare M.","year":"2014"},{"key":"B50","first-page":"4877","author":"Tao C.","year":"2018","journal-title":"Proceedings of the 35th International Conference on Machine Learning"},{"key":"B51","doi-asserted-by":"publisher","DOI":"10.1016\/j.orl.2016.05.005"},{"key":"B52","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2014.04.005"},{"key":"B53","doi-asserted-by":"publisher","DOI":"10.2307\/2371182"},{"key":"B54","first-page":"843","author":"Xu L.","year":"2018","journal-title":"Proceedings of the 21st International Conference on Artificial Intelligence and Statistics"},{"key":"B55","first-page":"217","author":"Zhou Y.","year":"2014","journal-title":"Proceedings of the 31st International Conference on Machine Learning"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/neco_a_01299","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,26]],"date-time":"2021-03-26T02:58:04Z","timestamp":1616727484000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/32\/9\/1733-1773\/95605"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9]]},"references-count":54,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["10.1162\/neco_a_01299"],"URL":"https:\/\/doi.org\/10.1162\/neco_a_01299","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9]]}}}