{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T23:33:57Z","timestamp":1723073637662},"reference-count":41,"publisher":"Wiley","issue":"5","license":[{"start":{"date-parts":[[2010,8,25]],"date-time":"2010-08-25T00:00:00Z","timestamp":1282694400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Statistical Analysis"],"published-print":{"date-parts":[[2010,10]]},"abstract":"Abstract<\/jats:title>The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative<\/jats:italic> frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK<\/jats:italic>, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near\u2010optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state\u2010of\u2010the\u2010art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during<\/jats:italic> frequent subgraph mining. Copyright \u00a9 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 302\u2010318, 2010<\/jats:p>","DOI":"10.1002\/sam.10084","type":"journal-article","created":{"date-parts":[[2010,8,29]],"date-time":"2010-08-29T23:24:54Z","timestamp":1283124294000},"page":"302-318","source":"Crossref","is-referenced-by-count":31,"title":["Discriminative frequent subgraph mining with optimality guarantees"],"prefix":"10.1002","volume":"3","author":[{"given":"Marisa","family":"Thoma","sequence":"first","affiliation":[]},{"given":"Hong","family":"Cheng","sequence":"additional","affiliation":[]},{"given":"Arthur","family":"Gretton","sequence":"additional","affiliation":[]},{"given":"Jiawei","family":"Han","sequence":"additional","affiliation":[]},{"given":"Hans\u2010Peter","family":"Kriegel","sequence":"additional","affiliation":[]},{"given":"Alex","family":"Smola","sequence":"additional","affiliation":[]},{"given":"Le","family":"Song","sequence":"additional","affiliation":[]},{"given":"Philip S.","family":"Yu","sequence":"additional","affiliation":[]},{"given":"Xifeng","family":"Yan","sequence":"additional","affiliation":[]},{"given":"Karsten M.","family":"Borgwardt","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2010,8,25]]},"reference":[{"key":"e_1_2_7_2_2","doi-asserted-by":"publisher","DOI":"10.1038\/nrd1156"},{"key":"e_1_2_7_3_2","doi-asserted-by":"crossref","unstructured":"S.Kramer L.Raedt andC.Helma Molecular feature mining in HIV data In Proceedings of KDD San Francisco CA 2001 136\u2013143.","DOI":"10.1145\/502512.502533"},{"key":"e_1_2_7_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2005.127"},{"key":"e_1_2_7_5_2","doi-asserted-by":"crossref","unstructured":"H.Cheng X.Yan J.Han andC.Hsu Discriminative frequent pattern analysis for effective classification In Proceedings of ICDE Istanbul Turkey 2007;716\u2013725.","DOI":"10.1109\/ICDE.2007.367917"},{"key":"e_1_2_7_6_2","unstructured":"H.Kashima K.Tsuda andA.Inokuchi Marginalized kernels between labeled graphs In Proceedings of ICML Washington DC 2003;321\u2013328."},{"key":"e_1_2_7_7_2","doi-asserted-by":"crossref","unstructured":"N.WaleandG.Karypis Comparison of descriptor spaces for chemical compound retrieval and classification In Proceedings of ICDM Hong Kong 2006;678\u2013689.","DOI":"10.21236\/ADA444816"},{"key":"e_1_2_7_8_2","unstructured":"N.ShervashidzeandK. M.Borgwardt Fast subtree kernels on graphs NIPS 2009;1660\u20131668."},{"key":"e_1_2_7_9_2","unstructured":"T.Kudo E.Maeda andY.Matsumoto An application of boosting to graph classification In Advances in Neural Information Processing Systems 17 (NIPS'04) Vancouver BC 2004;729\u2013736."},{"key":"e_1_2_7_10_2","doi-asserted-by":"crossref","unstructured":"K.Tsuda Entire regularization paths for graph data In Proceedings of ICML Oregon OR USA 2007;919\u2013926.","DOI":"10.1145\/1273496.1273612"},{"key":"e_1_2_7_11_2","doi-asserted-by":"crossref","unstructured":"M.Thoma H.Cheng A.Gretton J.Han H.\u2010P.Kriegel A.Smola L.Song P. S.Yu X.Yan andK.Borgwardt Near\u2010optimal supervised feature selection among frequent subgraphs In Proceedings of SDM Sparks NV USA 2009;1075\u20131087.","DOI":"10.1137\/1.9781611972795.92"},{"key":"e_1_2_7_12_2","unstructured":"X.YanandJ.Han gSpan: Graph\u2010based substructure pattern mining In Proceedings 2002 International Conference on Data Mining (ICDM'02) Maebashi City Japan 2002;721\u2013724."},{"key":"e_1_2_7_13_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF01588971"},{"key":"e_1_2_7_14_2","unstructured":"2008 ACM Las Vegas NV USA W. Fan K. Zhang H. Cheng J. Gao X. Yan J. Han P. S. Yu O. Verscheure Y. Li B. Liu S. Sarawagi Direct mining of discriminative and essential frequent patterns via model\u2010based search tree 230 238"},{"key":"e_1_2_7_15_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-008-5089-z"},{"key":"e_1_2_7_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/1401890.1401961"},{"key":"e_1_2_7_17_2","first-page":"433","volume-title":"SIGMOD Conference","author":"Yan X.","year":"2008"},{"key":"e_1_2_7_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646027"},{"key":"e_1_2_7_19_2","doi-asserted-by":"crossref","unstructured":"C.Guestrin A.Krause andA.Singh Near\u2010optimal sensor placements in gaussian processes In Proceedings of ICML Bonn Germany 2005;265\u2013272.","DOI":"10.1145\/1102351.1102385"},{"key":"e_1_2_7_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45372-5_2"},{"key":"e_1_2_7_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2001.989534"},{"key":"e_1_2_7_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2002.1183988"},{"key":"e_1_2_7_23_2","first-page":"487","volume-title":"Proceedings of 1994 International Conference on Very Large Data Bases (VLDB'94)","author":"Agrawal R.","year":"1994"},{"key":"e_1_2_7_24_2","first-page":"211","volume-title":"Proceedings 2002 International Conference on Data Mining (ICDM'02)","author":"Borgelt C.","year":"2002"},{"key":"e_1_2_7_25_2","first-page":"549","volume-title":"Proceedings 2003 International Conference on Data Mining (ICDM'03)","author":"Huan J.","year":"2003"},{"key":"e_1_2_7_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014134"},{"key":"e_1_2_7_27_2","unstructured":"N.Shervashidze S.Vishwanathan T.Petri K.Mehlhorn andK.Borgwardt Efficient graphlet kernels for large graph comparison AISTATS 2009."},{"key":"e_1_2_7_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/1133905.1133908"},{"key":"e_1_2_7_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2005.49"},{"key":"e_1_2_7_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45571-X_12"},{"key":"e_1_2_7_31_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1024653703689"},{"key":"e_1_2_7_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0022-2836(03)00628-4"},{"key":"e_1_2_7_33_2","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/28.1.304"},{"key":"e_1_2_7_34_2","first-page":"165","volume-title":"WABI","author":"Wernicke S.","year":"2005"},{"key":"e_1_2_7_35_2","volume-title":"2006 European Conference on Computational Biology (ECCB)","author":"Przulj N.","year":"2006"},{"key":"e_1_2_7_36_2","unstructured":"C.ChangandC.Lin LIBSVM: a library for support vector machines 2001. Software available athttp:\/\/www.csie.ntu.edu.tw\/\u223ccjlin\/libsvm."},{"key":"e_1_2_7_37_2","doi-asserted-by":"publisher","DOI":"10.1038\/415530a"},{"key":"e_1_2_7_38_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0601231103"},{"key":"e_1_2_7_39_2","first-page":"412","volume-title":"Proceedings of ICML","author":"Yang Y.","year":"1997"},{"key":"e_1_2_7_40_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30115-8_32"},{"key":"e_1_2_7_41_2","unstructured":"A.KrauseandC.Guestrin Near\u2010optimal nonmyopic value of information in graphical models In Uncertainty in Artificial Intelligence UAI'05 2005;324\u2013331."},{"key":"e_1_2_7_42_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-008-0136-4"}],"container-title":["Statistical Analysis and Data Mining: The ASA Data Science Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fsam.10084","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/sam.10084","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,7]],"date-time":"2023-10-07T07:01:13Z","timestamp":1696662073000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/sam.10084"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,8,25]]},"references-count":41,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2010,10]]}},"alternative-id":["10.1002\/sam.10084"],"URL":"https:\/\/doi.org\/10.1002\/sam.10084","archive":["Portico"],"relation":{},"ISSN":["1932-1864","1932-1872"],"issn-type":[{"value":"1932-1864","type":"print"},{"value":"1932-1872","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,8,25]]}}}