{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,4]],"date-time":"2025-04-04T09:26:06Z","timestamp":1743758766910},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2016,11]]},"abstract":"We present NG-DBSCAN, an approximate density-based clustering algorithm that operates on arbitrary data and any symmetric distance measure. The distributed design of our algorithm makes it scalable to very large datasets; its approximate nature makes it fast, yet capable of producing high quality clustering results. We provide a detailed overview of the steps of NG-DBSCAN, together with their analysis. Our results, obtained through an extensive experimental campaign with real and synthetic data, substantiate our claims about NG-DBSCAN's performance and scalability.<\/jats:p>","DOI":"10.14778\/3021924.3021932","type":"journal-article","created":{"date-parts":[[2017,1,24]],"date-time":"2017-01-24T15:29:41Z","timestamp":1485271781000},"page":"157-168","source":"Crossref","is-referenced-by-count":58,"title":["NG-DBSCAN"],"prefix":"10.14778","volume":"10","author":[{"given":"Alessandro","family":"Lulli","sequence":"first","affiliation":[{"name":"University of Pisa, Italy and ISTI, CNR, Pisa, Italy"}]},{"given":"Matteo","family":"Dell'Amico","sequence":"additional","affiliation":[{"name":"Symantec Research Labs, France"}]},{"given":"Pietro","family":"Michiardi","sequence":"additional","affiliation":[{"name":"EURECOM, Campus SophiaTech, France"}]},{"given":"Laura","family":"Ricci","sequence":"additional","affiliation":[{"name":"University of Pisa, Italy and ISTI, CNR, Pisa, Italy"}]}],"member":"320","published-online":{"date-parts":[[2016,11]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Apache Giraph. http:\/\/giraph.apache.org\/. Apache Giraph. http:\/\/giraph.apache.org\/."},{"key":"e_1_2_1_2_1","unstructured":"Apache Spark. https:\/\/spark.apache.org. Apache Spark. https:\/\/spark.apache.org."},{"key":"e_1_2_1_3_1","unstructured":"Apache Spark machine learning library. https:\/\/spark.apache.org\/mllib\/. Apache Spark machine learning library. https:\/\/spark.apache.org\/mllib\/."},{"key":"e_1_2_1_4_1","unstructured":"Clustering the News with Spark and MLLib. http:\/\/bigdatasciencebootcamp.com\/posts\/Part_3\/clustering_news.html. Clustering the News with Spark and MLLib. http:\/\/bigdatasciencebootcamp.com\/posts\/Part_3\/clustering_news.html."},{"key":"e_1_2_1_5_1","unstructured":"Word2vector package. https:\/\/code.google.com\/p\/word2vec\/. Word2vector package. https:\/\/code.google.com\/p\/word2vec\/."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLOUD.2012.42"},{"key":"e_1_2_1_7_1","first-page":"34","volume-title":"Clustering indices","author":"Desgraupes B.","year":"2013","unstructured":"B. Desgraupes . Clustering indices . In University of Paris Ouest-Lab ModalX , volume 1 , page 34 , 2013 . B. Desgraupes. Clustering indices. In University of Paris Ouest-Lab ModalX, volume 1, page 34, 2013."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963487"},{"key":"e_1_2_1_9_1","first-page":"226","volume-title":"Kdd","volume":"96","author":"Ester M.","year":"1996","unstructured":"M. Ester A density-based algorithm for discovering clusters in large spatial databases with noise . In Kdd , volume 96 , pages 226 -- 231 , 1996 . M. Ester et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd, volume 96, pages 226--231, 1996."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/WI.2007.43"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijar.2008.08.006"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2737792"},{"key":"e_1_2_1_13_1","first-page":"599","volume-title":"11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14)","author":"Gonzalez J. E.","year":"2014","unstructured":"J. E. Gonzalez : Graph processing in a distributed dataflow framework . In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14) , pages 599 -- 613 , 2014 . J. E. Gonzalez et al. Graphx: Graph processing in a distributed dataflow framework. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 599--613, 2014."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPADS.2011.83"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1002\/sim.4780140510"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2013.11.002"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772751"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2010.35"},{"key":"e_1_2_1_19_1","first-page":"1","article-title":"Fast connected components computation in large graphs by vertex pruning","author":"Lulli A.","year":"2016","unstructured":"A. Lulli , E. Carlini , P. Dazzi , C. Lucchese , and L. Ricci . Fast connected components computation in large graphs by vertex pruning . IEEE Transactions on Parallel and Distributed systems, page 1 , 2016 . A. Lulli, E. Carlini, P. Dazzi, C. Lucchese, and L. Ricci. Fast connected components computation in large graphs by vertex pruning. IEEE Transactions on Parallel and Distributed systems, page 1, 2016.","journal-title":"IEEE Transactions on Parallel and Distributed systems, page"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2015.7363845"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCC.2015.7405576"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807184"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818185"},{"key":"e_1_2_1_24_1","volume-title":"15th Workshop on Hot Topics in Operating Systems","author":"McSherry F.","year":"2015","unstructured":"F. McSherry but at what cost ? In 15th Workshop on Hot Topics in Operating Systems , 2015 . F. McSherry et al. Scalability! but at what cost? In 15th Workshop on Hot Topics in Operating Systems, 2015."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.51"},{"key":"e_1_2_1_26_1","volume-title":"Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct):2825--2830","author":"Pedregosa F.","year":"2011","unstructured":"F. Pedregosa Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct):2825--2830 , 2011 . F. Pedregosa et al. Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct):2825--2830, 2011."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1002\/ima.v19:2"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csda.2005.10.001"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/79173.79181"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503262"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009884809343"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/11430919_43"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3021924.3021932","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:29:42Z","timestamp":1672219782000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3021924.3021932"}},"subtitle":["scalable density-based clustering for arbitrary data"],"short-title":[],"issued":{"date-parts":[[2016,11]]},"references-count":32,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,11]]}},"alternative-id":["10.14778\/3021924.3021932"],"URL":"https:\/\/doi.org\/10.14778\/3021924.3021932","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2016,11]]}}}