{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,13]],"date-time":"2024-09-13T17:40:17Z","timestamp":1726249217273},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,10,18]],"date-time":"2022-10-18T00:00:00Z","timestamp":1666051200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,10,18]],"date-time":"2022-10-18T00:00:00Z","timestamp":1666051200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010061","name":"University of Waikato","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100010061","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Min Knowl Disc"],"published-print":{"date-parts":[[2023,1]]},"abstract":"Abstract<\/jats:title>The k<\/jats:italic>-means algorithm is widely used for clustering, compressing, and summarizing vector data. We present a fast and memory-efficient GPU-based algorithm for exact k<\/jats:italic>-means, Asynchronous Selective Batched K<\/jats:italic>-means (ASB K<\/jats:italic>-means). Unlike most GPU-based k<\/jats:italic>-means algorithms that require loading the whole dataset onto the GPU for clustering, the amount of GPU memory required to run our algorithm can be chosen to be much smaller than the size of the whole dataset. Thus, our algorithm can cluster datasets whose size exceeds the available GPU memory. The algorithm works in a batched fashion and applies the triangle inequality in each k<\/jats:italic>-means iteration to omit a data point if its membership assignment, i.e., the cluster it belongs to, remains unchanged, thus significantly reducing the number of data points that need to be transferred between the CPU\u2019s RAM and the GPU\u2019s global memory and enabling the algorithm to very efficiently process large datasets. Our algorithm can be substantially faster than a GPU-based implementation of standard k<\/jats:italic>-means even in situations when application of the standard algorithm is feasible because the whole dataset fits into GPU memory. Experiments show that ASB K<\/jats:italic>-means can run up to 15x times faster than a standard GPU-based implementation of k<\/jats:italic>-means, and it also outperforms the GPU-based k<\/jats:italic>-means implementation in NVIDIA\u2019s open-source RAPIDS machine learning library on all the datasets used in our experiments.<\/jats:p>","DOI":"10.1007\/s10618-022-00869-6","type":"journal-article","created":{"date-parts":[[2022,10,18]],"date-time":"2022-10-18T18:02:52Z","timestamp":1666116172000},"page":"67-109","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Large scale K-means clustering using GPUs"],"prefix":"10.1007","volume":"37","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-7158-3395","authenticated-orcid":false,"given":"Mi","family":"Li","sequence":"first","affiliation":[]},{"given":"Eibe","family":"Frank","sequence":"additional","affiliation":[]},{"given":"Bernhard","family":"Pfahringer","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,10,18]]},"reference":[{"issue":"7","key":"869_CR1","doi-asserted-by":"publisher","first-page":"622","DOI":"10.14778\/2180912.2180915","volume":"5","author":"B Bahmani","year":"2012","unstructured":"Bahmani B, Moseley B, Vattani A et al (2012) Scalable k-means++. Proc VLDB Endow 5(7):622\u2013633","journal-title":"Proc VLDB Endow"},{"key":"869_CR2","doi-asserted-by":"crossref","unstructured":"Bejarano J, Koushiki B, Brannan T, et\u00a0al (2011) Sampling within k-means algorithm to cluster large datasets. Tech. Rep. HPCF-2011-12, UMBC High Performance Computing Facility, University of Maryland, Baltimore County, Maryland, USA","DOI":"10.2172\/1025410"},{"key":"869_CR3","doi-asserted-by":"crossref","unstructured":"Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping Multidimensional Data. Springer, p 25\u201371","DOI":"10.1007\/3-540-28349-8_2"},{"issue":"1","key":"869_CR4","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1016\/j.jpdc.2012.04.003","volume":"73","author":"AR Brodtkorb","year":"2013","unstructured":"Brodtkorb AR, Hagen TR, S\u00e6tra ML (2013) Graphics processing unit (GPU) programming strategies and trends in GPU computing. J Parallel Distrib Comput 73(1):4\u201313","journal-title":"J Parallel Distrib Comput"},{"issue":"10","key":"869_CR5","doi-asserted-by":"publisher","first-page":"1370","DOI":"10.1016\/j.jpdc.2008.05.014","volume":"68","author":"S Che","year":"2008","unstructured":"Che S, Boyer M, Meng J et al (2008) A performance study of general-purpose applications on graphics processors using CUDA. J Parallel Distrib Comput 68(10):1370\u20131380","journal-title":"J Parallel Distrib Comput"},{"issue":"2","key":"869_CR6","doi-asserted-by":"publisher","first-page":"132","DOI":"10.1109\/TVCG.2010.55","volume":"17","author":"I Chiosa","year":"2011","unstructured":"Chiosa I, Kolb A (2011) GPU-based multilevel clustering. IEEE Trans Visual Comput Graphics 17(2):132\u2013145","journal-title":"IEEE Trans Visual Comput Graphics"},{"key":"869_CR7","doi-asserted-by":"crossref","unstructured":"Coates A, Ng AY (2012) Learning feature representations with k-means. In: Neural Networks: Tricks of the Trade. Springer, p 561\u2013580","DOI":"10.1007\/978-3-642-35289-8_30"},{"key":"869_CR8","unstructured":"Drake J, Hamerly G (2012) Accelerated k-means with adaptive distance bounds. In: NIPS Workshop on Optimization for Machine Learning, pp 42\u201353"},{"key":"869_CR9","unstructured":"Elkan C (2003) Using the triangle inequality to accelerate k-means. In: International Conference on Machine Learning. AAAI Press, pp 147\u2013153"},{"issue":"3","key":"869_CR10","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1109\/TETC.2014.2330519","volume":"2","author":"A Fahad","year":"2014","unstructured":"Fahad A, Alshatri N, Tari Z et al (2014) A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans Emerg Top Comput 2(3):267\u2013279","journal-title":"IEEE Trans Emerg Top Comput"},{"key":"869_CR11","unstructured":"Fang W, Lau KK, Lu M, et\u00a0al (2008) Parallel data mining on graphics processors. Tech. Rep. HKUST-CS08-07, Hong Kong Univ. Sci. and Technology, Hong Kong, China"},{"key":"869_CR12","unstructured":"Farivar R, Rebolledo D, Chan E, et\u00a0al (2008) A parallel implementation of k-means clustering on GPUs. In: International Conference on Parallel and Distributed Processing Techniques and Applications. CSREA Press, pp 340\u2013345"},{"key":"869_CR13","doi-asserted-by":"crossref","unstructured":"Hamerly G (2010) Making k-means even faster. In: SIAM International Conference on Data Mining. SIAM, pp 130\u2013140","DOI":"10.1137\/1.9781611972801.12"},{"key":"869_CR14","doi-asserted-by":"crossref","unstructured":"Hamerly G, Drake J (2015) Accelerating Lloyd\u2019s algorithm for k-means clustering. In: Partitional Clustering Algorithms. Springer, p 41\u201378","DOI":"10.1007\/978-3-319-09259-1_2"},{"issue":"14","key":"869_CR15","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.6621","volume":"34","author":"G He","year":"2022","unstructured":"He G, Vialle S, Baboulin M (2022) Parallel and accurate k-means algorithm on CPU-GPU architectures for spectral clustering. Concurr Comput: Pract Exp 34(14):e6621","journal-title":"Concurr Comput: Pract Exp"},{"key":"869_CR16","doi-asserted-by":"crossref","unstructured":"Hong-Tao B, Li-li H, Dan-tong O, et\u00a0al (2009) K-means on commodity GPUs with CUDA. In: WRI World Congress on Computer Science and Information Engineering. IEEE Computer Society, pp 651\u2013655","DOI":"10.1109\/CSIE.2009.491"},{"issue":"8","key":"869_CR17","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1016\/j.patrec.2009.09.011","volume":"31","author":"AK Jain","year":"2010","unstructured":"Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651\u2013666","journal-title":"Pattern Recogn Lett"},{"issue":"3","key":"869_CR18","doi-asserted-by":"publisher","first-page":"942","DOI":"10.1007\/s11227-011-0672-7","volume":"64","author":"L Jian","year":"2013","unstructured":"Jian L, Wang C, Liu Y et al (2013) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). J Supercomput 64(3):942\u2013967","journal-title":"J Supercomput"},{"key":"869_CR19","doi-asserted-by":"crossref","unstructured":"Kruli\u0161 M, Kratochv\u00edl M (2020) Detailed analysis and optimization of CUDA k-means algorithm. In: 49th International Conference on Parallel Processing-ICPP, pp 1\u201311","DOI":"10.1145\/3404397.3404426"},{"key":"869_CR20","doi-asserted-by":"crossref","unstructured":"Langdon WB (2013) Large-scale bioinformatics data mining with parallel genetic programming on graphics processing units. In: Massively Parallel Evolutionary Computation on GPGPUs. Springer, p 311\u2013347","DOI":"10.1007\/978-3-642-37959-8_15"},{"key":"869_CR21","unstructured":"Lee CC, Chu KY (2012) CUDA-accelerated hierarchical k-means, unpublished manuscript"},{"issue":"2","key":"869_CR22","doi-asserted-by":"publisher","first-page":"216","DOI":"10.1016\/j.jcss.2012.05.004","volume":"79","author":"Y Li","year":"2013","unstructured":"Li Y, Zhao K, Chu X et al (2013) Speeding up k-means algorithm by GPUs. J Comput Syst Sci 79(2):216\u2013229","journal-title":"J Comput Syst Sci"},{"issue":"2","key":"869_CR23","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","volume":"28","author":"S Lloyd","year":"1982","unstructured":"Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129\u2013137","journal-title":"IEEE Trans Inf Theory"},{"issue":"3","key":"869_CR24","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1007\/s13222-018-0293-x","volume":"18","author":"C Lutz","year":"2018","unstructured":"Lutz C, Bre\u00df S, Rabl T et al (2018) Efficient and scalable k-means on GPUs. Datenbank-Spektrum 18(3):157\u2013169","journal-title":"Datenbank-Spektrum"},{"issue":"4","key":"869_CR25","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1145\/2788396","volume":"47","author":"S Mittal","year":"2015","unstructured":"Mittal S, Vetter JS (2015) A survey of CPU-GPU heterogeneous computing techniques. ACM Comput Surv (CSUR) 47(4):69","journal-title":"ACM Comput Surv (CSUR)"},{"key":"869_CR26","doi-asserted-by":"crossref","unstructured":"Mohebi A, Aghabozorgi S, Ying\u00a0Wah T et\u00a0al (2016) Iterative big data clustering algorithms: a review. Software: Practice and Experience 46(1):107\u2013129","DOI":"10.1002\/spe.2341"},{"key":"869_CR27","unstructured":"NVIDIA (2021) CUDA C programming guide. https:\/\/docs.nvidia.com\/cuda\/cuda-c-programming-guide\/"},{"key":"869_CR28","unstructured":"Owens JD, Luebke D, Govindaraju NK et\u00a0al (2005) A survey of general-purpose computation on graphics hardware. In: Conference of the European Association for Computer Graphics. Eurographics Association, pp 21\u201351"},{"issue":"5","key":"869_CR29","doi-asserted-by":"publisher","first-page":"879","DOI":"10.1109\/JPROC.2008.917757","volume":"96","author":"JD Owens","year":"2008","unstructured":"Owens JD, Houston M, Luebke D et al (2008) GPU computing. Proc IEEE 96(5):879\u2013899","journal-title":"Proc IEEE"},{"key":"869_CR30","unstructured":"Pennington J, Socher R, Manning CD (2021) Global vectors for word representation. https:\/\/nlp.stanford.edu\/projects\/glove\/"},{"key":"869_CR31","doi-asserted-by":"crossref","unstructured":"Phillips SJ (2002) Acceleration of k-means and related clustering algorithms. In: Workshop on Algorithm Engineering and Experiments. Springer, pp 166\u2013177","DOI":"10.1007\/3-540-45643-0_13"},{"issue":"3","key":"869_CR32","doi-asserted-by":"publisher","first-page":"1","DOI":"10.17485\/ijst\/2016\/v9i3\/75971","volume":"9","author":"T Sajana","year":"2016","unstructured":"Sajana T, Rani CS, Narayana K (2016) A survey on clustering techniques for big data mining. Indian J Sci Technol 9(3):1\u201312","journal-title":"Indian J Sci Technol"},{"key":"869_CR33","doi-asserted-by":"crossref","unstructured":"Shirkhorshidi AS, Aghabozorgi S, Wah TY et\u00a0al (2014) Big data clustering: a review. In: International Conference on Computational Science and Its Applications. Springer, pp 707\u2013720","DOI":"10.1007\/978-3-319-09156-3_49"},{"key":"869_CR34","doi-asserted-by":"crossref","unstructured":"Taylor C, Gowanlock M (2021) Accelerating the yinyang k-means algorithm using the GPU. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), IEEE, pp 1835\u20131840","DOI":"10.1109\/ICDE51399.2021.00163"},{"issue":"3","key":"869_CR35","doi-asserted-by":"publisher","first-page":"284","DOI":"10.1016\/j.jpdc.2012.11.001","volume":"73","author":"SR Upadhyaya","year":"2013","unstructured":"Upadhyaya SR (2013) Parallel approaches to machine learning - a comprehensive survey. J Parallel Distrib Comput 73(3):284\u2013292","journal-title":"J Parallel Distrib Comput"},{"key":"869_CR36","unstructured":"Vassilvitskii S, Arthur D (2006) k-means++: The advantages of careful seeding. In: Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, pp 1027\u20131035"},{"key":"869_CR37","doi-asserted-by":"crossref","unstructured":"Wu R, Zhang B, Hsu M (2009) Clustering billions of data points using GPUs. In: Combined Workshops on Unconventional High Performance Computing Workshop Plus Memory Access Workshop, ACM, pp 1\u20136","DOI":"10.1145\/1531666.1531668"},{"issue":"3","key":"869_CR38","doi-asserted-by":"publisher","first-page":"645","DOI":"10.1109\/TNN.2005.845141","volume":"16","author":"R Xu","year":"2005","unstructured":"Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645\u2013678","journal-title":"IEEE Trans Neural Netw"},{"key":"869_CR39","doi-asserted-by":"crossref","unstructured":"Yang C, Li Y, Cheng F (2020) Accelerating k-means on GPU with CUDA programming. In: IOP Conference Series: Materials Science and Engineering, IOP Publishing, p 012036","DOI":"10.1088\/1757-899X\/790\/1\/012036"},{"key":"869_CR40","doi-asserted-by":"crossref","unstructured":"Zechner M, Granitzer M (2009) Accelerating k-means on the graphics processor via CUDA. In: First International Conference on Intensive Applications and Services. IEEE Computer Society, pp 7\u201315","DOI":"10.1109\/INTENSIVE.2009.19"}],"container-title":["Data Mining and Knowledge Discovery"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-022-00869-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10618-022-00869-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-022-00869-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,4]],"date-time":"2023-01-04T17:41:53Z","timestamp":1672854113000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10618-022-00869-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,18]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1]]}},"alternative-id":["869"],"URL":"https:\/\/doi.org\/10.1007\/s10618-022-00869-6","relation":{},"ISSN":["1384-5810","1573-756X"],"issn-type":[{"value":"1384-5810","type":"print"},{"value":"1573-756X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,18]]},"assertion":[{"value":"28 February 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 September 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 October 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose. This research was supported by the University of Waikato.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}