Abstract
General Purpose Graphic Processing Unit (GPGPU) computing with CUDA has been effectively used in scientific applications, where huge accelerations have been achieved. However, while today’s traditional GPGPU can reduce the execution time of parallel code by many times, it comes at the expense of significant power and energy consumption. In this paper, we propose ubiquitous parallel computing approach for construction of decision tree on GPU. In our approach, we exploit parallelism of well-known ID3 algorithm for decision tree learning by two levels: at the outer level of building the tree node-by-node, and at the inner level of sorting data records within a single node. Thus, our approach not only accelerates the construction of decision tree via GPU computing, but also does so by taking care of the power and energy consumption of the GPU. Experiment results show that our approach outperforms purely GPU-based implementation and CPU-based sequential implementation by several times.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bakum P, Skadron K (2010) Accelerating SQL database operations on a GPU with CUDA. In: GPGPU’10: proceedings of the third workshop on general-purpose computation on graphics processing units. pp 94–103
Kim J, Kim SG, Nam B (2013) Parallel multi-dimensional range query processing with R-trees on GPU. J Parallel Distrib Comput 73(8):2164–2179
Cayrel PL, Hoffmann G, Schneider M (2011) GPU implementation of the Keccak Hash function family. Int J Secur Appl 5(4):123–132
Che S, Boyer M, Meng J, Tarjan D, Sheaffer JW, Skadron K (2008) A performance study of general-purpose applications on graphics processors using CUDA. J Parallel Distrib Comput 68(10):1370–1380
Nguyen H (2007) GPU Gems 3. Addison-Wesley Professional, Reading
Huang S, Xiao S, Feng W (2009) On the energy efficiency of graphics processing units for scientific computing. In: Proceeding of the 2009 IEEE international symposium on parallel and distributed processing. pp 1–8
Yu CD, Wan W, Pierce D (2011) A CPU–GPU hybrid approach for the unsymmetric multifrontal method. Parallel Comput 37(12):759–770
Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbour search using GPU. In: Computer vision and pattern recognition workshops. pp 1–6
Kufrin R (1997) Decision trees on parallel processors. Mach Intell Pattern Recognit 20:279–306
Chui FCF, Bindoff I, Williams R (2009) Applying feature extraction for classification problems. J Signal Process Image Process Pattern Recognit 2(1):1–16
Cheng X, Xu J, Pei J, Liu J (2010) Hierarchical distributed data classification in wireless sensor networks. Comput Commun 33(12):1404–1413
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Teng Z, Du W (2007) A hybrid multi-group privacy-preserving approach for building decision trees. In: Proceedings of the 11th Pacific-Asia conference on advances in knowledge discovery and data mining. pp 296–307
Van Der Laan WJ, Jalba AC, Roerdink JBTM (2011) Accelerating wavelet lifting on graphics hardware using CUDA. IEEE Trans Parallel Distrib Syst 22(1):132–146
Shafer CJ, Agrawal R, Mehta M (1996) SPRINT: a scalable parallel classifier for data mining. In: Proceedings of the 22th international conference on very large data bases. pp 544–555
Mehta M, Agrawal R, Rissanen J (1996) SLIQ: a fast scalable classifier for data mining. In: Proceedings of the 5th international conference on extending database technology: advances in database technology. pp 18–32
Panda B, Herbach JS, Basu S, Bayardo RJ (2009) PLANET: massively parallel learning of tree ensembles with MapReduce. J VLDB Endow 2(2):1426–1437
Ben-Haim Y, Tom-Tov E (2010) A streaming parallel decision tree algorithm. J Mach Learn Res 11(3):849–872
Sharp T (2008) Implementing decision trees and forests on a GPU. In: Proceeding 10th European conference on computer vision. pp 595–608
Grahn H, Lavesson N, Lapajne HM, Slat D (2011) CudaRF: a CUDA-based implementation of random forests. In: Proceedings of the 2011 9th IEEE/ACS international conference on computer systems and applications. pp 95–101
Chiu CC, Luo GH, Yuan SM (2011) A decision tree using CUDA GPUs. In: Proceedings of the 13th international conference on information integration and web-based applications and services. pp 399–402
Park YH, Whang KY, Lee BS, Han WS (2006) Efficient evaluation of linear path expressions on large-scale heterogeneous XML documents using information retrieval techniques. J Syst Softw 79(2):180–190
Acknowledgments
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012003797). The work was also supported by the IT R & D program of MKE/KEIT. (10041854, Development of a smart home service platform wth real-time danger prediction and prevention for safety residential environments).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nasridinov, A., Lee, Y. & Park, YH. Decision tree construction on GPU: ubiquitous parallel computing approach. Computing 96, 403–413 (2014). https://doi.org/10.1007/s00607-013-0343-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-013-0343-z