PaD-DBSCAN: Enhancing Parallel DBSCAN Clustering with Density Peak Detection | SpringerLink
Skip to main content

PaD-DBSCAN: Enhancing Parallel DBSCAN Clustering with Density Peak Detection

  • Conference paper
  • First Online:
Advanced Data Mining and Applications (ADMA 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15387))

Included in the following conference series:

  • 80 Accesses

Abstract

DBSCAN is a well-established density-based clustering algorithm capable of discovering clusters of arbitrary shape with numerous practical applications. Despite the significant advances achieved by optimized variants of DBSCAN, these methods still encounter challenges when handling data with uneven density distributions. Additionally, they fail to optimally distribute the computational load in parallel architectures and are constrained by the need for fixed threshold parameter settings. These limitations represent key bottlenecks in existing DBSCAN variants. To address these issues, we propose a Parallel Density peak based DBSCAN clustering algorithm, called PaD-DBSCAN. This approach dynamically detects changes in density peaks, thereby enhancing parallel processing capabilities and eliminating the drawbacks of fixed parameter settings. Extensive experiments conducted on various datasets demonstrate the effectiveness and superiority of the PaD-DBSCAN, thus justifying our design choices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 8465
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10581
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://docs.rs/residua-zigzag/latest/zigzag/.

References

  1. de Andrade Silva, J., Hruschka, E.R., Gama, J.: An evolutionary algorithm for clustering data streams with a variable number of clusters. Expert Syst. Appl. 67, 228–238 (2017)

    Article  Google Scholar 

  2. Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. SIGMOD 28(2), 49–60 (1999)

    Article  Google Scholar 

  3. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)

    Article  Google Scholar 

  4. Cheng, D., Zhu, Q., Huang, J., Wu, Q., Yang, L.: Clustering with local density peaks-based minimum spanning tree. TKDE 33(2), 374–387 (2019)

    Google Scholar 

  5. Dafir, Z., Lamari, Y., Slaoui, S.C.: A survey on parallel clustering algorithms for big data. Artif. Intell. Rev. 54(4), 2411–2443 (2021)

    Article  Google Scholar 

  6. Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)

    Google Scholar 

  7. Gong, S., Zhang, Y., Yu, G.: Clustering stream data by exploring the evolution of density mountain. PVLDB 11(4), 393–405 (2017)

    Google Scholar 

  8. Han, D., Agrawal, A., Liao, W.K., Choudhary, A.: A novel scalable DBSCAN algorithm with spark. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1393–1402. IEEE (2016)

    Google Scholar 

  9. Liu, G., et al.: MCS-GPM: multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30(6), 1050–1064 (2017)

    Article  Google Scholar 

  10. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. TKDE 31(12), 2346–2363 (2018)

    Google Scholar 

  11. Lulli, A., Dell’Amico, M., Michiardi, P., Ricci, L.: NG-DBSCAN: scalable density-based clustering for arbitrary data. PVLDB 10(3), 157–168 (2016)

    Google Scholar 

  12. Luo, G., Luo, X., Gooch, T.F., Tian, L., Qin, K.: A parallel DBSCAN algorithm based on spark. In: BDCloud, pp. 548–553. IEEE (2016)

    Google Scholar 

  13. McInnes, L., Healy, J., Astels, S., et al.: HDBSCAN: hierarchical density based clustering. J. Open Sour. Softw. 2(11), 205 (2017)

    Article  Google Scholar 

  14. Noticewala, M., Vaghela, D.: MR-IDBSCAN: efficient parallel incremental DBSCAN algorithm using MapReduce. Int. J. Comput. Appl. 93(4) (2014)

    Google Scholar 

  15. Patwary, M.M.A., Palsetia, D., Agrawal, A., Liao, W.k., Manne, F., Choudhary, A.: A new scalable parallel DBSCAN algorithm using the disjoint-set data structure. In: SC, pp. 1–11. IEEE (2012)

    Google Scholar 

  16. Puschmann, D., Barnaghi, P., Tafazolli, R.: Adaptive clustering for dynamic IoT data streams. IEEE Internet Things J. 4(1), 64–74 (2016)

    Article  Google Scholar 

  17. Qiu, T., Li, Y.J.: Fast LDP-MST: an efficient density-peak-based clustering method for large-size datasets. TKDE 35(5), 4767–4780 (2023)

    Google Scholar 

  18. Ran, X., Zhou, X., Lei, M., Tepsan, W., Deng, W.: A novel k-means clustering algorithm with a noise algorithm for capturing urban hotspots. Appl. Sci. 11(23), 11202 (2021)

    Article  Google Scholar 

  19. Song, H., Lee, J.G.: RP-DBSCAN: a superfast parallel DBSCAN algorithm based on random partitioning. In: SIGMOD, pp. 1173–1187 (2018)

    Google Scholar 

  20. Xia, S., et al.: A fast adaptive k-means with no bounds. TPAMI (2020)

    Google Scholar 

  21. Xiong, Z., Chen, R., Zhang, Y., Zhang, X.: Multi-density DBSCAN algorithm based on density levels partitioning. J. Inf. Comput. Sci. 9(10), 2739–2749 (2012)

    Google Scholar 

  22. Xu, X., Jäger, J., Kriegel, H.P.: A fast parallel clustering algorithm for large spatial databases. In: Guo, Y., Grossman, R. (eds.) High Performance Data Mining, pp. 263–290. Springer, Boston (1999). https://doi.org/10.1007/0-306-47011-X_3

  23. Yewang, C., Hailu, C., Yi, C., Zhao, K., Zhen, L., Jixiang, D.: Survey on DBSCAN acceleration algorithms for large scale data. J. Comput. Res. Dev. 60, 2028–2047 (2023)

    Google Scholar 

  24. Zhang, Y., Liu, G., Liu, A., Zhang, Y., Li, Z., Zhang, X., Li, Q.: Personalized geographical influence modeling for POI recommendation. IEEE Intell. Syst. 35(5), 18–27 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of the National Natural Science Foundation of China under grant (No. 61802273), Jiangsu Higher Education Institutions of China (No. 23KJA520011), China Science and Technology Plan Project of Suzhou (No. SYG202139).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhua Fang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Y., Fang, J., Fu, R., Chao, P. (2025). PaD-DBSCAN: Enhancing Parallel DBSCAN Clustering with Density Peak Detection. In: Sheng, Q.Z., et al. Advanced Data Mining and Applications. ADMA 2024. Lecture Notes in Computer Science(), vol 15387. Springer, Singapore. https://doi.org/10.1007/978-981-96-0811-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-981-96-0811-9_21

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-96-0810-2

  • Online ISBN: 978-981-96-0811-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics