Dynamic Management of Distributed Machine Learning Projects | SpringerLink
Skip to main content

Dynamic Management of Distributed Machine Learning Projects

  • Conference paper
  • First Online:
Intelligent Distributed Computing XV (IDC 2022)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1089))

Included in the following conference series:

  • 226 Accesses

Abstract

Given the new requirements of Machine Learning problems in the last years, especially in what concerns the volume, diversity and speed of data, new approaches are needed to deal with the associated challenges. In this paper we describe CEDEs - a distributed learning system that runs on top of an Hadoop cluster and takes advantage of blocks, replication and balancing. CEDEs trains models in a distributed manner following the principle of data locality, and is able to change parts of the model through an optimization module, thus allowing a model to evolve over time as the data changes. This paper describes its generic architecture, details the implementation of the first modules, and provides a first validation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 25167
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 31459
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
JPY 31459
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zhou, L., Pan, S., Wang, J., Vasilakos, A.V.: Machine learning on big data: opportunities and challenges. Neurocomputing 237, 350–361 (2017)

    Article  Google Scholar 

  2. Mohammadi, M., Al-Fuqaha, A., Sorour, S., Guizani, M.: Deep learning for IoT big data and streaming analytics: a survey. IEEE Commun. Surv. Tutor. 20(4), 2923–2960 (2018)

    Article  Google Scholar 

  3. Gomes, H.M., Read, J., Bifet, A., Barddal, J.P., Gama, J.: Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explorations Newsl. 21(2), 6–22 (2019)

    Article  Google Scholar 

  4. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)

    Google Scholar 

  5. Attiya, H.: Concurrency and the principle of data locality. IEEE Distrib. Syst. Online 8(9), 3 (2007)

    Article  Google Scholar 

  6. Dong, X., Yu, Z., Cao, W., Shi, Y., Ma, Q.: A survey on ensemble learning. Front. Comput. Sci. 14(2), 241–258 (2020)

    Article  Google Scholar 

  7. Carneiro, D., Guimarães, M., Silva, F., Novais, P.: A predictive and user-centric approach to machine learning in data streaming scenarios. Neurocomputing 484, 238–249 (2021)

    Article  Google Scholar 

  8. Carneiro, D., Guimarães, M., Carvalho, M., Novais, P.: Using meta-learning to predict performance metrics in machine learning problems. Expert Syst. 40, e12900 (2021)

    Google Scholar 

  9. Ramos, D., Carneiro, D., Novais, P.: Using evolving ensembles to deal with concept drift in streaming scenarios. In: Camacho, D., Rosaci, D., Sarné, G.M.L., Versaci, M. (eds.) IDC 2021. SCI, vol. 1026, pp. 59–68. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-96627-0_6

    Chapter  Google Scholar 

Download references

Acknowledgments

This work was supported by FCT - Fundação para a Ciência e Tecnologia within projects UIDB/04728/2020 and EXPL/CCI-COM/0706/2021.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Davide Carneiro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Oliveira, F. et al. (2023). Dynamic Management of Distributed Machine Learning Projects. In: Braubach, L., Jander, K., Bădică, C. (eds) Intelligent Distributed Computing XV. IDC 2022. Studies in Computational Intelligence, vol 1089. Springer, Cham. https://doi.org/10.1007/978-3-031-29104-3_3

Download citation

Publish with us

Policies and ethics