Quantifying Cloud Data Analytic Platform Scalability with Extended TPC-DS Benchmark | SpringerLink
Skip to main content

Quantifying Cloud Data Analytic Platform Scalability with Extended TPC-DS Benchmark

  • Conference paper
  • First Online:
Performance Evaluation and Benchmarking (TPCTC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 13169))

Included in the following conference series:

  • 732 Accesses

Abstract

In the past decade, many data analytic systems have been moved to cloud platforms. One major reason for this trend is the elasticity that cloud platforms can provide. However, different vendors describe the scalability of their platform in different ways, so there is a need to measure and compare the scalability of different data analytic platforms in a consistent way. To achieve this goal, we extend the well-known TPC Benchmark™DS (TPC-DS). The primary metrics in TPC-DS are Performance Metric, Price-Performance Metric and System availability date. We propose an additional primary metric, Scalability Metric, to evaluate the scalability of System Under Test at a given scale factor across different resource levels. We use a set of experimental performance runs to demonstrate how Scalability Metric is derived and how it measures the scalability of a cloud data analytic platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 6291
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7864
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. T. P. P. C. (TPC): TPC BENCHMARK™ DS. http://tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v3.1.0.pdf

  2. Sakr, S., Zomaya, A.Y. (eds.): Encyclopedia of Big Data Technologies. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-77525-8

    Book  Google Scholar 

  3. Trivedi, M., Chen, Z.: Lessons learned from the industry’s first TPC benchmark DS (TPC-DS). In: Nambiar, R., Poess, M. (eds.) TPCTC 2018. LNCS, vol. 11135, pp. 140–154. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11404-6_11

    Chapter  Google Scholar 

  4. Yeruva, S., Kumar, P.V., Padmanabham, P.: Distributed data warehouse - experimentation with TPC-DS. In: 2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) (2015)

    Google Scholar 

  5. Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Spring Joint Computer Conference (1967)

    Google Scholar 

  6. Wikipedia: Amdahl’s law

    Google Scholar 

  7. Gunther, N.J.: Guerrilla Capacity Planning. Springer-Verlag, Berlin Heidelberg (2007)

    Google Scholar 

  8. Gunther, N.J.: A New Interpretation of Amdahl’s Law and Geometric Scalability, 17 Oct 2002. https://arxiv.org/abs/cs/0210017

  9. Gunther, N.J.: How to Quantify Scalability. http://www.perfdynamics.com/Manifesto/USLscalability.html#tth_sEc1

  10. Microsoft: Azure Synapse Analytics pricing. https://azure.microsoft.com/en-us/pricing/details/synapse-analytics/

  11. Amazon Web Services, Inc., Amazon Redshift pricing. https://aws.amazon.com/redshift/pricing/

  12. Snowflake Inc., Warehouse Size. https://docs.snowflake.com/en/user-guide/warehouses-overview.html#warehouse-size

  13. Google: BigQuery pricing. https://cloud.google.com/bigquery/pricing

  14. Wikipedia: Simple linear regression. https://en.wikipedia.org/wiki/Simple_linear_regression

  15. Wikipedia: Linear regression. https://en.wikipedia.org/wiki/Linear_regression

  16. ScienceDirect: Multiple Regression Analysis, ScienceDirect. https://www.sciencedirect.com/topics/economics-econometrics-and-finance/multiple-regression-analysis

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoheng Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, G., Johnson, T., Cilimdzic, M. (2022). Quantifying Cloud Data Analytic Platform Scalability with Extended TPC-DS Benchmark. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2021. Lecture Notes in Computer Science(), vol 13169. Springer, Cham. https://doi.org/10.1007/978-3-030-94437-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94437-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94436-0

  • Online ISBN: 978-3-030-94437-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics