Abstract
In the past decade, many data analytic systems have been moved to cloud platforms. One major reason for this trend is the elasticity that cloud platforms can provide. However, different vendors describe the scalability of their platform in different ways, so there is a need to measure and compare the scalability of different data analytic platforms in a consistent way. To achieve this goal, we extend the well-known TPC Benchmark™DS (TPC-DS). The primary metrics in TPC-DS are Performance Metric, Price-Performance Metric and System availability date. We propose an additional primary metric, Scalability Metric, to evaluate the scalability of System Under Test at a given scale factor across different resource levels. We use a set of experimental performance runs to demonstrate how Scalability Metric is derived and how it measures the scalability of a cloud data analytic platform.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
T. P. P. C. (TPC): TPC BENCHMARK™ DS. http://tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v3.1.0.pdf
Sakr, S., Zomaya, A.Y. (eds.): Encyclopedia of Big Data Technologies. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-77525-8
Trivedi, M., Chen, Z.: Lessons learned from the industry’s first TPC benchmark DS (TPC-DS). In: Nambiar, R., Poess, M. (eds.) TPCTC 2018. LNCS, vol. 11135, pp. 140–154. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11404-6_11
Yeruva, S., Kumar, P.V., Padmanabham, P.: Distributed data warehouse - experimentation with TPC-DS. In: 2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) (2015)
Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Spring Joint Computer Conference (1967)
Wikipedia: Amdahl’s law
Gunther, N.J.: Guerrilla Capacity Planning. Springer-Verlag, Berlin Heidelberg (2007)
Gunther, N.J.: A New Interpretation of Amdahl’s Law and Geometric Scalability, 17 Oct 2002. https://arxiv.org/abs/cs/0210017
Gunther, N.J.: How to Quantify Scalability. http://www.perfdynamics.com/Manifesto/USLscalability.html#tth_sEc1
Microsoft: Azure Synapse Analytics pricing. https://azure.microsoft.com/en-us/pricing/details/synapse-analytics/
Amazon Web Services, Inc., Amazon Redshift pricing. https://aws.amazon.com/redshift/pricing/
Snowflake Inc., Warehouse Size. https://docs.snowflake.com/en/user-guide/warehouses-overview.html#warehouse-size
Google: BigQuery pricing. https://cloud.google.com/bigquery/pricing
Wikipedia: Simple linear regression. https://en.wikipedia.org/wiki/Simple_linear_regression
Wikipedia: Linear regression. https://en.wikipedia.org/wiki/Linear_regression
ScienceDirect: Multiple Regression Analysis, ScienceDirect. https://www.sciencedirect.com/topics/economics-econometrics-and-finance/multiple-regression-analysis
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, G., Johnson, T., Cilimdzic, M. (2022). Quantifying Cloud Data Analytic Platform Scalability with Extended TPC-DS Benchmark. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2021. Lecture Notes in Computer Science(), vol 13169. Springer, Cham. https://doi.org/10.1007/978-3-030-94437-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-94437-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94436-0
Online ISBN: 978-3-030-94437-7
eBook Packages: Computer ScienceComputer Science (R0)