Abstract
The traditional OLAP (On-Line Analytical Processing) systems store data in relational databases. Unfortunately, it is difficult to manage big data volumes with such systems. As an alternative, NoSQL systems (Not-only SQL) provide scalability and flexibility for an OLAP system. We define a set of rules to map star schemas and its optimization structure, a precomputed aggregate lattice, into two logical NoSQL models: column-oriented and document-oriented. Using these rules we analyse and implement two decision support systems, one for each model (using MongoDB and HBase).We compare both systems during the phases of data (generated using the TPC-DS benchmark) loading, lattice generation and querying.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008). ACM
Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Rec. 26, 65–74 (1997)
El Malki, M., Teste, O., Kopliku, A., Chevalier, M., Tournier, R.: Implementation of multidimensional databases with document-oriented NoSQL. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 379–390. Springer, Heidelberg (2015)
Kopliku, A., Chevalier, M., Malki, M.E., Teste, O., Tournier, R.: Implementation of multidimensional databases in column-oriented NoSQL Systems. In: Morzy, T., Valduriez, P., Ladjel, B. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 79–91. Springer, Heidelberg (2015)
Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Benchmark for OLAP on NoSQL technologies. In: IEEE International Conference on Research Challenges in Information Systems (RCIS), pp. 480–485. IEEE (2015)
Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), vol. 1, pp. 172–183. SciTePress (2015)
Colliat, G.: Olap, relational, and multidimensional database systems. ACM SIGMOD Rec. 25(3), 64–69 (1996)
Cuzzocrea, A., Bellatreche, L., Song, I.-Y.: Data warehousing and OLAP over big data: Current challenges and future research directions. In: 16th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 67–70. ACM (2013)
Dede, E., Govindaraju, M., Gunter, D., Canon, R.S., Ramakrishnan, L.: Performance evaluation of a MongoDB and hadoop platform for scientific data analysis. In: 4th Workshop on Scientific Cloud Computing, pp. 13–20. ACM (2013)
Dehdouh, K., Boussaid, O., Bentayed, F., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: 21st International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 469–475 (2015)
Bentayeb, F., Boussaid, O., Kabachi, N., Dehdouh, K.: Towards an OLAP environment for column-oriented data warehouses. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 221–232. Springer, Heidelberg (2014)
Bentayeb, F., Dehdouh, K., Boussaid, O.: Columnar NoSQL star schema benchmark. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds.) MEDI 2014. LNCS, vol. 8748, pp. 281–288. Springer, Heidelberg (2014)
Floratou, A., Teletia, N., Dewitt, D., Patel, J., Zhang, D.: Can the elephants handle the NoSQL onslaught? In: International Conference on Very Large Data Bases (VLDB) 5(12), 1712–1723. VLDB Endowment (2012)
Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: A conceptual model for data warehouses. Int. J. Coop. Inf. Syst. (IJCIS) 7(2–3), 215–247 (1998)
Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: International Conference on Data Engineering (ICDE), pp. 152–159. IEEE Computer Society (1996)
Han, D., Stroulia, E.: A three-dimensional data model in Hbase for large time-series dataset analysis. In: 6th International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA), pp. 47–56. IEEE (2012)
Jacobs, A.: The pathologies of big data. Commun. ACM 52(8), 36–44 (2009)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, Indianapolis (2013)
Kim, J., Moon, Y.-S., Lee, S., Lee, W.: Efficient distributed parallel top-down computation of R-OLAP data cube using mapreduce. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 168–179. Springer, Heidelberg (2012)
LeFevre, J., Sankaranarayanan, J., Hacigumus, H., Tatemura, J., Polyzotis, N., Carey, M.J.: MISO: souping up big data query processing with a multistore system. In: International Conference on Management of data (SIGMOD), pp. 1591–1602. ACM (2014)
Li, C.: Transforming relational database into Hbase: A case study. In: International Conference on Software Engineering and Service Sciences (ICSESS), pp. 683–687. IEEE (2010)
Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: From conceptual modeling to logical representation. Data Knowl. Eng. (DKE) 59(2), 348–377 (2006). Elsevier
Morfonios, K., Konakas, S., Ioannidis, Y., Kotsis, N.: R-OLAP implementations of the data cube. ACM Comput. Surv. 39(4), 12 (2007). ACM
Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: International Conference on Management of data (SIGMOD), pp. 165–178. ACM (2009)
Ravat, F., Teste, O., Tournier, R., Zurfluh, G.: Algebraic and Graphic Languages for OLAP Manipulations. Int. J. Data Warehouse. Min. (IJDWM) 4(1), 17–46 (2008). IGI Publishing
Simitsis, A., Vassiliadis, P., Sellis, T.: Optimizing ETL processes in data warehouses. In: International Conference on Data Engineering (ICDE), pp. 564–575. IEEE (2005)
Stonebraker, M.: New opportunities for new SQL. Commun. ACM 55(11), 10–11 (2012)
Stonebraker, M., Madden, S., Abadi, D.J., Harizopoulos, S., Hachem, N., Helland, P.: The end of an architectural era: (it’s time for a complete rewrite). In: 33rd International Conference on Very large Data Bases (VLDB), pp. 1150–1160. ACM (2007)
Strozzi, C.: NoSQL – A relational database management system (2007–2010). http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/nosql/Home%20Page
Vajk, T., Feher, P., Fekete, K., Charaf, H.: Denormalizing data into schema-free databases. In: 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 747–752. IEEE (2013)
Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N.: ARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Engineering Bulletin, 23(4), IEEE, pp. 42–47, 2000
Tahara, D., Diamond, T., Abadi, D.J.: Sinew: a SQL system for multi-structured data. In: International Conference on Management of data (SIGMOD), pp. 815–826. ACM (2014)
TPC-DS. Transaction Processing Performance Council, Decision Support benchmark, version 1.3.0 (2014). http://www.tpc.org/tpcds/
Wrembel, R.: A survey of managing the evolution of data warehouses. Int. J. Data Warehouse. Min. (IJDWM) 5(2), 24–56 (2009). IGI Publishing
Zhao, H., Ye, X.: A practice of TPC-DS multidimensional implementation on NoSQL database systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2013. LNCS, vol. 8391, pp. 93–108. Springer, Heidelberg (2014)
Acknowledgements
This work is supported by the ANRT funding under CIFRE-Capgemini partnership.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R. (2015). How Can We Implement a Multidimensional Data Warehouse Using NoSQL?. In: Hammoudi, S., Maciaszek, L., Teniente, E., Camp, O., Cordeiro, J. (eds) Enterprise Information Systems. ICEIS 2015. Lecture Notes in Business Information Processing, vol 241. Springer, Cham. https://doi.org/10.1007/978-3-319-29133-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-29133-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29132-1
Online ISBN: 978-3-319-29133-8
eBook Packages: Computer ScienceComputer Science (R0)