Abstract
The data warehouse design methodologies require a novel approach in the Big Data context, because the methodologies have to provide solutions to face the issues related to the 5 Vs (Volume, Velocity, Variety, Veracity, and Value). So it is mandatory to support the designer through automatic techniques able to quickly produce a multidimensional schema using and integrating several data sources, which can be also unstructured and, therefore, need an ontology-based reasoning. Accordingly, the methodologies have to adopt agile techniques, in order to change the multidimensional schema as the business requirements change, without a complete design process. Furthermore, hybrid approaches must be used instead of the traditional data-driven or requirement-driven approaches, in order to avoid missing the adhesion to user requirements and to produce a valuable multidimensional schema compliant with data sources. In the paper, we perform a metric comparison among different methodologies, in order to demonstrate that methodologies classified as hybrid, ontology-based, automatic, and agile are tailored for the Big Data context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)
Buneman, P., Davidson, S., Fernandez, M., Suciu, D.: Adding structure to unstructured data. In: Afrati, F., Kolaitis, P. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 336–350. Springer, Heidelberg (1997). doi:10.1007/3-540-62222-5_55
Rehman, N.U., Mansmann, S., Weiler, A., Scholl, M.H.: Building a data warehouse for twitter stream exploration. In: International Conference on Advances in Social Networks Analysis and Mining, pp. 1341–1348. IEEE Computer Society (2012)
Waters, R.D., Jamal, J.Y.: Tweet, tweet, tweet: a content analysis of nonprofit organizations’ twitter updates. Public Relat. Rev. 37(3), 321–324 (2011)
He, L., Chen, Y., Meng, N., Liu, L.Y.: An ontology-based conceptual modeling method for data warehouse. In: International Conference on Information Technology, Computer Engineering and Management Sciences, vol. 4, pp. 130–133. IEEE (2011)
Vranesic, H., Rovan, L.: Ontology-based data warehouse development process. In: International Conference on Information Technology Interfaces, pp. 205–210. IEEE Computer Society (2009)
Di Tria, F., Lefons, E., Tangorra, F.: Ontological approach to data warehouse source integration. In: Gelenbe, E., Lent, R. (eds.) Information Sciences and Systems. Lecture Notes in Electrical Engineering, vol. 264, pp. 251–259. Springer, Heidelberg (2013). doi:10.1007/978-3-319-01604-7_25
Khouri, S., Bellatreche, L.: DWOBS: data warehouse design from ontology-based sources. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011. LNCS, vol. 6588, pp. 438–441. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20152-3_34
Thenmozhi, M., Vivekanandan, K.: A tool for data warehouse multidimensional schema design using ontology. Int. J. Comput. Sci. Issues 10(2), 161–168 (2013)
Farooq, F., Sarwar, S.M.: Real-time data warehousing for business intelligence. In: Proceedings of the 8th International Conference on Frontiers of Information Technology, pp. 38:1–38:7. ACM, New York (2010)
Dehdouh, K., Bentayeb, F., Boussaid, O., Kabachi, N.: Columnar NoSQL CUBE: aggregation operator for columnar NoSQL data warehouse. In: 2014 IEEE International Conference on Systems, Man and Cybernetics, pp. 3828–3833. IEEE (2014)
Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: How can we implement a multidimensional data warehouse using NoSQL? In: Hammoudi, S., Maciaszek, L., Teniente, E., Camp, O., Cordeiro, J. (eds.) ICEIS 2015. LNBIP, vol. 241, pp. 108–130. Springer, Cham (2015). doi:10.1007/978-3-319-29133-8_6
Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012). VLDB Endowment
Di Tria, F., Lefons, E., Tangorra, F.: Data warehouse automatic design methodology. In: Hu, W., Kaabouch, N. (eds.) Big Data Management, Technologies, and Applications, pp. 115–149. IGI Global, Hershey (2014)
Phipps, C., Davis, K.C.: Automating data warehouse conceptual schema design and evaluation. In: Lakshmanan, L.V.S. (ed.) Design and Management of Data Warehouses, vol. 58, pp. 23–32. CEUR-WS.org, Toronto (2002)
Corr, L., Stagnitto, J.: Agile data warehouse design: collaborative dimensional modeling, from whiteboard to star schema. DecisionOne Consulting (2011)
Mazón, J.N., Trujillo, J.: A hybrid model driven development framework for the multidimensional modeling of data warehouses! ACM SIGMOD Rec. 38(2), 12–17 (2009)
Mazón, J.N., Trujillo, J., Lechtenbörger, J.: Reconciling requirement-driven data warehouses with data sources via multidimensional normal forms. Data Knowl. Eng. 63, 725–751 (2007)
Di Tria, F., Lefons, E., Tangorra, F.: Academic data warehouse design using a hybrid methodology. Comput. Sci. Inf. Syst. 12(1), 135–160 (2015)
Di Tria, F., Lefons, E., Tangorra, F.: Hybrid methodology for data warehouse conceptual design by UML schemas. Inf. Softw. Technol. 54(4), 360–379 (2012)
Romero, O., Abelló, A.: A survey of multidimensional modeling methodologies. Int. J. Data Warehous. Min. 5, 1–23 (2009)
Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: a conceptual model for data warehouses. Int. J. Coop. Inf. Syst. 7(2), 215–247 (1998)
Mazón, J.N., Trujillo, J., Serrano, M., Piattini, M.: Designing data warehouses: from business requirement analysis to multidimensional modeling. In: REBNITA, vol. 5, pp. 44–53 (2005)
dell’Aquila, C., Di Tria, F., Lefons, E., Tangorra, F.: Dimensional fact model extension via predicate calculus. In: 24th International Symposium on Computer and Information Sciences, pp. 211–217. IEEE (2009)
Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: MAD skills: new analysis practices for big data. Proc. VLDB Endow. 2(2), 1481–1492 (2009). VLDB Endowment
Di Tria, F., Lefons, E., Tangorra, F.: Cost-benefit analysis of data warehouse design methodologies. Inf. Syst. 63, 47–62 (2017)
Serrano, M.A., Calero, C., Piattini, M.: Metrics for data warehouse quality. In: Effective Databases for Text & Document Management, pp. 156–173. IGI Global (2003)
Serrano, M., Calero, C., Sahraoui, H.A., Piattini, M.: Empirical studies to assess the understandability of data warehouse schemas using structural metrics. Softw. Qual. J. 16(1), 79–106 (2008)
Ley, M.: DBLP: some lessons learned. Proc. VLDB Endow. 2(2), 1493–1500 (2009). VLDB Endowment
Foxvog, D.: Cyc. In: Poli, R., Healy, M., Kameas, A. (eds.) Theory and Applications of Ontology: Computer Applications, pp. 259–278. Springer, Dordrecht (2010). doi:10.1007/978-90-481-8847-5_12
dell’Aquila, C., Di Tria, F., Lefons, E., Tangorra, F.: Logic programming for data warehouse conceptual schema validation. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2010. LNCS, vol. 6263, pp. 1–12. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15105-7_1
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Di Tria, F., Lefons, E., Tangorra, F. (2017). Evaluation of Data Warehouse Design Methodologies in the Context of Big Data. In: Bellatreche, L., Chakravarthy, S. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2017. Lecture Notes in Computer Science(), vol 10440. Springer, Cham. https://doi.org/10.1007/978-3-319-64283-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-64283-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64282-6
Online ISBN: 978-3-319-64283-3
eBook Packages: Computer ScienceComputer Science (R0)