Abstract
Extract-Transform-Load (\(\mathcal {ETL}\)) is a crucial phase in Data Warehouse (\(\mathcal {DW}\)) design life-cycle that copes with many issues: data provenance, data heterogeneity, process automation, data refreshment, execution time, etc. Ontologies and Semantic Web technologies have been largely used in the \(\mathcal {ETL}\) phase. Ontologies are a buzzword used by many research communities such as: Databases, Artificial Intelligence (AI), Natural Language Processing (NLP), where each community has its type of ontologies: conceptual canonical ontologies (for databases), conceptual non-canonical ontologies (for AI), and linguistic ontologies (for NLP). In \(\mathcal {ETL}\) approaches, these three types of ontologies are considered. However, these studies do not consider the types of the used ontologies which usually affect the quality of the managed data. We propose in this paper a semantic \(\mathcal {ETL}\) approach which considers both canonical and non-canonical layers. To evaluate the effectiveness of our approach, experiments are conducted using Oracle semantic databases referencing LUBM benchmark ontology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arenas, M., Bertossi, L., Chomicki, J.: Consistent query answers in inconsistent databases. In: Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. pp. 68–79. ACM (1999)
Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Computing Surveys (CSUR) 41(3), 16 (2009)
Bellatreche, L., Dung, N.X., Pierra, G., Hondjack, D.: Contribution of ontology-based data modeling to automatic integration of electronic catalogues within engineering databases. Computers in Industry 57(8), 711–724 (2006)
Bellatreche, L., Khouri, S., Berkani, N.: Semantic data warehouse design: from etl to deployment à la carte. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part II. LNCS, vol. 7826, pp. 64–83. Springer, Heidelberg (2013)
Chakroun, C., Bellatreche, L., Ait-Ameur, Y., Berkani, N., Jean, S.: Be careful when designing semantic databases: data and concepts redundancy. In: 2013 IEEE Seventh International Conference on Research Challenges in Information Science (RCIS), pp. 1–12. IEEE (2013)
Golfarelli, M.: From user requirements to conceptual design in data warehouse design a survey. In: Data Warehousing Design and Advanced Engineering Applications Methods for Complex Construction, pp. 1–16 (2010)
Gruber, T.: A translation approach to portable ontology specifications. Knowledge Acquisition 5(2), 199–220 (1993)
Jean, S., Pierra, G., Ameur, Y.A.: Domain ontologies: a database-oriented analysis. In: WEBIST (Selected Papers), pp. 238–254 (2006)
Lenzerini, M.: Data integration: a theoretical perspective. In: PODS, pp. 233–246 (2002)
Nebot, V., Berlanga, R.: Building data warehouses with semantic web data. Decision Support Systems (2012)
Niinimäki, M., Niemi, T.: An ETL process for OLAP using RDF/OWL ontologies. In: Spaccapietra, S., Zimányi, E., Song, I.-Y. (eds.) Journal on Data Semantics XIII. LNCS, vol. 5530, pp. 97–119. Springer, Heidelberg (2009)
Park, Y.R., Kim, J., Lee, H.W., Yoon, Y.J., Kim, J.H.: Gochase-ii: correcting semantic inconsistencies from gene ontology-based annotations for gene products. BMC Bioinformatics 12(1), 1–7 (2011)
Romero, O., Simitsis, A., Abelló, A.: GEM: requirement-driven generation of ETL and multidimensional conceptual designs. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 80–95. Springer, Heidelberg (2011)
Simitsis, A., Skoutas, D., Castellanos, M.: Representation of conceptual etl designs in natural language using semantic web technology. Data & Knowledge Engineering 69(1), 96–115 (2010)
Skoutas, D., Simitsis, A.: Ontology-based conceptual design of etl processes for both structured and semi-structured data. International Journal on Semantic Web and Information Systems (IJSWIS) 3(4), 1–24 (2007)
Skoutas, D., Simitsis, A., Sellis, T.: Ontology-driven conceptual design of ETL processes using graph transformations. In: Spaccapietra, S., Zimányi, E., Song, I.-Y. (eds.) Journal on Data Semantics XIII. LNCS, vol. 5530, pp. 120–146. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Khouri, S., Abdellaoui, S., Nader, F. (2015). Avoiding Ontology Confusion in ETL Processes. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds) New Trends in Databases and Information Systems. ADBIS 2015. Communications in Computer and Information Science, vol 539. Springer, Cham. https://doi.org/10.1007/978-3-319-23201-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-23201-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23200-3
Online ISBN: 978-3-319-23201-0
eBook Packages: Computer ScienceComputer Science (R0)