Abstract
Defining a business intelligence project for a transportation system with more than 10 k-users per day could become a challenging problem. A transportation system like this would generate more than 400 million of registers per month when monitoring users each minute. That is why, some strategies need to be applied to the ETL process to correctly handle the data generated by big transportation systems. This paper explores different operational database (OD) architectures and analyze their impact on processing time of the ETL stage in a business intelligence. The database architectures reviewed are: one centralized OD, one logical-centralized OD and distributed OD. This model is being tested with the transportation system defined in the city of Poza Rica, Mexico. This system contains more than three million simulated registers per day and the entire ETL process is done under 136 s. This model runs on a Quad-core Intel Xeon processor 8 GB RAM OSX Yosemite 10.10.5 computer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sharif, A., Li, j., Khalil, M., Kumar. R., Irfan, M., Sharif, A.: Internet of things—smart traffic management system for smart cities using big data analytics. In: 2017 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 281–284. Chengdu (2017)
Liu, Y.: Big data technology and its analysis of application in urban intelligent transportation system. In: 2018 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), pp. 17–19. Xiamen (2018)
Yang, Q., Gao, Z., Kong, X., Rahim, A., Wang, J., Xia, F.: Taxi operation optimization based on big traffic data. In: 2015 IEEE 12th International Conference on Ubiquitous Intelligence and Computing and 2015 IEEE 12th International Conference on Autonomic and Trusted Computing and 2015 IEEE 15th International Conference on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), pp. 127–134. Beijing (2015)
de Moreno, I.F.: La sociedad del conocimiento. General José María Córdova 5(7), 40–44 (2009)
García, F.: Engineering contributions to a multicultural perspective of the knowledge society. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje 10(1), 17–18 (2015)
Kumar, N., Goel, S., Mallick, P.: Smart cities in India: features, policies, current status, and challenges. In: 2018 Technologies for Smart-City Energy Security and Power (ICSESP), pp. 1–4. Bhubaneswar (2018)
Ynzunza., C., Landeta, I., Bocarando, J., Aguilar, F., Larios, M.: El entorno de la industria 4.0: Implicaciones y Perspectivas Futuras. In: Conciencia Tecnológica, ISSN: 1405–5597
Lom, M., Pribyl, O., Svitek, M.: Industry 4.0 as a part of smart cities. In: 2016 Smart Cities Symposium Prague (SCSP), pp. 1–6. Prague (2016)
Gokalp, M., Kayabay, K., Akyol, M., Eren, P., Koçyiğit, A.: Big data for industry 4.0: a conceptual framework. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 431–434. Las Vegas, NV (2016)
Ferreira, N., et al.: Urbane: A 3D framework to support data driven decision making in urban development. In: 2015 IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 97–104. Chicago, IL (2015)
Zhao, M.: Urban traffic flow guidance system based on data driven. In: 2009 International Conference on Measuring Technology and Mechatronics Automation, pp. 653–657. Zhangjiajie, Hunan (2009)
Zhang, S., Jia, S., Ma, C., Wang, Y.: Impacts of public transportation fare reduction policy on urban public transport sharing rate based on big data analysis. In: 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pp. 280–284. Chengdu (2018)
Guido, G., Rogano, D., Vitale, A., Astarita, V., Festa, D.: Big data for public transportation: a DSS framework. In: 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), pp. 872–877. Naples (2017)
Yuan, W., Deng, P., Taleb, T., Wan, J., Bi, C.: An unlicensed taxi identification model based on big data analysis. IEEE Trans. Intell. Transp. Syst. 17(6), 1703–1713 (2016)
Wu, P., Chen, Y.: Big data analytics for transport systems to achieve environmental sustainability. In: 2017 International Conference on Applied System Innovation (ICASI), pp. 264–267. Sapporo (2017)
Huang, J., Guo, C.: An MAS-based and fault-tolerant distributed ETL workflow engine. In: Proceedings of the 2012 IEEE 16th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 54–58. Wuhan (2012)
Yang, P., Liu, Z., Ni, J.: Performance tuning in distributed processing of ETL. In: 2013 Seventh International Conference on Internet Computing for Engineering and Science, pp. 85–88. Shanghai, (2013)
Tank, D., Ganatra, A., Kosta, Y., Bhensdadia, C.: Speeding ETL processing in data warehouses using high-performance joins for changed data capture (CDC). In: 2010 International Conference on Advances in Recent Technologies in Communication and Computing, pp. 365–368. Kottayam (2010)
Wang, G., Guo, C.: Research of distributed ETL engine based on MAS and data partition. In: Proceedings of the 2011 15th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 342–347. Lausanne, (2011)
Azqueta, A., Patiño, M., Brondino, I., Jimenez, R.: Massive data load on distributed database systems over HBase. In: 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 776–779. Madrid (2017)
Chen, G., An, B., Liu, Y.: A novel agent-based parallel ETL system for massive data. In: 2016 Chinese Control and Decision Conference, pp. 3942–3948. Yinchuan (2016)
Guerreiro, G., Figueiras, P., Silva, R., Costa, R., Jardim, R.: An architecture for big data processing on intelligent transportation systems: an application scenario on highway traffic flows. In: 2016 IEEE 8th International Conference on Intelligent Systems (IS), pp. 65–72. Sofia (2016)
Acknowledgements
This research is sponsored in part by the Mexican Agency for International Development Cooperation (AMEXCID) and the Uruguayan Agency for International Cooperation (AUCI) through the Joint Uruguay-Mexico Cooperation Fund. This research work reflects only the points of view of the authors and not those of the AMEXCID or the AUCI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cristobal-Salas, A. et al. (2019). ETL Processing in Business Intelligence Projects for Public Transportation Systems. In: Torres, M., Klapp, J. (eds) Supercomputing. ISUM 2019. Communications in Computer and Information Science, vol 1151. Springer, Cham. https://doi.org/10.1007/978-3-030-38043-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-38043-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38042-7
Online ISBN: 978-3-030-38043-4
eBook Packages: Computer ScienceComputer Science (R0)