Abstract
Most recent applications such as sensor networks generate continuous data streams. Additional constraints are faced for efficient query processing of such data streams that have uncertain nature and require fast and timely processing. Traditional query processing techniques of static data process the whole data without partitioning them, which is not applicable to data streams. Applying data clustering is demanded as a preprocessing step of data streams. Thus, in this paper, we propose the Incomplete High dimensional Data streams Query processing (IHDQ) algorithm for efficiently answering data streams queries. Obtained results reveal the efficiency of clustering and query processing of the proposed IHDQ compared to the alternative state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Najib, F.M., Ismail, R.M., Badr, N.L., Gharib, T.: Clustering based approach for incomplete data streams processing. J. Intell. Fuzzy Syst. 38(3), 3213–3227 (2020)
Najib, F.M., Ismail, R.M., Badr, N.L., Tolba, M.F.: Multiple queries optimization for data streams on cloud computing. In: Tenth International Conference on Computer Engineering & Systems (ICCES), pp. 28–33. IEEE (2015)
Liu, Y., Li, X., Chen, X., Wang, X., Li, H.: High-performance machine learning for large-scale data classification considering class imbalance. Sci. Program. (2020)
Najib, F.M., Ismail, R.M., Badr, N.L., Tolba, M.F.: Cloud-based data streams optimization. WIREs Data Min. Knowl. Discov. 8(3), e1247 (2018)
Datta, S., Bhattacharjee, S., Das, S.: Clustering with missing features: a penalized dissimilarity measure based approach. Mach. Learn. 107(12), 1987–2025 (2018)
Bu, F., Chen, Z., Zhang, Q., Yang, L.T.: Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud. J. Supercomput. 72(8), 2977–2990 (2016)
Dzulkalnine, M.F., Sallehuddin, R.: Missing data imputation with fuzzy feature selection for diabetes dataset. SN. Appl. Sci. 1(4), 362 (2019)
Kaur, A., Datta, A.: A novel algorithm for fast and scalable subspace clustering of high-dimensional data. J. Big Data 2(1), 17 (2015)
Jain, N., Murthy, C.A.: Connectedness-based subspace clustering. Knowl. Inf. Syst. 58(1), 9–34 (2019)
Wang, X., Lei, Z., Guo, X., Zhang, C., Shi, H., Li, S.Z.: Multi-view subspace clustering with intactness-aware similarity. Pattern Recogn. 88, 50–63 (2019)
Struski, L., Śmieja, M., Tabor, J.: Pointed subspace approach to incomplete data. J. Classif. 28, 1–6 (2019)
Khalifa, S., Martin, P., Young, R.: Label-aware distributed ensemble learning: a simplified distributed classifier training model for big data. Big Data Res. 15, 1 (2019)
van Rijn, J.N., Holmes, G., Pfahringer, B., Vanschoren, J.: The online performance estimation framework: heterogeneous ensemble learning for data streams. Mach. Learn. 107(1), 149–176 (2018)
Yin, C., Xia, L., Zhang, S., Sun, R., Wang, J.: Improved clustering algorithm based on high-speed network data stream. Soft Comput. 22(13), 4185–4195 (2018)
Shaikh, S.A., Watanabe, Y., Wang, Y., Kitagawa, H.: Smart scheme: an efficient query execution scheme for event-driven stream processing. Knowl. Inf. Syst. 58(2), 341–370 (2019)
Zhang, L., Lu, W., Liu, X., Pedrycz, W., Zhong, C., Wang, L.: A global clustering approach using hybrid optimization for incomplete data based on interval reconstruction of missing value. Int. J Intell. Syst. 31(4), 297–313 (2016)
Daily and Sports Activities Data Set. https://archive.ics.uci.edu/ml/datasets/Daily+and+Sports+Activities
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Najib, F.M., Ismail, R.M., Badr, N.L., Gharib, T.F. (2021). An Efficient Approach for Query Processing of Incomplete High Dimensional Data Streams. In: Hassanien, AE., Chang, KC., Mincong, T. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2021. Advances in Intelligent Systems and Computing, vol 1339. Springer, Cham. https://doi.org/10.1007/978-3-030-69717-4_57
Download citation
DOI: https://doi.org/10.1007/978-3-030-69717-4_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69716-7
Online ISBN: 978-3-030-69717-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)