Abstract
In today’s data age, the big data processing analysis framework plays an important role in mass information processing, along with the increasing of massive data. “Sharing Data” is proposed to enhance the performance of data processing through structured data scheduling. However, such approach makes the higher communication cost and buffer cost for the extra data copy and buffering. Hence, in the big data analysis environment, this paper uses based on the correlation of data, Dynamic Cluster Scheduling Algorithm(DCSA) is proposed for parallel optimization of big data tasks. Firstly, a dynamic data queue based on the server’s request database is generated. The priority of data item and size of data item are as the considerations of dynamic data queue for data clustering association. And then the weights are introduced, the dynamic data item is made equalization to provide the basis for the multi-channel optimal scheduling. Secondly, according to the relevance of the data items, the mechanism of data optimized placement is used to make the data which are aggregated in the same frame. After the placement is completed, the dynamic data is uniformly scheduled to minimize the cost at the time of migration, with the local characteristics of the data item as constraints. Through the target iteration, the optimal scheduling scheme is adjusted, and finally to achieve multi-channel optimal scheduling. Experiments show that the proposed method enables dynamic data to achieve optimal scheduling.




We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
Wu, G., et al. (2013). A decentralized approach for mining event correlations in distributed system monitoring. Journal of Parallel and Distributed Computing, 73(3), 330–340. https://doi.org/10.1016/j.jpdc.2012.09.007
Qiu, M., et al. (2015). Data allocation for hybrid memory with genetic algorithm. IEEE Transactions on Emerging Topics in Computing, 3(4), 544–555. https://doi.org/10.1109/TETC.2015.2398824
Qiu, M., et al. (2008). Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP. Journal of Parallel and Distributed Computing, 68(4):443–455. https://doi.org/10.1016/j.jpdc.2007.06.014. URL https://www.sciencedirect.com/science/article/pii/S0743731507001013
Wang, J., Qiu, M., & Guo, B. (2017). Enabling real-time information service on telehealth system over cloud-based big data platform. Journal of Systems Architecture, 72, 69–79.
Qiu, L., Gai, K., & Qiu, M. (2016). Optimal big data sharing approach for tele-health in cloud computing. 2016 IEEE International Conference on Smart Cloud (SmartCloud), 184–189. https://doi.org/10.1109/SmartCloud.2016.21
Qiu, M., et al. (2013). Rna nanotechnology for computer design and in vivo computation. Philosophical Transactions Series A, Mathematical, Physical, and Engineering Sciences, 371(2000)
Qiu, M., Li, H., & Sha, E. H. (2009). Heterogeneous real-time embedded software optimization considering hardware platform. In Shin SY, Ossowski S (Eds.) Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), (pp. 1637–1641). Honolulu, Hawaii, USA, March 9-12, 2009, ACM. https://doi.org/10.1145/1529282.1529651
Qiu, M., et al. (2013). Security-aware optimization for ubiquitous computing systems with SEAT graph approach. Journal of Computer and System Sciences, 79(5), 518–529. https://doi.org/10.1016/j.jcss.2012.11.002
Li, Y., Song, Y., Jia, L., et al. (2020). Intelligent fault diagnosis by fusing domain adversarial training and maximum mean discrepancy via ensemble learning. IEEE Trans on Industrial Informatics, 17(4), 2833–2841.
Qiu, M., Gai, K., & Xiong, Z. (2018). Privacy-preserving wireless communications using bipartite matching in social big data. FGCS, 87, 772–781.
Novak, A., Sucha, P., Novotny, M., Stec, R., & Hanzalek, Z. (2022). Scheduling jobs with normally distributed processing times on parallel machines. European Journal of Operational Research, 297(2), 422–441. https://doi.org/10.1016/j.ejor.2021.05.01. URL https://ideas.repec.org/a/eee/ejores/v297y2022i2p422-441.html
Qiu, M., et al. (2008). Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP. Journal of Parallel and Distributed Computing, 68(4), 443–455. URL https://www.sciencedirect.com/science/article/pii/S0743731507001013. https://doi.org/10.1016/j.jpdc.2007.06.014
Qiu, M., Guo, M., Liu, M., et al. (2009). Loop scheduling and bank type assignment for heterogeneous multi-bank memory. JPDC, 69, 546–558.
Goossens, S., Chandrasekar, K., Akesson, B., & Goossens, K. (2016). Memory Controllers for Mixed-Time-Criticality Systems: Architectures. Methodologies and Trade-Offs: Springer Publishing Company, Incorporated.
Kordon, A. M. (2020). A fixed-parameter algorithm for scheduling unit dependent tasks on parallel machines with time windows. Discrete Applied Mathematics. URL https://hal.archives-ouvertes.fr/hal-03041735
Niño, A., Reyes, S., & Carbó-Dorca, R. (2021). An HPC hybrid parallel approach to the experimental analysis of fermat’s theorem extension to arbitrary dimensions on heterogeneous computer systems. J Supercomput, 77(10), 11328–11352. https://doi.org/10.1007/s11227-021-03727-2
Niu, J., Gao, Y., Qiu, M., & Ming, Z. (2012). Selecting proper wireless network interfaces for user experience enhancement with guaranteed probability. JPDC, 72, 1565–1575.
Qiu, M., et al. (2006). Efficent algorithm of energy minimization for heterogeneous wireless sensor network. In E. Sha, S. K. Han, C. Z. Xu, M. H. Kim, L. T. Yang, & B. Xiao (Eds.), Embedded and Ubiquitous Computing (pp. 25–34). Heidelberg: Springer, Berlin Heidelberg, Berlin.
Lu, Z., Wang, N., Wu, J., & Qiu, M. (2018). IoTDeM: An IoT Big Data-oriented MapReduce performance prediction extended model in multiple edge clouds. J Parallel Distributed Comput, 118, 316–327.
Jiang, W., Shen, Y., Liu, L., Zhao, X., & Shi, L. (2021). A new method for a class of parallel batch machine scheduling problem. Flexible Services and Manufacturing Journal, 1–33.
Lei, Z., Lei, X., & Long, J. (2021). Memory-aware scheduling parallel real-time tasks for multicore systems. International Journal of Software Engineering and Knowledge Engineering, 31, 613–634.
Du, Y., et al. (2020). A data-driven parallel scheduling approach for multiple agile earth observation satellites. IEEE Transactions on Evolutionary Computation, 24, 679–693.
Alidaee, B., Wang, H., Kethley, B., & Landram, F. G. (2019). A unified view of parallel machine scheduling with interdependent processing rates. Journal of Scheduling, 1–17.
Guan, L. Y., Li, J., Li, W., & Lichen, J. (2019). Improved approximation algorithms for the combination problem of parallel machine scheduling and path. Journal of Combinatorial Optimization, 1–9.
Peng, W. (2021). Big data mining and analysis based on convolutional fuzzy neural network. Arabian Journal for Science and Engineering.
Shang, T., Zhao, Z., Ren, X., & Liu, J. (2021). Differential identifiability clustering algorithms for big data analysis. Science China Information Sciences, 64.
Pasupathi, S., Shanmuganathan, V., Kaliappan, M., Robinson, Y. H., & Kim, M. (2021). Trend analysis using agglomerative hierarchical clustering approach for time series big data. The Journal of Supercomputing, 1–20.
Cui, M. (2021). Big data medical behavior analysis based on machine learning and wireless sensors. Neural Computing and Applications.
Mansour, R. F., et al. (2021). Artificial intelligence with big data analytics-based brain intracranial hemorrhage e-diagnosis using ct images. Neural Computing and Applications, 1–13.
Anuradha, J. (2021). Big data based stock trend prediction using deep cnn with reinforcement-lstm model. International Journal of Systems Assurance Engineering and Management, 1–11.
Maghsoud, Z., Noori, H., & Mozaffari, S. P. (2021). Peps: predictive energy-efficient parallel scheduler for multi-core processors. The Journal of Supercomputing, 1–20
Acknowledgements
This paper is supported by the National Natural Science Foundation of China under Grant No. 61972293.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on: Big Data Security Track
Rights and permissions
About this article
Cite this article
Liu, F., He, Y., He, J. et al. Optimization of Big Data Parallel Scheduling Based on Dynamic Clustering Scheduling Algorithm. J Sign Process Syst 94, 1243–1251 (2022). https://doi.org/10.1007/s11265-022-01765-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-022-01765-4