Efficient Resource Managing and Job Scheduling in a Heterogeneous Kubernetes Cluster for Big Data

Int J Performability Eng ›› 2024, Vol. 20 ›› Issue (3): 157-166.doi: 10.23940/ijpe.24.03.p4.157166

Previous Articles     Next Articles

Efficient Resource Managing and Job Scheduling in a Heterogeneous Kubernetes Cluster for Big Data

Jayanthi Ma,* and K. Ram Mohan Raob   

  1. aDepartment of Computer Science and Informatics, Mahatma Gandhi University, Nalgonda, India;
    bDepartment of Information Technology, Vasavi College of Engineering, Hyderabad, India
  • Submitted on ; Revised on ; Accepted on
  • Contact: *E-mail address: jayanthimgu343@gmail.com

Abstract: Cloud computing is an on-demand model of computing that utilizes virtualization expertise to offer cloud resources such as CPU, memory, storage, and network in virtual machines. As a result, most big data analytics in many modern enterprise applications are run from the cloud. Since resources in these private clouds are limited, getting the most out of resource applications and providing guaranteed user service by efficiently scheduling tasks and resources is the ultimate goal. However, existing big data processing system schedulers need to consider application performance and resource utilization when performing allocations. Therefore, it is difficult to design workflows for low turnaround time and high resource consumption in extensive data systems. In this paper, we propose a resource management system for efficient job scheduling, called RMS, which dynamically schedules big data jobs in Kubernetes cluster nodes for Spark applications and autonomously adjusts scheduling policies in heterogeneous node clusters to enhance application execution and resource consumption. The RMS mechanism will ensure adequate guidance and resources available in its planning objectives and satisfactory resource utilization. The experimental analysis of different RMS and performance preferences using different methods depends on the predicted completion time and the benchmark statistical result of different significant data performance indicators traces. The results show that RMS decreases the cost and scheduling overhead and improves job execution performance.

Key words: cloud computing, big data, job scheduling, resource management, kubernetes, spark