An optimized approach for container deployment driven by a two-stage load balancing mechanism

Chaoze Lu; Jianchao Zhou; Qifeng Zou

doi:10.1371/journal.pone.0317039

Abstract

Lightweight container technology has emerged as a fundamental component of cloud-native computing, with the deployment of containers and the balancing of loads on virtual machines representing significant challenges. This paper presents an optimization strategy for container deployment that consists of two stages: coarse-grained and fine-grained load balancing. In the initial stage, a greedy algorithm is employed for coarse-grained deployment, facilitating the distribution of container services across virtual machines in a balanced manner based on resource requests. The subsequent stage utilizes a genetic algorithm for fine-grained resource allocation, ensuring an equitable distribution of resources to each container service on a single virtual machine. This two-stage optimization enhances load balancing and resource utilization throughout the system. Empirical results indicate that this approach is more efficient and adaptable in comparison to the Grey Wolf Optimization (GWO) Algorithm, the Simulated Annealing (SA) Algorithm, and the GWO-SA Algorithm, significantly improving both resource utilization and load balancing performance on virtual machines.

Citation: Lu C, Zhou J, Zou Q (2025) An optimized approach for container deployment driven by a two-stage load balancing mechanism. PLoS ONE 20(1): e0317039. https://doi.org/10.1371/journal.pone.0317039

Editor: Jacopo Soldani, University of Pisa, ITALY

Received: July 12, 2024; Accepted: December 19, 2024; Published: January 10, 2025

Copyright: © 2025 Lu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are on https://github.com/buer33/Greedy-Genetic-Algorithm.

Funding: This research was supported by Zhejiang Provincial Natural Science Foundation of China under Grant No. LQ24F020023. This Project is Supported by Ningbo Natural Science Foundation (No.2023J180). A Project Supported by Scientific Research Fund of Zhejiang Provincial Education Department (No.Y202351645). Scientific Research Project Funded by Ningbo University of Technology (No.2040011540019). Scientific Research Incubation Program of Ningbo University of Technology (No. 2022TS23). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Containerization technology [1, 2] is a lightweight form of operating system-level virtualization that packages and encapsulates applications along with their runtime dependencies into standardized, highly portable images. By isolating processes and enforcing resource constraints through container engines, this technology decouples applications from the underlying operating system and hardware, thereby allowing them to be packaged once and executed consistently across diverse environments. This technology presents several significant advantages, including decreased overhead, expedited startup times, enhanced portability, and strong support for contemporary development paradigms such as microservices and DevOps practices. These characteristics have established containerization as a fundamental technology for cloud-native applications and distributed architectures. Furthermore, containerization automates numerous facets of application management, including cluster monitoring and process-level data isolation, thereby improving scalability and performance. Cloud computing platforms have extensively embraced container technology to construct container-based clouds [3, 4], offering a comprehensive array of functionalities that streamline the development and operational processes. These platforms offer a range of functionalities, including container lifecycle management, resource scheduling, service orchestration, automated application deployment, monitoring and logging, configuration management, storage and network management, complemented by integrated security features. The adoption of container-based cloud solutions has markedly enhanced resource utilization, decreased operational costs, and accelerated business iteration cycles. By optimizing performance and providing a flexible, scalable environment, containerization is instrumental in improving user experience and facilitating enterprises in their digital transformation initiatives. This comprehensive ecosystem empowers organizations to efficiently develop, deploy, and manage containerized applications, thereby fostering innovation and operational agility.

In addition to being deployed in public clouds, private clouds, and various container service platforms, containers can also be deployed on physical or virtual hosts. This is the simplest and most prevalent method, particularly suitable for personal development, testing, and small-scale applications. Deploying containers on a single host entails the installation of widely-used container engine software, such as Docker or Podman, on either a physical or virtual machine, followed by the execution of containers within the container engine. This approach capitalizes on the isolation and security afforded by virtualization while also benefiting from the lightweight nature and flexibility inherent to containers. However, one common challenge associated with the containerized deployment model is the issue of uneven resource allocation. This phenomenon pertains to the disproportionate distribution of resources across various containers or virtual machines, which can result in suboptimal performance outcomes. In instances of uneven resource allocation, certain virtual machines may experience excessively high resource consumption, while others remain underutilized. Such imbalances can adversely affect the overall performance and stability of applications, particularly in contexts where precise resource allocation is critical [5]. The root causes of this issue can be attributed to inadequate virtual machine scheduling strategies, imbalanced workloads among containers, and resource performance of containerized workloads. These factors contribute to varied resource utilization across containers, ultimately impacting overall performance and resource utilization [6, 7].

In order to address the aforementioned issue, it is essential to investigate and develop more equitable and efficient resource allocation strategies in containerized deployments. Our main contributions of this study are as follows:

(1)The comprehensive deployment process for container services is categorized into two stages: the coarse-grained stage and the fine-grained stage. During the coarse-grained stage, the deployment of multiple container services across various virtual machines is optimized according to their resource requirements. During the fine-grained stage, resources on each virtual machine are distributed evenly among the container services.
(2)In the initial phase of addressing the challenge of distributing multiple containers across various virtual machines, we employ a greedy algorithm. Subsequently, to tackle the issue of resource allocation for multiple containers on a single virtual machine, we utilize a genetic algorithm. This approach enables us to derive a set of optimal resource allocation strategies that facilitate a reasonable and balanced distribution of resources among the containers on the virtual machines.
(3)Our approach addresses the limitations of traditional solutions, which primarily concentrate on the latter stages of the resource allocation problem while overlooking the overall balance in the initial stages. Furthermore, our approach underscores the significance of load balancing in the optimization of container service deployment.

This paper is organized as follows. “Background and related work” introduces related work. “Container deployment combinatorial optimization problem based on load balancing” introduces the two-stage partitioning situation and the mathematical modeling design of each stage. “Algorithm design for solving the two-stage container load balancing deployment model” introduces the research method and the specific algorithm design process of this paper. “Simulation experiment of container deployment optimization for two-stage load balancing” verifies and compares the deployment optimization method of this paper. “Summary and future work” summarizes the whole paper and introduces the future work.

Background and related work

The challenge of rational container placement, deployment, and balanced resource allocation has attracted considerable attention from both industry and academia. Despite this interest, research in this domain remains predominantly exploratory and developmental. Resource allocation, in particular, constitutes a critical component of container deployment strategies. Researchers are concentrating on the development of intelligent algorithms that effectively manage resources such as CPU, memory, and network bandwidth. These algorithms must be aligned with application requirements to enhance overall system performance. By improving the performance, reliability, and resource utilization of container clusters, these advancements seek to optimize the efficiency and effectiveness of containerized environments.

Currently, both domestic and international researchers have extensively explored task balancing and placement issues, yielding notable results. For instance, Hirofuchi et al. [8] addressed the placement problem under various resource constraints, such as CPU and memory, by considering both the configuration time and migration time of virtual machines to optimize cloud resource allocation. Aslam et al. [9] provided a comprehensive review of load balancing algorithms in cloud environments from 2004 to 2015, categorizing these algorithms into two main types: static and dynamic. They provided detailed descriptions and comparisons of each algorithm, and evaluated the performance of different load balancing algorithms through multiple parameters such as fairness, response time, and throughput. Nakada et al. [10] utilized a genetic algorithm to develop effective placement strategies by adjusting parameter weights, though determining the optimal parameters for multi-objective optimization remains challenging. Patel et al. [11] proposed a dynamic priority spillover technique and introduced the concept of short-life and long-life containers to optimize resource utilization and address fragmentation issues in virtual machine (VM) placement. Xu et al. [12] approached the placement problem as a multi-objective optimization challenge, aiming to minimize total resource waste, power consumption, and cooling costs. They proposed an enhanced multi-objective optimization genetic algorithm to explore the solution space and reconcile conflicting objectives. Van et al. [13] defined the placement problem as a constraint satisfaction issue and employed constraint planning methods to model it, achieving an automated mechanism for virtual resource management. Dasgupta et al. [14] introduced a novel load balancing strategy based on a genetic algorithm (GA). This algorithm efficiently balances the load across cloud infrastructure while minimizing the makespan for a given set of tasks, demonstrating its effectiveness in optimizing resource allocation. It is noteworthy that machine learning technologies have demonstrated significant potential in addressing load balancing challenges [15]. However, for small-scale development users, the adoption of such technologies could significantly escalate development costs. Consequently, after comprehensive consideration, this paper has opted to employ heuristic optimization algorithms as the solution.

Resource allocation, a critical issue in cloud computing, involves distributing resources based on specific usage rules within a fixed resource environment and addressing fixed user resource requests. Numerous studies have proposed algorithms and solutions to optimize this process. Wei et al. [16] achieved preliminary independent optimization results using binary integer programming, reaching an ideal and fair resource allocation through evolutionary mechanisms. Their findings demonstrated the existence of Nash equilibrium in resource allocation games when feasible solutions are available. Lee et al. [17] developed a mechanism for adaptive and stable deployment on virtual machines, utilizing evolutionary game theory to model key performance parameters. However, their study focused solely on CPU performance and did not consider other critical parameters such as memory and I/O. Pandey et al. [18] approached resource allocation by modeling it as a directed acyclic graph and employed a particle swarm-based heuristic cloud resource scheduling algorithm. Their approach aimed to minimize computation and transmission costs effectively. Semmoud et al. [19] proposed a load balancing algorithm based on a starvation threshold, which ensures load balancing by activating at least one idle virtual machine. This method helps reduce overall migration costs and mitigates additional overhead. Shi et al. [20] introduced the BMin algorithm, an enhancement of the Min-min algorithm, and demonstrated its effectiveness through experiments using the CloudSim simulation framework. Their results indicated that the BMin algorithm improves throughput and resource load balancing, outperforming traditional algorithms. Shahid et al. [21] evaluated the performance of several existing load balancing algorithms, including Particle Swarm Optimization (PSO) and Round Robin (RR). They also explored various modeling and simulation methods for mobile cloud computing environments, which are essential for assessing cost and reliability trade-offs in a Pay-As-You-Go (PAYG) context [22].

In summary, the process of container deployment allows for the consideration of various perspectives and the application of diverse data analysis methods to allocate resources among containers on a single host. However, existing methodologies exhibit certain limitations and fail to achieve a balanced deployment from a comprehensive architectural viewpoint, as they lack unified optimization criteria. This paper introduces a two-stage optimization method for load balancing in container deployment, which aims to minimize the processing time of all subtasks as the primary optimization criterion. The first stage employs a greedy algorithm to facilitate the balanced deployment of virtual machines by effectively placing multiple container tasks. The second stage utilizes a genetic algorithm to achieve load balancing of container tasks through the allocation of resources on virtual machines. This approach is designed to enhance system responsiveness and improve resource utilization.

Container deployment combinatorial optimization problem based on load balancing

Fundamentals of container deployment

Microservice architecture.

The concept of microservices was initially introduced by Martin Fowler and James Lewis in 2014. This architectural style entails decomposing an application into multiple small and independent service units, each capable of operating within its own process [23], and communicating via lightweight mechanisms such as HTTP APIs. Each service unit is developed autonomously and can be implemented using various programming languages, frameworks, and data storage technologies, provided that they conform to a standardized interface specification. These service units can be developed, tested, deployed, and scaled independently, without impacting the functionality of other service units, thereby collectively forming a complete application. The microservices architecture [24] offers developers an enhanced development paradigm for applications, characterized by:

(1) the independence of each microservice from one another;
(2) each microservice functioning as an atomic service that cannot be further subdivided;
(3) the capacity of microservices to be rapidly combined and refactored into a cohesive system.

Virtual machine technology.

A virtual machine is a software technology that simulates hardware and operating systems. It abstracts, transforms, and partitions the hardware resources of a computer, providing multiple virtual execution environments. As a result, it enables running multiple different operating systems on a single physical machine. The concept of virtual machines was first introduced by IBM in the 1960s implement multi-user and multitasking capabilities for large-scale computers. Later, with the development of personal computers and the internet, virtual machine technology has been widely applied and developed. Its fundamental principle involves creating one or more virtual hardware environments, known as virtual machines [25], on the hardware resources of a physical machine through a software layer called a virtualization layer or hypervisor. The virtualization layer manages and allocates the CPU, memory, disk, network, and other resources of the physical machine to each virtual machine. It also provides a set of standard interfaces and protocols for communication and interaction between virtual machines and the physical machine.

Virtual machine technology provides infrastructure-level support and advantages for microservices architecture, helping to achieve isolation, elastic scalability, management deployment, and a more flexible and diverse selection of runtime environments.

Containerization technology and deployment.

The concept of containers can be traced back to the Unix system call mechanism known as chroot (Change Root), which was proposed in 1979. Chroot facilitates the creation of an isolated environment that restricts access permissions for specific users or services, thereby preventing processes from accessing or modifying files and resources outside the designated directory tree. This methodology for process restriction and isolation served as a foundational element for the evolution of container technology [26]. Subsequently, the Linux kernel introduced namespaces and control groups (cgroups) to enhance process isolation and resource management. In 2008, the Linux Container Project launched the Linux container engine. This was followed by the introduction of the Docker container engine in 2013, which addressed issues related to standardization and portability within the realm of containerization. The advent and advancement of technologies such as the Kubernetes container orchestration platform in 2014, which is capable of managing distributed container clusters, signified the official emergence of containerization technology into a phase characterized by rapid technological growth.

Container technology represents a lightweight virtualization approach that consolidates all necessary files, libraries, dependencies, and other requirements essential for executing a software program. This technology can be deployed across various computing environments, including physical servers, virtual machines, and cloud servers. Container technology facilitates environment isolation and resource limitation, providing process-level isolation by creating an independent runtime environment for a collection of processes. Moreover, it enables applications to be deployed across diverse runtime environments, allowing them to operate independently in separate spaces without compromising the performance of other applications, even in the presence of errors. This characteristic enhances portability and fault tolerance for software developers. Additionally, containerization supports rapid updates and releases of software programs through the use of container modules, without disrupting or affecting the operating system or other applications. Consequently, it fosters agility and scalability while ensuring compatibility. Thus, container technology complements microservices architecture by promoting decoupling, high scalability, and flexibility in the implementation of microservices.

Due to the ability to run each microservice in an independent container, developers package the microservices and their dependencies into a container image, which can then be deployed on a single virtual machine. The general solution design consists of four main steps: application packaging, image building, image pushing, and container orchestration management [27], as shown in Fig 1. Application packaging can be configured within the configuration file located in the root directory of the project to facilitate the automatic packaging of the application. The process of image building can be accomplished by creating a Dockerfile and utilizing the build command. The Dockerfile encompasses various instructions and specifications essential for constructing the image, rendering it more efficient and concise in comparison to manual image building methods. Once the image has been constructed, it is subsequently pushed to a repository for storage, enabling it to be retrieved at any time for the establishment of a development environment. As the number of containers increases, it becomes imperative to manage and configure the operational order, communication, and status among the containers. Consequently, a container orchestration tool is required for the effective management and scheduling of containers. Prominent orchestration tools available in the market include Kubernetes, Docker Compose, and Docker Swarm.

Download:

Fig 1. Containerization deployment flowchart.

https://doi.org/10.1371/journal.pone.0317039.g001

Two-stage load balancing container deployment problem model

Based on the potential issues that may arise from the aforementioned deployment process, this paper researches the problem of two-stage load balancing container deployment optimization with coarse-to-fine granularity. In order to describe the problem more clearly and in detail, this paper divides this optimization problem into two parts. The first part is to place the container requests, which require certain resources, onto multiple virtual machines in a reasonable manner. The second part is to further allocate the inherent resources on the virtual machines to the respective container requests in a balanced way. The problem is further described as follows.

Container placement problem among virtual machines.

The objective of this study is to deploy a specified number of containers onto a cluster of virtual machines (VMs). This process, which involves the allocation and deployment of multiple containers across the VM cluster, is commonly referred to as the container placement problem. In the allocation and deployment strategy, if an even distribution approach is adopted, where containers are deployed evenly across VMs based on their quantity. However, it is important to note that each container possesses distinct resource requirements; some containers may necessitate a substantial portion of a VM’s resources, while others may require significantly less. Consequently, an even distribution based solely on quantity may result in certain VMs being disproportionately burdened with tasks, while others remain relatively underutilized. This phenomenon leads to a concentration of containers, as illustrated in Fig 2.

Download:

Fig 2. The possible states of virtual machines deployed using the principle of average allocation.

https://doi.org/10.1371/journal.pone.0317039.g002

Therefore, in order to solve this problem, this paper should deploy containers reasonably, so that the available virtual machines can handle the work, and make full use of the resources of multiple machines to make the containers run more quickly and smoothly.

Definition 1(Coarse-grained Way): Suppose there are n virtual machines, each with distinct configurations, and m different container deployment requests, alongside m container task requests that must be allocated to n virtual machines. It is stipulated that the total amount of resources required by each container task request is denoted as CT_SUM_i(1≤i≤m), The deployment of containers to the virtual machines occurs in a sequential manner. The objective is to deploy a certain number of containers evenly across the virtual machines. Each virtual machine should have a resource consumption within the range [Res_min, Res_max]. This approach guarantees that no virtual machine is either overloaded or left idle.

The parameters in the research problem model are now defined and explained, and the objective function and related constraints are shown in Table 1.

Download:

Table 1. Parameter meaning for coarse-grained stages.

https://doi.org/10.1371/journal.pone.0317039.t001

This problem aims to place each container on a suitable virtual machine without causing congestion. The objective function values are the endpoints of the range [Res_min, Res_max], which represent the resource consumption of each virtual machine after allocation. The method of determining the values of Res_min and Res_max is as follows: (1) (2)

The constraints are as follows: (3) (4)

Res_min and Res_max are the minimum and maximum values of the reasonable resource range for each virtual machine VM_j(1≤j≤n). They depend on the current number of container requests that need to be deployed. They depend on the number of current container requests that need to be deployed, and then reflect load balancing by taking into account the maximum and minimum resource requirements for deploying containers and measuring the extreme differences in the rational allocation of resources by VMs. Eq 4 indicates that a container request can only be deployed to one virtual machine.

Internal resource allocation problem of virtual machines.

From the above problems, it can be concluded that m different container requests have been reasonably allocated to n virtual machines. However, this allocation does not guarantee that the container requests deployed on each virtual machine will be able to complete their tasks efficiently and concurrently. Given that each container is associated with varying task workloads, some containers necessitate substantial amounts of CPU, memory, and computational time, while others require only minimal CPU and memory resources to achieve rapid completion. The overall completion of a subprocess is contingent upon the successful execution of all container threads within that subprocess. Even if certain container threads have concluded their tasks, they must still await the completion of other threads that remain unfinished. These subprocesses typically arise from complex and large-scale services. Consequently, it is imperative to develop an optimization function that effectively balances the allocation of diverse resources, thereby facilitating the concurrent operation of containers on virtual machines while minimizing overall time consumption.

Definition 2(Fine-grained Way): Suppose w containers are deployed on a virtual machine, and each container request i has a different resource type request. It also states that the entire task is considered complete only when all containers on the virtual machine have completed their work.

In order to meet this condition: assuming that when each container request is allocated a different amount of resources, the completion time of the process also varies. Container a will have a shorter completion time if it receives sufficient resources, while container b will have a longer completion time if it is allocated insufficient resources. However, even after container a finishes its process, it still needs to wait for container b to complete its task. Therefore, if we distribute all resources evenly among the containers based on the number of container requests, the above-mentioned unreasonable situation will occur, as shown in Fig 3. A reasonable approach is to balance the resource allocation of the virtual machines. This way, each container thread can work with a fair share of resources and finish at similar times. This will minimize the overall completion time of the subprocess task, as shown in Fig 4.

Download:

Fig 3. The average allocation time of virtual machine resources.

https://doi.org/10.1371/journal.pone.0317039.g003

Download:

Fig 4. The balanced allocation time of virtual machine resources.

https://doi.org/10.1371/journal.pone.0317039.g004

The parameters in the research problem model will now be defined and explained. The objective function and related constraints are shown in Table 2.

Download:

Table 2. Parameter meaning for fine-grained stages.

https://doi.org/10.1371/journal.pone.0317039.t002

In this problem, it is assumed that only three types of resources are studied: CPU, memory, and I/O, Hence r is equal to 3. To minimize the completion time of the entire task on the virtual machine, resources should be evenly distributed and all container processes should work in parallel. The formula required for the objective function is as follows: (5) (6)

The formulated objective function is as follows: (7)

The formulated constraints are as follows: (8)

U_i is the utilization rate of container request i. It measures how well the virtual machine allocates three types of resources to container request i. It is calculated by standardizing the difference between the container request i and the resource allocation. The virtual machine may provide more of a certain resource than container request i needs. In that case, container request i can use that resource fully in an ideal state. The utilization rate of that resource is 1. T_i represents the time required for container i to complete its processes after being allocated different amounts of various resources by the virtual machine. The utilization rate U_i of container i can be used to determine the amount of resources allocated based on the size of the workload. The more the allocated resources satisfy the resource requirements of container i, the more efficiently it can complete its tasks, and the closer the completion time is to the minimum time required by the thread. Conversely, if the allocated resources do not meet the resource requirements of container i, the opposite is true. The objective function G(T) measures the dispersion of the completion times of each container thread. The resource allocation is more effective when the times are similar and the data deviation is low. This means that faster threads do not have to wait for slower threads to finish. Eq 8 indicates that each container request must receive resources from the virtual machine, and there should be no situation of resource allocation failure.

Algorithm design for solving the two-stage container load balancing deployment model

This paper divides container deployment into two-stage problems. The first problem is how to place container service requests on virtual machines in a reasonable manner. The second problem is how to allocate resources to virtual machines with deployed container services, so that each container service can complete tasks relatively quickly and improve overall service efficiency. The following are the method strategies proposed in this paper to solve these two problems.

Greedy strategy for solving the container placement problem among virtual machines

The first problem to be addressed involves the strategic deployment of each container to the appropriate virtual machine. Given that different container tasks exhibit varying resource consumption levels on virtual machines, it is impractical to pre-allocate containers to specific virtual machines. Consequently, it is essential to first sort the container tasks in descending order based on the time they require. The advantage of this sorting is that it can use the greedy algorithm [28], which enables the identification of a locally optimal solution at each selection step, thereby allowing for the rapid approximation of an optimal solution. To achieve an effective deployment strategy, container tasks that demand the most time should be assigned to the virtual machine currently exhibiting the lowest resource usage. The pseudocode of the algorithm is shown in Algorithm 1. This approach promotes a balanced distribution of resource consumption across all virtual machines executing the container services. The operational process is illustrated below.

Step 1: For the container structure list, sort the container list in descending order according to the amount of requested resources;
Step 2: Remove the first container from the container list. The container that requests the most resources is allocated to the first VM in the VM list, that is, the VM that consumes the least resources.
Step 3: Deploy the corresponding container task to the corresponding VM, and sort the VM list in ascending order based on the resource consumption.
Step 4: Repeat Steps 2-3 until all containers in the container list are assigned to the virtual machine.

Algorithm 1 Greedy Algorithm for the Container Placement

Require: List of containers CT[ ], list of virtual machines VM[ ].

Ensure: An allocation deployment plan.

1: Initialize CT=[ ]

2: Initialize VM=[ ]

3: Sort CT[ ] in descending order by time

4: while tasks is not empty do

5: Sort VM[ ] in ascending order by resource consumption

6: Assign the last container in CT[ ] to the first vm in VM[ ]

7: Remove the last container from CT[ ]

8: end while

9: while true do

10: Find the vm with the least resource consumption (minVm)

11: Find the vm with the most resource consumption (maxVm)

12: if the difference between maxVm and minVm resource consumption <= Q_max − Q_min then

13: break the loop

14: else

15: Remove the last container from maxVm’s task list

16: Assign this container to minVm

17: Update maxVm’s time and resource consumption

18: end if

19: end while

20: return allocation deployment plan

Genetic algorithm for solving virtual machine internal resource allocation problem

Generic load balancing techniques.

Generic load balancing techniques [29], such as Round Robin, First-Come-First-Served(FCFS), Min-Min and Max-Min algorithms, were once widely used in cloud computing environments. However, with the increasing demand for resources and the diversity of demands, these traditional load balancing methods gradually show their limitations when facing complex resource allocation scenarios. For example, Round Robin simply assigns requests to each server in turn; FCFS, while simple and fair, does not take into account the execution time of the tasks, which may lead to longer average waiting and response times; the Min-Min algorithm prioritises scheduling of the smallest tasks, which results in larger tasks being delayed until the end, thus lengthening the completion time; and the Max-Min algorithm prioritises scheduling of the largest tasks but leads to load imbalance when tasks are executed at similar times. These methods cannot be dynamically adjusted to adapt to changing environments and task types, so more advanced load balancing techniques are needed to meet this challenge. For this reason, load balancing techniques based on natural phenomena have emerged, such as ant colony algorithms, bee algorithms and genetic algorithms. In this paper, we will focus on the genetic algorithm [30], an algorithm that performs well in dealing with extensive search spaces and complex objective functions. The main advantages of genetic algorithms include their ability to effectively avoid falling into local optimal solutions and their strong adaptability to cope with complex problems. Next, the fundamentals of genetic algorithms and their application in load balancing are described in detail.

The fundamental concept of genetic algorithm.

Load balancing is a key issue in distributed systems, aiming to evenly distribute workloads across multiple computing resources to improve the overall performance and reliability of the system. Load balancing is particularly important in cloud computing environments because it directly affects resource utilization and quality of service. However, the load balancing problem is highly complex and dynamic because the resource demand and system state may change at any time. Traditional load balancing methods often struggle to cope with these challenges, especially when global optimization is required.

Genetic algorithms [31] are adaptive global optimization methods that mimic natural selection and genetic mechanisms. These algorithms are based on Darwinian evolution and Mendelian genetics and consider all individuals in a population as search entities. Genetic algorithms use stochastic optimization techniques to efficiently explore the parameter space and dynamically learn and update their understanding of the problem space during the search process. They are able to adaptively adjust the search strategy to find the optimal solution. Meanwhile genetic algorithms exhibit excellent global search capabilities in resource allocation and load balancing problems. They can optimize multiple load balancing metrics, which is highly compatible with the multi-objective nature of the load balancing problem, and are scalable, allowing the integration of new genetic operations or parameter tuning based on specific load balancing requirements, adapting to the dynamic nature of the load balancing problem. In addition, genetic algorithms are remarkably flexible and robust to the dynamics and uncertainties in load balancing problems. Therefore, genetic algorithms have significant advantages and potential in solving complex load balancing problems.

Algorithm design for the optimization model.

(1) Encoding method. Common encoding methods used in genetic algorithms include binary encoding, gray code encoding, real number encoding, etc. Binary encoding is simple and straightforward to implement, and can swiftly accomplish encoding and decoding in genetic algorithm operations. However, it also has some limitations, such as requiring longer bit strings to enhance the accuracy of the solution, which leads to an expanded solution space and a decreased algorithm efficiency. At the same time, binary encoding also cannot reflect the relationship between variables. Therefore, this paper chooses the real number encoding method, which can improve the calculation accuracy, does not need to convert the base, and is suitable for numerical optimization problems. This paper takes the amount of resources allocated by each virtual machine to the container as the gene value.

Definition 3(Chromosome): Assuming that w container services are deployed on a virtual machine, each container service will request resources of c₁,c₂,…,c_k respectively, then the chromosome encoding is: X = x₁₁,x₁₂,…,x_1k,x₂₁, x₂₂,…,x_2k,…,x_w1,x_w2,…,x_wk, since this paper assumes that only CPU, memory, and I/O are studied. These three types of resources, therefore, the composition of a chromosome is shown in Fig 5, where x₁₁,x₁₂,x₁₃ represent the three types of resource requests required by the first container on a virtual machine.

Download:

Fig 5. Chromosome composition designed by genetic algorithm.

https://doi.org/10.1371/journal.pone.0317039.g005

(2) Population setting. Before performing the algorithm, the initial population and the population size must be set first. According to the actual situation, the initial population can be set in the possible distribution range of the entire problem domain. The method of initializing the population individuals in this paper is to randomly generate n genes within the specified resource range to form a chromosome, that is, a chromosome is allocated. The sum of x₁₁,x₂₁,…,x_w1,x₁₂,x₂₂,…,x_w2 and x₁₃,x₂₃,…,x_w3 values will not exceed the corresponding total amount of resources on the virtual machine, and the initialization range of gene values is set according to different resource amounts. The population size is closely related to the execution efficiency and final result of the genetic algorithm. When the population size is small, the optimization performance obtained by the genetic algorithm will not be very good. When the population size is large, the computational complexity will also increase accordingly. This paper sets the population size NP to 600 and the number of iterations to 200. This scale dimension is moderate.

(3) Fitness function. Fitness refers to the ability of individuals in a biological population to adapt to the living environment. In genetic algorithms, the mathematical function used to evaluate the quality of individuals is called the fitness function of individuals. The fitness function has a direct impact on the performance of the genetic algorithm. Therefore, this paper defines the fitness function as the objective function of the original problem, and computes the resource utilization rate U_i and running time T_i of each individual in each generation of the population based on Eqs 5 and 6, and then proceeds to calculate using Eq 7. The individual with the smallest variance of running time in each generation of the population is taken as the optimal solution, that is (7)

(4) Genetic operator design. Selection operator: The function of the selection operator is to select good individuals from a certain generation of population and inherit them to the next generation of population. This paper adopts the roulette wheel selection algorithm [23]. The underlying principle of the roulette wheel selection algorithm is that individuals with higher fitness have a higher chance of being selected, but not necessarily. It also allows individuals with lower fitness to have some opportunity of selection, thus ensuring survival of the fittest. This paper calculates the fitness value of each individual for a certain generation of population with a size of NP, and adds up all the fitness values to obtain the total fitness value. The selection probability of an individual is calculated as individual fitness value/total fitness value. Then, a random number R between 0 and 1 is generated. Next, the selection probabilities of each individual are added up until the sum exceeds R. The individual corresponding to that sum is selected. In order to ensure the diversity of the population, repeat this step until NP individuals are selected to generate the next generation.

Crossover operator: Genetic algorithm employs crossover operator to alter the structure of two chromosomes, swapping gene segments between them, in order to modify the gene structure and enhance the search capability. This paper randomly selects two chromosomes, and randomly picks a position between the chromosome lengths to exchange with the corresponding exchangeable position of another chromosome (the same resource type in the chromosome is the corresponding exchangeable position), for example, x₁₁ in chromosome 1 can be exchanged with any value of x₁₁,x₂₁,…,x_w1 in chromosome 2, as shown in Fig 6.

Download:

Fig 6. The same type of resources of two chromosomes can be crossed.

https://doi.org/10.1371/journal.pone.0317039.g006

Mutation operator: Genetic algorithm uses mutation operator to change the gene structure of a single individual, so that the new gene structure is closer to the optimal solution. For a certain individual in the population, some genes on a certain gene or some genes are changed to other alleles with a certain probability (mutation probability). This paper applies real number mutation, which means that after the crossover operation, the gene value of each individual in the population is substituted by a random value within the value range, taking into account the mutation probability. Only after exceeding a certain probability, mutation may occur. Otherwise, mutation will develop into a random search situation. After completing the crossover operation in this paper, a random number rand between 0 and 1 will be generated. If rand is less than the mutation rate α defined in this paper, which is 0.4, then the mutation operation will be executed, and x₁₁,x₂₁,…,x_w1,x₁₂,x₂₂,…,x_w2 and x₁₃,x₂₃,…,x_w3 will undergo mutation within the given range, making sure that the total of the three kinds of resource requests does not surpass the overall amount of resources of the virtual machine.

Proposition: Let the decision vector X be utilized to construct the solution space for the problem within the context of the biological evolution process as implemented in genetic algorithms.

Proof: Through a series of genetic operations, including gene crossover and mutation among chromosomes, multiple iterations are conducted. The flowchart is shown in Fig 7. In accordance with the principle of survival of the fittest, individuals exhibiting higher fitness levels are selected for inheritance across generations. This process ultimately leads to the identification of a solution that either reaches or approximates the optimal solution to the problem at hand. The pseudocode of the algorithm is shown in Algorithm 2.

Download:

Fig 7. Genetic algorithm flow chart.

https://doi.org/10.1371/journal.pone.0317039.g007

Algorithm 2 Genetic algorithm for virtual machine internal resource allocation

Require: Container resource requirements D[ ], Total virtual machine resources V[ ].

Ensure: An optimal set of resource allocation options. operations

1: Define genetic algorithm parameters(Population size NP, Maximum number of generations G, Chromosome length N)

2: Initialize population

3: Initialize generation counter

4: while generation counter < G do

5: Calculate fitness for each individual in population

6: Record best fitness for this generation

7: Individuals in the population are selected using the roulette selection method

8: crossover_idx1 = random integer between 1 and N

9: crossover_idx2 = random integer between 1 and N

10: Perform crossover based on idx1 and idx2

11: mutation_idx = random integer between 1 and N

12: Perform mutation based on mutation_idx

13: Increment generation counter

14: end while

15: return an optimal set of resource allocation options

(5) Time complexity analysis. In this paper, we analyze the time complexity of using genetic algorithm to allocate resources for load balancing, which proves the rationality and application value of the algorithm. The main process of genetic algorithm includes calculating the fitness function, selection, crossover and mutation operations. In calculating the fitness function, the time complexity is O(NP·w), and the fitness values of w individuals in NP populations are calculated separately. In the selection operation, the time complexity is O(NP), and the population with the large fitness value is selected among the NP populations. In crossover operation, the time complexity is (NP·r), two individuals are randomly selected among the NP populations, and then two-by-two exchanges are performed on r resource types. In the mutation operation, the time complexity is O(NP·r), and any one of the r kinds of resources of an individual is randomly selected in the NP populations for mutation. Where NP is the number of population size individuals, w is the number of individuals within each population (i.e., the number of containers), and r is the number of resource types required for each container. Iterate the above operation until the maximum number of iterations G is satisfied before stopping, so the total time complexity is shown below. (9)

Simulation experiment of container deployment optimization for two-stage load balancing

Upon the completion of the modeling and design of the theoretical knowledge component, this paper conducts simulation experiments (https://github.com/buer33/Greedy-Genetic-Algorithm) focused on the optimal allocation of multiple container request tasks and the reduction of the time required for each subprocess task. The initial phase of the experiment validates that the greedy algorithm effectively assigns each container request task to an appropriate virtual machine based on the resource requirements of each task. The results indicate that this approach significantly outperforms the average resource consumption associated with randomly allocated virtual machines. Subsequently, the study demonstrates that the genetic algorithm can allocate the intrinsic resources of the virtual machine in accordance with the resource requirements and completion times of each container request task. This allocation strategy aims to minimize the completion time of each container thread, thereby expediting the completion of subprocess tasks, and the findings reveal that this method yields a more equitable distribution of resources compared to average allocation. Finally, the paper compares the running time differences between the greedy-genetic algorithm and other algorithms, including the GWO algorithm, the SA algorithm, and the GWO-SA algorithm [32], thereby establishing the superiority of the greedy-genetic algorithm.

Simulation experiment of container placement between virtual machines solved by greedy strategy

This experiment posits that a container task encompasses three distinct types of resource requests: CPU, Memory, and I/O bandwidth. The demand for each resource type is contingent upon the specific circumstances of the task. To enhance both the comprehensiveness and simplicity of the calculations, the requested amounts of CPU, Memory, and I/O are represented in units of C (units), M (GB), and I (Mbps), respectively. These amounts are aggregated into a weighted sum, referred to as CT_SUM, as delineated in Eq 10. This approach is designed to provide flexibility in resource allocation, allowing users to adjust the weights assigned to each resource type according to their particular requirements, thereby facilitating more rational and optimized resource distributions. Now specify that α₁=0.4, α₂=0.4, α₃=0.2 in the formula. (10)

The running time time under the ideal state of a container task can be estimated by the performance data such as CPU, Memory and I/O required by the container task. Now give the CPU amount, Memory amount and I/O amount of 20 container task requests, which are required to be deployed on 4 virtual machines with the same configuration. The specific requirements are shown in Table 3.

Download:

Table 3. Resource requests and running times for each container.

https://doi.org/10.1371/journal.pone.0317039.t003

According to the greedy algorithm design, the specified numerical interval [Res_min, Res_max] of the virtual machine resource consumption obtained by Eqs 1 and 2 is [367.00,492.60], when the overall amount of resources utilized by each virtual machine falls within this range, it can be regarded as being in a fairly balanced state. At this time, according to the greedy algorithm scheme, the container number, total resource consumption deployed to 4 virtual machines are shown in Table 4.

Download:

Table 4. Consumed resources for each virtual machine.

https://doi.org/10.1371/journal.pone.0317039.t004

The results of the algorithm design indicate that the VM_SUM values representing the total resource consumption for each virtual machine are relatively uniform, with none exhibiting excessive or insufficient resource usage. All values fall within the specified numerical range, thereby meeting the established requirements. In contrast, the total resource consumption for load balancing is significantly less effective when multiple container tasks are assigned and deployed randomly across the virtual machines, as compared to the outcomes produced by the greedy algorithm. The following Fig 8 clearly illustrates that the resource allocation scheme implemented by the greedy algorithm achieves a more equitable distribution of resources across each virtual machine, effectively preventing scenarios in which certain virtual machines are burdened with a disproportionate number of containers while others remain underutilized.

Download:

Fig 8. Total resource consumption of different allocation schemes.

https://doi.org/10.1371/journal.pone.0317039.g008

Simulation experiment of resource allocation inside virtual machine solved by genetic algorithm

The above simulation experiment results indicate that four virtual machines are deployed, each with a different number of container tasks. When different container tasks are required to be deployed to a virtual machine, each container task has its own resource requirements. CPU, Memory and I/O requirements are different. The configuration of the four virtual machines is assumed to be identical, and the CPU, Memory and I/O resources that can be allocated are 2 units, 4GB and 2000Mbps respectively. The resource requirements of the four virtual machines are shown in Tables 5–8.

Download:

Table 5. The resource requirements of virtual machine 1.

https://doi.org/10.1371/journal.pone.0317039.t005

Download:

Table 6. The resource requirements of virtual machine 2.

https://doi.org/10.1371/journal.pone.0317039.t006

Download:

Table 7. The resource requirements of virtual machine 3.

https://doi.org/10.1371/journal.pone.0317039.t007

Download:

Table 8. The resource requirements of virtual machine 4.

https://doi.org/10.1371/journal.pone.0317039.t008

The three resource requests of each container task of the above four virtual machines are taken as parameters and substituted into the genetic algorithm design of this paper. The genetic algorithm will distribute the resources optimally to each container based on the available resources of the virtual machine, so as to balance the completion time of each container task, and avoid large discrepancies in the completion time of each container on each virtual machine. The fitness evolution curve obtained by the genetic algorithm after 200 iterations is shown in Figs 9–12.

Download:

Fig 9. The iteration curve of virtual machine 1.

https://doi.org/10.1371/journal.pone.0317039.g009

Download:

Fig 10. The iteration curve of virtual machine 2.

https://doi.org/10.1371/journal.pone.0317039.g010

Download:

Fig 11. The iteration curve of virtual machine 3.

https://doi.org/10.1371/journal.pone.0317039.g011

Download:

Fig 12. The iteration curve of virtual machine 4.

https://doi.org/10.1371/journal.pone.0317039.g012

As shown in Figs 9–12, the variance of the container time on each virtual machine decreases gradually as the number of iterations increases, and the completion time of each container tends to be balanced. The completion time of each container on each virtual machine is given as follows.

The data from Figs 13–16 shows that after each virtual machine allocates three kinds of resources to each container reasonably by the genetic algorithm, the completion time of each container is relatively balanced, except for some containers such as container 4 on virtual machine 4, which takes too long to complete due to its large computational task compared to other containers. The time difference degree of other containers is small, which means that the completion time dispersion is low after the virtual machine balances the resources for each container, that is, the completion time does not differ much, thus making the subprocess task finish in the shortest time.

Download:

Fig 13. The container task completion time on virtual machine 1.

https://doi.org/10.1371/journal.pone.0317039.g013

Download:

Fig 14. The container task completion time on virtual machine 2.

https://doi.org/10.1371/journal.pone.0317039.g014

Download:

Fig 15. The container task completion time on virtual machine 3.

https://doi.org/10.1371/journal.pone.0317039.g015

Download:

Fig 16. The container task completion time on virtual machine 4.

https://doi.org/10.1371/journal.pone.0317039.g016

Comparison of container deployment algorithms between two-stage hybrid allocation and algorithms for other heuristics allocation

The aforementioned algorithms illustrate that the deployment scheme utilizing a greedy algorithm is capable of appropriately allocating multiple container tasks with diverse resource request sizes to each virtual machine. This approach results in a more equitable distribution of resource consumption across the virtual machines. The genetic algorithm allocation scheme facilitates the distribution of resources to each container based on the inherent resources of the virtual machine, thereby balancing the completion times of each container and ultimately achieving the minimum completion time for the subprocess task on the virtual machine.

The advantage of the two-stage hybrid algorithm presented in this paper lies in its ability to effectively address the challenge of deploying multiple container tasks across various virtual machines. Initially, the algorithm balances the allocation of multiple containers to each virtual machine based on their resource consumption. Subsequently, it ensures the provision of sufficient resources for the execution of these containers within the virtual machine that accommodates multiple container tasks. This approach ultimately enhances the efficiency of completing the entire subprocess task.

The simulated annealing (SA) optimization algorithm exhibits significant generality and robustness in addressing resource allocation problems, effectively managing complex nonlinear optimization challenges. Nevertheless, it is characterized by a relatively slow convergence rate and a sensitivity to initial conditions and parameter settings, which can impede the rapid identification of the global optimal solution.

The Grey Wolf Optimization (GWO) algorithm is widely regarded as an effective approach for addressing resource allocation problems, owing to its straightforward structure and rapid convergence capabilities. These characteristics facilitate a balance between local search and global optimization. Nonetheless, the GWO algorithm is not without its limitations; it is susceptible to premature convergence, which can result in the algorithm settling on a local optimal solution during the search process, thereby potentially neglecting the identification of the global optimal solution.

The simulated annealing algorithm, which is based on grey wolf optimization (GWO-SA), seeks to integrate the strengths of both methodologies to enhance the efficiency and accuracy of resource allocation problem-solving. This algorithm has the potential to mitigate premature convergence through the simulated annealing process, while simultaneously capitalizing on the rapid convergence characteristics of the grey wolf optimization algorithm to expedite the search process. Nevertheless, the amalgamation of these two algorithms may lead to increased implementation complexity and computational costs, necessitating more meticulous parameter tuning to achieve an optimal balance in performance.

Now, based on the problem proposed in “Simulation experiment of container placement between virtual machines solved by greedy strategy” that containers 1-20 are required to be deployed to virtual machines 1-4, a comparison of the Greedy-Genetic algorithm and the GWO algorithm and the SA algorithm and the GWO-SA algorithm, as shown in Fig 17.

Download:

Fig 17. The fitness value for each algorithm to execute under four virtual machines.

https://doi.org/10.1371/journal.pone.0317039.g017

A comprehensive analysis of the completion times for all container tasks indicates that the allocation scheme utilized by the greedy-genetic algorithm demonstrates a notable degree of balance in the completion times of the majority of containers. In contrast, the other three heuristic algorithms exhibit considerable disparities in completion times among individual containers. This finding underscores the effectiveness of the greedy-genetic approach in achieving a nearly uniform distribution of workload across containers, thereby improving overall resource utilization and system efficiency. The balanced completion times not only facilitate a smoother workflow but also mitigate the risk of resource bottlenecks and delays, rendering it a preferable solution for resource allocation in complex containerized environments. As shown in Figs 18–21.

Download:

Fig 18. The completion time of each container in the Greed-Genetic algorithm.

https://doi.org/10.1371/journal.pone.0317039.g018

Download:

Fig 19. The completion time of each container in the GWO algorithm.

https://doi.org/10.1371/journal.pone.0317039.g019

Download:

Fig 20. The completion time of each container in the SA algorithm.

https://doi.org/10.1371/journal.pone.0317039.g020

Download:

Fig 21. The completion time of each container in the GWO-SA algorithm.

https://doi.org/10.1371/journal.pone.0317039.g021

In this chapter, to ensure the rationality and reproducibility of the experimental design, we have not only meticulously detailed the process and outcomes of applying the Greedy-Genetic algorithm for resource allocation and deployment balancing among 20 containers but also expanded the experimental scope by conducting corresponding validations on 10 and 30 containers, respectively. The experimental results are presented in Figs 22–24. In these figures, the green-filled areas represent the lower and upper bounds of the runtime intervals for each container. Through observation, it is evident that the resource allocation strategy optimized by the Greedy-Genetic algorithm results in a uniform distribution of completion times for all containers within a relatively narrow range, thereby minimizing the overall task completion time.

Download:

Fig 22. Balanced completion times of 10 containers and comparison.

https://doi.org/10.1371/journal.pone.0317039.g022

Download:

Fig 23. Balanced completion times of 20 containers and comparison.

https://doi.org/10.1371/journal.pone.0317039.g023

Download:

Fig 24. Balanced completion times of 30 containers and comparison.

https://doi.org/10.1371/journal.pone.0317039.g024

Furthermore, we conducted a comparative analysis of the fastest completion times for each container before deployment and the coefficient of variation (CV, the ratio of the standard deviation to the mean, used to quantify the dispersion of data distributions) of the completion times after deployment. The CV serves as an indicator of data distribution concentration, with a lower CV indicating a more centralized and balanced distribution. Our comparative analysis revealed a significant reduction in the CV of completion times for all containers after deployment compared to before. This finding suggests that, through optimized deployment, the completion times of the containers exhibit a more balanced distribution, indicating a significant improvement in the balance of completion times. Not only does this validate the effectiveness of the Greedy-Genetic algorithm in addressing resource allocation problems, but it also provides robust data support and theoretical justification for subsequent related research.

Summary and future work

This paper presents a two-stage optimization method for load balancing in container deployment resource allocation, aimed at addressing the issues of resource allocation imbalance and inefficiency prevalent in existing methodologies. The first stage of the proposed method involves the systematic allocation of container requests, which require specific resources, to multiple virtual machines. This allocation is conducted in a manner that is both rational and efficient. By modeling the resource demands of each container, the method calculates the intervals of relative balance in resource consumption for each virtual machine. Subsequently, a greedy algorithm is employed to partition these intervals into optimal allocation combinations, facilitating the deployment of multiple containers accordingly. Upon completion of the preliminary placement in the first stage, the second stage focuses on balancing the inherent resources of individual virtual machines among the container requests. This is achieved through a modeling calculation of the utilization rates of allocated resources for each container, alongside the corresponding budgeted time. The method conceptualizes the allocated resources as genes and the allocation schemes as chromosomes. It then utilizes genetic algorithm operations, including initialization, evolution, crossover, and mutation, to derive the optimal resource allocation scheme. Furthermore, this paper conducts a comparative evaluation of deploying 20 containers across 4 virtual machines, utilizing both the proposed two-stage load balancing optimization method and three alternative heuristic algorithms. The results, which reflect the relative balance of completion times for each algorithm, indicate that the optimization deployment method introduced in this study demonstrates superior efficiency and balance.

In the future, we will conduct and expand our research from the following perspectives:

(1)Algorithm Optimization: This paper proposes to enhance the modeling and calculation of the resource utilization rate of containers by incorporating a more realistic resource matching similarity rate for future analysis and computation.
(2)Remaining Resource Allocation: Following the implementation of various load balancing resource allocation schemes, the genetic algorithm proposed in this study will inevitably result in some unallocated resources. In future research, we will explore the application of a more systematic and equitable approach to distribute these remaining resources among each container, thereby enhancing the overall resource utilization rate.
(3)Real Case Study: In future research, we will explore the implementation of a real-world small-scale application-level container deployment problem. This will allow us to compare its effectiveness against traditional deployment schemes and to conduct a thorough evaluation and verification of the results.

References

1. Shan C, Xia Y, Zhan Y, Zhang J. KubeAdaptor: a docking framework for workflow containerization on Kubernetes. Future Generation Computer Systems. 2023;148: 584–599.
- View Article
- Google Scholar
2. Siddiqui T, Siddiqui SA, Khan NA. Comprehensive Analysis of Container Technology. 2019 4th International Conference on Information Systems and Computer Networks (ISCON). 2019; 218–223.
3. Zhang W, Chen L, Luo J, Liu J. A two-stage container management in the cloud for optimizing the load balancing and migration cost. Future Generation Computer Systems. 2022; 135:303–314.
- View Article
- Google Scholar
4. Pahl C, Brogi A, Soldani J, Jamshidi P. Cloud Container Technologies: A State-of-the-Art Review. IEEE Transactions on Cloud Computing. 2019; 7(3):677–692.
- View Article
- Google Scholar
5. Shahid MA, Islam N, Alam MM, Su’ud MM, Musa S. A Comprehensive Study of Load Balancing Approaches in the Cloud Computing Environment and a Novel Fault Tolerance Approach. IEEE Access. 2020; 8:130500–130526.
- View Article
- Google Scholar
6. Shahid MA, Islam N, Alam MM, Mazliham MS, Musa S. Towards Resilient Method: An exhaustive survey of fault tolerance methods in the cloud computing environment. Computer Science Review. 2021; 40:1574–0137.
- View Article
- Google Scholar
7. Shahid MA, Alam MM, Su’ud MM. Achieving Reliability in Cloud Computing by a Novel Hybrid Approach. Sensors. 2023; 23(4):1965. pmid:36850563
- View Article
- PubMed/NCBI
- Google Scholar
8. Hirofuchi T, Nakada H, Ogawa H, Itoh S, Sekiguchi S. Eliminating Datacenter Idle Power with Dynamic and Intelligent VM Relocation. Distributed Computing and Artificial Intelligence: 7th International Symposium. 2010; 645–648.
- View Article
- Google Scholar
9. Aslam S, Shah MA. Load balancing algorithms in cloud computing: A survey of modern techniques. 2015 National Software Engineering Conference (NSEC). 2015; 30–35.
10. Nakada H, Hirofuchi T, Ogawa H, Itoh S. Toward virtual machine packing optimization based on genetic algorithm. Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living: 10th International Work-Conference on Artificial Neural Networks. 2009; 651–654.
11. Patel KK, Desai MR, Soni DR. Dynamic priority based load balancing technique for VM placement in cloud computing. 2017 International Conference on Computing Methodologies and Communication (ICCMC). 2017; 78–83.
12. Xu J, Fortes JAB. Multi-Objective Virtual Machine Placement in Virtualized Data Center Environments. 2010 IEEE/ACM Int’l Conference on Green Computing and Communications & Int’l Conference on Cyber, Physical and Social Computing. 2010; 179–188.
13. Van HN, Tran FD, Menaud JM. Autonomic virtual resource management for service hosting platforms. 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing. 1009; 1–8.
14. Dasgupta K, Mandal B, Dutta P, Mandal JK, Dam S. A genetic algorithm (ga) based load balancing strategy for cloud computing. Procedia Technology. 2013; 10: 340–347.
- View Article
- Google Scholar
15. Shahid MA, Alam MM, Su’ud MM. Improved accuracy and less fault prediction errors via modified sequential minimal optimization algorithm. Plos one. 2023; 18(4): e0284209.
- View Article
- Google Scholar
16. Wei G, Vasilakos AV, Zheng Y, Xiong N. A game-theoretic method of fair resource allocation for cloud computing services. The journal of supercomputing. 2010; 252–269.
- View Article
- Google Scholar
17. Lee C, Suzuki J, Vasilakos A, Yamamoto Y, Oba K. An evolutionary game theoretic approach to adaptive and stable application deployment in clouds. Proceedings of the 2nd workshop on Bio-inspired algorithms for distributed systems. 2010; 29–38.
- View Article
- Google Scholar
18. Pandey S, Wu L, Guru SM, Buyya R. A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. 2010 24th IEEE international conference on advanced information networking and applications. 2010; 400–407.
19. Semmoud A, Hakem M, Benmammar B, Charr JC. Load balancing in cloud computing environments based on adaptive starvation threshold. Concurrency and Computation: Practice and Experience. 2020; 32(11):e5652.
- View Article
- Google Scholar
20. Shi Y, Suo K, Kemp S, Hodge J. A task scheduling approach for cloud resource management. 2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4). 2020; 131–136.
21. Shahid MA, Alam MM, Su’ud MM. Performance evaluation of load-balancing algorithms with different service broker policies for cloud computing. Applied Sciences. 2023; 13(3): 1586.
- View Article
- Google Scholar
22. Shahid MA. A Systematic Survey of Simulation Tools for Cloud and Mobile Cloud Computing Paradigm. Journal of Independent Studies and Research Computing. 2022; 20(1).
- View Article
- Google Scholar
23. Dragoni N, Giallorenzo S, Lafuente AL, Mazzara M, Montesi F, Mustafin R, et al. Microservices: yesterday, today, and tomorrow. Present and ulterior software engineering. 2017; 195–216.
- View Article
- Google Scholar
24. Salah T, Zemerly MJ, Yeun CY, Al-Qutayri M, Al-Hammadi Y. The evolution of distributed systems towards microservices architecture. 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST). 2016; 318–325.
25. Buzen JP, Gagliardi UO. The evolution of virtual machine architecture. National computer conference and exposition (AFIPS’73). 1973; 291–299.
26. Chen S, Zhou M. Evolving Container to Unikernel for Edge Computing and Applications in Process Industry. Processes. 2021; 9(2): 351.
- View Article
- Google Scholar
27. Linlin F, Suwen Z. Research on container deployment of microservices. Computing Technology and Automation. 2019; 38(4): 151–155.
- View Article
- Google Scholar
28. DeVore RA, Temlyakov VN. Some remarks on greedy algorithms. Advances in computational Mathematics. 1996; 5: 173–187.
- View Article
- Google Scholar
29. Ghomi EJ, Rahmani AM, Qader NN. Load-balancing algorithms in cloud computing: A survey. Journal of Network and Computer Applications. 2017; 88: 50–71.
- View Article
- Google Scholar
30. Whitley D. A genetic algorithm tutorial. Statistics and computing. Statistics and computing. 1994; 4: 65–85.
- View Article
- Google Scholar
31. Yadav SL, Sohal A. Comparative study of different selection techniques in genetic algorithm. International Journal of Engineering, Science and Mathematics. 2017; 6(3): 174–180.
- View Article
- Google Scholar
32. Patra MK, Misra S, Sahoo B, Turuk AK. GWO-based simulated annealing approach for load balancing in cloud for hosting container as a service. Applied Sciences. 2022; 12(21): 11115.
- View Article
- Google Scholar

[ref1] 1. Shan C, Xia Y, Zhan Y, Zhang J. KubeAdaptor: a docking framework for workflow containerization on Kubernetes. Future Generation Computer Systems. 2023;148: 584–599.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Siddiqui T, Siddiqui SA, Khan NA. Comprehensive Analysis of Container Technology. 2019 4th International Conference on Information Systems and Computer Networks (ISCON). 2019; 218–223.

[ref3] 3. Zhang W, Chen L, Luo J, Liu J. A two-stage container management in the cloud for optimizing the load balancing and migration cost. Future Generation Computer Systems. 2022; 135:303–314.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Pahl C, Brogi A, Soldani J, Jamshidi P. Cloud Container Technologies: A State-of-the-Art Review. IEEE Transactions on Cloud Computing. 2019; 7(3):677–692.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref5] 5. Shahid MA, Islam N, Alam MM, Su’ud MM, Musa S. A Comprehensive Study of Load Balancing Approaches in the Cloud Computing Environment and a Novel Fault Tolerance Approach. IEEE Access. 2020; 8:130500–130526.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Shahid MA, Islam N, Alam MM, Mazliham MS, Musa S. Towards Resilient Method: An exhaustive survey of fault tolerance methods in the cloud computing environment. Computer Science Review. 2021; 40:1574–0137.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref7] 7. Shahid MA, Alam MM, Su’ud MM. Achieving Reliability in Cloud Computing by a Novel Hybrid Approach. Sensors. 2023; 23(4):1965. pmid:36850563
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref8] 8. Hirofuchi T, Nakada H, Ogawa H, Itoh S, Sekiguchi S. Eliminating Datacenter Idle Power with Dynamic and Intelligent VM Relocation. Distributed Computing and Artificial Intelligence: 7th International Symposium. 2010; 645–648.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref9] 9. Aslam S, Shah MA. Load balancing algorithms in cloud computing: A survey of modern techniques. 2015 National Software Engineering Conference (NSEC). 2015; 30–35.

[ref10] 10. Nakada H, Hirofuchi T, Ogawa H, Itoh S. Toward virtual machine packing optimization based on genetic algorithm. Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living: 10th International Work-Conference on Artificial Neural Networks. 2009; 651–654.

[ref11] 11. Patel KK, Desai MR, Soni DR. Dynamic priority based load balancing technique for VM placement in cloud computing. 2017 International Conference on Computing Methodologies and Communication (ICCMC). 2017; 78–83.

[ref12] 12. Xu J, Fortes JAB. Multi-Objective Virtual Machine Placement in Virtualized Data Center Environments. 2010 IEEE/ACM Int’l Conference on Green Computing and Communications & Int’l Conference on Cyber, Physical and Social Computing. 2010; 179–188.

[ref13] 13. Van HN, Tran FD, Menaud JM. Autonomic virtual resource management for service hosting platforms. 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing. 1009; 1–8.

[ref14] 14. Dasgupta K, Mandal B, Dutta P, Mandal JK, Dam S. A genetic algorithm (ga) based load balancing strategy for cloud computing. Procedia Technology. 2013; 10: 340–347.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref15] 15. Shahid MA, Alam MM, Su’ud MM. Improved accuracy and less fault prediction errors via modified sequential minimal optimization algorithm. Plos one. 2023; 18(4): e0284209.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref16] 16. Wei G, Vasilakos AV, Zheng Y, Xiong N. A game-theoretic method of fair resource allocation for cloud computing services. The journal of supercomputing. 2010; 252–269.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref17] 17. Lee C, Suzuki J, Vasilakos A, Yamamoto Y, Oba K. An evolutionary game theoretic approach to adaptive and stable application deployment in clouds. Proceedings of the 2nd workshop on Bio-inspired algorithms for distributed systems. 2010; 29–38.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref18] 18. Pandey S, Wu L, Guru SM, Buyya R. A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. 2010 24th IEEE international conference on advanced information networking and applications. 2010; 400–407.

[ref19] 19. Semmoud A, Hakem M, Benmammar B, Charr JC. Load balancing in cloud computing environments based on adaptive starvation threshold. Concurrency and Computation: Practice and Experience. 2020; 32(11):e5652.
View Article
Google Scholar

[43] View Article

[44] Google Scholar

[ref20] 20. Shi Y, Suo K, Kemp S, Hodge J. A task scheduling approach for cloud resource management. 2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4). 2020; 131–136.

[ref21] 21. Shahid MA, Alam MM, Su’ud MM. Performance evaluation of load-balancing algorithms with different service broker policies for cloud computing. Applied Sciences. 2023; 13(3): 1586.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref22] 22. Shahid MA. A Systematic Survey of Simulation Tools for Cloud and Mobile Cloud Computing Paradigm. Journal of Independent Studies and Research Computing. 2022; 20(1).
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref23] 23. Dragoni N, Giallorenzo S, Lafuente AL, Mazzara M, Montesi F, Mustafin R, et al. Microservices: yesterday, today, and tomorrow. Present and ulterior software engineering. 2017; 195–216.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref24] 24. Salah T, Zemerly MJ, Yeun CY, Al-Qutayri M, Al-Hammadi Y. The evolution of distributed systems towards microservices architecture. 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST). 2016; 318–325.

[ref25] 25. Buzen JP, Gagliardi UO. The evolution of virtual machine architecture. National computer conference and exposition (AFIPS’73). 1973; 291–299.

[ref26] 26. Chen S, Zhou M. Evolving Container to Unikernel for Edge Computing and Applications in Process Industry. Processes. 2021; 9(2): 351.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref27] 27. Linlin F, Suwen Z. Research on container deployment of microservices. Computing Technology and Automation. 2019; 38(4): 151–155.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref28] 28. DeVore RA, Temlyakov VN. Some remarks on greedy algorithms. Advances in computational Mathematics. 1996; 5: 173–187.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref29] 29. Ghomi EJ, Rahmani AM, Qader NN. Load-balancing algorithms in cloud computing: A survey. Journal of Network and Computer Applications. 2017; 88: 50–71.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref30] 30. Whitley D. A genetic algorithm tutorial. Statistics and computing. Statistics and computing. 1994; 4: 65–85.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref31] 31. Yadav SL, Sohal A. Comparative study of different selection techniques in genetic algorithm. International Journal of Engineering, Science and Mathematics. 2017; 6(3): 174–180.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref32] 32. Patra MK, Misra S, Sahoo B, Turuk AK. GWO-based simulated annealing approach for load balancing in cloud for hosting container as a service. Applied Sciences. 2022; 12(21): 11115.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

Figures

Abstract

Introduction

Background and related work

Container deployment combinatorial optimization problem based on load balancing

Fundamentals of container deployment

Microservice architecture.

Virtual machine technology.

Containerization technology and deployment.

Two-stage load balancing container deployment problem model

Container placement problem among virtual machines.

Internal resource allocation problem of virtual machines.

Algorithm design for solving the two-stage container load balancing deployment model

Greedy strategy for solving the container placement problem among virtual machines

Genetic algorithm for solving virtual machine internal resource allocation problem

Generic load balancing techniques.

The fundamental concept of genetic algorithm.

Algorithm design for the optimization model.

Simulation experiment of container deployment optimization for two-stage load balancing

Simulation experiment of container placement between virtual machines solved by greedy strategy

Simulation experiment of resource allocation inside virtual machine solved by genetic algorithm

Comparison of container deployment algorithms between two-stage hybrid allocation and algorithms for other heuristics allocation

Summary and future work

References