An integrated model of threshold-based scaling and fractional admission controlling to improve resource utilization efficiency in 5G core networks

Ly Cuong Hoa; Thanh Chuong Dang; Viet Minh Nhat Vo

doi:10.1371/journal.pone.0330072

Abstract

User Plane Function (UPF) is considered a bridge between User Equipment (UE) and Data Networks (DN) in the 5G core network. A UPF instance can manage multiple Packet Data Unit (PDU) sessions, and there are usually various UPF instances deployed to serve PDU session requests. One requirement is utilizing system resources effectively while ensuring stable system performance. Specifically, the need to optimize unused UPF instances to reduce system costs. The paper proposes a fractional admission controlling (FAC) mechanism and integrates it with a Markov chain-based analytical model for threshold-based scaling for UPF instances (called TSUPF-FAC), in which two additional thresholds are added to control UPF instances globally in order to optimize resource utilization. A threshold-based scaling and fractional admission controlling (TS-FAC) algorithm is developed and implemented in Kubernetes-based Open5GS. The simulation results show a similarity between the analytical and experimental results, in which the analytical model helps to determine the admission thresholds for the best performance of TSUPF-FAC, as measured by metrics such as the number of idle UPF instances and system utilization.

Citation: Hoa LC, Dang TC, Vo VMN (2025) An integrated model of threshold-based scaling and fractional admission controlling to improve resource utilization efficiency in 5G core networks. PLoS One 20(8): e0330072. https://doi.org/10.1371/journal.pone.0330072

Editor: Dhanamjayulu C, Vellore Institute of Technology, INDIA

Received: September 19, 2024; Accepted: July 27, 2025; Published: August 18, 2025

Copyright: © 2025 Cuong Hoa et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: This work was supported by the Ministry of Education and Training (Vietnam) for the development of Science and Technology under grant number B2023-DHH-17. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The 5G network is a new generation of mobile networks capable of meeting the many requirements of vertical industries. In the 5G network with Service-Based Architecture, control and user planes are separated, which provides flexibility in allocating UPF instances for new 5G applications and customer-specific edge services. UPF instances are responsible for connecting Radio Access Networks (RAN) to DNs via Access and Mobility Management Functions (AMF). A single UPF instance can manage multiple PDU sessions, while a PDU session can only be managed by one UPF instance. PDU sessions provide an end-to-end user plane connections between UEs and DNs [1].

Establishing a PDU session on an UPF instance is illustrated in Fig 1. First, an UE must establish an UPF instance by registering to access the 5G infrastructure. The AMF requests a Session Management Function (SMF) to create a session management context to manage the UE’s PDU session. The SMF then selects an UPF instance to serve the UE. In order to create UE-specific QoS flows, a PDU session request initiated by the UE is sent to an UPF instance. The SMF schedules static and dynamic rules for sessions and then establishes session-related rules and policies for the UPF instance. Therefore, each PDU session is managed by an UPF instance, and the operation of UPF instances are performed by the Operation, Administration, and Management. Typically, several UPF instances are deployed to serve PDU session requests. However, since the number of UPF instances is limited, deploying more UPF instances than needed will result in waste. Therefore, a mechanism to control resources while scaling UPF instances is needed.

Download:

Fig 1. The model of establishing PDU sessions in an UPF instance in the 5G core network.

Functions of gNB, AMF, SMF, and UPF in PDU session establishment between UE and the data network.

https://doi.org/10.1371/journal.pone.0330072.g001

The paper proposes a FAC mechanism and integrates it with TSUPF (called TSUPF-FAC), in which two different thresholds are added to control UPF instances globally. A Markov chain-based analytical model (called the Queueing model) has also been improved to evaluate the effectiveness of the new integration model. A TS-FAC algorithm has been developed and implemented in Kubernetes-based Open5GS.

The main contributions of the article include:

Proposing a FAC mechanism to efficiently respond to PDU session requests;
Improving the model of TSUPF proposed in [2] by integrating a FAC mechanism to scale UPF instances effectively. The improved model is called the Threshold-based Scaling and Fractional Admission Controlling for UPF instances (TSUPF-FAC);
Building a Queueing model for TSUPF-FAC (Q- TSUPF-FAC) to evaluate the impact of control thresholds on the performance of TSUPF-FAC; and
Developping a TS-FAC algorithm and implementing simulation on Kubernetes with Open5GS to evaluate the performance of TSUPF-FAC experimentally.

The following sections of the article are organized as follows. Sect ‘Literature reviews’ introduces related works. The TSUPF-FAC model and queueing-based performance analysis are presented in Sect ‘Thresholds-based scaling for UPF instances with fractional admission controlling’. Experimental implementation and result analysis are described in Sect ‘Results and discussions’. The rest is the conclusion and future research.

Literature reviews

In 5G core networks, network functions can be dynamically scaled in/out to adjust the capacity of network components (e.g., UPF instances, Virtualized Network Functions (VNF), network slices). The process of scaling out instances is to increase available resource capacity, while the process of scaling in instances is to reduce operational costs. However, scaling-in/out issues in 5G networks differ from those in traditional cloud computing [3]. 5G network functions must deploy multiple instances simultaneously and more frequently than traditional cloud computing. Both the number and frequency of deployments impact cost-effectiveness significantly. One such system, called Telco-Cloud, was introduced by [4], which aims to deploy VNFs with the ability to handle an enormous number of requests.

There have been some studies on scaling with different resource types. Herrera and Moltó [5] studied the impact of a bio-based approach on automated container orchestration platforms. They found a relationship between bio-based approaches and the scalability of containers. The scaling method is also considered for improvement in some actual cases. Taherizadeh et al. [6] proposed an auto-scaling method based on a dynamic multi-level model, where the thresholds change automatically and the application scope is not only limited to the network infrastructure but can also be applied to application monitoring data. Guo et al. [7] developed a technique for packaging virtual machines to physical machines (VM-to-PM) and how to scale virtual machines to meet resource requirements. The Monitor-Analyze-Plan-Execute (MAPE) method was also introduced by Nguyen et al. [8] to manage the auto-scaling of UPF instances.

There have been several proposals for Network Functions Virtualization (NFV) and VNF. Specifically, the NFV auto-scaling algorithm proposed by Ren et al. [9,10] considered the trade-off between performance and operating costs. VNF is also enabled or disabled depending on the required capacity. Kumar et al. [11] proposed a new method for scaling in and out of VNFs and thereby discovered techniques for allocating and revoking UPF instances based on the Linux kernel. Ren et al. [12] introduced the VNF auto-scaling algorithm to balance high performance and low operating costs. Tang et al. [13] introduced dynamic VNF scaling and deployment in data center networks. A system called ScalFlux was introduced by Liu et al. [14] to reduce latency and achieve optimal performance through VNF traffic monitoring. Adamuz-Hinojosa et al. [15] proposed a scaling process according to European Telecommunications Standards Institute (ETSI) standards through interaction and information exchange between functional blocks in the NFV framework. The problem of optimizing Service Function Chain design and VNF placement to minimize resource costs considering VNF dependencies and traffic scale was addressed in the research of Zeng et al. [16]. Chen et al. [17] proposed HyScaler as a hybrid system for scaling VNFs deployed on an open-source NFV platform. Another tool called Tacker, based on the OpenStack open-source cloud computing platform, was also used by Sales et al. [18] to implement autoscaling functions in NFV. Leyva-Pupo et al. [19,20] raised the issue of dynamic UPF position reconfiguration due to user mobility. They proposed an Integer Linear Programming (ILP) model to reduce the cost and scheduling mechanism for re-computation time.

System performance enhancement can be achieved by Guard Channel (GC) and Fractional Guard Channel (FGC) mechanisms. Cruz-Perez and Ortigoza-Guerrero [21] overviewed call admission control mechanisms to ensure QoS in mobile networks. The authors [21] proposed the FGC mechanism to improve the original GC. Chuong et al. [22] consider the retrial queueing model with the FGC mechanism in cellular networks, thereby analyzing, evaluating, and comparing four mechanisms of Limited FGC, Uniform FGC, Limited Average FGC, and Quasi Uniform FGC.

Barrachina-Muñoz et al. [23] have developed and evaluated a 5G framework based on network functions encapsulated within Kubernetes clusters. Similarly, using the Kubernetes platform, Simone et al. [24] employed queueing models for AMF, SMF, and UPF nodes to assess system performance related to latency. Accordingly, our model presented in this paper will also be modeled based on Kubernetes.

The Queueing model used to analyze the resource scaling (e.g., UPF instances) has also attracted the attention of some researchers. Accordingly, Hsieh et al. [3] applied the queueing model to analyze the scaling of resource blocks corresponding to network slices. Rotter and Do [2] proposed the Queueing model of Threshold-based Scaling for UPF instances (Q-TSUPF) to analyze the scaling of UPF instances when the number of sessions being served reaches thresholds T₁ or T₂. However, this threshold pair is considered locally for each UPF to scale in or out when the number of PDU sessions reaches them. This approach clearly does not consider the entire system’s remaining UPF instances when scaling. In the case of a sudden increase in the number of PDU session requests, while the number of UPF instances is almost exhausted, it may be impossible to satisfy PDU session requests, and as a result, the service efficiency of the system is seriously degraded. Controlling the admission of PDU session requests is therefore necessary to improve the efficiency of resource use, and maintain stable system performance.

This paper will propose an improved model of TSUPF, in which a fractional admission controlling mechanism is integrated to control the admission of incoming PDU session requests, thereby more effectively managing the deployment or termination of UPF instances. A Queueing model for analyzing TSUPF-FAC is also developed, and implementating TSUPF-FAC on Kubernetes with Open5GS is also deployed. The following section will describe our contributions in detail.

Thresholds-based scaling for UPF instances with fractional admission controlling

Problem description

The UPF plays a crucial role in 5G networks to realize the transformation of low latency and high throughput. The UPF is deployed as software and packaged in virtual machines or image containers. Service providers launch UPF instances in their cloud infrastructure to serve customers. In 5G networks, there is significant variation in PDU sessions generated by subscriber devices. In order to ensure quality of service (QoS), each UPF instance typically handles only a limited number of PDU sessions. New UPF instances can be deployed when there are more PDU session requests, and idle UPF instances are terminated when the number of PDU sessions decreases. In other words, improved scaling algorithms are required to manage UPF instances efficiently.

In [2], a Queueing model with the scaling-in (T₁) and scaling-out (T₂) thresholds is introduced to deploy and terminate an UPF instance when the number of in-serve PDU sessions are and , respectively. A two-dimensional Markov chain with continuous time , where I(t) is the number of in-serve PDU sessions and J(t) is the number of UPF instances deployed at time t, is analysed. The arrival rate of PDU session requests (λ) is assumed to have a Poisson distribution, and the service time of PDU sessions () is assumed to have an exponential distribution. In fact, the assumption of Poisson-distributed session arrivals in 5G networks has been introduced in [2,12,25,26]. Poisson distribution is often used to model events such as incoming calls, network connection requests, or data sessions (e.g., PDU sessions in 5G networks). Non-Poisson-distributed arrivals also exist in 5G networks, but this paper only focuses on Poisson-distributed arrivals.

An UPF instance (j) only is deployed, with the state transited from j to , , if the number of in-serve PDU sessions in-serve (i) increases from to , denoted by . Conversely, an UPF instance (j) is terminated, with the state transited from j to j–1, , if the number of in-serve PDU sessions in service (i) decreases from to , denoted by . The threshold-based scaling problem can be illustrated in Fig 2.

Download:

Fig 2. Operation of TS-FAC for UPF instances.

https://doi.org/10.1371/journal.pone.0330072.g002

The transition process of a state (i,j) is carried out as in [2]:

From (i,j) to : When a new PDU session request arrives and the system can serve it, a new UPF instance is not deployed if:
- either ;
- or ;
- or .
From (i,j) to : When a new PDU session request arrives and the system can serve it, an UPF instance is deployed if .
From (i,j) to (i–1,j): When a PDU session departs, a deployed UPF instance is not terminated if:
- either ;
- or ;
- or .
From (i,j) to (i–1,j–1): When a PDU session departs, a deployed UPF instance is terminated if .

Note that it is necessary to distinguish two cases: and . This paper only considers the case , that is or . Other cases have been demonstrated in [2].

FAC mechanism

To improve the effectiveness of Q-TSUPF [2], we add two thresholds, H₁ and H₂ , to minimize the number of redundant deployed UPF instances while ensuring utilization remains unchanged. Two probabilities of and are also included to control the conditions under which two thresholds of H₁ and H₂ are applied. H₁ and H₂ are independent of the deployment threshold T₁ and the termination threshold T₂. The conditions applying H₁ and H₂ are as in the FGC mechanism [21]. These two thresholds are intended to gradually reduce the number of deployed UPF instances if the number of in-serve PDU sessions reaches the strict H₁ threshold and then the more stringent H₂ threshold (Table 1). These thresholds help avoid over-deploying which causes wast of resources) or under-deploying, which leads to degraded performance).

Download:

Table 1. Notations used in the FAC mechanism.

https://doi.org/10.1371/journal.pone.0330072.t001

The proposed FAC mechanism in this paper is inspired by the FGC mechanism [21], which is to limit new PDU session requests using two thresholds H₁ and H₂. Based on the number of in-serve PDU sessions, the FAC mechanism establishes the thresholds H₁ and H₂ with probabilities and , respectively. FAC can be consided as a general case of FGC when not limited by thresholds. However, certain systems may limit the thresholds to a specific range. The FAC mechanism is defined as follows.

Definition FAC. Fractional admission control is a mechanism that deploys UPF instances based on the number of in-serve PDU requests to segment by establishing two thresholds H₁ and H₂ corresponding to two probabilities and , respectively. These parameters are used to determine the probability of system state (i,j) in a two-dimensional Markov chain .

TS-FAC algorithm

Based on Definition FAC, we further refine the thresholds-based scaling model [2] by introducing two thresholds and the probabilities of applying them to limit deployed resources and ensure system performance, as presented in Algorithm TS-FAC. The purpose of admission controlling is to prevent uncontrolled deploying, system underutilization, and waste of resource.

Algorithm TS-FAC: Threshold-based Scaling and Fractional Admission Controlling

Input: .

Output:

1:

2:

3: while do

4:

5:

6: if then 7:

8: end if

9: if H₂<I(t) then

10:

11: end if

12: // are arrival rates at states

13: if a new UE requests a PDU session setup then

14:

15: if and then

16: return

17: end if

18: end if

19: if an UE in the system departs then

20:

21: if and then

22: return

23: end if

24: end if

25: end while

By applying the addition and multiplication rules, as well as the algorithmic complexity calculation methods as in [27], we deduce that the algorithm’s complexity is . From the algorithmic complexity, we can see that it depends linearly on C and L, and is independent of traffic load.

Algorithm TS-FAC manages UPF instances dynamically based on predefined thresholds which helps maintain better performance without over-deploying or under-deploying UPF instances and waste of resource.

One improvement of Algorithm TS-FAC is that it considers the probability control value for the states (i,j) as with initial default values of 1. The algorithm then compares the states (i,j), and if then is , and if i>H₂ then is .

Operating the Queing model of TSUPF-FAC (Q-TSUPF-FAC)

TSUPF-FAC performs scaling of UPF instanses based on data related to the PDU session at the AMF and UPF (Fig 1). The operation of TSUPF-FAC is depicted in Fig 2, where scaling and FAC are performed by Algorithm TS-FAC.

Accordingly, the system has at most L deployed UPF instances and each deployed UPF instance has at most C PDU sessions in service. Thus, there is a maximum of PDU sessions operating in the system. As shown in Fig 2, the uncontrolled cells with green color have the probability . The yellow cells controlled at the threshold H₁ with the probability . The red cells controlled at the threshold H₂ with the probability . Q-TSUPF-FAC is developped from Q-TSUPF in [2]. Specifically, when the number of PDU session requests in the system reaches , labeled as a_j in Fig 2, the system deploys a new UPF instance. Similarly, if the number of PDU session requests decreases to , labeled as in Fig 2, the system will terminate one UPF instance. According to Definition FAC, the lines 5-12 of Algorithm TS-FAC can be described more clearly as follows:

In the case ,
- If and then ;
- If and then ;
- If and H₂<i then .
In the case ,
- If then ;
- If i>H₁ then .

For exceptional cases, if or then , Q-TSUPF-FAC becomes to Q-TSUPF in [2].

System state equilibrium equations

From Q-TSUPF-FAC in Fig 2, state equilibrium equations and state transition schemes are as the following equations, where equilibrium probabilities of the two-dimensional Continuous Time Markov Chain is denoted as

Where is the set of states of the system: . The cardinality of the set is determined as :

(1)

We have state transition equations:

(2)

(3)

(4)

(5)

(6)

(7)

The state transition diagrams of Q-TSUPF-FAC corresponding to the Eqs (2)–(7) are shown in Figs 3 to Fig 8.

Download:

Fig 3. State transition subdiagram of Q-TSUPF-FAC for the Eq (2).

https://doi.org/10.1371/journal.pone.0330072.g003

Download:

Fig 4. State transition subdiagram of Q-TSUPF-FAC for the Eq (3).

https://doi.org/10.1371/journal.pone.0330072.g004

Download:

Fig 5. State transition subdiagram of Q-TSUPF-FAC for the Eq (4).

https://doi.org/10.1371/journal.pone.0330072.g005

Download:

Fig 6. State transition subdiagram of Q-TSUPF-FAC for the Eq (5).

https://doi.org/10.1371/journal.pone.0330072.g006

Download:

Fig 7. State transition subdiagram of Q-TSUPF-FAC for the Eq (6).

https://doi.org/10.1371/journal.pone.0330072.g007

Download:

Fig 8. State transition subdiagram of Q-TSUPF-FAC for the Eq (7).

https://doi.org/10.1371/journal.pone.0330072.g008

The Eqs (2)–(7) are solved using a system of linear regression equations with a normalization condition by setting . In this case, . At this point, the system of the Eqs (2)–(7) can be solved by backtracking [28] since there is a defined value of . After determining the base on the normalization condition in the Eq (8),

(8)

We can calculate:

(9)

We then derive the probabilities p_i,j from .

Performance evaluation metrics

The performance metrics of Q-TSUPF-FAC are similar to those of Q-TSUPF [2]. However, Q-TSUPF-FAC adds the probability to achieve the minimal number of idle UPF instances.

The average number of deployed UPF instances includes the busy and idle UPF instances.(10)
The average number of busy UPF instances includes the deployed and used UPF instances.(11)
The average number of UPF instances deployed but idle () is for insurance purposes. However, sometimes, the number of idle UPF instances exceeds the required insurance, which is called redundant UPF instances. The redundancy leads to a waste of resources and increased management costs. Therefore, Q-TSUPF-FAC aims to minimize the number of redundant UPF instances.(12)
Utilization is the ratio of used resources to allocated resources.(13)

The probability in Q-TSUPF-FAC directly affects the equilibrium probability p_i,j, which improves the performance parameters of Q-TSUPF-FAC compared to Q-TSUPF in [2]. To evaluate the effectiveness of the TSUPF-FAC model, we assess both the and U metrics in comparison with the model presented in [2]. The subsequent analysis will show that the value in our model is smaller than the value in [2], while the U value in Q-TSUPF-FAC is larger than the U value in [2]. This indicates that the Q-TSUPF-FAC model has deployed a sufficient number of idle UPF instances, thereby decreasing redundant UPF instances and helping to reduce system costs. The next section will clarify the advantages of our proposed model compared to a specific scenario identified in [2].

Implementing of TS-FAC on Kubernetes with Open5GS

To evaluate the proposed Q-TSUPF-FAC model, we model the 5G core network with UPF instances, as depicted in Fig 9 [23,24].

Download:

Fig 9. Kubernetes-based Open5GS testbed implementing a complete 5G network infrastructure.

The red entities and lines represent the process of establishing the PDU session in the 5G core network.

https://doi.org/10.1371/journal.pone.0330072.g009

The simulation model in Fig 9 depicts the architecture of the 5G core network using the Open5GS simulator integrated within Kubernetes, encompassing various functional blocks [23,24]:

UERANSIM block: models the open-source simulation of UE and RAN.
Open5GS block: implements the core infrastructure of the 5G network in compliance with 3GPP Release 17 [29] and serves as the deployment area for the various versions of UPF.

In our experimental simulation model, Open5GS nodes have been deployed on Kubernetes, one of the most widely used container orchestration systems [30]. Accordingly, the functions of the 5G network, including UPF instances, are virtualized using container technology as illustrated in Fig 9. Core elements, including UPF instances of the 5G core network, are packaged into separate containers and organized into Pods within Kubernetes with resources and execution of each Pod managed by Kubernetes. This setup facilitates isolation of the execution environment and optimizes resource utilization. Kubernetes manages these Pods as the fundamental unit of deployment. In this paper, we consider a Pod running a single container that hosts an UPF image; thus, it can be referred to as an UPF Pod in Kubernetes [31]. We also use Open5GS [32] deployed on Kubernetes to simulate Algorithm TS-FAC and compare the results with the analytical outcomes [33]. A notable point is that in [2], the authors only performance with numerical evaluation. Therefore, to verify the accuracy of the improved model, we conducted additional simulations with a duration of 10,000 seconds. and due to the Poisson arrival process and the exponential service process, the results of each run exhibit slight but not significant variations. Therefore, to confirm accuracy, we conducted 100 to 300 simulations and compared the averaged results, as shown in Figs 10 and 11. From the average results presented in Figs 10 and 11, it is evident that the simulation results are accurate to 99% and exhibit convergence. Furthermore, to validate the accuracy of our theoretical model in alignment with practical simulations, the results are presented in Figs 20 and 21

Download:

Fig 10. Comparison results by number of simulations for

.

https://doi.org/10.1371/journal.pone.0330072.g010

Download:

Fig 11. Comparison results by number of simulations for U.

https://doi.org/10.1371/journal.pone.0330072.g011

Download:

Fig 12. Comparison of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on

when varing

and

.

The large red marker represents the special case of Q-TSUPF [2].

https://doi.org/10.1371/journal.pone.0330072.g012

Download:

Fig 13. Comparison of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on U when varing

and

.

The large red marker represents the special case of Q-TSUPF [2].

https://doi.org/10.1371/journal.pone.0330072.g013

Download:

Fig 14. Comparison of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on

when varing

and

.

The large red marker represents the special case of Q-TSUPF [2].

https://doi.org/10.1371/journal.pone.0330072.g014

Download:

Fig 15. Comparison of of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on U when varing

and

.

The large red marker represents the special case of Q-TSUPF [2].

https://doi.org/10.1371/journal.pone.0330072.g015

Download:

Fig 16. Comparison of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on

when varing

and

.

https://doi.org/10.1371/journal.pone.0330072.g016

Download:

Fig 17. Comparison of of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on U when varing

and

.

https://doi.org/10.1371/journal.pone.0330072.g017

Download:

Fig 18. Comparison of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on

when varing

.

https://doi.org/10.1371/journal.pone.0330072.g018

Download:

Fig 19. Comparison of the performance of Q-TSUPF-FAC and Q-TSUPF [2] based on U when varing

.

https://doi.org/10.1371/journal.pone.0330072.g019

Download:

Fig 20. Comparison of

between theorical analysis and simulation when varying

.

https://doi.org/10.1371/journal.pone.0330072.g020

Download:

Fig 21. Comparison of U between theorical analysis and simulation when varying

.

https://doi.org/10.1371/journal.pone.0330072.g021

Results and discussions

Simulations are performed with the default parameters as shown in Table 2 (similar to [2]) to compare the performance of TSUPF-FAC and TSUPF in [2]. In some cases, the parameters are reset to suit the simulation objectives. In our model, it is assumed that the PDU session arrival rate follows a Poisson distribution, while the service rate follows an exponential distribution - an approach consistent with the assumptions in [2,12], among others. Furthermore, simulations have been carried out to verify the practical feasibility of the proposed model.

Download:

Table 2. Simulation/analysis parameters and values.

https://doi.org/10.1371/journal.pone.0330072.t002

The performance evaluation metrics are as mentioned in the Sect ‘Performance evaluation metrics’:

The average number of idle UPF instances (noted ): results are computed using (12) according to the theory and are averaged from 100 to 300 simulations, and
The utilization (noted U): results are computed using (13) according to the theory and are averaged from 100 to 300 simulations.

According to the definition of the traffic load [34], where k is the number of servers. Based on the parameter values in the paper, we deduce . For the system to operate stably, we need the condition or . Therefore, we choose to simulate and analyze with the value is appropriately chosen to simulate and analyze. In addition, assuming that there are multiple UE and each UE can request the establishment of multiple PDU sessions.

Comparison of and U when varying and

The first goal of our simulation is to evaluate the effectiveness of integrating the FAC mechanism into Q-TSUPF. Figs 12 and 13 show that Q-TSUPF in [2] is represented by a large red circle marker corresponding to or . We adjust the thresholds H₁ and H₂ relative to the total number of system states (1).

The results show that when , the number of idle UPF instances remains almost unchanged at around 0.45 (the lowest value). However, when , the number of idle UPF instances gradually changes according to the value of H₂, with the exception of the case where , which corresponds to the Q-TSUPF model in [2], yielding the highest number of idle UPF instances. In terms of performance, the results remain stable at a level above 0.9921, or 99,21%, which demonstrates the suitability of our proposed theoretical model. As shown in Fig 12, the number of idle UPF instances is lower, but the utilization level does not change. The result suggests that integrating FAC into Q-TSUPF makes the number of UPF instances deployed more efficiently while maintaining the same utilization level.

Comparison of and U when varying and

In the above section, we consider scaling a UPF instance once the number of served PDU sessions reaches the threshold H₁ and H₂. However, to create a smooth and non-abrupt transition, two control probabilities and are added for the transition at H₁ and H₂, respectively. Assuming the threshold values are fixed as and , Fig 14 depicts as and vary, where of Q-TSUPF-FAC is lower than that of Q-TSUPF [2]. Q-TSUPF [2] is indicated by a large red circle marker corresponding to . Despite reducing , U of Q-TSUPF-FAC is not lower than Q-TSUPF [2]. As shown in Fig 15, Q-TSUPF-FAC always maintains a stable efficiency level at 0.9921, similar to that of Q-TSUPF [2].

Comparison of and U when varying T₁ and T₂

In [2], the system performance was analyzed with different thresholds T₁ and T₂. As shown in Figs 16 and 17, with the predefined values of as in Table 2 and different considered pairs of , of Q-TSUPF-FAC are always lower than Q-TSUPF in [2] (Fig 16). However, U of Q-TSUPF-FAC achieves a higher value than Q-TSUPF [2] (Fig 17), which again proves the effectiveness of the FAC mechanism. Although the results in Fig 17 show that the TS-FAC algorithm only increases the performance by about 1%, the improvement is noticeable as each UPF can serve more PDU sessions.

Comparison of and U when varying traffic

The efficiency of scaling UPF instances is affected by traffic . Specifically, when changing intense (from 100 to 500), Fig 18 shows an increase in . An increase is necessary to respond quickly to requests when the number of arriving PDU sessions increases and thus maintain the stable utilization of the system. Fig 19 shows the resource utilization adapted to the increasing arrival rate of these PDU sessions.

Comparison of and U between theorical analysis and simulation

In order to demonstrate the correctness of the theoretical model and simulation implementation, we compare the theoretical analysis results and simulation results with the defaultparameters, as shown in Table 2. Figs 20 and 21 show that there is a similarity of the performance curves ( and U) between the theoretical analysis and the experimental simulation. It is clear that adding FAC to the Q-TSUPF model [2] is correct and brings better performance for scaling UPF instances in 5G core systems.

Conclusion

In this paper, we have established a threshold scaling and controlling model by proposing and integrating the FAC mechanism to threshold scaling model (Q-TSUPF). Additionally, we have developed an effective algorithm for applying the FAC mechanism, which aids in the analysis, calculation, and real-time simulation of the queueing model. Simulation and analytical results indicate that our model outperforms existing models by incorporating the control thresholds H₁ and H₂ with the control probabilities and , respectively. Our model results in a lower number of idle UPF instances compared to the scenario without the FAC mechanism under system traffic load, thus allowing the system to conserve distributed resources while maintaining high performance relative to incoming request volumes. A notable advantage is the ease of implementing this FAC mechanism. The computational complexity of the algorithm is , but it is more efficient regarding resource utilization in fewer idle UPF instances. Furthermore, it enables network operators to evaluate and ensure that a 5G network model can meet QoS requirements for PDU session requests. A crucial aspect is that our model can be applied in 6G networks. To fulfill the requirements in heterogeneous environments, we will consider applying random processes other than Poisson processes. Additionally, artificial intelligence and machine learning approaches will also be considered to improve computational results in heterogeneous networks.

References

1. UCC 5G UPF Configuration and Administration Guide; 2023.
2. Rotter C, Van Do T. A queueing model for threshold-based scaling of UPF instances in 5G core. IEEE Access. 2021;9:81443–53.
- View Article
- Google Scholar
3. Hsieh C-Y, Phung-Duc T, Ren Y, Chen J-C. Design and analysis of dynamic block-setup reservation algorithm for 5G network slicing. IEEE Trans Mobile Comput. 2022:1–1.
- View Article
- Google Scholar
4. Jaro G, Hilt A, Nagy L, Tundik MA, Varga J. Evolution towards Telco-Cloud: Reflections on dimensioning, availability and operability. In: 2019 42nd International Conference on Telecommunications and Signal Processing (TSP); 2019. 1–8.
5. Herrera J, Molto G. Toward bio-inspired auto-scaling algorithms: An elasticity approach for container orchestration platforms. IEEE Access. 2020;8:52139–50.
- View Article
- Google Scholar
6. Taherizadeh S, Stankovski V. Dynamic multi-level auto-scaling rules for containerized applications. Comput J. 2018;62(2):174–97.
- View Article
- Google Scholar
7. Guo Y, Stolyar A, Walid A. Online VM auto-scaling algorithms for application hosting in a cloud. IEEE Trans Cloud Comput. 2018:1–1.
- View Article
- Google Scholar
8. Nguyen V-G, Grinnemo K-J, Taheri J, Forsman J, Le Duc T, Brunstrom A. On auto-scaling and load balancing for user-plane gateways in a softwarized 5G Network. In: 2021 17th international conference on network and service management (CNSM); 2021. p. 132–8. https://doi.org/10.23919/cnsm52442.2021.9615536
9. Ren Y, Phung-Duc T, Chen J-C, Li FY. Enabling dynamic autoscaling for NFV in a non-standalone virtual EPC: Design and analysis. IEEE Trans Veh Technol. 2023;72(6):7743–56.
- View Article
- Google Scholar
10. Ren Y, Phung-Duc T, Chen JC, Yu ZW. Dynamic auto scaling algorithm (DASA) for 5G mobile networks. In: 2016 IEEE Global Communications Conference (GLOBECOM); 2016. 1–6.
11. Kumar D, Chakrabarti S, Rajan AS, Huang J. Scaling telecom core network functions in public cloud infrastructure. In: 2020 IEEE international conference on cloud computing technology and science (CloudCom); 2020. p. 9–16. https://doi.org/10.1109/cloudcom49646.2020.00006
12. Ren Y, Phung-Duc T, Liu Y-K, Chen J-C, Lin Y-H. ASA: Adaptive VNF scaling algorithm for 5G mobile networks. In: 2018 IEEE 7th international conference on cloud networking (CloudNet); 2018. p. 1–4. https://doi.org/10.1109/cloudnet.2018.8549542
13. Tang H, Zhou D, Chen D. Dynamic network function instance scaling based on traffic forecasting and VNF placement in operator data centers. IEEE Trans Parallel Distrib Syst. 2019;30(3):530–43.
- View Article
- Google Scholar
14. Liu L, Xu H, Niu Z, Li J, Zhang W, Wang P, et al. ScaleFlux: Efficient stateful scaling in NFV. IEEE Trans Parallel Distrib Syst. 2022;33(12):4801–17.
- View Article
- Google Scholar
15. Adamuz-Hinojosa O, Ordonez-Lucena J, Ameigeiras P, Ramos-Munoz JJ, Lopez D, Folgueira J. Automated network service scaling in NFV: Concepts, mechanisms and scaling workflow. IEEE Commun Mag. 2018;56(7):162–9.
- View Article
- Google Scholar
16. Zeng Z, Xia Z, Zhang X, He Y. SFC design and VNF placement based on traffic volume scaling and VNF dependency in 5G networks. Comput Model Eng Sci. 2023;134(3):1791–814.
- View Article
- Google Scholar
17. Chen Z, Li H, Ota K, Dong M. HyScaler: A dynamic, hybrid vnf scaling system for building elastic service function chains across multiple servers. IEEE Trans Netw Serv Manage. 2023;20(4):4803–14.
- View Article
- Google Scholar
18. Sales W, Coutinho E, Souza J. Auto-scaling in NFV using tacker. In: Conference: 5th international workshop on ADVANCEs in ICT INfrastructures and Services (ADVANCE 2015). Evry – France; 2017.
19. Leyva-Pupo I, Cervelló-Pastor C, Anagnostopoulos C, Pezaros DP. Dynamic UPF placement and chaining reconfiguration in 5G networks. Comput Netw. 2022;215:109200.
- View Article
- Google Scholar
20. Leyva-Pupo I, Cervelló-Pastor C, Anagnostopoulos C, Pezaros DP. Dynamic scheduling and optimal reconfiguration of UPF placement in 5G networks. In: Proceedings of the 23rd international ACM conference on modeling, analysis and simulation of wireless and mobile systems; 2020. p. 103–11. https://doi.org/10.1145/3416010.3423221
21. Cruz-Pérez FA, Ortigoza-Guerrero L. Fractional resource reservation in mobile cellular systems. Resource, mobility, and security management. New York, USA: Auerbach Publications; 2006. p. 335–62.
22. Chuong DT, Cuong HL, Trung Duc P, Duc Hung D. Performance analysis in cellular networks considering the QoS by retrial queueing model with the fractional guard channels policies. IJCNC. 2021;13(04):85–100.
- View Article
- Google Scholar
23. Barrachina-Muñoz S, Nikbakht R, Baranda J, Payaró M, Mangues-Bafalluy J, Kokkinos P, et al. Deploying cloud-native experimental platforms for zero-touch management 5G and beyond networks. IET Netw. 2023;12(6):305–15.
- View Article
- Google Scholar
24. De Simone L, Di Mauro M, Natella R, Postiglione F. Performance and availability challenges in designing resilient 5G architectures. IEEE Trans Netw Serv Manage. 2024;21(5):5291–303.
- View Article
- Google Scholar
25. Almagrabi AO, Ali R, Alghazzawi D, AlBarakati A, Khurshaid T. A poisson process-based random access channel for 5G and beyond networks. Mathematics. 2021;9(5):508.
- View Article
- Google Scholar
26. Ahmed Mohamed Anwar, Mohamed Shehata, Safa M. Gasser, Hesham El Badawy. Handoff scheme for 5G mobile networks based on markovian queuing model. ARASET. 2023;30(3):348–61.
- View Article
- Google Scholar
27. El-kenawy E-SM, Khodadadi N, Mirjalili S, Abdelhamid AA, Eid MM, Ibrahim A. Greylag goose optimization: Nature-inspired optimization algorithm. Expert Syst Appl. 2024;238:122147.
- View Article
- Google Scholar
28. Shah KR, Sinha B. Theory of optimal designs. Springer Science & Business Media; 2012.
29. 3GPP. 3GPP TR 21.917. https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3937
30. Kubernetes. https://kubernetes.io/docs/home/. Accessed 2023 October 1.
31. Nguyen HT, Van Do T, Rotter C. Scaling UPF instances in 5G/6G core with deep reinforcement learning. IEEE Access. 2021;9:165892–906.
- View Article
- Google Scholar
32. Open5GS. https://open5gs.org/open5gs/. Accessed 2023 October 1.
33. Tran MN, Vu DD, Kim Y. A survey of autoscaling in Kubernetes. In: International conference on ubiquitous and future networks; 2022. p. 263–5.
34. Butterworth RW, Kleinrock L. Queueing systems volume 1: Theory. J Am Stat Assoc. 1976;71(355):773.
- View Article
- Google Scholar

[ref1] 1. UCC 5G UPF Configuration and Administration Guide; 2023.

[ref2] 2. Rotter C, Van Do T. A queueing model for threshold-based scaling of UPF instances in 5G core. IEEE Access. 2021;9:81443–53.
View Article
Google Scholar

[3] View Article

[4] Google Scholar

[ref3] 3. Hsieh C-Y, Phung-Duc T, Ren Y, Chen J-C. Design and analysis of dynamic block-setup reservation algorithm for 5G network slicing. IEEE Trans Mobile Comput. 2022:1–1.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Jaro G, Hilt A, Nagy L, Tundik MA, Varga J. Evolution towards Telco-Cloud: Reflections on dimensioning, availability and operability. In: 2019 42nd International Conference on Telecommunications and Signal Processing (TSP); 2019. 1–8.

[ref5] 5. Herrera J, Molto G. Toward bio-inspired auto-scaling algorithms: An elasticity approach for container orchestration platforms. IEEE Access. 2020;8:52139–50.
View Article
Google Scholar

[10] View Article

[11] Google Scholar

[ref6] 6. Taherizadeh S, Stankovski V. Dynamic multi-level auto-scaling rules for containerized applications. Comput J. 2018;62(2):174–97.
View Article
Google Scholar

[13] View Article

[14] Google Scholar

[ref7] 7. Guo Y, Stolyar A, Walid A. Online VM auto-scaling algorithms for application hosting in a cloud. IEEE Trans Cloud Comput. 2018:1–1.
View Article
Google Scholar

[16] View Article

[17] Google Scholar

[ref8] 8. Nguyen V-G, Grinnemo K-J, Taheri J, Forsman J, Le Duc T, Brunstrom A. On auto-scaling and load balancing for user-plane gateways in a softwarized 5G Network. In: 2021 17th international conference on network and service management (CNSM); 2021. p. 132–8. https://doi.org/10.23919/cnsm52442.2021.9615536

[ref9] 9. Ren Y, Phung-Duc T, Chen J-C, Li FY. Enabling dynamic autoscaling for NFV in a non-standalone virtual EPC: Design and analysis. IEEE Trans Veh Technol. 2023;72(6):7743–56.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref10] 10. Ren Y, Phung-Duc T, Chen JC, Yu ZW. Dynamic auto scaling algorithm (DASA) for 5G mobile networks. In: 2016 IEEE Global Communications Conference (GLOBECOM); 2016. 1–6.

[ref11] 11. Kumar D, Chakrabarti S, Rajan AS, Huang J. Scaling telecom core network functions in public cloud infrastructure. In: 2020 IEEE international conference on cloud computing technology and science (CloudCom); 2020. p. 9–16. https://doi.org/10.1109/cloudcom49646.2020.00006

[ref12] 12. Ren Y, Phung-Duc T, Liu Y-K, Chen J-C, Lin Y-H. ASA: Adaptive VNF scaling algorithm for 5G mobile networks. In: 2018 IEEE 7th international conference on cloud networking (CloudNet); 2018. p. 1–4. https://doi.org/10.1109/cloudnet.2018.8549542

[ref13] 13. Tang H, Zhou D, Chen D. Dynamic network function instance scaling based on traffic forecasting and VNF placement in operator data centers. IEEE Trans Parallel Distrib Syst. 2019;30(3):530–43.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref14] 14. Liu L, Xu H, Niu Z, Li J, Zhang W, Wang P, et al. ScaleFlux: Efficient stateful scaling in NFV. IEEE Trans Parallel Distrib Syst. 2022;33(12):4801–17.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref15] 15. Adamuz-Hinojosa O, Ordonez-Lucena J, Ameigeiras P, Ramos-Munoz JJ, Lopez D, Folgueira J. Automated network service scaling in NFV: Concepts, mechanisms and scaling workflow. IEEE Commun Mag. 2018;56(7):162–9.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref16] 16. Zeng Z, Xia Z, Zhang X, He Y. SFC design and VNF placement based on traffic volume scaling and VNF dependency in 5G networks. Comput Model Eng Sci. 2023;134(3):1791–814.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref17] 17. Chen Z, Li H, Ota K, Dong M. HyScaler: A dynamic, hybrid vnf scaling system for building elastic service function chains across multiple servers. IEEE Trans Netw Serv Manage. 2023;20(4):4803–14.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref18] 18. Sales W, Coutinho E, Souza J. Auto-scaling in NFV using tacker. In: Conference: 5th international workshop on ADVANCEs in ICT INfrastructures and Services (ADVANCE 2015). Evry – France; 2017.

[ref19] 19. Leyva-Pupo I, Cervelló-Pastor C, Anagnostopoulos C, Pezaros DP. Dynamic UPF placement and chaining reconfiguration in 5G networks. Comput Netw. 2022;215:109200.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref20] 20. Leyva-Pupo I, Cervelló-Pastor C, Anagnostopoulos C, Pezaros DP. Dynamic scheduling and optimal reconfiguration of UPF placement in 5G networks. In: Proceedings of the 23rd international ACM conference on modeling, analysis and simulation of wireless and mobile systems; 2020. p. 103–11. https://doi.org/10.1145/3416010.3423221

[ref21] 21. Cruz-Pérez FA, Ortigoza-Guerrero L. Fractional resource reservation in mobile cellular systems. Resource, mobility, and security management. New York, USA: Auerbach Publications; 2006. p. 335–62.

[ref22] 22. Chuong DT, Cuong HL, Trung Duc P, Duc Hung D. Performance analysis in cellular networks considering the QoS by retrial queueing model with the fractional guard channels policies. IJCNC. 2021;13(04):85–100.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref23] 23. Barrachina-Muñoz S, Nikbakht R, Baranda J, Payaró M, Mangues-Bafalluy J, Kokkinos P, et al. Deploying cloud-native experimental platforms for zero-touch management 5G and beyond networks. IET Netw. 2023;12(6):305–15.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref24] 24. De Simone L, Di Mauro M, Natella R, Postiglione F. Performance and availability challenges in designing resilient 5G architectures. IEEE Trans Netw Serv Manage. 2024;21(5):5291–303.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref25] 25. Almagrabi AO, Ali R, Alghazzawi D, AlBarakati A, Khurshaid T. A poisson process-based random access channel for 5G and beyond networks. Mathematics. 2021;9(5):508.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref26] 26. Ahmed Mohamed Anwar, Mohamed Shehata, Safa M. Gasser, Hesham El Badawy. Handoff scheme for 5G mobile networks based on markovian queuing model. ARASET. 2023;30(3):348–61.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref27] 27. El-kenawy E-SM, Khodadadi N, Mirjalili S, Abdelhamid AA, Eid MM, Ibrahim A. Greylag goose optimization: Nature-inspired optimization algorithm. Expert Syst Appl. 2024;238:122147.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref28] 28. Shah KR, Sinha B. Theory of optimal designs. Springer Science & Business Media; 2012.

[ref29] 29. 3GPP. 3GPP TR 21.917. https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3937

[ref30] 30. Kubernetes. https://kubernetes.io/docs/home/. Accessed 2023 October 1.

[ref31] 31. Nguyen HT, Van Do T, Rotter C. Scaling UPF instances in 5G/6G core with deep reinforcement learning. IEEE Access. 2021;9:165892–906.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref32] 32. Open5GS. https://open5gs.org/open5gs/. Accessed 2023 October 1.

[ref33] 33. Tran MN, Vu DD, Kim Y. A survey of autoscaling in Kubernetes. In: International conference on ubiquitous and future networks; 2022. p. 263–5.

[ref34] 34. Butterworth RW, Kleinrock L. Queueing systems volume 1: Theory. J Am Stat Assoc. 1976;71(355):773.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

Figures

Abstract

Introduction

Literature reviews

Thresholds-based scaling for UPF instances with fractional admission controlling

Problem description

FAC mechanism

TS-FAC algorithm

Operating the Queing model of TSUPF-FAC (Q-TSUPF-FAC)

System state equilibrium equations

Performance evaluation metrics

Implementing of TS-FAC on Kubernetes with Open5GS

Results and discussions

Comparison of and U when varying and

Comparison of and U when varying and

Comparison of and U when varying T1 and T2

Comparison of and U when varying traffic

Comparison of and U between theorical analysis and simulation

Conclusion

References

Comparison of and U when varying T₁ and T₂