A novel and effective method for solving the router nodes placement in wireless mesh networks using reinforcement learning

Le Huu Binh; Thuy-Van T. Duong

doi:10.1371/journal.pone.0301073

Abstract

Router nodes placement (RNP) is an important issue in the design and implementation of wireless mesh networks (WMN). This is known as an P-hard problem, which cannot be solved using conventional algorithms. Consequently, approximate optimization strategies are commonly used to solve this problem. With heavy node density and wide-area WMNs, solving the RNP problem using approximation algorithms often faces many difficulties, therefore, a more effective solution is necessary. This motivated us to conduct this work. We propose a new method for solving the RNP problem using reinforcement learning (RL). The RNP problem is modeled as an RL model with environment, agent, action, and reward are equivalent to the network system, routers, coordinate adjustment, and connectivity of the RNP problem, respectively. To the best of our knowledge, this is the first study that applies RL to solve the RNP problem. The experimental results showed that the proposed method increased the network connectivity by up to 22.73% compared to the most recent methods.

Citation: Binh LH, Duong T-VT (2024) A novel and effective method for solving the router nodes placement in wireless mesh networks using reinforcement learning. PLoS ONE 19(4): e0301073. https://doi.org/10.1371/journal.pone.0301073

Editor: Mohammed Balfaqih, University of Jeddah, SAUDI ARABIA

Received: August 9, 2023; Accepted: March 9, 2024; Published: April 10, 2024

Copyright: © 2024 Binh, Duong. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting information files.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Wireless communication is growing and being widely applied in many fields. In the local area network of agencies, businesses, schools, and so on, wireless mesh networks (WMN) [1, 2] are the best choice today because of their significant advantages compared to wireless networks using traditional access points. The most notable benefit of the WMN is that it reduces congestion owing to its ability to balance the loads. In addition, the installation of a WMN is very convenient because there is no need to construct wired connections from the gateway to all routers. Fig 1 illustrates an example of a WMN consisting of six mesh routers (represented by r₁ to r₆) and eleven mesh clients (represented by c₁ to c₁₁). In addition, at least one the router of the Internet service provider serves as a gateway for clients to access the Internet. If two mesh routers are within range of each other, a wireless link is established between them. A mesh topology consists of of all the mesh routers and wireless links. For a WMN to deliver Internet services, several mesh routers must be connected to the gateway router via wireless or cable links. As shown in Fig 1, the mesh routers r₁ and r₂ are connected to the gateway router (GPON or FTTh router) via wireless links. Mesh clients are terminal devices that are users of network services. When a mesh client enters the network region, it can be covered by one or more mesh routers; the mesh client connects to the nearest mesh router to access network services.

Download:

Fig 1. An example of a wireless mesh network.

https://doi.org/10.1371/journal.pone.0301073.g001

With the rapid development of wireless and mobile communication technologies, network services are becoming more diverse and rich, especially those on fifth-generation (5G) and sixth-generation (6G) wireless network platforms. To effectively provide these services, WMNs must be designed and installed in the most efficient manner possible, allowing network resources to be fully utilized. This is the motivation for researchers to focus on WMN. Some of the most prevalent subjects that have been implemented include network topology control [3–7], router node placement (RNP) [8–24], optimum routing protocols [25–29], and access point allocation [30–33], with the RNP challenge being the most fascinating. Because the RNP problem is known to be NP-hard, it cannot be solved using conventional algorithms. Recently, approximate optimization methods have become useful for solving this problem [8–12]. The authors of [8] have used the coyote optimization algorithm (COA) to solve the RNP problem. Their proposed method optimizes both network connectivity and user coverage, which are two critical performance criteria. Using MATLAB simulations, the authors demonstrated that the COA algorithm outperformed other well-known optimization algorithms. In [10], the authors suggested an optimal method called the Chemical Reaction Optimization (CRO) algorithm to solve this problem. The CRO algorithm was inspired by how molecules interact to achieve a low, stable energy state in chemical reactions. In terms of client coverage and network connection, the simulation findings reveal that their suggested approach outperforms the Genetic approach (GA) and Simulated Annealing (SA). Another study employed a genetic algorithm and simulated annealing to discover a low-cost WMN configuration while satisfying restrictions and identifying the number of gateways needed [34]. Experiments showed that the evolutionary algorithm and simulated annealing were successful in lowering WMN network expenses while maintaining QoS. The new models significantly outperformed the conventional solutions. QoS was also considered in the RNP problem in [23]. The authors described a unique particle swarm optimization method for improving network connectivity and client coverage. The QoS restrictions for this study are the delay, relay load, and Internet gateway capacity. In [35], the authors suggested an improved version of the Moth Flame Optimization (MFO) algorithm, namely, Enhanced Chaotic Lévy Opposition-based MFO (ECLO-MFO), for solving the RNP problem. To improve the optimization performance of MFO, the proposed method integrates three strategies: the chaotic map concept, Lévy flying strategy, and Opposition-Based Learning (OBL) technique. The simulation results showed that the proposed algorithm was more efficient than the method of applying popular optimization algorithms.

Based on the results of published works, we find that the method of using approximate optimal algorithms provide good solutions. However, because randomness is used in several steps of the algorithm, the results often differ for different executions. For accurate results, each script must be executed multiple times, and then the average of all executions is obtained. For example, the authors of [8, 11] executed each simulation scenario 50 times. Furthermore, with heavy node density and wide-area WMNs, solving the RNP problem with approximation algorithms often presents many difficulties, necessitating a more effective solution. In this paper, we propose a new and effective algorithm to solve this problem. The main contributions of this study are summarized as follows:

(i) We proposed a novel and effective method for solving the RNP problem using RL. The RNP problem is modeled as an RL model, with the environment, agent, action, and reward representing the network system, routers, coordinate adjustment, and connectivity respectively, of the RNP problem. To the best of our knowledge, this is the first study to apply reinforcement learning to the RNP problem.
(ii) We compared and evaluated the performance of the RNP problem solving method using the heuristic algorithms and the RL method.

The remainder of this paper is organized as follows. The next section describes the formulation of the RNP problem in the WMN. The following sections present our proposed solution and experimental results. Finally, concluding remarks and promising future studies are presented in the last section.

RNP problem

In this section, we formulate the RNP problem in a WMN. First, graph theory was used to describe the WMN. We then define some metrics to use for the objective function of the RNP problem, similar to [11]. Finally, the RNP problem was formulated as a nonlinear programming problem. For convenience, we define the mathematical symbols shown in Table 1.

Download:

Table 1. The notations used in this paper.

https://doi.org/10.1371/journal.pone.0301073.t001

Mathematical model of a WMN using graph theory

Consider a WMN comprising m mesh routers, n mesh clients, and k gateway routers. Mathematically, this WMN can be represented as an undirected graph, denoted by G = (V, E), where V and E are the vertex and edge sets, respectively. V is equivalent to the set of all nodes in the WMN and is determined by V = R ∪ C ∪ W, where R, C and W are the sets of mesh routers, mesh clients, and gateway routers, respectively. E is equivalent to the set of all wireless links in the WMN and consists of three types: links between mesh routers, links between mesh client and mesh router, and links between gateway and mesh router.

RNP problem formulation

In this section, we formulate the RNP problem using some concepts and metrics from [11], including the connected router, connected client, connected router ratio, and connected client ratio.

Connected router.

The mesh router r_i is a connected router if and only if at least one path exists between it and the gateway router. If we return to the WMN example in Fig 1, we can see that mesh routers r₁, r₂, r₃, r₄ and r₆ are the connected routers but r₅ is not because no path exists from this mesh router to the gateway router.

Connected router ratio (CRR).

The CRR is defined as the percentage of connected routers in relation to the total number of routers in a WMN, calculated by [11] (1) where m is the number of routers in a WMN and α(r_i) is a function that indicates whether router r_i is a connected router or no, defined by (2)

Connected client.

Mesh client c_i is a connected client if and only if it is covered by at least one connected router. Let β(c_i) be a function that indicates whether client c_i is a connected client, returning 1 if yes and 0 otherwise. Then, β(c_i) is calculated as (3) where d_r is the coverage radius of the routers, d(c_i, r_j) is the distance between client c_i and router r_j, given by (4) where and are the coordinates of the client c_i and router r_j, respectively. Considering the example of WMN in Fig 1, we can easily observe that the set of connected clients are listed as {c₁, c₃, c₅, c₆, c₇, c₈, c₁₀, c₁₁}. Client c₉ is not a connected client because it is not covered by any mesh router. For clients c₂ and c₄, although they are covered by router r₅, they are not connected clients, because r₅ is not the connected router.

Connected client ratio (CCR).

The CCR is defined as the percentage of the connected clients in relation to the total number of clients in a WMN, calculated by [11] (5) where n is the number of mesh clients and β(c_i) is determined according to (3).

Formulate the RNP into a nonlinear programming problem.

The RNP problem in the WMN is stated as follows: Consider a case where it is necessary to design and install a WMN with the following assumptions:

The network system is located in an area of W×H meters.
The number of clients is n, and they are located at a given set of coordinates .
The number of gateway routers is k, they are located at a given set of coordinates , and the coverage radius of each gateway router is d_w.
The number of mesh routers was m, and the coverage radius of each mesh router was d_r.

Find the set of coordinates to place m routers such that CRR and CCR are at their maximum. Thus, the NRP problem can be described as the following nonlinear programming problem: (6) subject to the following constraints: (7) (8) where W and H are the width and height of the network, respectively. By solving the nonlinear programming problem with objective functions (6) and the constraints (7) and (8), we find the coordinate set to place m mesh routers in the network area W×H. This nonlinear programming problem can be solved in various ways. Recently, the method of applying approximate optimization algorithms to solve this problem has become popular [8–12]. In this work, we propose a new and effective method to solve this problem using reinforcement learning. The following sections describe this new method in detail.

RL-based mesh router nodes placement

Fundamentals of RL

A type of machine learning is called RL, in which the system learns from its past actions to choose wiser ones in the future. Fig 2 depicts the fundamental principles of RL, in which an agent operates as a learner, interacts with the environment to gain a reward and changes the state of the environment. At time t, the agent interacts with the environment through a_t action. The environment changes from s_t state to s_t+1 state as a result of this activity, and the agent is rewarded with an r_t. Based on the rewards acquired in the prior learning, the agent selects the action that provides the best reward in the following learnings. The total reward for taking the a_t action in s_t state is Q(s_t, a_t), which is typically determined by the Q-learning algorithm as follows [36]: (9) where α and γ ∈ [0, 1] are the learning rate and the discount factors, respectively.

Download:

Fig 2. Demonstrate the fundamental principles of RL.

https://doi.org/10.1371/journal.pone.0301073.g002

RL has been successfully applied to control protocols in wireless networks, typically routing in WMN [25, 27, 29], topology control in wireless sensor networks [37], improving the performance of energy-harvesting wireless body area networks [38, 39]. In this paper, we apply RL to solve the RNP in WMN. Details of this new proposal are presented in the following sections.

Solving the RNP in WMN using RL

The RL has recently been successfully employed to solve technical challenges in wireless communication such as routing [27, 36], topology management [37], and resource allocation. In this study, we use RL to solve the RNP problem. To the best of our knowledge, this is the first study to use RL to address the RNP problem. To do this, the RNP problem must be modeled as a reinforcement learning model with five characteristic factors: agent, environment, state, action, and reward.

Agent.

An agent is a mesh router that regularly adjusts its coordinates to obtain an optimal topology.

Environment.

In a RL model, the environment is everything that exists around the agent, and it is where the agent acts and interacts. The environment for the RNP problem using RL is the network system, which includes a set of mesh routers, clients, gateway routers, and network area.

State.

Each state is determined by a triple {P_c, P_r, P_w}, where P_c, P_r and P_w are the sets of coordinates for the mesh clients, mesh router, and gateway routers, respectively. The sets are listed in Table 1.

Action.

Action is the way in which the agent interacts with the environment to change its state. For the RNP problem using reinforcement learning, the agents are the mesh routers. Each action was defined by a mesh router that adjusted its coordinates. The set of actions at a specific state s_t for each mesh router r_i is defined as A_t = {mn1s, me1s, ms1s, mw1s, mn2s, me2s, ms2s, mw2s}, where the actions are described in Table 2, step is a given distance.

Download:

Table 2. Actions taken by the mesh routers.

https://doi.org/10.1371/journal.pone.0301073.t002

Reward.

The agent receives a reward for each action that interacting with the environment. The agent chooses the next action based on the reward value of past actions, with the goal of eventually achieving the best reward. For the RNP problem using reinforcement learning, we used the objective function defined in (6) as the reward for the learning process. This objective function consists of two metrics: CRR and CCR. To maximize both these metrics, we define the reward function as follows: (10) where RW(r_i, s_t, a_t) is the reward obtained when the mesh router r_i performs the action a_t ∈ A_t at state s_t, CRR_t and CCR_t are the connected router ratio and the connected client ratio at state s_t, calculated according to (1) and (5), respectively. λ is a coefficient in the range [0, 1], that is used to control the optimal degree of the metrics. In this study, the Q-learning algorithm is used to update the total reward each time a mesh router performs an action. Let Q(r_i, s_t, a_t) be the total reward received after the mesh router r_i performs the action a_t ∈ A_t at state s_t, then Q(r_i, s_t, a_t) is given by (11) where α and γ ∈ [0, 1] are the learning rate and the discount factors, respectively.

RL algorithm for solving RNP problem.

Algorithm 1 is the pseudo code of the RL algorithm for solving the RNP problem in the WMN. First, m routers are placed at random coordinates in a network area of W × H [m] (step 1). For each learning time, the mesh router r_i was randomly selected from set R to perform an action in set A_t. The policy for selecting an action a_k in set A_t is ε -greedy as in [37]. For this policy, the mesh router r_i chooses action a_t at state s_t with a high probability of 1 − ε if the Q(r_i, s_t, a_t) the value is maximum. The remaining actions in set A_t are chosen with an equally low probability ε (step 7), where ε is set to 0.1, as in [37]. Let π(r_i, s_t, a_t,k) be the probability that the mesh router r_i chooses action a_t,k at state s_t. A ccording to ε -greedy policy, this probability is given by [37] (12) where |A_t| denotes the size of the set A_t, that is, the number of actions that the mesh router r_i can select.

Algorithm 1 The pseudo-code of the reinforcement learning algorithm for solving RNP problem

Input:

Network area (W × H);
The set of mesh clients (C = {c_i|i = 1..n}), and the set of its coordinates ();
The set of gateway routers (W = {w_i|i = 1..k}), and the set of its coordinates ();
The set of mesh routers (R = {r_i|i = 1..m}), and the coverage radius of each mesh router (d_r);

Output: The set of the best coordinates of m mesh routers:

Method:

1: Place m mesh routers at the coordinates , where and are random values in the area W × H;

2: while (learn ≤ numLearn) do

3: Randomly choose mesh router r_i ∈ R;

4: for (each action a_j ∈ A) do

5: Update Q(r_i, s_t, a_t,j) using (11);

6: end for

7: Choose the action a_t,k ∈ A_t using policy derived from Q-values (e.g., ε -greedy) according to (12);

8: Take action a_t,k, observe reward R(r_i, s_t, a_t,k) and next state s_t+1;

9: Update next state (s_t+1) to current state (s_t);

10: learn ← learn + 1;

11: end while

12: P_r ← P_r in state s_t;

Analyze the computational complexity.

The computational complexity of Algorithm 1 depends mainly on the iteration in Step (2), the number of possible actions in Step (4), and the algorithm for updating the Q value in Step (5). Q(r_i, s_t, a_t,j) is updated using Eq. (11), where the greatest complexity is the calculation of RW(r_i, s_t, a_t) according to (10). RW(r_i, s_t, a_t) contains two metrics, CRR and CCR, which are defined by (1) and (5), respectively. To determine CRR, we employed a breadth-first search algorithm on a network of m vertices, which is the number of mesh routers. Therefore, the computational complexity was O(m²). The CCR is calculated using two nested loops of sizes m and n, where n is the number of mesh clients. Therefore, the complexity was O(m × n). Because n is always greater than m in a WMN, the computational complexity of RW(r_i, s_t, a_t) is O(m × n). Consequently, the computational complexity of Algorithm 1 is O(I × |A| × m × n), where I is the number of iterations and |A| is the number of possible actions.

The computational complexity of Algorithm 1 is greater than that of the algorithms solving the RNP problem using GA [40], PSO [24], and WOA [41], which we compare in the following section. However, because its computing complexity is a polynomial function, it can be implemented in practice. Furthermore, because the algorithms for solving the RNP problem are run offline, the polynomial complexity is acceptable.

Simulation results and discussion

Simulation scenarios

The performance of the proposed method was evaluated through a simulation using Python. Our proposed method is compared with the most recent methods that use approximate optimization algorithms to address the RNP problem, including GA [40], PSO [24], WOA [41], and MVO [11]. All experiments were run on a 3.6 GHz Core i7 CPU computer. The surveyed network instances (NI) are presented in Table 3. NI-1 and NI-2 were used to investigate the effect of the number of mesh routers on the network performance, with the number of mesh routers ranging from 20 to 45 covering 150 mesh clients (NI-1) and 350 mesh clients (NI-2). NI-3 and NI-4 ware used to study the effect of client density, varying from 100 to 400. In NI-5 and N-6, the effect of the coverage radius of each mesh router was thoroughly examined. The final two NIs were used to investigate the influence of the network area. The parameters of the simulation scenarios and algorithms are presented in Table 4, where th parameters of the GA, PSO, WOA, and MVO are set as in [11].

Download:

Table 3. Network instances use for validating our proposed method.

https://doi.org/10.1371/journal.pone.0301073.t003

Download:

Table 4. The parameters of algorithms and scenarios.

https://doi.org/10.1371/journal.pone.0301073.t004

Simulation results

Topology evaluation.

First, we evaluate the topology obtained when solving the RNP problem using the GA, PSO, WOA, MVO, and our proposed method, which employs reinforcement learning. The results obtained in Fig 3 clearly show topological differences between the methods. These findings were obtained using NI-2 with 30 mesh routers covering 350 mesh clients in an area of 2000 × 2000 [m²] and a coverage radius of 200 [m] for each mesh router. We can observe that the method using RL provides the most optimal topology compared with the methods using approximate optimization algorithms, GA, WOA, PSO, and MVO. Specifically, for the method using reinforcement learning, there are 334 mesh clients covered by at least one mesh router, corresponding to a rate of 95.43%. These values were 292 (83.43%), 309 (88.29%), 313 (89.43%), and 313 (88.86%) for the WOA, GA, PSO, and MVO algorithms, respectively. In addition, the topology of the reinforcement learning method has a wider coverage area than the other methods, which can increase the percentage of clients covered in the case of denser clients.

Download:

Fig 3. Compare WMN topologies using different router node placement algorithms.

(a) WOA, (b) GA, (c) PSO, (d) MVO, and (e) reinforcement learning.

https://doi.org/10.1371/journal.pone.0301073.g003

Impact of mesh router density.

In this section, the impact of the mesh router density on network performance is investigated using various simulation scenarios. We use the most important metric often used to evaluate the performance of RNP problem solving methods, that is, network connectivity (NC). In our context, the NC is calculated as (13) where α(r_i) and β(c_j) are determined according to (2) and (3), respectively, m and n represent the number of mesh routers and mesh clients, respectively.

The results obtained in Fig 4 clearly show the difference in network connectivity between the proposed method and the method using approximate optimization algorithms. These findings were obtained using NI-1, in which the number of mesh routers varieed from 20 to 45, covering 150 mesh clients in an area of 2000 × 2000 [m²] and a coverage radius of 200 [m] for each mesh router. We can observe that the NC increases proportionally with the number of mesh routers for all methods. This is evident because as the number of mesh routers increases, the coverage area expands, increasing the probability of mesh clients being covered. Comparing the methods of solving RNP problems, the method using RL (legend namely RL-based RNP) gives the highest NC. For example, considering the case of 35 mesh routers, The NC values of the methods using the WOA, PSO, GA, MVO, and RL are 85.64, 87.42, 90.67, 93.42, and 95.68%, respectively. Thus, compared with the method using algorithms WOA, PSO, GA, and MVO, the proposed method improved NC by 10.03, 8.25, 5.01%, and 2.25%, respectively. This is a significant result in improving WMN performance.

Download:

Fig 4. Evaluate the network connectivity versus the number of mesh routers using NI-1.

https://doi.org/10.1371/journal.pone.0301073.g004

The results obtained were quite similar for the implementation on NI-2, as shown in Fig 5. The assumptions of this simulation scenario are the same as those in NI-1, except that the number of mesh clients increases to 350. We can see that the proposed method is highly effective in terms of NC. We can observe that the proposed solution provides high efficiency in terms of NC for most values of the number of mesh routers. The NC of the method using RL increases by an average of 4 to 20% compared with the cases where approximate optimization algorithms are used. As is the case with 35 mesh routers, the NC of the RL is 98.71%. These values of the WOA, PSO, GA, and MVO algorithms were 81.66%, 86.89%, 88.59%, and 94.44% respectively. Thus, the method using RL improved the NC from 4.26% to 17.04%.

Download:

Fig 5. Evaluate the network connectivity versus the number of mesh routers using NI-2.

https://doi.org/10.1371/journal.pone.0301073.g005

Based on the findings in Figs 4 and 5, we can conclude that changing the number of mesh routers affects on network performance in terms of NC. The larger the number of routers, the higher the NC for all investigated RNP problem solving methods. In particular, the method based on RL is the most efficient.

Impact of mesh client density.

In this section, we investigate the effect of client density on network performance. In a WMN, the denser the clients, the greater is the number of connection requests to the routers. As a result, network performance was affected. This is more evident in Fig 6, where we plot NC as a function of the number of mesh clients. These results are obtained by executing NI-3, where the number of mesh routers is 30, covering 150 to 300 mesh clients. We can easily observe that the method using RL always yields the highest NC regardless of whether the client density is sparse or dense. The NC value of this method from 90.43% to 95.79%. Meanwhile, the NC values for the cases of algorithm WOA, PSO, GA, and MVO are fom 74.59% to 84.08%, from 77.00% to 85.63%, from 82.32% to 91.46%, and from 88.27% to 90.83%, respectively. When 45 mesh routers were used (NI-4), the NC value increased for all methods. This is clearly shown in Fig 7, where we represent NC versus the number of mesh clients. Comparing the methods, we find that the method using RL outperforms the method using approximate optimal algorithms in terms of NC.

Download:

Fig 6. Evaluate the network connectivity versus the number of mesh clients using NI-3.

https://doi.org/10.1371/journal.pone.0301073.g006

Download:

Fig 7. Evaluate the network connectivity versus the number of mesh clients using NI-4.

https://doi.org/10.1371/journal.pone.0301073.g007

Impart of the coverage radius of mesh routers.

The coverage radius of the mesh routers is another technological parameter that has a considerable impact on the WMN performance. In this section, we investigate the effect of this technological parameter on the NC metric. The results obtained in Fig 8 clearly show the change in NC with respect to the coverage radius of the mesh routers. These results were implemented using NI-5, which has 30 mesh routers and 150 mesh clients. The coverage radius of each router ranged from 150 to 300 [m]. The plots in Fig 8 indicate that the NC increases proportionally to the coverage radius of the mesh routers. This is because expanding the coverage radius increases the likelihood that clients will be covered. As a result, NC increases. In particular, the method using RL yielded the highest NC, reaching close to 100% when the coverage radius was 250 [m] or more. The results are also similar for NI-6, as shown in Fig 9. The NC value of this NI is greater than that of the NI-5 because this uses more mesh routers. As in the previous scenarios, the method using RL always yields the highest NC.

Download:

Fig 8. Evaluate the network connectivity versus the coverage radius of mesh routers using NI-5.

https://doi.org/10.1371/journal.pone.0301073.g008

Download:

Fig 9. Evaluate the network connectivity versus the coverage radius of mesh routers using NI-6.

https://doi.org/10.1371/journal.pone.0301073.g009

Impact of network area.

In the last section, we investigate the effect of network area on the efficiency of RNP problem solving methods. Figs 10 and 11 show the results obtained by executing NI-7 and NI-8, respectively. In these NIs, the network area varies from 2000 × 2000 [m²] to 3000 × 3000 [m²]. The NC value decreased according to the network area for all the algorithms. This is because, for a given number of mesh routers, the larger the network area, the lower the percentage of area covered, leading to a decrease in the NC value. However, the NC value of the method using RL is always the largest.

Download:

Fig 10. Evaluate the network connectivity versus network area using NI-7.

https://doi.org/10.1371/journal.pone.0301073.g010

Download:

Fig 11. Evaluate the network connectivity versus network area using NI-8.

https://doi.org/10.1371/journal.pone.0301073.g011

Based on the above findings, we can conclude that the proposed method, which uses reinforcement learning to solve the RNP problem, is more efficient than a method that uses approximate optimal algorithms. This is a crucial result in the design and implementation of a WMN, which helps find an optimal network topology to exploit network resources more efficiently.

Conclusion

The placement of router nodes in wireless mesh networks is a significant problem that has recently attracted the interest of several research groups. This problem is recognized as NP-hard, and cannot be resolved using conventional algorithms. In this study, we proposed a new and effective method for solving this problem using RL. The process of finding the optimal coordinates for placing mesh routers is modeled as an RL with the main components being environment, agent, action, and reward, which are equivalent to the network system, routers, coordinate adjustment, and network connectivity of the RNP problem, respectively. Simulation results show that our proposed method outperforms the most recent methods in terms of coverage and network connectivity.

In future work, we will continue to develop this method by considering additional constraints on the quality of transmission and load balancing to improve network performance. In addition, the deep reinforcement learning method can also be applied to static and dynamic RNP problems to further improve the performance of the WMN.

Supporting information

S1 Dataset.

https://doi.org/10.1371/journal.pone.0301073.s001

(ZIP)

References

1. Zhang Y, Luo J, Hu H. Wireless Mesh Networking—Architectures, Protocols and Standards. Taylor & Francis Group, LLC; 2007.
2. Akyildiz IF, Xudong Wang. Wireless Mesh Networks. John Wiley & Sons Ltd; 2009.
3. Pragasen Mudali MOA. Context-Based Topology Control for Wireless Mesh Networks. Mobile Information Systems;2016:16 pages.
- View Article
- Google Scholar
4. Aron FO, Olwal TO, Kurien A, Odhiambo MO. Energy Efficient Topology Control Algorithm for Wireless Mesh Networks. In: 2008 International Wireless Communications and Mobile Computing Conference; 2008. p. 135–140.
5. Vázquez-Rodas A, de la Cruz Llopis LJ. A centrality-based topology control protocol for wireless mesh networks. Ad Hoc Networks. 2015;24:34–54.
- View Article
- Google Scholar
6. Yang L, Quan L. A Topology Control Algorithm Using Power Control for Wireless Mesh Network. In: 2011 Third International Conference on Multimedia Information Networking and Security; 2011. p. 141–145.
7. Le Huu Binh and Duong Thuy-Van T and Ngo Vuong M. TFACR: A Novel Topology Control Algorithm for Improving 5G-based MANET Performance by Flexibly Adjusting the Coverage Radius. IEEE Access. 2023;11:105734–105748.
- View Article
- Google Scholar
8. Taleb SM, Meraihi Y, Gabis AB, Mirjalili S, Zaguia A, Ramdane-Cherif A. Solving the mesh router nodes placement in wireless mesh networks using coyote optimization algorithm. IEEE Access. 2022; p. 1–1.
- View Article
- Google Scholar
9. Nouri N, Aliouat Z, Naouri A, Hassak S. Accelerated PSO algorithm applied to clients coverage and routers connectivity in wireless mesh networks. Journal of Ambient Intelligence and Humanized Computing. 2021.
- View Article
- Google Scholar
10. Sayad L, Bouallouche-Medjkoune L, Aissani D. A Chemical Reaction Algorithm to Solve the Router Node Placement in Wireless Mesh Networks. Mob Netw Appl. 2020;25(5):1915–1928.
- View Article
- Google Scholar
11. Binh LH, Truong TK. An Efficient Method for Solving Router Placement Problem in Wireless Mesh Networks Using Multi-Verse Optimizer Algorithm. Sensors. 2022;22(15). pmid:35897998
- View Article
- PubMed/NCBI
- Google Scholar
12. Mekhmoukh Taleb S, Meraihi Y, Benmessaoud Gabis A, Mirjalili S, Ramdane-Cherif A. Nodes placement in wireless mesh networks using optimization approaches: a survey. Neural Computing and Applications. 2022;34.
- View Article
- Google Scholar
13. Amaldi E, Capone A, Cesana M, Filippini I, Malucelli F. Optimization Models and Methods for Planning Wireless Mesh Networks. Computer Networks. 2008;52:2159–2171.
- View Article
- Google Scholar
14. Xhafa F, Sánchez C, Barolli A, Takizawa M. Solving mesh router nodes placement problem in Wireless Mesh Networks by Tabu Search algorithm. Journal of Computer and System Sciences. 2015;81:1417–1428.
- View Article
- Google Scholar
15. Bello OM, Taiwe KD. Mesh Node Placement in Wireless Mesh Network Based on Multiobjective Evolutionary Metaheuristic. In: Proceedings of the International Conference on Internet of Things and Cloud Computing. ICC’16. New York, NY, USA: Association for Computing Machinery; 2016.Available from: https://doi.org/10.1145/2896387.2896444.
16. Sayad L, Bouallouche-Medjkoune L, Aïssani D. A Simulated Annealing Algorithm for the placement of Dynamic Mesh Routers in a Wireless Mesh Network with Mobile Clients. Internet Technology Letters. 2018;1:e35.
- View Article
- Google Scholar
17. Xhafa F, Barolli A, Sánchez C, Barolli L. A simulated annealing algorithm for router nodes placement problem in Wireless Mesh Networks. Simulation Modelling Practice and Theory. 2011;19(10):2276–2284.
- View Article
- Google Scholar
18. Hamdi M, Mhiri S. Dynamic mesh router placement for connectivity maximization in wireless mesh networks; 2015. p. 1–6.
19. Lin CC. Dynamic router node placement in wireless mesh networks: A PSO approach with constriction coefficient and its convergence analysis. Information Sciences. 2013;232:294–308.
- View Article
- Google Scholar
20. Sayad L. Optimal placement of mesh routers in a wireless mesh network with mobile mesh clients using simulated annealing. In: 2017 5th International Symposium on Computational and Business Intelligence (ISCBI); 2017. p. 45–49.
21. Rezaei M, Sarram M, Derhami V, Sarvestani H. Novel Placement Mesh Router Approach for Wireless Mesh Network. 2012;.
22. Seetha S, Anand John Francis S, Grace Mary Kanaga E. Optimal Placement Techniques of Mesh Router Nodes in Wireless Mesh Networks. In: Haldorai A, Ramu A, Mohanram S, Chen MY, editors. 2nd EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing. Cham: Springer International Publishing; 2021. p. 217–226.
23. Lin CC, Chen TH, Jhong SY. Wireless mesh router placement with constraints of gateway positions and QoS. In: 2015 11th International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness (QSHINE); 2015. p. 72–74.
24. Lin CC, Tseng PT, Wu TY, Deng DJ. Social-aware dynamic router node placement in wireless mesh networks. Wireless Networks. 2015;22.
- View Article
- Google Scholar
25. Binh LH, Duong TVT. Load balancing routing under constraints of quality of transmission in mesh wireless network based on software defined networking. Journal of Communications and Networks. 2021;23(1):12–22.
- View Article
- Google Scholar
26. Lahsen-Cherif I, Zitoune L, Veque V. Energy Efficient Routing for Wireless Mesh Networks with Directional Antennas: When Q-learning meets Ant systems. Ad Hoc Networks. 2021;121:102589.
- View Article
- Google Scholar
27. Binh LH, Duong TVT. An improved method of AODV routing protocol using reinforcement learning for ensuring QoS in 5G-based mobile ad-hoc networks. ICT Express. 2023. https://doi.org/10.1016/j.icte.2023.07.002
- View Article
- Google Scholar
28. Dai L, Xue Y, Chang B, Cao Y, Cui Y. Optimal Routing for Wireless Mesh Networks With Dynamic Traffic Demand. Mobile Networks and Applications. 2008;13:97–116.
- View Article
- Google Scholar
29. Duong TVT, Binh LH, Ngo VM. Reinforcement learning for QoS-guaranteed intelligent routing in Wireless Mesh Networks with heavy traffic load. ICT Express. 2022;8(1):18–24.
- View Article
- Google Scholar
30. Ding R, Xu Y, Gao F, Shen X, Wu W. Deep Reinforcement Learning for Router Selection in Network With Heavy Traffic. IEEE Access. 2019;7:37109–37120.
- View Article
- Google Scholar
31. Raschellà A, Bouhafs F, Mackay M, Shi Q, Ortin J, Gallego JR, et al. A Dynamic Access Point Allocation Algorithm for Dense Wireless LANs Using Potential Game. Computer Networks. 2019;167:106991.
- View Article
- Google Scholar
32. Kumar G, Chigarapalle S. A Study on Access Point Selection Algorithms in Wireless Mesh Networks. International Journal of Advanced Networking and Applications. 2014;6:2158–2167.
- View Article
- Google Scholar
33. Kim MS, Kim Y, Lee SS, Lee S, Golmie N. A user application-based access point selection algorithm for dense WLANs. PLoS ONE. 2019;14. pmid:30650150
- View Article
- PubMed/NCBI
- Google Scholar
34. Mahmoud T, Girgis M, Abdullatif B, Sayed A. Solving the Wireless Mesh Network Design Problem using Genetic Algorithm and Simulated Annealing Optimization Methods. International Journal of Computer Applications. 2014;96:1–10.
- View Article
- Google Scholar
35. Mekhmoukh Taleb S, Meraihi Y, Mirjalili S, Acheli D, Ramdane-Cherif A, Benmessaoud Gabis A. Mesh Router Nodes Placement for Wireless Mesh Networks Based on an Enhanced Moth–Flame Optimization Algorithm. Mobile Networks and Applications. 2023. https://doi.org/10.1007/s11036-022-02059-6
36. Duong Thi Thuy V, Binh L. IRSML: An intelligent routing algorithm based on machine learning in software defined wireless networking. ETRI Journal. 2022;44:733–745.
- View Article
- Google Scholar
37. Le T, Moh S. An Energy-Efficient Topology Control Algorithm Based on Reinforcement Learning for Wireless Sensor Networks. International Journal of Control and Automation. 2017;10:233–244.
- View Article
- Google Scholar
38. Mohammadi R, Shirmohammadi Z. RLS2: An energy efficient reinforcement learning- based sleep scheduling for energy harvesting WBANs. Computer Networks. 2023;229:109781.
- View Article
- Google Scholar
39. Mohammadi R, Shirmohammadi Z. DRDC: Deep reinforcement learning based duty cycle for energy harvesting body sensor node. Energy Reports. 2023;9:1707–1719.
- View Article
- Google Scholar
40. Oda T, Elmazi D, Barolli A, Sakamoto S, Barolli L, Xhafa F. A genetic algorithm-based system for wireless mesh networks: analysis of system data considering different routing protocols and architectures. Soft Computing. 2015;20.
- View Article
- Google Scholar
41. Mirjalili S, Lewis A. The whale optimization algorithm. Advances in engineering software. 2016;95:51–67.
- View Article
- Google Scholar

[ref1] 1. Zhang Y, Luo J, Hu H. Wireless Mesh Networking—Architectures, Protocols and Standards. Taylor & Francis Group, LLC; 2007.

[ref2] 2. Akyildiz IF, Xudong Wang. Wireless Mesh Networks. John Wiley & Sons Ltd; 2009.

[ref3] 3. Pragasen Mudali MOA. Context-Based Topology Control for Wireless Mesh Networks. Mobile Information Systems;2016:16 pages.
View Article
Google Scholar

[4] View Article

[5] Google Scholar

[ref4] 4. Aron FO, Olwal TO, Kurien A, Odhiambo MO. Energy Efficient Topology Control Algorithm for Wireless Mesh Networks. In: 2008 International Wireless Communications and Mobile Computing Conference; 2008. p. 135–140.

[ref5] 5. Vázquez-Rodas A, de la Cruz Llopis LJ. A centrality-based topology control protocol for wireless mesh networks. Ad Hoc Networks. 2015;24:34–54.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref6] 6. Yang L, Quan L. A Topology Control Algorithm Using Power Control for Wireless Mesh Network. In: 2011 Third International Conference on Multimedia Information Networking and Security; 2011. p. 141–145.

[ref7] 7. Le Huu Binh and Duong Thuy-Van T and Ngo Vuong M. TFACR: A Novel Topology Control Algorithm for Improving 5G-based MANET Performance by Flexibly Adjusting the Coverage Radius. IEEE Access. 2023;11:105734–105748.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref8] 8. Taleb SM, Meraihi Y, Gabis AB, Mirjalili S, Zaguia A, Ramdane-Cherif A. Solving the mesh router nodes placement in wireless mesh networks using coyote optimization algorithm. IEEE Access. 2022; p. 1–1.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref9] 9. Nouri N, Aliouat Z, Naouri A, Hassak S. Accelerated PSO algorithm applied to clients coverage and routers connectivity in wireless mesh networks. Journal of Ambient Intelligence and Humanized Computing. 2021.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref10] 10. Sayad L, Bouallouche-Medjkoune L, Aissani D. A Chemical Reaction Algorithm to Solve the Router Node Placement in Wireless Mesh Networks. Mob Netw Appl. 2020;25(5):1915–1928.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref11] 11. Binh LH, Truong TK. An Efficient Method for Solving Router Placement Problem in Wireless Mesh Networks Using Multi-Verse Optimizer Algorithm. Sensors. 2022;22(15). pmid:35897998
View Article
PubMed/NCBI
Google Scholar

[24] View Article

[25] PubMed/NCBI

[26] Google Scholar

[ref12] 12. Mekhmoukh Taleb S, Meraihi Y, Benmessaoud Gabis A, Mirjalili S, Ramdane-Cherif A. Nodes placement in wireless mesh networks using optimization approaches: a survey. Neural Computing and Applications. 2022;34.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref13] 13. Amaldi E, Capone A, Cesana M, Filippini I, Malucelli F. Optimization Models and Methods for Planning Wireless Mesh Networks. Computer Networks. 2008;52:2159–2171.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref14] 14. Xhafa F, Sánchez C, Barolli A, Takizawa M. Solving mesh router nodes placement problem in Wireless Mesh Networks by Tabu Search algorithm. Journal of Computer and System Sciences. 2015;81:1417–1428.
View Article
Google Scholar

[34] View Article

[35] Google Scholar

[ref15] 15. Bello OM, Taiwe KD. Mesh Node Placement in Wireless Mesh Network Based on Multiobjective Evolutionary Metaheuristic. In: Proceedings of the International Conference on Internet of Things and Cloud Computing. ICC’16. New York, NY, USA: Association for Computing Machinery; 2016.Available from: https://doi.org/10.1145/2896387.2896444.

[ref16] 16. Sayad L, Bouallouche-Medjkoune L, Aïssani D. A Simulated Annealing Algorithm for the placement of Dynamic Mesh Routers in a Wireless Mesh Network with Mobile Clients. Internet Technology Letters. 2018;1:e35.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref17] 17. Xhafa F, Barolli A, Sánchez C, Barolli L. A simulated annealing algorithm for router nodes placement problem in Wireless Mesh Networks. Simulation Modelling Practice and Theory. 2011;19(10):2276–2284.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref18] 18. Hamdi M, Mhiri S. Dynamic mesh router placement for connectivity maximization in wireless mesh networks; 2015. p. 1–6.

[ref19] 19. Lin CC. Dynamic router node placement in wireless mesh networks: A PSO approach with constriction coefficient and its convergence analysis. Information Sciences. 2013;232:294–308.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref20] 20. Sayad L. Optimal placement of mesh routers in a wireless mesh network with mobile mesh clients using simulated annealing. In: 2017 5th International Symposium on Computational and Business Intelligence (ISCBI); 2017. p. 45–49.

[ref21] 21. Rezaei M, Sarram M, Derhami V, Sarvestani H. Novel Placement Mesh Router Approach for Wireless Mesh Network. 2012;.

[ref22] 22. Seetha S, Anand John Francis S, Grace Mary Kanaga E. Optimal Placement Techniques of Mesh Router Nodes in Wireless Mesh Networks. In: Haldorai A, Ramu A, Mohanram S, Chen MY, editors. 2nd EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing. Cham: Springer International Publishing; 2021. p. 217–226.

[ref23] 23. Lin CC, Chen TH, Jhong SY. Wireless mesh router placement with constraints of gateway positions and QoS. In: 2015 11th International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness (QSHINE); 2015. p. 72–74.

[ref24] 24. Lin CC, Tseng PT, Wu TY, Deng DJ. Social-aware dynamic router node placement in wireless mesh networks. Wireless Networks. 2015;22.
View Article
Google Scholar

[52] View Article

[53] Google Scholar

[ref25] 25. Binh LH, Duong TVT. Load balancing routing under constraints of quality of transmission in mesh wireless network based on software defined networking. Journal of Communications and Networks. 2021;23(1):12–22.
View Article
Google Scholar

[55] View Article

[56] Google Scholar

[ref26] 26. Lahsen-Cherif I, Zitoune L, Veque V. Energy Efficient Routing for Wireless Mesh Networks with Directional Antennas: When Q-learning meets Ant systems. Ad Hoc Networks. 2021;121:102589.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref27] 27. Binh LH, Duong TVT. An improved method of AODV routing protocol using reinforcement learning for ensuring QoS in 5G-based mobile ad-hoc networks. ICT Express. 2023. https://doi.org/10.1016/j.icte.2023.07.002
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref28] 28. Dai L, Xue Y, Chang B, Cao Y, Cui Y. Optimal Routing for Wireless Mesh Networks With Dynamic Traffic Demand. Mobile Networks and Applications. 2008;13:97–116.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref29] 29. Duong TVT, Binh LH, Ngo VM. Reinforcement learning for QoS-guaranteed intelligent routing in Wireless Mesh Networks with heavy traffic load. ICT Express. 2022;8(1):18–24.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref30] 30. Ding R, Xu Y, Gao F, Shen X, Wu W. Deep Reinforcement Learning for Router Selection in Network With Heavy Traffic. IEEE Access. 2019;7:37109–37120.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref31] 31. Raschellà A, Bouhafs F, Mackay M, Shi Q, Ortin J, Gallego JR, et al. A Dynamic Access Point Allocation Algorithm for Dense Wireless LANs Using Potential Game. Computer Networks. 2019;167:106991.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref32] 32. Kumar G, Chigarapalle S. A Study on Access Point Selection Algorithms in Wireless Mesh Networks. International Journal of Advanced Networking and Applications. 2014;6:2158–2167.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref33] 33. Kim MS, Kim Y, Lee SS, Lee S, Golmie N. A user application-based access point selection algorithm for dense WLANs. PLoS ONE. 2019;14. pmid:30650150
View Article
PubMed/NCBI
Google Scholar

[79] View Article

[80] PubMed/NCBI

[81] Google Scholar

[ref34] 34. Mahmoud T, Girgis M, Abdullatif B, Sayed A. Solving the Wireless Mesh Network Design Problem using Genetic Algorithm and Simulated Annealing Optimization Methods. International Journal of Computer Applications. 2014;96:1–10.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref35] 35. Mekhmoukh Taleb S, Meraihi Y, Mirjalili S, Acheli D, Ramdane-Cherif A, Benmessaoud Gabis A. Mesh Router Nodes Placement for Wireless Mesh Networks Based on an Enhanced Moth–Flame Optimization Algorithm. Mobile Networks and Applications. 2023. https://doi.org/10.1007/s11036-022-02059-6

[ref36] 36. Duong Thi Thuy V, Binh L. IRSML: An intelligent routing algorithm based on machine learning in software defined wireless networking. ETRI Journal. 2022;44:733–745.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref37] 37. Le T, Moh S. An Energy-Efficient Topology Control Algorithm Based on Reinforcement Learning for Wireless Sensor Networks. International Journal of Control and Automation. 2017;10:233–244.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref38] 38. Mohammadi R, Shirmohammadi Z. RLS2: An energy efficient reinforcement learning- based sleep scheduling for energy harvesting WBANs. Computer Networks. 2023;229:109781.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref39] 39. Mohammadi R, Shirmohammadi Z. DRDC: Deep reinforcement learning based duty cycle for energy harvesting body sensor node. Energy Reports. 2023;9:1707–1719.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref40] 40. Oda T, Elmazi D, Barolli A, Sakamoto S, Barolli L, Xhafa F. A genetic algorithm-based system for wireless mesh networks: analysis of system data considering different routing protocols and architectures. Soft Computing. 2015;20.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref41] 41. Mirjalili S, Lewis A. The whale optimization algorithm. Advances in engineering software. 2016;95:51–67.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

Figures

Abstract

Introduction

RNP problem

Mathematical model of a WMN using graph theory

RNP problem formulation

Connected router.

Connected router ratio (CRR).

Connected client.

Connected client ratio (CCR).

Formulate the RNP into a nonlinear programming problem.

RL-based mesh router nodes placement

Fundamentals of RL

Solving the RNP in WMN using RL

Agent.

Environment.

State.

Action.

Reward.

RL algorithm for solving RNP problem.

Analyze the computational complexity.

Simulation results and discussion

Simulation scenarios

Simulation results

Topology evaluation.

Impact of mesh router density.

Impact of mesh client density.

Impart of the coverage radius of mesh routers.

Impact of network area.

Conclusion

Supporting information

S1 Dataset.

References