A distributed differential game approach to trajectory planning for offshore wind farm inspection

Yunqi Liao; Shuyuan You; Houmin Wang; Siming Yu; Wenyan Xue

doi:10.1371/journal.pone.0344989

Abstract

To address the complex challenges associated with multiple unmanned aerial vehicles (multi-UAVs) cooperative inspection in offshore wind farms, including limited sensing and communication ranges, constrained battery capacity, and round-trip mission requirements, this paper introduces an optimal coordinated trajectory method for multi-UAV based on a distributed differential game (DDG) framework. The approach explicitly accounts for energy consumption, incorporating round-trip requirements into a game-theoretic objective function to facilitate energy-aware trajectory planning. Each UAV operates based solely on local information from neighboring UAVs, enabling distributed decision-making that ensures collision-free coordination while optimizing global inspection time and overall energy efficiency. The convergence of the proposed strategy to a global Nash equilibrium (G-NE), as confirmed by theoretical analysis, ensures system-level coordination optimality subject to round-trip and energy constraints. Simulation results demonstrate that the method significantly enhances inspection efficiency and reduces task completion time by up to compared to conventional approaches, while guaranteeing the safe return of all UAVs.

Citation: Liao Y, You S, Wang H, Yu S, Xue W (2026) A distributed differential game approach to trajectory planning for offshore wind farm inspection. PLoS One 21(3): e0344989. https://doi.org/10.1371/journal.pone.0344989

Editor: Tri-Hai Nguyen, Van Lang University: Truong Dai hoc Van Lang, VIET NAM

Received: October 8, 2025; Accepted: March 1, 2026; Published: March 26, 2026

Copyright: © 2026 Liao et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data for this study are within the paper and publicly available from the GitHub repository: (https://github.com/xuewenyan6-debug/Wenyan-Xue).

Funding: This work is supported by the funding from the Efficient space-time coordination of swarm aircraft (No. 360302022401, Funded Author: Wenyan Xue); The Zhanjiang Non-funded Science and Technology Research Project (2025B01076, Funded Author: Wenyan Xue). The Research on Motion Control Mechanism and Regulation Strategy for Brain Computer Interface (No. 360302042406, Funded Author: Houmin Wang). The 2025 University Student Innovation Training Program (CXXL2025259, Funded Author: Yunqi Liao).

Competing interests: The authors have declared that no competing interests exist.

Introduction

The field of offshore wind turbine inspection has witnessed a paradigm shift, moving from conventional methods toward intelligent solutions powered by unmanned aerial vehicles (UAVs) [1,2]. Early trajectory planning methods were largely guided by a distance-based nearest-first principle, which disregarded essential environmental factors such as direction and consequently led to poor energy efficiency and suboptimal inspection outcomes [3]. To address this limitation, a value evaluation function incorporating parameters such as positional altitude, average wind speed, and wind direction, along with an improved consensus-based bundle algorithm, is introduced, by which the rationality of trajectory planning was markedly enhanced [4]. However, this model does not account for the energy constraint associated with the UAVs’ return flight [5]. To reduce energy consumption in multi-UAVs, Ref. [6] introduced a genetic algorithm-based dynamic zoning strategy (GA-DZ), which optimizes UAV trajectories by minimizing the total flight distance, thereby implicitly enhancing energy efficiency. However, this approach suffers from limited adaptability and does not account for the impact of battery power on the return journey [7]. To address these limitations, multi-agent reinforcement learning has emerged as a promising alternative. For example, Ref. [8] applied such methods to enable agents to learn cooperative policies through extensive environmental interactions. Similarly, Ref. [9] combined convolutional neural networks with deep reinforcement learning (NN-DRL), creating an interactive mechanism between environmental perception and policy learning that considerably increased mission completion rates. Nevertheless, the challenge of aligning local optimization with global efficiency remains only partially resolved [10]. Distributed optimization methods, such as based on consensus algorithm [11], have been proposed to alleviate reliance on a central coordinator, yet it often require iterative communication and may not explicitly account for dynamic collision avoidance or energy constraints in time-critical missions. Furthermore, centralized optimization techniques, including mixed-integer linear programming, can generate optimal trajectories by solving a global optimization problem [12]. Nonetheless, their inherent dependency on a central coordinator and perfect global information makes them vulnerable to single points of failure and communication bottlenecks, which are common challenges in offshore settings. Recent efforts have explored hybrid centralized-distributed architectures [13] to balance optimality and robustness, but the fundamental issue of guaranteeing global performance with strictly local interactions persists. While effective in mitigating coordination conflicts, these approaches are often validated empirically, lacking the rigorous theoretical guarantees needed to provide proofs of optimality [14]. Alternatively, distributed model predictive control (DMPC) has been applied to handle unexpected changes in dynamic environments [15]. It synthesizes a state‑feedback control strategy using a receding‑horizon scheme. While DMPC can iteratively seek locally optimal solutions at each sampling instant, its focus on algorithmic optimization often prioritizes certain global objectives at the expense of individual UAV performance, which may lead to extended overall mission durations.

To bridge this gap, differential game theory has emerged as a promising framework for modeling multi-agent strategic interactions under dynamics constraints. It has been applied to various multi-robot coordination problems, such as formation control [16] and airborne conflict resolution [17]. This framework reconciles individual and collective objectives from a game equilibrium perspective, providing a theoretical foundation for analyzing system-level outcomes. Specific to distributed settings, recent work has investigated graphical differential games for networked systems with limited communication [18,19], establishing convergence to local Nash equilibria (L-NE) under certain connectivity conditions. Building upon this, Refs. [20,21] advanced multi-player differential game solutions via a framework merging distributed optimal control with game theory, specifically addressing collision avoidance.

However, despite these advances, the direct application of existing differential game formulations to optimal trajectory planning for offshore wind turbine inspection poses distinct challenges, particularly under the stringent constraints of limited sensing, communication, and most critically, finite energy for round-trip missions. The main obstacles are as follows: 1) The NE derived in many generic multi-agent games or even in existing offshore wind inspection scenarios do not effectively enhance operational efficiency for battery-constrained UAVs, as round-trip energy constraints are seldom incorporated into the cost function design [22,23]. 2) Many theoretical differential game solutions assume perfect or periodic global information exchange [24], an assumption often invalid in practical offshore wind scenarios due to limited and unreliable communication links that restrict information exchange to a local neighborhood. The issue of scalability and performance under imperfect communication becomes more pronounced as the number of UAVs increases and operational conditions grow more complex [25].

In summary, while GA-DZ [6] optimizes flight distance (and thus implicitly reduces energy consumption) yet lacks explicit safety and return-trip constraints, and NN-DRL [9] learns adaptive policies but offers no guarantee of global optimality and is prone to local optima, DMPC [15] can handle dynamic disruptions but often sacrifices inspection efficiency due to its local optimization nature. Furthermore, existing centralized differential game approaches [19,24] are not directly applicable to offshore wind farm inspection scenarios under communication constraints.

To overcome the above limitations, this paper proposes a novel DDG method, which provides a theoretically guaranteed globally optimal solution. The key contributions are summarized as follows:

1) Compared to the GA-DZ method [6], which requires re-iteration and thus suffers from reduced real-time performance when dealing with dynamic maritime environments, and which optimizes for distance while lacking return-trip constraints, the proposed DDG framework explicitly models round-trip energy constraints and local communication limitations, thereby significantly improving task completion efficiency while ensuring safety. and global convergence.
2) Unlike the NN-DRL approach [9], which learns adaptive strategies yet cannot ensure global optimality and is prone to local optima, the proposed DDG framework provides a theoretically guaranteed convergence from a L-NE to a G-NE for all UAVs. This overcomes the key limitations of learning‑based strategies, particularly their lack of theoretical interpretability and convergence assurance.
3) In contrast to the DMPC [15], which relies on iterative algorithmic optimization to obtain locally optimal solutions at each sampling instant, the proposed DDG method is grounded in game theory and explicitly models strategic interactions among UAVs, driving the system toward a NE. Simulations further confirm the superior inspection efficiency of the proposed DDG method over prevailing trajectory planning methods in offshore wind farm applications [6,9,15].

Preliminaries

A comprehensive list of variables and parameters is provided in Table 1.

Download:

Table 1. Nomenclature.

https://doi.org/10.1371/journal.pone.0344989.t001

The problem description for offshore wind power inspection

The description of inspection

The workflow for inspecting offshore wind turbines using a multi-UAV system is illustrated below [26]:

Task allocation: Inspection tasks are formulated by the control center based on a comprehensive assessment of turbine conditions and meteorological information, distributing them via a cloud platform to specify detection targets and priorities.
Coordinated control and data collection: During autonomous flight and data collection, UAVs operate within a coordinated control framework that harmonizes global objectives (thorough inspection of wind turbine components) with local goals (collision avoidance and inter-UAV safety). Under constraints including limited communication range, and energy capacity, trajectory planning is optimized to minimize task completion time while ensuring complete and accurate data acquisition.
Data processing and alerting: Collected data is transmitted in real-time for AI-based analysis to identify anomalies. Alerts are generated and pushed to maintenance terminals for rapid decision-making.
Return and recharging: After task completion, UAVs autonomously return to base, execute precise landing, automatically recharge, and backup data for subsequent missions.

Remark 1. This study focuses on collaborative control of UAV clusters, excluding subsequent maintenance processes.

The modelling of UAV

This paper employs a quadrotor model to address the inspection of offshore wind turbines [27,28].

(1)

where , and are the positional components along the x, y, and z axes on the 3-dimensional Euclidean space, respectively; , and are the roll, pitch, and yaw angles, respectively; is the mass of the UAV i; , and are the moments of inertia along the x, y, and z axes, respectively; is the distance between the motor axis and the center of the body; is the acceleration due to gravity; , , , are the control strategies of the UAV i, defined as follows:

(2)

where is the lift coefficient of the UAV i; is the drag coefficient; , , and are the rotation angular velocity of rotor 1, 2, 3, and 4 for the UAV i, respectively; is the total vertical thrust; is the differential lift affecting the pitch motion of the UAV i; is the differential lift affecting the roll motion of the UAV i; is the torque affecting the yaw motion of the UAV i.

To streamline the coordinated control design, the following assumption is introduced.

Assumption 1. Each UAV operates with slow dynamics and small attitude angles near its equilibrium point, implying that the terms and are negligible and can be approximated as zero.

Under Assumption 1, the model for UAV is defined with the control input acting on the position and yaw . Consequently, the system model reduces to a second-order integrator dynamics.

For each UAV i in the set , the model is

(3)

where is the pose of UAV i; is the velocity of UAV i; is the control strategy of UAV i; is the all allowable control input; ; ;

Then, the dynamics of each UAV are expressed by the following model:

(4)

where is the state of UAV i; ; ; .

Define the collective state of the multi-UAV system as . The resulting system dynamics are given by:

(5)

where ; ; .

To quantify the inspection deviation, we define the state error of UAV i as:

(6)

where ; is the state of the target wind turbine for inspection.

Let denote the collective state error vector of the multi-UAV system. The control objective is therefore formulated as driving this error to zero asymptotically, ensuring each UAV converges to its target wind turbine:

(7)

To characterize limited sensing and communication in the multi-UAV inspection system, the communication relationships are modeled using graph theory. Specifically, a directed graph characterizes the topology among N UAVs: denotes the set of vertices, while denotes the set of edges representing communication links [29]. The presence of an edge indicates that UAV i obtains information from UAV j. Accordingly, the neighbors of UAV i are defined as . This paper assumes that the communication topology of the multi-UAV inspection system is directed and strongly connected.

Problem statement for optimal multi-UAV coordination

During the execution of offshore wind turbine inspection tasks, multi-UAVs inherently encounter challenges including communication constraints, structural obstacles from the turbines, and potential inter-UAV conflicts. Consequently, the trajectory planning problem can be effectively transformed into a coordinated control framework, which is essential for ensuring that all UAVs complete their inspection missions safely and efficiently.

To formulate the trajectory planning problem as a coordination control framework, we project the operational environment, including UAVs, turbines, and obstacles, into the configuration space. In line with the conventions established in Refs. [20,22], we define collision regions , sensing regions and free regions , representing non-navigable areas, collision avoidance areas and safe flight spaces, respectively (see [20,22] for details). Consistent with most UAV control studies, we give the the following assumptions:

Assumption 2. For every UAV , neither its initial position nor its target position lies within the collision region .

Problem 1. (Distributed differential game) Consider a inspection system composed of N UAVs, whose dynamics are limited by the constraints of equations (4)-(6), and they operate in an environment containing collision areas (which include static obstacles such as wind turbine towers and blades, as well as other UAVs). The system is also subject to a communication topology , and round-trip requirements. The objective is to minimize the total task completion time while ensuring operational safety. This problem can be formulated within a DDG framework, where each UAV is treated as an intelligent player.

In this DDG framework, each player designs its coordinated control strategy to minimize an individual cost function , the specific form of which is deferred. Strategic interactions among player i and the neighbors lead to a L-NE, which defines a collectively optimal control strategy and thus yields the optimal trajectory plan. Then,

(8)

where the pair denotes the optimal strategy and its associated cost for UAV i, while represents the collective optimal strategy for its neighbors.

The model of DDG

In the context of offshore wind turbine inspection, the proposed DDG framework is shown in Fig 1. Each UAV i operates as an autonomous player. It receives local sensor data (own state ) and communicated information from neighbors (states , strategies ). These inputs feed into its local Game Solver (green block), which solves the TPBVP (Eqs. (20)-(21)) via the numerical method (i.e., distributed dradient optimization for L-NE) to compute its optimal strategy . This strategy is applied to its dynamics and also broadcast to its neighbors, closing the distributed feedback loop. The cyan block indicates information flow limited by the communication graph .

Download:

Fig 1. The schematic diagram of the proposed DDG framework.

https://doi.org/10.1371/journal.pone.0344989.g001

Specifically, the cost function for each UAV is defined as:

(9)

s.t.

where , , , are are symmetric positive definite matrices; is the terminal return cost; is the terminal state of the UAV i; is the base position of UAV i (the takeoff position); denotes the running cost (to be defined explicitly later); and corresponds to the control strategy of neighboring UAV .

The running cost for UAV i is given by:

(10)

where , are the weighting coefffcients; The collision avoidance penalty function, is defined as:

(11)

where the visual radius of UAV i is denoted by , and represents the set of all known static obstacles along with unknown obstacles—including other UAVs and any unknown static obstacles within UAV i’s sensing range. For each obstacle , its radius and centroid are given by and , respectively. ; if obstacle is static, and if it is a UAV. The relative velocity is defined as the difference between the velocity of UAV i projected onto the plane, denoted as , and the obstacle avoidance velocity , which will be specified subsequently.

Let and represent the world and body-fixed coordinate frames, respectively. The flight trajectory of UAV i is projected onto the two-dimensional plane. The velocity of UAV i expressed in the body-fixed frame is defined as , representing its components along the and axes. This velocity vector can be obtained by

(12)

where denotes the obstacle avoidance angle. The terms and correspond to the velocity projections of UAV i onto the and axes of its body-fixed frame , respectively. Similarly, and are its velocity projections onto the x and y axes of the world coordinate system .

Accordingly, the obstacle avoidance velocity is denoted by and is defined as:

(13)

Fig 2 illustrates the obstacle avoidance angle, denoted as

(14)

Download:

Fig 2. Comparison of inspection trajectories for 3 UAVs and 28 turbines.

(a) Proposed DDG method: All UAVs (UAV1-3) complete inspection tasks. (b) GA-DZ method [6]: UAV1 collides with Wind Turbine 4 (marked by ‘X’), and UAV3 exhausts its battery before returning. (c) NN-DRL method [9]: UAV1 exhibits an inefficient, elongated path requiring mid-mission returns to the OBS.

https://doi.org/10.1371/journal.pone.0344989.g002

where ; and are the sets of known static obstacles and unknown obstacles, respectively; and are the positions decomposed along the axis in the coordinate.

Remark 2. In the paper, when UAV i is avoiding an unknown obstacle, the obstacle centroid used in the proposed obstacle penalty function refers to the centroid based on the boundary points of the obstacle detected within the sensing range of the UAV i. The radius of the obstacle used is determined by the UAV’s safety radius .

is the trajectory optimization function, which is designed as:

(15)

where is the deviation angle of UAV i. , . For the relevant principle, see our prior work [20].

Remark 3. Bidirectional range constraint (i.e., terminal return cost) is primarily considered through the return-to-base requirement in the terminal cost. This ensures that the UAV automatically meets the safe return requirement while optimizing inspection efficiency.

Remark 4. Compared to standard optimization algorithms [6,9], this approach better captures dynamic strategic interactions and ensures fairness. The outcome is a Pareto improvement where system-wide optimization is achieved without sacrificing individual utilities. The theoretical guarantees and interpretability of the NE further underscore its reliability for real-world applications such as UAV-based inspection of offshore wind turbines. As an extension of prior work [20], this study incorporates the return-trip energy constraint during UAV inspections on the basis of existing literature, making it more aligned with the practical scenario of offshore wind turbine inspection.

The optimal coordination control strategy

Following the problem formulation in the preceding section, this section is devoted to a detailed description of the solution approach for the game-theoretic models. The L-NE for the DDG defined in Problem 1 corresponds to a set of control strategies where each UAV i’s strategy is the optimal response to its neighbors’ optimal strategies , minimizing its individual cost (Eq. (9)). This constitutes a coupled optimal control problem for each agent.

Necessary conditions via Pontryagin’s minimum principle

Define an auxiliary state variable as:

(16)

where , .

Then, we give the following form:

(17)

where .

Therefore, determining the DDG for UAV i in (9) is equivalent to formulating and solving an optimal control problem, i.e.,

(18)

s.t.

where .

Furthermore, we determine the optimal coordinated control strategy through Pontryagin’s Minimum Principle (PMP), with the corresponding Hamiltonian given by:

(19)

where , corresponds to the multi-UAVs’ optimal state trajectory, and is the associated costate function.

Accordingly, inspired by the work in Ref. [22], Eq. (19) provides a key necessary condition for the optimal control in the DDG (9), ensuring convergence of the strategy set to an L-NE. The PMP states that for the optimal trajectory and control , there exists a costate such that:

(20)

with the boundary condition

(21)

where corresponds to the system state at time . These conditions define a two-point boundary value problem (TPBVP) whose solution characterizes the L-NE.

Distributed Dradient optimization for L-NE

Solving the coupled TPBVP (20) directly in a distributed manner is challenging. Inspired by the Ref. [16] for solving TPBVP, Instead, we adopt a direct optimization approach: each UAV i iteratively improves its control trajectory to minimize directly, using only local information. This approach is numerically robust and naturally parallelizable.

We employ a distributed gradient-based optimization algorithm, summarized in the following Algorithm 1. The control trajectory is parameterized over . Each iteration involves: 1) Forward simulation: Integrate the dynamics (Eq. (4)) with the current control to obtain the state trajectory. 2) Gradient computation: Compute the gradient efficiently using the adjoint method. This requires a backward integration of an adjoint equation, which is computationally inexpensive and avoids explicit solution of the costate equation in (20). 3) Gradient update and communication: Update along the negative gradient direction, then exchange the updated control with neighbors.

Remark 5. Upon convergence, the solution obtained by the gradient‑based optimization algorithm satisfies the first‑order necessary conditions for a minimum of the cost functional J_i. These conditions are mathematically equivalent to the set of Pontryagin’s Minimum Principle (PMP) conditions given in Eq.(20). Specifically, the adjoint variable introduced in the gradient computation obeys the same linear differential equation and terminal condition as the PMP costate . Because both variables satisfy an identical linear boundary‑value problem, the uniqueness theorem for such problems guarantees that at convergence. Consequently, the trajectories generated by our distributed gradient algorithm fulfill all PMP necessary conditions and therefore constitute a L‑NE.

Even when the underlying system dynamics are nonlinear, the costate (or adjoint) equation remains a linear differential equation in (or ). This linearity follows from the fact that the equation is derived either by linearizing the original Hamiltonian system around the optimal trajectory or directly from the variational principle. Hence, the uniqueness argument holds in the general nonlinear setting, ensuring the equivalence between the numerical solution of the gradient algorithm and the analytical PMP formulation.

The pseudo-code of the distributed gradient optimization for L-NE is as follows.

Algorithm 1 Distributed gradient optimization for L-NE

Input: Initial state , neighbor strategies , cost weights, horizon , step size , tolerance ε.

Output: Optimal control .

1: Repeat

2: Forward simulation Integrate the dynamics (Eq. 4) forward from to , obtaining the state trajectory .

3: Gradient computation Compute the cost gradients and along .

4: Integrate the adjoint equation backward in time: with terminal condition .

5: Compute the gradient: .

6: Distributed communication Broadcast the updated control to all neighbors .

7: Receive neighbors’ controls for .

8: Projected gradient update

9: , where projects onto the feasible control set.

10: Update step size via backtracking line search.

11: .

12: Until for all i or .

The L-NE strategy is obtained by numerically minimizing the cost function in (9) subject to the dynamics constraint (4). This is achieved using a distributed gradient descent algorithm, which directly optimizes the control trajectory without explicitly solving the two-point boundary value problem for the costate .

The Pontryagin’s Minimum Principle (PMP) applied to our DDG formulation yields the set of necessary conditions for optimality (Eqs. 20–21). These conditions, which include a two-point boundary value problem, define what constitutes a Local Nash Equilibrium (L-NE). The theoretical contribution of our work (Proposition 1) is to prove that under a strongly connected graph, the unique solution satisfying these local conditions for all agents converges to a Global Nash Equilibrium (G-NE).

Remark 6. To illustrate the scalability of the proposed DDG method, an analysis is conducted from two aspects: computational burden and real-time feasibility.

1) Computational burden: Solving the local optimal control problem (Eq. 18–20)for each UAV involves a two-point boundary value problem with state dimension 8. We employ an efficient iterative solver (i.e., a gradient-based method) whose convergence per agent typically requires iterations in our simulations, with each iteration involving low-dimensional matrix operations [16]. The distributed architecture allows these computations to be parallelized across UAVs.
2) Real-time feasibility: For the inspection scenarios considered (mission duration 400s), the trajectory planning is computed offline or re-planned at low frequency (every 30s) based on updated neighbor states. The per-agent computation time (0.5s on a standard desktop CPU) is negligible compared to the re-planning interval, demonstrating the method’s potential for near real-time operation.

We also compare our proposed approach with a centralized game-theoretic solver, which employs the same PMP principle and cost structure as our DDG method but solves a single, high-dimensional optimization problem using global information (i.e., centralized DG). This centralized solver must handle the concatenated state vector of all N UAVs, resulting in a total state dimension of 8N. Consequently, its computational complexity scales approximately as , where represents the state dimension of a single UAV.

The G-NE

The primary objective of the multi-UAV inspection mission for offshore wind farms is to minimize the total inspection time. This goal necessitates globally optimal coordination of the entire fleet, surpassing what individual UAVs can achieve locally. Consequently, the control strategy must ensure that the system converges to a G-NE, which guarantees the time-optimal performance for the entire mission, as supported by Ref. [22]. The following definition of G-NE is formalized to this end.

Definition 1. (G-NE) An N-tuple of coordination strategies for the N-UAV inspection game constitutes a G-NE if, for every UAV i, the following conditions are met:

1) Optimality condition: The control strategy is the optimal response to the optimal strategies of all other UAVs:(22)
2) Non-Triviality condition: There exists an alternative strategy such that a unilateral deviation from results in a different system cost:(23)

Next, the following proposition concerning the convergence of a L-NE to a G-NE is presented.

Proposition 1. (Convergence of L-NE to G-NE) Under a strongly connected communication topology , let denote the optimal coordinated control strategy of UAV i, derived from its interactions with neighbors . If the distributed gradient algorithm (Algorithm 1) converges, and the communication graph is strongly connected, then the L-NE generated by the algorithm will converge to a G-NE, i.e.,

(24)

Proof. The proof is divided into three steps: (1) the algorithm converges to an L-NE (satisfying the PMP); (2) strong connectivity enforces global consistency through gradient exchange; (3) convexity ensures that the local solution is unique, and thus globally unique.

Step 1 (Algorithm convergence and attainment of the L-NE): For each UAV i, integrate the dynamics (Eq. (4)) over using the current control strategy .

(25)

To efficiently compute the gradient , an adjoint variable is introduced, governed by the adjoint equation (which provides an efficient computation of the costate equation in (20)):

(26)

After backward integration, the gradient is obtained from the partial derivative of the Hamiltonian:

(27)

Each agent updates its strategy along the negative gradient direction:

(28)

where is the step size. Updated strategies are then broadcast to all neighbors . Upon convergence, for all ,

(29)

According to PMP, this condition, together with the state equation, adjoint equation, and transversality condition, constitutes the first-order necessary condition for optimality. In the distributed setting, this implies that for each agent i, given the optimal strategies of its neighbors , the strategy is a local minimizer of its individual cost (9), thereby satisfying the definition of a L-NE (8).

Step 2 (Global consistency enforced by strong connectivity): Assume the communication graph is strongly connected. Suppose, for contradiction, that the L‑NE strategies are not globally consistent.(i.e., there exist two disjoint subsets of UAVs whose locally optimal solutions are mutually incompatible given the global mission objectives.) Such inconsistency would manifest as a mismatch in the coupled cost terms via the terms in (9). For any adjacent UAVs i and , a strategy discrepancy would produce a non‑zero gradient component:

(30)

During the iterative process, this gradient information is exchanged among neighbors (Step 3 of Algorithm 1). Strong connectivity guarantees that there exists a directed path from any agent i to any other agent l. Consequently, any local inconsistency (nonzero gradient) propagates through the entire network via successive neighbor‑to‑neighbor exchanges.

Define the global gradient norm as . The gradient‑descent update ensures that is non‑increasing with k. At convergence,

(31)

Then,

(32)

This global zero‑gradient condition implies that not only each agent’s own gradient vanishes, but also all coupled interaction terms (via ) are balanced, thereby eliminating any pairwise strategic contradictions. Hence, the locally optimal strategies are globally consistent.

Step 3 (Uniqueness and attainment of the G‑NE): To establish the uniqueness of the G-NE, we begin by analyzing the adjoint system derived from a quadratic approximation of the problem around the equilibrium trajectory. Consider the Nash equilibrium trajectory , Defining the deviation as , we have the collective state vector and the co-state vector . The associated adjoint system with two-point boundary values is given by (The derivation is detailed in S1 Appendix):

(33)

where the matrix is a block-diagonal matrix, . Each block is constructed from the Hessian of the running cost function for UAV i evaluated at the equilibrium:

(34)

where is the state of UAV i at the Nash equilibrium, is the Hessian matrix of at , is a positive weighting coefficient. The aggregate matrix thus represents a weighted sum of the individual Hessians.

The positive definiteness of is crucial and follows from the construction of the running cost . This cost combines a collision avoidance penalty and a trajectory optimization term . The penalty term is designed to be convex and increasing outside a safe distance, with a positive definite Hessian at the collision-free equilibrium point . The trajectory term is also convex (e.g., based on squared angular deviation), yielding a positive semi-definite Hessian. By selecting positive coefficients , the weighted superposition of these terms ensures that each is positive semi-definite, with at least one block being positive definite. Consequently, the block-diagonal matrix is positive definite.

The positive definiteness of together with the positive definiteness of the weighting matrices , , , in the cost function (9), guarantees that the integrated cost term and the terminal cost are jointly convex. Therefore, given the strategies of its neighbors , each UAV’s optimization problem is strictly convex. For a strictly convex problem, any point satisfying the first‑order necessary condition (i.e., the zero‑gradient condition) is the unique global minimizer. Thus, the strategy obtained in Step 1 is the unique optimal response of UAV i to , each UAV’s optimization subproblem is strictly convex. For a strictly convex problem, any point satisfying the first-order necessary optimality conditions (i.e., the zero-gradient condition derived from Pontryagin’s Minimum Principle) is the unique global minimizer. Hence, the strategy obtained upon convergence of the distributed gradient algorithm is the unique optimal response of UAV i to its neighbors’ strategies .

As established in Step 2 of the main proof, the strong connectivity of the communication graph , ensures that the locally optimal strategies are globally consistent. The collection of these unique local optimal responses, , therefore forms a strategy profile that satisfies the definition of a G-NE (Definition 1). Formally, for every UAV i, if all other UAVs adhere to (the strategies of all agents except i), then is its optimal response:

(35)

The distinction between the L-NE and the G-NE is that the G-NE considers the strategies of all other UAVs, not just immediate neighbors. The strong connectivity of the network, which enables the propagation of local consistency, guarantees the equivalence between these two notions in our framework. Consequently, the uniqueness of the solution to the adjoint system (ensured by ), combined with the strong connectivity of , secures the convergence of the algorithm to a G-NE:

(36)

In summary, under a strongly connected communication topology, the distributed gradient algorithm converges to a strategy profile that is both a local and a global Nash Equilibrium. □

Remark 7. The strong connectivity of the communication graph ensures a bidirectional information path between any two UAVs, allowing local strategy information to propagate across the entire network in finite time. This leads to global alignment of the L-NE strategies and consequently drives convergence toward a G-NE. Furthermore, the positive definiteness of guarantees the existence and uniqueness of the L-NE, which in turn ensures that the resulting G-NE is also uniquely defined.

Simulations

Here we conduct a comprehensive assessment of the proposed DDG method’s core capabilities: coordinated optimality and operational safety. A systematic comparison against two benchmark methods is presented to demonstrate how coordinated behavior enhances mission efficiency through reduced completion times without compromising safety in maritime multi-UAV inspection scenarios: The GA-DZ method, which optimizes UAV trajectories by minimizing the total flight distance, exhibits limited adaptability and fails to account for bidirectional range constraints [6], and the NN-DRL method, which is prone to convergence to local optima [9].

Remark 8. This study focuses specifically on the operational control of multi-UAV systems during maritime inspection, and thus the simulations are confined to the operational space in which UAVs execute inspection tasks over offshore wind turbines. Given that target wind turbines are pre-assigned to each UAV, the core objective is to ensure the safe and efficient completion of these missions, rather than addressing the task allocation problem. To guarantee a fair comparison, all evaluated methods—including the proposed approach and the benchmarks—utilize the same initial task assignments and are implemented with fully disclosed parameters.

For the GA-DZ baseline, we adopted the genetic algorithm with dynamic zoning as outlined in Ref. [6], utilizing the authors’ publicly available source code. The algorithm encodes solutions as sequences of waypoint assignments and corresponding flight trajectories. Its optimization is driven by a fitness function defined as the inverse of the total path distance for the UAV fleet. To ensure a meaningful comparison that tests the algorithm’s inherent ability to satisfy constraints, substantial penalties ( per violation) are applied in the fitness evaluation for any path that violates safety distances limits. The evolutionary process uses a population size of 100 and runs for 500 generations per simulation, with a crossover rate of 0.85, a mutation rate of 0.1, and tournament selection (size=3). Unlike learning‑based methods, GA‑DZ does not involve a separate training phase; it is executed directly on each test scenario to produce a scenario‑specific solution, enabling a direct and fair performance comparison under the same conditions as the proposed DDG method.

For the NN-DRL baseline, we implemented a Dueling Deep Q-Network (DQN) following the architecture in Ref. [9]. The network takes the UAV’s state as input, processes it through two fully-connected layers, and outputs Q-values for each discrete action. The state space includes the UAV’s pose , velocity , remaining battery, relative positions to its target and the nearest wind turbine, as well as positions of neighboring UAVs within communication range. The action space is defined as nine discrete actions: hovering and moving in eight fixed-speed directions. The reward function is designed to balance multiple objectives and is formulated as:

(37)

where , , , and denote distance to target, energy consumption, inspection completion reward, collision penalty, and return violation penalty, respectively, with all coefficients tuned accordingly. During training, an ε-greedy exploration strategy is adopted, with ε linearly decaying from to over the first episodes. Each agent was trained for episodes on a randomized set of training scenarios to ensure strategy generalization. The trained strategy was then evaluated on a separate, held-out test set that is identical to the scenarios used for evaluating the proposed DDG method, thereby ensuring a fair comparison. The parameters of the NN-DRL method are summarized in Table 2.

Download:

Table 2. The parameters of the NN-DRL method [9].

https://doi.org/10.1371/journal.pone.0344989.t002

We consider a multi-UAV inspection system comprising three UAVs (i.e., ). The environment contains 28 offshore wind turbines and one offshore booster station (OBS), which serves as both the charging base and the common starting point for all UAVs. The simulated obstacles include known wind turbine towers, each with a radius of 5m, as well as other UAVs, which are treated as unknown dynamic obstacles. Each UAV is assigned a specific subset of wind turbines for inspection and plans its trajectory accordingly. To ensure a fair comparison, the proposed DDG method and the two benchmark approaches [6,9] are evaluated under identical conditions: the same initial positions and dynamics model (Eq. 3), the same environmental layout (28 turbines and 1 OBS), and the same success criteria (complete inspection, collision‑free operation, and safe return to the OBS). This setup isolates the performance differences to the algorithmic level. All other relevant simulation parameters of the proposed DDG are summarized in Table 3.

Download:

Table 3. The other related simulation parameters of the proposed DDG.

https://doi.org/10.1371/journal.pone.0344989.t003

As shown in Fig 3, which shows the inspection planned trajectories under three methods. UAV1 is responsible for inspecting wind turbine set ; UAV2 for set ; and UAV3 for set . For clarity, the inspection sequences with the three methods are summarized in a Table 4. The results indicate that the proposed method achieves the shortest total path length of 845 m. This is because the proposed method transforms the estimated planning of the UAVs into an optimally coordinated DDG model, obtaining a G-NE trajectory that balances maximum flight range and reduced inspection time. The GA-DZ method [6], which uses a genetic algorithm to optimize trajectories and aims to reduce inspection time, neglects the safety constraints of the UAVs. This makes it unsuitable for densely distributed wind turbine scenarios. During inspection, the wind turbines remain operational. When UAV1 using the GA-DZ method flies from turbine 1 to turbine 4, it collides with turbine 4 due to unaccounted dynamic obstacles, preventing completion of subsequent tasks (Fig 3(b), red solid line). Additionally, UAV3 fails to complete its inspection because it runs out of battery while flying from turbine 9 to turbine 12, as the return energy constraint is not considered (Fig 3(b), blue dashed line). The NN-DRL method [9] yields the longest total path length of 10679 m. This is attributed to its tendency to fall into local optima during online trajectory planning. Although it considers the return energy constraint, frequent returns for recharging reduce inspection efficiency. For instance, UAV1 returns to charge three times, significantly increasing the total path length (Fig 3(c), red solid line).

Download:

Table 4. The inspection sequences and path length with the three methods.

https://doi.org/10.1371/journal.pone.0344989.t004

Download:

Fig 3. 3 UAVs and 28 turbines.

(a) The minimum distance between UAVs and wind turbines with the proposed method; (b) The minimum distance between UAVs and wind turbines with the GA-DZ method; (c) The minimum distance between UAVs and wind turbines with the NN-DRL method.

https://doi.org/10.1371/journal.pone.0344989.g003

To evaluate safety during inspection, Fig 4 shows the minimum distances between each UAV and obstacles (wind turbines and other UAVs within the field of view) for the three methods. Fig 4(a) corresponds to the proposed method, where all UAVs maintain distances greater than the safe threshold. Fig 4(b) illustrates the results for the GA-DZ method [6]. While UAV2 maintains safe distances, the minimum distance between UAV1 and wind turbine 4 falls below the safe threshold, indicating a collision. Moreover, UAV3 stops operating at 250 s due to battery depletion. Fig 4(c) presents the results for the NN-DRL method [9], where all UAVs maintain distances above the minimum safe level.

Download:

Fig 4. The state errors for 3 UAVs and 28 turbines.

(a) under proposed approach; (b) under the GA-DZ approach; (c) under the NN-DRL approach.

https://doi.org/10.1371/journal.pone.0344989.g004

Fig 5 further compares the state error convergence of the three methods. Fig 5(a) shows the evolution of state errors for each UAV using the proposed approach. All errors converge to zero, confirming successful completion of the inspection tasks. Fig 5(b) displays the state errors for the GA-DZ method [6], where only UAV2 completes its task. Fig 5(c) presents the state errors for the NN-DRL method [9], where all UAVs’ errors converge to zero, indicating task completion.

Download:

Fig 5. Comparison of inspection trajectories for 6 UAVs and 40 turbines.

(a) Proposed DDG method: UAV1’s elongated path requires it to return to the OBS mid‑mission. (b) GA-DZ method [6]: UAV1 collides with Wind Turbine 16 (marked by ‘X’), UAV3 exhausts its battery before returning. (c) NN-DRL method [9]: UAV1 and UAV4 exhibits an inefficient, elongated path requiring mid-mission returns to the OBS.

https://doi.org/10.1371/journal.pone.0344989.g005

To demonstrate the scalability and effectiveness of the proposed method, a scenario with six UAVs inspecting 40 wind turbines is designed. UAV1 is assigned turbines ; UAV2: ; UAV3: ; UAV4: ; UAV5:; and UAV6: . Fig 6 shows the inspection trajectories for the three methods, with sequences summarized in a Table 5. The proposed method again achieves the shortest total path length (272 + 226 + 206 + 312 + 240 + 308 = 1564 m). UAV1 returns to recharge when its battery drops below a threshold and then resumes inspection. With the GA-DZ method [6], UAV1 collides with turbine 16 while flying from turbine 7 to the next target, preventing further inspection (Fig 6(b), red solid line). UAV3 fails to complete its task due to insufficient battery (Fig 6(b), blue dashed line). The NN-DRL method [9] produces the longest total path length, with UAV1 and UAV4 each returning to charge twice, significantly increasing the path length (Fig 6(c), red and yellow solid lines).

Download:

Table 5. The inspection sequences and path length with the three methods.

https://doi.org/10.1371/journal.pone.0344989.t005

Download:

Fig 6. 6 UAVs and 40 turbines.

(a) The minimum distance between UAVs and wind turbines with the proposed method; (b) The minimum distance between UAVs and wind turbines with the GA-DZ method; (c) The minimum distance between UAVs and wind turbines with the NN-DRL method.

https://doi.org/10.1371/journal.pone.0344989.g006

Similarly, Fig 7 plots the minimum distances between each UAV and obstacles. Fig 7(a) shows that all UAVs maintain safe distances with the proposed method. Fig 7(b) illustrates that for the GA-DZ method [6], UAV1 violates the safe distance, and UAV3 stops at 180 s due to battery depletion. Fig 7(c) shows that all UAVs maintain safe distances with the NN-DRL method [9].

Download:

Fig 7. The state errors for 6 UAVs and 40 turbines.

(a) under proposed approach; (b) under the GA-DZ approach; (c) under the NN-DRL approach.

https://doi.org/10.1371/journal.pone.0344989.g007

Fig 8 compares the state error convergence for the six-UAV scenario. Fig 8(a) shows that all UAVs’ state errors converge to zero with the proposed method, indicating successful task completion. Fig 8(b) reveals that for the GA-DZ method [6], the state errors of UAV1 and UAV3 do not converge to zero, meaning these UAVs fail to complete their inspections. Fig 8(c) shows that all UAVs’ state errors converge to zero with the NN-DRL method [9], confirming task completion.

Download:

Fig 8. Comparison of the online average computation time for the three methods under two scales.

https://doi.org/10.1371/journal.pone.0344989.g008

In contrast to GA-DZ’s failure in safety and return energy, and NN-DRL’s suboptimal time efficiency due to lack of global coordination guarantees, our method ensures both safety and time-optimality by construction through the proven G-NE.

To further illustrate the computational cost of the proposed method, the comparison results of online average computation time for the three methods under different paradigms are shown in Fig 9. In the scenario of 3 UAVs inspecting 28 wind turbines, the online average computation time of the proposed DDG method is 0.15 seconds, which outperforms GA-DZ [6] at 0.50 seconds and NN-DRL [9] at 0.30 seconds. The GA-DZ method suffers from significant computational burden due to its reliance on online genetic algorithm optimization. The NN-DRL method, while better than GA-DZ, still requires periodic online learning updates, resulting in higher computation times. When scaling up to 6 UAVs inspecting 40 wind turbines, the computation time of the DDG method only increases to 0.17 seconds, a growth rate of 13.3%, significantly lower than the 70.0% of GA-DZ and 50.0% of NN-DRL. This notable advantage stems from the core design of the distributed differential game framework: each UAV only needs to solve a local optimal control problem (Eq. 18–20), whose computational complexity depends solely on its own state and the number of neighbors , independent of the total system scale N, thereby ensuring the scalability of individual computations. At the same time, the distributed nature of the algorithm leads to a near-linear increase in the total system-wide computational load with respect to N, avoiding the combinatorial explosion or polynomial complexity often encountered in centralized global optimizers; the communication overhead scales with the network density (number of edges in graph ), and under the assumption of a strongly connected graph, the system is guaranteed to converge to a G-NE regardless of how N increases. By incorporating a local information interaction mechanism, the proposed method achieves a gradual increase in computational burden with scale, making it more suitable for real-world offshore wind farm inspection scenarios characterized by limited communication and variable scales.

Download:

Fig 9. Performance distribution copmarison of three methods.

(a) The minimum distance between UAVs and wind turbines with the three methods (b) The task completion time between UAVs and wind turbines with the three methods.

https://doi.org/10.1371/journal.pone.0344989.g009

To further illustrate the efficacy in minimizing inspection time without compromising safety of the proposed DDG method, we conducted a comparative validation of the three methods in 30 randomly generated simulation scenarios. In each trial, the initial positions of the UAVs and the assignment sequence of wind turbines were varied within a defined operational range. Each scenario involves 3 UAVs and 28 wind turbines, with the inspection tasks per UAV and algorithm parameters remaining unchanged (see Table 2 to Table 3). Fig 10 visually presents the data distribution of the three methods in terms of minimum distance and task completion time. The figure includes the median (line inside the box), interquartile range (box range), whiskers (normal data range), and outliers (individual points), providing complete statistical distribution information.

Download:

Fig 10. Comparison of computation time.

(a) The comparison of computation time for 3 UAVs inspecting 28 wind turbines (b) The comparison of computation time for 6 UAVs inspecting 40 wind turbines.

https://doi.org/10.1371/journal.pone.0344989.g010

By observing Fig 10(a), the minimum distance distributions of the proposed DDG method (median: 5.25 m, interquartile range: 4.64–5.89 m) and the NN-DRL method (median: 5.13 m, interquartile range: 4.58–5.69 m) are highly overlapping, and both are significantly above the 4.0 m safety distance (indicated by the dashed line). Both methods maintain a 100% success rate across the 30 scenarios. The box height of the DDG method is slightly narrower than that of the NN-DRL method, indicating better inter-scenario consistency in maintaining safe distance. All data points are above 4.5m, with no outliers. The GA-DZ method succeeded in only 6 scenarios (20% success rate), and although its minimum safe distance distribution (median: 4.13 m) is above the threshold, it contains multiple data points close to the lower limit. More importantly, 80% of the scenarios failed due to collisions or energy depletion, confirming the high risk of this method in practical applications.

By observing Fig 10(b), the task completion time distribution of the proposed DDG method (median: 420.0 s, interquartile range: 357.9–561.0 s) is entirely lower than that of the NN-DRL method (median: 503.2 s, interquartile range: 398.6–620.4 s). The box of the DDG method is completely below that of the NN-DRL method, visually demonstrating its efficiency advantage. The average task completion time of the DDG method (381.2 ± 19.8 s) is 87.7 seconds shorter than that of the NN-DRL method (468.9 ± 23.5 s), representing a relative improvement of 18.7%. This improvement is reflected in the box plot as a clear vertical offset. The interquartile range of the DDG method (36.6 s) is narrower than that of the NN-DRL method (41.6 s), indicating lower sensitivity to different scenario configurations and better predictability in time. The GA-DZ method has valid data only in 6 successful scenarios (median: 492.4 s), but considering its 80% failure rate, its average task completion time is effectively infinite.

The relevant performance metrics from the 30 randomly generated simulation scenarios are summarized in Table 6. Based on the comprehensive statistical analysis across 30 randomized scenarios, the proposed DDG method demonstrates significant and consistent advantages over the benchmark methods [6,9]. While maintaining statistically equivalent safety performance (5.36 ± 0.29 m) and perfect reliability (100% success rate) compared to the NN-DRL method, it achieves a remarkable 18.7% reduction in task completion time (381.2 vs. 468.9 seconds). This combination of enhanced efficiency and unwavering reliability—standing in sharp contrast to the GA‑DZ method, which fails in 80% of scenarios due to collisions or energy exhaustion—establishes the DDG framework as a highly efficient and robust solution for practical offshore wind farm inspection, particularly well‑suited for resource‑constrained UAV platforms.

Download:

Table 6. The comprehensive statistical results using the three methods.

https://doi.org/10.1371/journal.pone.0344989.t006

In summary, by incorporating collision avoidance constraints, trajectory optimization, and maximum range constraints, the proposed method enables the multi-UAVs to ensure operational safety while reducing the overall inspection time during offshore wind turbine inspections.

Scalability analysis

To concretely address scalability and computational burden, we performed a head-to-head comparison between our distributed DDG and an equivalent centralized game-theoretic solver. The centralized solver uses the same PMP principle and the proposed cost structure but optimizes the trajectories of all UAVs simultaneously using global information. The results are presented in the Fig 11.

Download:

Fig 11. Performance Comparison.

(a) The minimum distance for 3 UAVs inspecting 28 wind turbines with the two methods (b) The task completion for 3 UAVs inspecting 28 wind turbines with the two methods (c) The computation time for 3 UAVs inspecting 28 wind turbines with the two methods (d) The minimum distance for for 6 UAVs inspecting 40 wind turbines with the two methods (e) The task completion for 6 UAVs inspecting 40 wind turbines with the two methods (f) The computation time for 6 UAVs inspecting 40 wind turbines with the two methods.

https://doi.org/10.1371/journal.pone.0344989.g011

1) For the 3-UAV / 28-turbine scenario
The proposed DDG: Per-UAV computation times are 0.12s, 0.15s, and 0.13s (average: ).
The centralized solver: The joint optimization requires 0.85s to compute a solution for the entire system. Therefore, the distributed system is times faster per planning cycle when considering parallel execution, which is shown in the Fig 11(a).
2) For the 6-UAV / 40-turbine scenario:
The proposed DDG: Per-UAV times range from 0.155s to 0.18s (average: ).
The centralized solver: Computation time surges to .

Then, the performance gap widens significantly. The distributed system is now 17 times faster on average, and critically, the centralized time far exceeds the real-time threshold, which is shown in the Fig 11(b). This comparative analysis provides direct, empirical evidence that our proposed DDG framework successfully avoids the combinatorial explosion typical of centralized optimal control.

Notably, when the fleet size was increased by 100% (from 3 to 6 UAVs), the average per-agent computation time of our DDG method increased by only about 24%. This sub-linear growth is a direct outcome of the distributed architecture, where each UAV solves a local problem whose complexity is bounded by the size of its neighborhood , rather than by the global fleet size N.

To further demonstrate the advantages of the proposed DDG method, a comparative analysis was conducted against the widely-used DMPC approach [15] under two operational paradigms: 3 UAVs inspecting 28 wind turbines and 6 UAVs inspecting 40 wind turbines. For each paradigm, 10 different random simulation scenarios were generated. The following performance metrics were statistically evaluated: the minimum safe distance between UAVs and obstacles (including turbines and other UAVs within sensing range), the total task completion time, and the on‑board computation time. In the DMPC [15] implementation, the prediction horizon is set to 10 steps. To ensure a fair comparison between the algorithms, the cost function is kept consistent with that of the DDG approach, and the fmincon solver is employed for optimization. The statistical results are presented in Fig 12.

Fig 12(a) and Fig 12(d) illustrate that, in terms of safety, both methods successfully avoid collisions in all simulations, consistently maintaining obstacle‑avoidance distances above the 4.0m safety threshold. This confirms that the DDG method preserves a safety level comparable to that of DMPC.

In terms of task efficiency, as observed in Fig 12(b) and Fig 12(e), the DDG method reduces the average task completion time by approximately 12%. For instance, in the 6‑UAV scenario, DDG achieves , whereas DMPC [15] requires . This improvement stems from the game‑theoretic foundation of DDG, which explicitly models the strategic interactions among agents and drives the system toward a NE. Under the assumption of strong connectivity, the L‑NE attained by DDG is guaranteed to be globally optimal, ensuring that each UAV’s trajectory is globally balanced and coordinated. In contrast, DMPC [15] relies on algorithmic optimization to seek locally optimal solutions at each sampling instant. While such solutions may optimize certain global objectives, they often do so at the expense of individual agent performance, leading to longer overall mission times.

Regarding computational efficiency, Fig 12(c) and Fig 12(f) demonstrate that DDG exceeds the on‑board computation time of DMPC by to . In the 6‑UAV scenario, DDG requires , whereas DMPC only demands . This significant difference stems from the fact that DDG needs to obtain the L‑NE strategy through online game theory at each sampling instant. In contrast, DMPC solve an online optimization problem at every sampling interval, the prediction horizon interval of DMPC is shorter than the game‑theoretic cycle of DDG at each sampling time, while DDG exhibits only a mild increase in computation, underscoring its superior scalability and real‑time capability.

In summary, the proposed DDG method not only matches DMPC in safety assurance but also significantly outperforms it in task efficiency, even though its online computation time is slightly higher. Its game‑theoretic formulation ensures globally balanced trajectories with low communication and computation demands, surpassing both the GA-DZ [6] and NN-DRL methods [9]. This makes it particularly suitable for distributed cooperative inspection in offshore wind farms, where communication is limited.

Conclusion

This paper introduces an optimal coordinated control strategy designed to minimize task completion time for multi-UAV inspection systems in offshore wind farms, subject to limited sensing capabilities and round-trip mission constraints. The coordination challenge is formulated using an novel DDG framework, which avoids the need for global system information. The proposed model explicitly integrates round-trip requirements into a game-theoretic objective function to facilitate energy-aware trajectory planning. With a strongly connected communication graph, the L-NE from decentralized solving of the DDG provably converges to the G-NE, thereby ensuring system-wide coordination optimality under energy and operational constraints. Simulation results validate the framework’s efficacy, confirming its ability to enhance inspection efficiency through a marked reduction in task completion time.

Limitations and future work

Current Limitations: The present model assumes ideal, delay-free communication within the sensing range and does not explicitly account for dynamic environmental disturbances such as wind gusts.

To enhance the realism and robustness of the proposed framework, future work will focus on incorporating time-varying communication topologies, developing communication delay compensation mechanisms, modeling dynamic wind fields, and formulating strategies to handle unexpected obstacles. Integrating stochastic wind models into the dynamics and cost function for more resilient trajectory planning. Additionally, large-scale simulations (e.g., ) will be conducted to empirically quantify the relationship between system scale and performance, further validating the scalability and practical applicability of the method.

Supporting information

S1 Appendix. This appendix contains the detailed derivation of the adjoint system (33).

https://doi.org/10.1371/journal.pone.0344989.s001

(PDF)

References

1. Yang C, Zhou H, Liu X, Ke Y, Gao B, Grzegorzek M, et al. BladeView: Toward Automatic Wind Turbine Inspection With Unmanned Aerial Vehicle. IEEE Trans Automat Sci Eng. 2025;22:7530–45.
- View Article
- Google Scholar
2. Yan M, Yuan H, Xu J, Yu Y, Jin L. Task allocation and route planning of multiple UAVs in a marine environment based on an improved particle swarm optimization algorithm. EURASIP J Adv Signal Process. 2021;2021(1).
- View Article
- Google Scholar
3. Li Z, Wu J, Xiong J, Liu B. Research on automatic path planning of wind turbines inspection based on combined UAV. In: 2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE; 2024. p. 1–6.
4. Chung H-M, Maharjan S, Zhang Y, Eliassen F, Strunz K. Placement and Routing Optimization for Automated Inspection With Unmanned Aerial Vehicles: A Study in Offshore Wind Farm. IEEE Trans Ind Inf. 2021;17(5):3032–43.
- View Article
- Google Scholar
5. Huang X, Wang G. Optimization for total energy consumption of drone inspection based on distance-constrained capacitated vehicle routing problem: A study in wind farm. Expert Syst Appl. 2024;255:124880.
- View Article
- Google Scholar
6. Peng Z, Sun S, Tong L, Fan Q, Wang L, Liu D. Optimization of offshore wind farm inspection paths based on K-means-GA. PLoS One. 2024;19(5):e0303533. pmid:38781135
- View Article
- PubMed/NCBI
- Google Scholar
7. Miao F, Li H, Mei X. Three-Dimensional Path Planning of UAVs for Offshore Rescue Based on a Modified Coati Optimization Algorithm. JMSE. 2024;12(9):1676.
- View Article
- Google Scholar
8. Zhang X-Y, Yu H, Zheng X, Wang H, Mu C, Guo P. Hybrid Deep Reinforcement Learning for UAV Inspection in Large-Scale Wind Farms: Deployment and Routing Optimization. IEEE Trans Ind Inf. 2025;21(11):9011–21.
- View Article
- Google Scholar
9. Fan T, Fu L, Guo C, Zhang Y, Sun L. Multi-UAV Inspection Optimization for Offshore Wind Farms Considering Battery Exchange Process. IEEE Trans Intell Veh. 2025;10(2):972–82.
- View Article
- Google Scholar
10. Wang Z, Yuan Q, Zhang F, Wang Y, Kou L, Wen J, et al. A Chaotic Simulated Annealing Genetic Algorithm with Asymmetric Time for Offshore Wind Farm Inspection Path Planning. IJBIC. 2024;1(1).
- View Article
- Google Scholar
11. Heinze J, Schopferer S, de Haag MU. Trajectory planning for offshore wind farm logistics with unmanned aircraft. In: 2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC). IEEE; 2024. p. 1–10.
12. Huang X, Wang G, Lu Y, Jia Z. Study on a Boat-Assisted Drone Inspection Scheme for the Modern Large-Scale Offshore Wind Farm. IEEE Syst J. 2023;17(3):4509–20.
- View Article
- Google Scholar
13. Xuan L, Yang P, Lin Z, Yan P, Dong S, Dong J, et al. Research on design and control method of lightning protection detection robot for wind turbine. In: 2023 5th International Conference on Robotics and Computer Vision (ICRCV). IEEE; 2023. p. 41–5.
14. Wang L, Kou L, Wang Z, Zhang F, Yuan Q. Optimisation of single starting point path for offshore wind farm inspection based on MTSP. IJWMC. 2025;29(3):300–9.
- View Article
- Google Scholar
15. Kong S, Liu Y. Model predictive control for risk-based inspection and maintenance planning of offshore wind turbines. Ships Offshore Struct. 2025;:1–12.
- View Article
- Google Scholar
16. Li Y, Hu X. A differential game approach to intrinsic formation control. Automatica. 2022;136:110077.
- View Article
- Google Scholar
17. Rattanakul S, Chansiri K. Analysis of strategic and tactical conflict management for urban air mobility and UAS in high-density airspace. J Theoret Exp Sci Adv. 2025;15(10):1–15.
- View Article
- Google Scholar
18. Yu D, Ge SS, Li D, Wang P. Finite-horizon robust formation-containment control of multi-agent networks with unknown dynamics. Neurocomputing. 2021;458:403–15.
- View Article
- Google Scholar
19. Cappello D, Garcin S, Mao Z, Sassano M, Paranjape A, Mylvaganam T. A Hybrid Controller for Multi-Agent Collision Avoidance via a Differential Game Formulation. IEEE Trans Contr Syst Technol. 2021;29(4):1750–7.
- View Article
- Google Scholar
20. Xue W, Zhan S, Wu Z, Chen Y, Huang J. Distributed multi-agent collision avoidance using robust differential game. ISA Trans. 2023;134:95–107. pmid:36182609
- View Article
- PubMed/NCBI
- Google Scholar
21. Xu Y, Yang H, Jiang B, Polycarpou MM. A Games-in-Games Framework for Task Allocation, Path Planning, and Formation Control. IEEE Trans Control Netw Syst. 2025;12(1):620–33.
- View Article
- Google Scholar
22. Xue W, Zhan S, Chen N, Huang J. Optimal coordination in logistics warehousing with sensing and communication limits: A distributed differential game approach. Asian J Control. 2023;26(1):419–35.
- View Article
- Google Scholar
23. Singh SK, Reddy PV, Vundurthy B. Study of Multiple Target Defense Differential Games Using Receding Horizon-Based Switching Strategies. IEEE Trans Contr Syst Technol. 2022;30(4):1403–19.
- View Article
- Google Scholar
24. Wang X, Xiao Z, Ren Z, Dong C, Tian XD. Formation cooperative trajectory tracking control for unmanned aerial vehicles via differential game and reinforcement learning. Trans Instit Measur Control. 2024;47(9):1762–70.
- View Article
- Google Scholar
25. Huang X, Wang G. Saving Energy and High‐Efficient Inspection to Offshore Wind Farm by the Comprehensive‐Assisted Drone. Int J Energy Res. 2024;2024(1).
- View Article
- Google Scholar
26. Castelar Wembers C, Pflughaupt J, Moshagen L, Kurenkov M, Lewejohann T, Schildbach G. LiDAR‐based automated UAV inspection of wind turbine rotor blades. J Field Robot. 2024;41(4):1116–32.
- View Article
- Google Scholar
27. Zhou P, Chen BM. Semi-global leader-following consensus-based formation flight of unmanned aerial vehicles. Chin J Aeronaut. 2022;35(1):31–43.
- View Article
- Google Scholar
28. Zhou S, Hua H, Dong X, Li Q, Ren Z. Air-ground time varying formation tracking control for heterogeneous UAV-UGV swarm system. Aero Weaponry. 2019;26(4):54–9.
- View Article
- Google Scholar
29. Jond HB. Distributed Differential Graphical Game for Control of Double-Integrator Multi-Agent Systems With Input Delay. IEEE Trans Control Netw Syst. 2024;11(4):1949–61.
- View Article
- Google Scholar

[ref1] 1. Yang C, Zhou H, Liu X, Ke Y, Gao B, Grzegorzek M, et al. BladeView: Toward Automatic Wind Turbine Inspection With Unmanned Aerial Vehicle. IEEE Trans Automat Sci Eng. 2025;22:7530–45.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Yan M, Yuan H, Xu J, Yu Y, Jin L. Task allocation and route planning of multiple UAVs in a marine environment based on an improved particle swarm optimization algorithm. EURASIP J Adv Signal Process. 2021;2021(1).
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Li Z, Wu J, Xiong J, Liu B. Research on automatic path planning of wind turbines inspection based on combined UAV. In: 2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE; 2024. p. 1–6.

[ref4] 4. Chung H-M, Maharjan S, Zhang Y, Eliassen F, Strunz K. Placement and Routing Optimization for Automated Inspection With Unmanned Aerial Vehicles: A Study in Offshore Wind Farm. IEEE Trans Ind Inf. 2021;17(5):3032–43.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref5] 5. Huang X, Wang G. Optimization for total energy consumption of drone inspection based on distance-constrained capacitated vehicle routing problem: A study in wind farm. Expert Syst Appl. 2024;255:124880.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Peng Z, Sun S, Tong L, Fan Q, Wang L, Liu D. Optimization of offshore wind farm inspection paths based on K-means-GA. PLoS One. 2024;19(5):e0303533. pmid:38781135
View Article
PubMed/NCBI
Google Scholar

[15] View Article

[16] PubMed/NCBI

[17] Google Scholar

[ref7] 7. Miao F, Li H, Mei X. Three-Dimensional Path Planning of UAVs for Offshore Rescue Based on a Modified Coati Optimization Algorithm. JMSE. 2024;12(9):1676.
View Article
Google Scholar

[19] View Article

[20] Google Scholar

[ref8] 8. Zhang X-Y, Yu H, Zheng X, Wang H, Mu C, Guo P. Hybrid Deep Reinforcement Learning for UAV Inspection in Large-Scale Wind Farms: Deployment and Routing Optimization. IEEE Trans Ind Inf. 2025;21(11):9011–21.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref9] 9. Fan T, Fu L, Guo C, Zhang Y, Sun L. Multi-UAV Inspection Optimization for Offshore Wind Farms Considering Battery Exchange Process. IEEE Trans Intell Veh. 2025;10(2):972–82.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref10] 10. Wang Z, Yuan Q, Zhang F, Wang Y, Kou L, Wen J, et al. A Chaotic Simulated Annealing Genetic Algorithm with Asymmetric Time for Offshore Wind Farm Inspection Path Planning. IJBIC. 2024;1(1).
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref11] 11. Heinze J, Schopferer S, de Haag MU. Trajectory planning for offshore wind farm logistics with unmanned aircraft. In: 2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC). IEEE; 2024. p. 1–10.

[ref12] 12. Huang X, Wang G, Lu Y, Jia Z. Study on a Boat-Assisted Drone Inspection Scheme for the Modern Large-Scale Offshore Wind Farm. IEEE Syst J. 2023;17(3):4509–20.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref13] 13. Xuan L, Yang P, Lin Z, Yan P, Dong S, Dong J, et al. Research on design and control method of lightning protection detection robot for wind turbine. In: 2023 5th International Conference on Robotics and Computer Vision (ICRCV). IEEE; 2023. p. 41–5.

[ref14] 14. Wang L, Kou L, Wang Z, Zhang F, Yuan Q. Optimisation of single starting point path for offshore wind farm inspection based on MTSP. IJWMC. 2025;29(3):300–9.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref15] 15. Kong S, Liu Y. Model predictive control for risk-based inspection and maintenance planning of offshore wind turbines. Ships Offshore Struct. 2025;:1–12.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref16] 16. Li Y, Hu X. A differential game approach to intrinsic formation control. Automatica. 2022;136:110077.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref17] 17. Rattanakul S, Chansiri K. Analysis of strategic and tactical conflict management for urban air mobility and UAS in high-density airspace. J Theoret Exp Sci Adv. 2025;15(10):1–15.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref18] 18. Yu D, Ge SS, Li D, Wang P. Finite-horizon robust formation-containment control of multi-agent networks with unknown dynamics. Neurocomputing. 2021;458:403–15.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref19] 19. Cappello D, Garcin S, Mao Z, Sassano M, Paranjape A, Mylvaganam T. A Hybrid Controller for Multi-Agent Collision Avoidance via a Differential Game Formulation. IEEE Trans Contr Syst Technol. 2021;29(4):1750–7.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref20] 20. Xue W, Zhan S, Wu Z, Chen Y, Huang J. Distributed multi-agent collision avoidance using robust differential game. ISA Trans. 2023;134:95–107. pmid:36182609
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref21] 21. Xu Y, Yang H, Jiang B, Polycarpou MM. A Games-in-Games Framework for Task Allocation, Path Planning, and Formation Control. IEEE Trans Control Netw Syst. 2025;12(1):620–33.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref22] 22. Xue W, Zhan S, Chen N, Huang J. Optimal coordination in logistics warehousing with sensing and communication limits: A distributed differential game approach. Asian J Control. 2023;26(1):419–35.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref23] 23. Singh SK, Reddy PV, Vundurthy B. Study of Multiple Target Defense Differential Games Using Receding Horizon-Based Switching Strategies. IEEE Trans Contr Syst Technol. 2022;30(4):1403–19.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref24] 24. Wang X, Xiao Z, Ren Z, Dong C, Tian XD. Formation cooperative trajectory tracking control for unmanned aerial vehicles via differential game and reinforcement learning. Trans Instit Measur Control. 2024;47(9):1762–70.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref25] 25. Huang X, Wang G. Saving Energy and High‐Efficient Inspection to Offshore Wind Farm by the Comprehensive‐Assisted Drone. Int J Energy Res. 2024;2024(1).
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref26] 26. Castelar Wembers C, Pflughaupt J, Moshagen L, Kurenkov M, Lewejohann T, Schildbach G. LiDAR‐based automated UAV inspection of wind turbine rotor blades. J Field Robot. 2024;41(4):1116–32.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref27] 27. Zhou P, Chen BM. Semi-global leader-following consensus-based formation flight of unmanned aerial vehicles. Chin J Aeronaut. 2022;35(1):31–43.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref28] 28. Zhou S, Hua H, Dong X, Li Q, Ren Z. Air-ground time varying formation tracking control for heterogeneous UAV-UGV swarm system. Aero Weaponry. 2019;26(4):54–9.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref29] 29. Jond HB. Distributed Differential Graphical Game for Control of Double-Integrator Multi-Agent Systems With Input Delay. IEEE Trans Control Netw Syst. 2024;11(4):1949–61.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

Figures

Abstract

Introduction

Preliminaries

The problem description for offshore wind power inspection

The description of inspection

The modelling of UAV

Problem statement for optimal multi-UAV coordination

The model of DDG

The optimal coordination control strategy

Necessary conditions via Pontryagin’s minimum principle

Distributed Dradient optimization for L-NE

The G-NE

Simulations

Scalability analysis

Conclusion

Limitations and future work

Supporting information

S1 Appendix. This appendix contains the detailed derivation of the adjoint system (33).

References