Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Distributed optimal power flow

  • HyungSeon Oh

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    hoh@usna.edu

    Affiliation Department of Electrical and Computer Engineering, United States Naval Academy, Annapolis, Maryland, United States of America

Correction

15 Jul 2021: The PLOS ONE Staff (2021) Correction: Distributed optimal power flow. PLOS ONE 16(7): e0255014. https://doi.org/10.1371/journal.pone.0255014 View correction

Abstract

Objective

The objectives of this paper are to 1) construct a new network model compatible with distributed computation, 2) construct the full optimal power flow (OPF) in a distributed fashion so that an effective, non-inferior solution can be found, and 3) develop a scalable algorithm that guarantees the convergence to a local minimum.

Existing challenges

Due to the nonconvexity of the problem, the search for a solution to OPF problems is not scalable, which makes the OPF highly limited for the system operation of large-scale real-world power grids—“the curse of dimensionality”. The recent attempts at distributed computation aim for a scalable and efficient algorithm by reducing the computational cost per iteration in exchange of increased communication costs.

Motivation

A new network model allows for efficient computation without increasing communication costs. With the network model, recent advancements in distributed computation make it possible to develop an efficient and scalable algorithm suitable for large-scale OPF optimizations.

Methods

We propose a new network model in which all nodes are directly connected to the center node to keep the communication costs manageable. Based on the network model, we suggest a nodal distributed algorithm and direct communication to all nodes through the center node. We demonstrate that the suggested algorithm converges to a local minimum rather than a point, satisfying the first optimality condition.

Results

The proposed algorithm identifies solutions to OPF problems in various IEEE model systems. The solutions are identical to those using a centrally optimized and heuristic approach. The computation time at each node does not depend on the system size, and Niter does not increase significantly with the system size.

Conclusion

Our proposed network model is a star network for maintaining the shortest node-to-node distances to allow a linear information exchange. The proposed algorithm guarantees the convergence to a local minimum rather than a maximum or a saddle point, and it maintains computational efficiency for a large-scale OPF, scalable algorithm.

Introduction

In modern societies, demand for electricity is expected to be satisfied continuously via controllable generation technologies. An event is a situation in which the demand is not fulfilled. Ten-in-one, a widely used reliability criterion for events, means that an event should occur just once in a 10-year span. To meet this standard, system operators schedule the generation portfolio and the grid systems in advance. For example, a day-ahead unit commitment determines the 24-hourly dispatches, along with unit commitment decisions, to meet varying hourly demands. For each hour, the demand profiles are assumed to be constant, which defines the process’s steady-state operation. In the absence of an unexpected disturbance, stochastic hourly demand is the unique source of uncertainty in traditional power system operation. Over the last decade or so, renewable energy resources and smart grid technologies have been integrated into systems to improve energy efficiency and reduce greenhouse gas emissions. This integration has introduced uncertainty into the operation of power systems, presenting a new challenge. If high-precision forecasting could be introduced to estimate future energy resources and control demand, existing operations’ tools would remain useful, assuming they could be integrated into the expected effective demand (≡ demand–expected demand reduction–expected renewable energy resources). Unfortunately, even though the precision of forecasting tools has improved, the errors in their long-term forecasts, for a day ahead, for example, are not yet sufficiently small for reliable operation. Frequent decisions are a potential way to accommodate the uncertainty. For example, a day-ahead 24-hour unit commitment (UC) decision is made once in a daily cycle. If the errors in 2-hour ahead forecasts are small enough, then the UC decision with a forecast every 2 hours would still be a reliable tool for the power system’s operation. The computational capability to support such decisions plays a key role in this process.

Optimal power flow (OPF) is a backbone in the steady-state operation of power systems. The characteristics of OPF are highly nonlinear and nonconvex. The computational complexity associated with these characteristics makes power flow (PF) analysis a non-deterministic polynomial-time (NP)-hard problem [1]. In most operational practices, a linear approximation of OPF, namely, direct current (DC) OPF, is pursued. Although easy to solve, DC OPF does not address voltage problems, losses, and the dispatch of reactive power generation. Due to these issues, DC OPF may not be feasible. To address the problem correctly, it is ideal to aim for a nonlinear and nonconvex OPF. In addition to the nonconvex nature of the full OPF, uncertainties increase the number of variables in traditional, central decision-making processes. Therefore, frequent but short-term decisions concerning large-scale power systems can be challenging. With the recent advancements in hardware in multi-core machines, distributed computation becomes an attractive approach for enhancing computational efficiency. An exemplary area in power system analysis concerns the use of distributed computation for OPF. Motivated readers can find information related to distributed approaches to solving OPF problems [2, 3]. Within distributed computation, the alternating direction method of multipliers (ADMM) has gained popularity due to its straightforward implementation and its provable convergence (if the original problem is convex). For a linear DC OPF, the ADMM approach can result in successful convergence to a global solution [46]. A full OPF problem on a radial network can be relaxed to a convex semidefinite programming (SDP) problem, and the relaxed problem is exact if optimal power injections lie in a region where the voltage upper bounds do not bind [7, 8]. Several studies have used the ADMM approach to solve OPF problems in a radial network [911]. OPF problems usually involve the operation of mesh transmission systems, but an SDP solution for a mesh network may not be physically meaningful [2].

A nodal OPF would be the most intuitive approach to extending a central OPF in a distributed fashion. In the nodal OPF [12], the information exchange among the nearest neighbors leads to high communication costs. The maximum node-to-node distance (also termed the path length (PL) [13]) plays a key role in the communication costs. The convergence tends to be very slow due to the contaminated information received from local decisions during communications. To the best of our knowledge, there has been no report of the successful convergence of this approach for any mesh networks.

The PL can be reduced when a clustering approach is undertaken for a mesh network via the partitioning of a system into multiple subsystems [1419]. Two adjacent subsystems share some nodes and branches between the nodes; thus, the PL is small, keeping the communication costs manageable. The primary and the dual variables at the shared nodes and branches are constrained equally. This approach can be efficient if the shared nodes adequately represent the other nodes in the same subsystem. Several studies have proposed an efficient algorithm for partitioning a system so that the ADMM converges to a solution [16, 17, 19, 20]. In contrast to a study by Sun, Phan, and Ghosh [12], the flow constraints can be integrated for the lines of intra-subsystems. In two studies, Erseghe [14, 15] integrated the flow limits of the lines of inter-subsystems by redefining the subsystems to overlap the lines. Guo, Hug, and Tonguz [16] report that the inclusion of the limit still yields a solution. However, the approach has several shortcomings that contribute to computational inefficiency: 1) the low quality of the solution, 2) the need for a warm starting point for convergence, and 3) the communication costs. In addition, the convergence behavior is not reported, so it is not possible to discuss the computational efficiency. The solutions presented in several studies [1417] are low-voltage, inferior solutions due to increased losses. Engelmann et al. [18] added a significantly large term regarding reactive power injections that affects the optimality conditions and, as a result, the distributed problem is different from the original one. The necessity of a solution for the nonconvex PF as a starting point increases the computational costs. Even though the PL decreases in comparison to the nodal OPF in the study by Sun, Phan, and Ghosh [12], the communication costs increase the computational costs due to the tradeoff discussed by Guo, Hug, and Tonguz [16]. These shortcomings make the benefit of distributed computation questionable. The distributed SDP approach by Madani, Kalbat, and Lavaei [19] yields the global (and therefore identical) solution to the central SDP approximation to the OPF problem because the problem is convex, but the solution may not be physically feasible. In addition to the computational inefficiency, there is no approach that can theoretically yield an optimal balance between the computational cost for one subsystem and the communication costs of the subsystems. Another study by Guo, Hug, and Tonguz [20] proposes a heuristic approach for selecting subsystems, but it does not yield a unique choice because its initialization is based on the local solution for the nonconvex OPF. In several studies [14, 17, 18, 20], a positive correlation was observed between the system size and the number of nodes in the largest subsystems (see Fig 1). The PL depends linearly on the number of subsystems n, which depends inversely on the size of the largest subsystem (Nsub), i.e., nNb/Nsub, where the equality holds when the sizes of the subsystems are uniform. Fig 1 indicates a positive correlation between Nsub and Nb. The dotted line indicates that the best linear fit for the relationship is NsubNb0.88. If the optimization problem is solved by SDP, the computation time is found in [21] that yields a computation time that is proportional to ϑ(Nb2.64). In addition, the approaches require significant communication costs as well. Therefore, the overall computation cost is much higher than that of the central optimization. In addition to the computational efficiency, it is not guaranteed that the voltages at the boundary of each subsystem represent the voltages of other nodes inside the same subsystem correctly. If they do not, convergences may not be observed because the information exchange is limited to the boundary buses. The slow convergences and/or non-monotonous convergences reported in previous studies [1417] indicate the insufficient representativeness of the boundary buses. A relatively fast convergence is reported by Engelmann et al. [18] in exchange of per-step communication costs by sharing the sensitivities in addition to the local primal variables. The increase in the communication cost is found in where ni is the dimension of the local gradient at the ith group. Although the progress at each iteration is faster than those of other ADMM approaches, the communication cost itself is much higher than the total computation costs of the nonconvex heuristic solvers or of the SDP solvers. As a result, although these studies are worth exploring, we conclude that the aggregation approach is not practical in terms of computational efficiency and the inferior quality of the solutions for a scalable algorithm due to the tradeoff issue, nonconvexity, and the modeling problem. A new network model for OPF is necessary for the distributed computation.

thumbnail
Fig 1. The largest number of nodes in the partitioned subsystems of various power system test cases reported in the literature [14, 1719].

https://doi.org/10.1371/journal.pone.0251948.g001

The contributions of this paper are 1) a new network model that yields direct communication among nodes regardless of the system size, 2) a distributed, fast, and efficient algorithm to solve highly nonlinear and nonconvex OPF problems, and 3) a scalable algorithm that guarantees convergence to a local minimizer. The paper is organized as follows: the theory section proposes a new network model designed for distributed computation and presents an algorithm to solve a full OPF; the next section describes the details of the implementation of the proposed algorithm; the results and discussion present the results for the OPF and comparisons with those from other studies; the following section provides conclusions and research directions for further improving the computational efficiency; and the appendices sketch the proofs of the ranks of the matrices associated with OPF problems and of the convergence.

Theory

Proposed network model and algorithm

We propose a new network model for a nodal OPF to keep communication costs manageable regardless of the system size. For this purpose, the desired properties of the model are as follows:

  1. The model must be compatible with PF studies for which Kirchhoff’s laws and voltage magnitudes are well defined.
  2. Each node is a short distance away from the rest of the nodes to minimize the communication costs.
  3. The voltages at a node and those at the rest of the nodes are linearly related.

In the power flow studies, such as PF, OPF, state estimation, and probabilistic PF, voltages are the variables. The constraints in the studies consist of the power flows and injections, as well as the voltage magnitudes in terms of voltages. In the Cartesian coordinate system, the power flows and injections are quadratic in voltage. For example, the power flow over the line connecting Nodes i and j at i is , and the power injection at Node i is where the quantities sandwiched between voltages are in 2Nb-by-2Nb; where ei-j is a vector with the cardinality of 2Nl of which the element corresponding to the flow over i-j is 1, and all other elements are zeros, ; and the superscript H is the conjugate transpose. The matrices and Si have two nonzero rows at i and Nb+i rows due to J ei.

  1. Claim 1: The matrices associated with power flows and power injections are all of rank 4.
  2. Claim 2: The matrices associated with the squares of the voltage magnitudes have rank 2.

See S1 Appendix for the proof of the claims.

For a real-valued symmetric rank-4 matrix Mj, a real-valued eigen pair λ and u exist such that . Because Mj is a rank 4 matrix, the number of columns in Фj and the number of nonzero diagonal elements in IIj are both 4. Here, the terms and : refer to the matrices associated with the real power injection at Node j ; is associated with reactive power injection at Node j ; is associated with real power flow over the line j-l at the side of Node j ; and is associated with the reactive power flow over the line j-l at the side of Node j . Note that the eigenpairs are dependent solely on the system parameters (not on voltages). With the eigenpairs, the quantities of interest in the PF are the real power injection at Node j pj, the reactive power injection at Node j qj, the real power flow over a line j-l at Node j , the reactive power flow over a line j-l at Node j , and the squared voltage’s magnitude at Node j Ej. There are quadratic relationships among them: (1)

Eq (1) expresses Kirchhoff’s laws and the voltage magnitudes. The nodal variables αj, βj, γj-l,j, δj-l,j, and ωj are defined as follows: the αj are the voltages projected onto the eigenspaces spanned by the real power injection, and the βj cover the reactive power injection, the γj-l,j cover the real power flow over j-l at Node j, the δj-l,j cover the reactive power flow over j-l at Node j, and the ωj cover the voltage magnitudes at the jth node in the same manner. Although there is one each of αj, βj, and ωj at each node, the nlj of γj-l,j and δj-l,j are defined such that nlj represents the number of branches that are directly connected to Node j (i.e., l = 1, 2, …, nlj). It is worth noting that the nodal variables are linearly associated with the voltages (the definitions of the nodal variables in Eq (1)), which indicates that PL = 1. Therefore, the nodal variables satisfy all the desired properties of the new network model.

Next, we let nlj be the number of lines connected to the jth node. Then, the dimensions of α, β, γ, δ, and ω are 4, 4, 4nlj, 4nlj, and 2, respectively. Note that their dimensions do not depend on the system size. Let μj be a nodal variable that integrates local generation variables as follows: where μend = 1. The generalized PF formulation is listed in previous work [22]. Note that the cardinality of the variable is fixed—independent of the system size. Therefore, the new network model yields: 1) a fixed number of variables, and 2) a PL of 1.

Proposed network model: A star grid with two channels

In the definition of the local variable xj, it is noticeable that there are three variable types: 1) power injection and flow variables α, β, γ, and δ, 2) the voltage magnitude variable ω, and 3) quadratic variables, Q xj comprising power flows and power generations (gj). The first two types of variables are linear, and the third type is quadratic, in terms of the voltages. Once xj (α, β, γ, and δ) is determined, it is straightforward to find the power generations at Node j with Eq (1) (i.e., xj defines the feasible region of μj). The voltage vp can be reconstructed from α, β, γ, and δ. The communication path used to collect their values is termed the power channel. Here, ω is directly associated with the voltage magnitudes, and the values are reported through another communication path termed the voltage channel to update voltages vm. Although the voltages vp and vm should be identical, conditions are relaxed so that they can be different.

Fig 2 illustrates an example of the traditional network model (left) and the proposed network model (right) for a modified 4-bus system that has branches connecting 1–2, 1–3, 1–4, 2–4, and 3–4. In the proposed model, all the nodes are connected to the center node (black dot at the center) through the power channel (red lines) and the voltage channel (green lines), and all the nodes are of distance 1 from the center (PL = 1). The network is a complete bipartite graph of order 2 with a maximum-diameter star network. All communications between the variables take place in local nodes (α, β, γ, δ, and ω), and the center node is linear in terms of voltages.

thumbnail
Fig 2. A star and linear network for a modified IEEE-4 bus system.

The traditional network model is shown on the left (A) and the proposed network model is presented on the right (B). Red lines indicate the power channel, and green lines are the voltage channel.

https://doi.org/10.1371/journal.pone.0251948.g002

Nodal OPF with nodal variables and its convex relaxation

Even though we describe the OPF problem in this paper, the same framework is applied to any other problem, including PF, OPF, state estimation, and probabilistic PF problems, because they have unified formulas [22]. The central OPF formulation as a nonconvex, quadratically constrained quadratic problem is given in Eq (3), its distributed optimization problem (nodal OPF) at each node I is given in Eq (3A), and the nodal OPF is formulated in terms of an ADMM algorithm in Eq (3B). (2) (3A) where and where A is a block matrix to collect an element ; ; ; ; ; ; ;; ; ; ; ; ; and ; , and . (3B) where , ,, or equivalently ; and Xj is the column space of Фj. at Node j defined by the constraints listed in 1–5 in (3A).

Note that all A’s are full column rank matrices that collect relevant parts from μi, and that Фi includes the linear relationships between the local variable xi and the central variable v and y. Using the definition of the function fj, the following observations are made: 1) fj is a nonconvex, but smooth, function that is C1 on an open set containing Xj (defined by the column space of Фj), and ∇fj is Lipschitz continuous on Xj; 2) Xj is nonempty, closed, and convex; and 3) wj is coercive.

Even though Eqs (3A) and (3B) have low cardinalities in the decision variables due to the nonconvex nature of the problem, the uniqueness and existence of the solution are not guaranteed. To address the complexity issue, a surrogate function hj approximating Eq (3B) is defined as follows: (4)

Because uj is the SDP relaxation of fj, it is convex, Lipchitz continuous, and continuously differentiable on Xj. Note that uj relaxes the rank constraint only; therefore, the first derivatives of uj and of fj regarding xj are identical (i.e., ). Because the SDP is convex, the solution at the kth iteration is uniquely determined by at given yk and zjk. Note that the relaxed nodal OPF is convex and has a small number of variables; therefore, its computational complexity remains manageable.

Properties of uj, fj, gj, hj, wj. and

The nodal problem fj is C smooth, but nonconvex, and prox-regular at xj relative to if xj ∈ dom(fj), . Further, whenever and , there exists : ; and . Statement (f2) holds by the Descent Lemma [23] with the properties of gj and of hj described below. Its SDP relaxation uj is strongly convex, Lipschitz continuous, and prox-regular, and its first derivative equals that of fj (i.e.: ; and Whereas the variables considered in fj and uj are all local primal variables only, ADMM constraint gj depends on the central variable y and the multipliers z. With the properties of uj, fj, and gj, wj is C smooth, but nonconvex, and prox-regular, whereas hj is strongly convex and Lipschitz continuous. Because all Xj are bounded sets, wj and hj are coercive and lower-bounded over Xj. Because is the optimizer of convex hj (i.e., ), the following holds:

and .

Proposed algorithm with provable convergence

Overview of proposed algorithm.

Even though the number of variables in the nodal OPF is much smaller than that in the original problem, the complexity of the problem makes it difficult to solve the nodal problem due to its nonconvex nature. To manage the nonconvex nature of the problem, the nonconvex components are relaxed. Once identified, the relaxed solution is mapped onto the feasible region, to the closest point in the feasible region from the identified solution. There are three cases: 1) the convexified problem is not feasible, 2) the convexified problem is feasible, but the solution to the problem is too far from the feasible region, and 3) the solution to the convexified problem is sufficiently close. Only if the mapped point is close enough, the nodal variables are updated and fed to the center node to update voltages.

Proposed algorithm.

We propose the following algorithm to solve wj in conjunction with hj in Eq 5:

Distributed, Regulated, Optimally-Homogeneous, and Scalable (DROHS) Algorithm

1. Set k = 0, initialize all parameters, such as Δk ∈(0,1), yk, ρj, .

2. If satisfies the termination criteria, terminate the algorithm.

3. Solve the local problems, .

4. Depending on the solution , determine (Fig 3)

  • Case 1: is in the real space of convex Xj, i.e., in the column space of Фj, .
  • Otherwise : Find a feasible projection of onto the feasible region Xj, near ;
  • Case 2: the projection is close enough to (i.e., where ); accept the solution , ;
  • Case 3: is not sufficiently close; reject the solution , ;
  • update , ;
  • set the error bound asymptotically to vanish as iteration proceeds (i.e., ).

5. Compute , and update accordingly; and where a ∈ (0.5, 1).

6. Update so that all x-variables are consistent with global variable y (i.e., ).

7. kk + 1, and go to 2.

thumbnail
Fig 3. x-update from the result of optimization with 1) a feasible solution , 2) an infeasible solution, but the distance between the solution and the projection onto the feasible region is close enough , and 3) an infeasible solution in that the projection onto the feasible region is too far .

https://doi.org/10.1371/journal.pone.0251948.g003

Fig 3 illustrates how the x-optimization process determines the described in Rule 4 of the algorithm. If the solution is feasible in Xj, is the solution (Case 1). If the solution is not in Xj but the point projected onto the convex region Xj is sufficiently close, is the projected point (Case 2). For Case 3, where either no solution is identified or the identified solution is not in Xj, and the distance to the projected point is too far away, the solution is rejected, and is the point determined in the previous step. Upon the determination of , the iteration proceeds to a point between the current and the projected point depending on parameter Δk.

After the x-optimization is performed, y-optimization follows. The y-optimization at given and zk is a least square problem–convex problem. It should be noted that the solution to the convex problem is always feasible. Rule 6 is a striking feature of the proposed algorithm that is critically different from the ADMM approaches. Instead of the locally independent update of the primal variables in the ADMM approaches, the nodal variables are updated 1) linearly for the computational efficiency, and 2) with respect to the central variable so that the update is reasonably agreeable among nodal variables. The proof of the convergence of the proposed algorithm is presented in this paper’s S2 Appendix.

Implementation details

Implementation of Rule 4

The local SDP relaxation yields a matrix (known as ) instead of a vector ; therefore, a new criterion should be established to compute , as follows: Because is from SDP, all the eigenvalues are non-negative (i.e., ) where is the mth column vector, and r is the rank of : (5)

To construct an equivalent criterion to with , we impose to determine whether or not the feasible projection is acceptable. The grouping criterion is simplified to by invoking . Let be the mth eigenvalue of . Then, the criterion becomes .

The definition of is also modified as using .

Choice of parameters

In contrast to the ADMM approaches, only a few parameters are assigned in the proposed algorithm. Because , and the specific choice of zj is irrelevant, all zj‘s are random vectors in the null space of known Φj. It is important to hold to guarantee the convergence of the proposed algorithm [24]. Due to the fact that , the maximum τ over all nodes at the kth iteration is set by The parameters used were as follows: ρj = 20 for vL and 200 for vM; Δ0 = 0.3; and the number of maximum iteration = 100; .

The global variable y is randomly assigned (cold start) to test the algorithm’s robustness. For comparison, a flat start and a warm start are also attempted. Here, a flat start refers a point at which real voltages are unity, and imaginary voltages are within [-0.1, 0.1]; and a warm start [20] is a solution to the PF. It is recognized that a faster convergence is obtained when the initial point is close to the feasible region, which is commonly observed in numerical iterative methods.

Termination criterion

The algorithm is terminated when no further progress is made. The progress is measured in terms of the global variable y, and the local variables x and z. After the solution is identified, the objective function W* is determined according to (3B) where , i.e., . The interim value of W at the kth iteration is , and the progress measure ηk at the kth iteration is defined: (6)

The criterion used in Rule 2 terminates the algorithm if the progress measure is less than the tolerance (10−7) (i.e., ηk ≤ 10−7).

Results and discussion

Simulation environments

The model systems used in the simulations are available from MATPOWER [25] for 3-, 4-, 9-, 14-, 24-, 30-, 39-, 57-, 85-, 118-, 300-, and 2,000-bus systems. All simulations were performed using a Mac pro with 2 × 2.93 GHz 6-core Intel Xeon processors and 6 GB 1333 MHz DDR3 memory. The local SDP problems were solved using the CVX solver [26]. To compare the results from various systems, the lines in the figures are normalized so that all start at zero. The initial points for all the cases are cold starting points, and all the real and imaginary voltages are set to random numbers (not even a flat start). The qualities of the solutions are all numerically identical to those identified using MATPOWER for the model systems tested (all the solutions v* are less than 10−4 from the MATPOWER solutions (vMATPOWER), ∥v* − vMATPOWER∥/∥v*∥ ≤ 10−4). For comparison, we also attempted a flat start and observed that Niter decreases at least 30%, but the flat start also finds the same solutions.

Subsystems of the test cases

In the ADMM approaches referred to in the literature [1420], the grid partitioning is performed, but the details of the partitioning are not well listed. In considering the computational coupling, the partitioning is based on spectral clustering [20, 27] while each subsystem contains at least one generator [14, 20]. The inclusion of a generator in a subsystem seems a natural choice to fulfill the power balance equality constraints (1 in (3)) within each substation. However, the inclusion of a generator does not serve the purpose well, because the generator may not be dispatched if its generation costs are excessively high. The spectral clustering is performed using two different algorithms: 1) unnormalized [27] and 2) normalized [28]. The algorithms yield different grid clustering. For example, the unnormalized algorithm results in two clusters for the IEEE 3-bus systems, while the normalized algorithm finds a single cluster for the same system [28]. Fig 4 illustrates the path length, maximum number of nodes in a subsystem, and the number of subsystems.

thumbnail
Fig 4. The path length, the maximum number of nodes in a subsystem, and the number of subsystems for the IEEE 3-, 4-, 9-, 14-, 24-, 30-, 39-, 57-, 85-, 118-, 300-, and 2,000-bus systems.

The lines indicate the positive correlations on the size of the system. The red-dotted line, the blue broken line, and the green solid line represent the best-fit curves for the path length, the maximum number of nodes in a subsystem, and the number of subsystems, respectively. Two algorithms are applied for clustering nodes: (A) unnormalized algorithm, and (B) normalized based on the algorithm in [28].

https://doi.org/10.1371/journal.pone.0251948.g004

The lines are best-fit lines, showing the positive correlations of path length (red dotted line path length ∝ Nb0.30), the maximum number of nodes in a subsystem (green solid line ∝ Nb0.43), and the number of subsystems (blue broken line ∝ Nb0.76) based on the unnormalized algorithm (left plot). The results using the algorithm with a different normalization [28] yield path length ∝ Nb0.34, the maximum number of nodes in a subsystem ∝ Nb0.45, and the number of subsystems ∝ Nb0.66 (right plot). Even though the results from the two plots are not exactly the same, the positive correlations with the system size are clear. They affect the computational efficiency of ADMM [14, 20]; path length affects the communication costs and Niter because the decision at each subsystem should be delivered to the rest of the system; the number of subsystems affects the number of computation cores and the communication costs unless PL equals one; and the maximum number of nodes affects the computation costs at each subproblem. Therefore, the decrease in the computation costs at each subsystem is achieved in exchange for the increased costs of communication among subsystems. Because the path length and number of subsystems increase rapidly with system size, the improvement in computational efficiency of the ADMM approach is questionable.

Maximum cardinality of nodal variables in the proposed algorithm

The nodal variables in the x-optimization μj are . Because μend is unity, the cardinality of μj is 10nl + 2ng + 10 where nl is the number of lines connected to the jth node, and ng is the number of generators located at the node. In determining the voltages through the power channel, the node with a high value for nl plays a key role. Fig 5 illustrates the maximum cardinality of μj with the systems’ sizes. The dashed line indicates . Even though the positive relationship between and Nb is visible among the model systems, the relationship is not necessarily positive. The number of variables in the central OPF is typically 2Nb + 2Ng where Nb and Ng are the numbers of buses and generators in the system, respectively. For small systems, such as 3-, 4-, 9-, 14-, and 24-bus systems, the maximum cardinalities of the nodal variables are higher than the number of variables in the central OPF; therefore, it would be challenging to keep the computational costs of the distributed OPF for the small systems lower than those of the central OPF. It is noteworthy that the communication costs remain manageable, because each node directly communicates with the central node due to the fact that PL equals unity.

thumbnail
Fig 5. Cardinalities of nodal variables in x-optimization in terms of system size.

https://doi.org/10.1371/journal.pone.0251948.g005

Distributed OPF for 3-, 4-, 24-, 39-, and 85-bus systems

To examine whether the choice of parameters affects the convergences, optimizations were executed with many randomly assigned initial points, and uniform convergences were consistently observed (Fig 6). In general, Niter depends on the initial guess of the voltages. As the distance of the initial point from the solution becomes closer, Niter decreases, which is commonly observed in numerical iterative methods. All the solutions identified are numerically the same as the solutions using MATPOWER.

thumbnail
Fig 6. Convergence of proposed algorithm for selected systems (i.e., IEEE 3-, 4-, 9-, 14-, 24-, 39-, and 85-bus systems).

https://doi.org/10.1371/journal.pone.0251948.g006

A worst-case convergence was observed for the IEEE 39-bus system. The nodal OPF for the system involves large negative eigenvalues (i.e., the nodal OPFs are highly nonconvex). According to Rule 4, a relatively large number of solutions are rejected because they are not close to the feasible regions and, therefore, Niter becomes large (approximately 40 iterations). However, the solution identified is a local minimizer, and Niter is still much smaller than the number of iterations for the ADMM approaches. For the visual presentations of the results obtained for various systems, the progress is normalized to the 1st iteration (i.e., the curves begin at 0 for the 1st iteration).

Comparison of convergence to ADMM approaches

The convergence measures were compared to those from the ADMM approaches reported in multiple studies [14, 1719]. It is worth noting that Erseghe [14] and Zhang et al. [17] do not take the flow limits into consideration; the approach of Engelmann et al. [18] involves a high communication cost; and the approach of Madani, Kalbat, and Lavaei [19] may yield a physically infeasible solution. We attempted the ADMM approaches with the flow constraints and feasibility, but our implemented solver failed to converge with any starting points. Instead, we compared the convergence behaviors of the proposed algorithm to those reported in the previous [14, 1719]. For the visual presentation, the convergence behaviors are normalized so that all the curves begin at the same point (see Fig 7 for the comparison of the convergence).

thumbnail
Fig 7. Convergence of proposed algorithm for selected small systems (i.e., IEEE 9-, 14-, 30-, 57-, 118-, and 300-bus systems).

The convergences of the ADMM approaches [14, 1719], are compared.

https://doi.org/10.1371/journal.pone.0251948.g007

In the previous research [14, 1719], the size of the subsystems increased with that of the system. Therefore, the tradeoff between communication costs amongst the subsystems and the computation costs for each subsystem make it difficult to develop a scalable algorithm. As the system size increases, Niter increases significantly for the results in two of the previous studies [14, 19]. The performance measure in another study [17] fluctuates consistently with various systems, indicating that the convergence may not be guaranteed. Because the final study [18] requires the information exchange regarding sensitivities as well as the primal variables, the computation and communication costs per iteration increase much more rapidly than do the computation costs of a central OPF solver.

Different from the clustering containing multiple nodes used in the ADMM approach, the proposed algorithm for each subsystem contains only 1 node regardless of the system size; each subsystem directly communicates with the rest of the entire system; and the communication involves only the primal variables. With this difference in mind, the fast convergences observed in the proposed algorithm are similar to those in two of the previous studies [17, 18], which are consistently much faster than those in the other two previous studies [14, 19]. Whereas Niter are like those reported by Engelmann et al. [18], the convergences of the proposed algorithm occur more uniformly and consistently.

Many approaches using ADMM do not consider the flow limits [14, 17], and/or feasibility [19] and, therefore, it is not appropriate to compare the quality of the solutions. Instead, we compared our solutions with those using a central OPF solver, MATPOWER. For all the cases described above, they yield numerically identical solutions. It is not clear why the proposed algorithm finds the same solution as the central OPF solver. We tested with the central SDP-relaxation for a small-scale system to estimate the “global” solution. Due to the memory issue, our SDP tests are limited up to 118-bus systems. When the relaxation returns the rank-1 solution, the solution is global and physically feasible. For several cases, the test cases are of the global solutions that are also identified by MATPOWER and by the proposed algorithm. However, there are cases where the SDP relaxation returns a physically infeasible solution. For these cases, both MATPOWER and the proposed algorithm identify numerically the same local minimizers that may not be global solutions. MATPOWER finds a point that meets the first-order necessary conditions for optimality [25]. Although it finds a minimizer in most cases, there is no guarantee that the identified solution is a minimizer—it can be either a maximizer or a saddle point. On the other hand, the proposed algorithm identifies a solution that meets the second-order optimality conditions, which guarantees that the solution is a minimizer [29]. From a practical point of view, there are two advantages of the proposed algorithms over MATPOWER: 1) they are numerically stable and, therefore, robust because all the subproblems are convex—no issues regarding the rank-deficient Hessian matrix, and 2) they use distributed computation and, therefore, manageable computation in each subproblem. However, there are disadvantages of the proposed algorithm: 1) the requirement of multiple cores to perform the distributed optimization, and 2) the number of nodal variables can be larger than that of the central OPF; for example, 3-, 4-, 9-, 14-, and 24-bus (See Fig 5).

Large-scale OPF: 2,000-bus system

We tested the algorithm for a large-scale system, the 2,000-bus system that is a synthetic grid on a footprint of Texas. Fig 8 presents a convergence pattern. The solution identified is numerically identical to the one found using MATPOWER. Niter remains small, as is the case for the small systems (See Figs 6 and 7). Note that the nodal OPF includes only 1 bus, and the communication costs remain small because PL equals unity. If a sufficient number of computation cores are provided, the proposed algorithm is scalable if the computation cost for solving a subproblem does not increase significantly as the system size increases.

thumbnail
Fig 8. Convergence of proposed algorithm for synthetic 2,000-bus system on a footprint of Texas.

https://doi.org/10.1371/journal.pone.0251948.g008

Different from other algorithms reported in the literature (i.e., [1418]), the proposed algorithm converges regardless of the initial points, but the number of iterations decreases by at least a factor of 2 if it uses a flat start. Another shortcoming of the algorithms reported in previous research [1418], is the quality of their solutions. Whereas the existing OPFs find low-voltage [17] or suboptimal [16] solutions, the proposed algorithm finds the same solution as a central OPF solver. One previous study [16] claimed that there might be an optimal number of partitioned subsystems due to the tradeoff between communications and computational costs used to obtain a local solution. The proposed method keeps the PL = 1, and the central computation is a simple addition of xj through both the power and voltage channels. From the simulations with various parameter values such as ρj, Δk, , and a, we obtained the same solutions, indicating that the proposed algorithm is robust as well as efficient. In addition, the proposed model does not require any partitioning.

Comparison to a heuristic central OPF solver, MATPOWER

In comparing the convergence between the proposed algorithm and the central OPF, the number of variables and Niter were examined. The maximum cardinality of μj increases with the system size in (See Fig 4). Fig 9 illustrates the number of variables in the central OPF as well as the maximum cardinality of the nodal variables. The blue line is the best-fit line for the central OPF (), whereas the red line is the best-fit line for the distributed algorithm. It is clear that increases with Nb in a much faster way than does. The black line is the boundary at which equals . If enough computation cores are available, the proposed algorithm involves reduced computation costs per iteration for systems larger than or equal to IEEE 24-bus. Whereas almost linearly increases with Nb because NgNb in most systems, does not necessarily increase theoretically. A key observation is that the proposed approach can be a scalable algorithm if Niter does not rapidly increase with the system size.

thumbnail
Fig 9. Cardinalities of central OPF and proposed algorithm for the test systems.

https://doi.org/10.1371/journal.pone.0251948.g009

Fig 10 presents Niter for the central OPF (Nitcent) and of the proposed algorithm (Nitdist) on the tested systems. A visible increase in Nitcent is observed with Nb, but the dependence of Nitdist on Nb is not evident—the solid lines are best-fit curves (log-log plot) that indicate and Nitdist = Nb−0.007, respectively. The dotted lines are average Niter, and the values are 15 and 23, respectively.

thumbnail
Fig 10. Niter for central OPF and of the proposed algorithm on the tested systems.

Dotted lines are averages, and solid lines are best-fit curves.

https://doi.org/10.1371/journal.pone.0251948.g010

From the comparisons (the number of variables and Niter), we conclude that the computation cost of the distributed algorithm increases with Nb at a much slower rate than that of the central OPF if sufficient computational resources are available.

Scalability of the algorithm

The computation efficiency of a distributed computation depends on the number of iterations and the computational cost per iteration. The cost is determined by the local computation with the largest number of variables. Fig 10 shows that the number of iterations does not increase with system size.

Fig 11 illustrates the maximum nodal computation time depending on the maximum cardinality of the nodal variables. This performance dependence on the number of nodal variables may shed light on the claim by Loukarakis, Bialek, and Dent [30] that a larger system does not necessarily imply an inferior convergence performance. The dashed line indicates that the maximum nodal computation time is proportional to the maximum cardinality of the nodal variables with the power of 2.64. The cardinality to the power of 2.64 observed in this study is close to the theoretically estimated 3 for the SDP solver. Note that it is not necessary for the maximum cardinality of the nodal variables to be positively correlated with the system size. Rather, the maximum cardinality depends on the local grid topology of the system. For the tested systems, and , which results in CTNb2.64/4. = Nb0.66 where CT represents the maximum nodal computation time. The total computation cost is bounded by the product between Niter and CT. The total computation cost is bounded by CTϑ(⌈Nb/NcoreNb0.66) where Ncore is the number of cores. If the computational resource is sufficient, CTϑ(Nb0.66). From this observation, the computation cost increases sub-linearly for a large-scale network.

thumbnail
Fig 11. Computation times in seconds for nodal OPF of the maximum cardinality of the nodal variables.

Dotted line is the best-fit line with the slope of 2.64.

https://doi.org/10.1371/journal.pone.0251948.g011

If a single core is available for computation, the computation cost is in where pSDP is the computational complexity for a nodal SDP. For the state-of-art central heuristic OPF solvers, the computation cost is in ϑ(Nb1.5). Therefore, when the computation resources are highly limited, the proposed algorithm would still be efficient with a convex problem solver that yields pSDP ≤ 2. A potential improvement in pSDP of the SDP solver is to explore the sparse structure of the matrices [31] or to utilize a commercial solver such as MOSEK.

Nomenclature

eigenvectors of Meq, Min scaled by and

Set and the complement of the set of Case m

αj, βj nodal variable associated with real and with reactive power injection, and

γj-l,j, δj-l,j nodal variable with real and with reactive power flow over the line j-l at the side of the jth node, and

ωj nodal variable with Ej,

B, Br, G sets of nodes, branches, and generators

Br{j}, G{j} sets of branches and of generators at node j

Ej voltage magnitude square at Node j,

diagonal matrix of ±1 associated with equality and inequality constraints

I identity matrix

J matrix [IT j IT]T

Meq, Min symmetric matrices with equality and inequality constraints

Ybus, Ybr nodal and branch admittance matrices

Nb, Nl, Ng number of nodes, lines, and generators at the system of interest

Niter number of iterations until convergence occurs

capl thermal limit of flow over l

dj real and reactive power demand at j,

ej jth column vector in the identity matrix

power flow over j-l at the jth node,

gj generation at the jth node,

j

nlj, ngj number of lines and generators at Node j

nlmax maximum number among nlj in the system

vL, vM voltages at the power and voltage channels

v, v, vx, vy complex voltage vector, voltage , real and imaginary part of voltage,

complex variable, real, and imaginary parts of x, and

Nodal variables and their cardinality

nodal variable associated with real power injection at Node j

nodal variable associated with reactive power injection at Node j

nodal variable associated with real power flow over a line j-k at the side of Node j

nodal variable associated with real power flow over the lines connected to Node j

nodal variable associated with reactive power flow over a line j-k at the side of Node j

nodal variable associated with reactive power flow over the lines connected to Node j

nodal variable associated with the voltage magnitude at Node j

scalar representing real power flow over a line j-k at the side of Node j

scalar representing reactive power over a line j-k at the side of Node j

nodal real power generation vector at Node j

nodal reactive power generation vector at Node j

Conclusions and future research directions

From the tensor analysis of the power flow, we developed a star and linear model to achieve a scalable distributed computation. The new network model allows the direct communication between the nodal variables and the central voltages. In the model, the PL remains at unity regardless of the system size. On the other hand, the Kirchhoff’s laws and voltage magnitudes are expressed in terms of nodal variables that are linear in the voltages. Therefore, the model makes it possible to keep the size of a nodal OPF small regardless of the size of the system, while the communication costs remain manageable. This new aspect of the model allows us to construct a scalable algorithm that converges to the same solution as the nonconvex OPF. We proposed the DROHS algorithm to find a local minimum using a convex surrogate function. Among the nodal OPF solutions, only near-feasible solutions (Rule 4) are selected for updating. In addition to the high quality of the solution, it also achieves computational efficiency and robustness. We tested the DROHS algorithm for the 3-, 4-, 9-, 14-, 24-, 30-, 39-, 57-, 85-, 118-, 300-, and 2,000-bus systems. The proposed algorithm achieves 1) fast and uniform convergence, 2) provable convergence, 3) the same problem formulation as the central OPF problem without ignoring any constraints, 4) guaranteed convergence to a local minimum, rather than a maximum or saddle point, that meets the first-order necessary conditions for optimality, and 5) a completely distributed algorithm (i.e., a scalable algorithm, which has never been achieved before in the literature). The challenges that the proposed algorithm faces are 1) an increased number of nodal variables that may be higher than that of the variables in the central OPF for a small system, 2) an increased number of iterations when highly connected nodes involve solutions far from the feasible regions, and 3) a prolonged wait time for nodes with low cardinalities. Therefore, the proposed algorithm is an efficient alternative to the central OPF for a large-scale network. Future research directions include the development of 1) a way to accommodate the impact of the rejected solutions in updating the x-variables if the corresponding nodes are highly connected, 2) an efficient computation to solve the nodal SDP, particularly a way to explore the sparse structure of the nodal OPF, and 3) asynchronous distributed optimization for improving the computational efficiency where the scheduling of the distributed computation is identified in terms of a knapsack problem. We also present the proof showing that the surrogate function improves at every iteration and that the iteration converges to a fixed point of the nonconvex OPF. The numerical results exhibit rapid convergence, and the convergence behavior is discussed.

Acknowledgments

We thank Dr. Charles Van Loan who provided insight on tensor computation. We would like to express our gratitude to Dr. Robert J. Thomas and Mr. Gilbert Bindewald for their expertise on the computation for the power system analysis during this research, and we also thank 5 anonymous reviewers whose comments and suggestions helped improve and clarify this manuscript.

References

  1. 1. Verma DB and A. Strong NP-hardness of AC power flows feasibility. arXiv:151207315.
  2. 2. Molzahn DK, Dörfler F, Sandberg H, Low SH, Chakrabarti S, Baldick R, et al. A Survey of Distributed Optimization and Control Algorithms for Electric Power Systems. IEEE Transactions on Smart Grid. 2017.
  3. 3. Wang Y, Wang S, Wu L. Distributed optimization approaches for emerging power systems operation: A review. Electric Power Systems Research. 2017.
  4. 4. Abboud A, Couillet R, Debbah M, Siguerdidjane H. Asynchronous alternating direction method of multipliers applied to the direct-current optimal power flow problem. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing—Proceedings. 2014.
  5. 5. Kraning M. Dynamic Network Energy Management via Proximal Message Passing. Found Trends® Optim. 2014.
  6. 6. Chakrabarti S, Kraning M, Chu E, Baldick R, Boyd S. Security Constrained Optimal Power Flow via proximal message passing. 2014 Clemson University Power Systems Conference, PSC 2014. 2014.
  7. 7. Gan L, Li N, Topcu U, Low SH. Exact Convex Relaxation of Optimal Power Flow in Radial Networks. IEEE Trans Automat Contr. 2015.
  8. 8. Peng Q, Low SH. Distributed optimal power flow algorithm for radial networks, I: Balanced single phase case. IEEE Trans Smart Grid. 2018.
  9. 9. Peng Q, Low SH. Distributed algorithm for optimal power flow on a radial network. Proceedings of the IEEE Conference on Decision and Control. 2014.
  10. 10. Zheng W, Wu W, Zhang B, Sun H, Liu Y. A Fully Distributed Reactive Power Optimization and Control Method for Active Distribution Networks. IEEE Trans Smart Grid. 2016.
  11. 11. Dall’Anese E, Zhu H, Giannakis GB. Distributed optimal power flow for smart microgrids. IEEE Trans Smart Grid. 2013.
  12. 12. Sun AX, Phan DT, Ghosh S. Fully decentralized AC optimal power flow algorithms. IEEE Power and Energy Society General Meeting. 2013.
  13. 13. Watts DJ, Strogatz SH. Collective dynamics of ‘small-world9 networks. Nature. 1998. pmid:9623998
  14. 14. Erseghe T. Distributed optimal power flow using ADMM. IEEE Trans Power Syst. 2014.
  15. 15. T. Erseghe. Distributed processing. http://dgt.dei.unipd.it/pages/read/93/
  16. 16. Guo J, Hug G, Tonguz OK. A Case for Nonconvex Distributed Optimization in Large-Scale Power Systems. IEEE Trans Power Syst. 2017. pmid:28824226
  17. 17. Zhang M, Kar RS, Miao Z, Fan L. New auxiliary variable-based ADMM for nonconvex AC OPF. Electr Power Syst Res. 2019.
  18. 18. Engelmann A, Jiang Y, Muhlpfordt T, Houska B, Faulwasser T. Toward distributed OPF using ALADIN. IEEE Trans Power Syst. 2019.
  19. 19. Madani R, Kalbat A, Lavaei J. ADMM for sparse semidefinite programming with applications to optimal power flow problem. Proceedings of the IEEE Conference on Decision and Control. 2015.
  20. 20. Guo J, Hug G, Tonguz OK. Intelligent Partitioning in Distributed Optimization of Electric Power Systems. IEEE Trans Smart Grid. 2016.
  21. 21. Vandenberghe L, Balakrishnan VR, Wallin R, Hansson A, Roh T. Interior-point algorithms for semidefinite programming problems derived from the KYP lemma. Lect Notes Control Inf Sci. 2005.
  22. 22. Oh H. A Unified and Efficient Approach to Power Flow Analysis. Energies. 2019;12: 2425.
  23. 23. Bauschke HH, Bolte J, Teboulle M. A descent lemma beyond Lipschitz gradient continuity: First-order methods revisited and applications. Math Oper Res. 2017.
  24. 24. Scutari G, Sun Y. Parallel and distributed successive convex approximation methods for big-data optimization. Lecture Notes in Mathematics. 2018.
  25. 25. Zimmerman RD, Murillo-Sánchez CE, Thomas RJ. MATPOWER: Steady-state operations, planning, and analysis tools for power systems research and education. IEEE Trans Power Syst. 2011.
  26. 26. M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming. http://cvxr.com/cvx
  27. 27. Von Luxburg U. A tutorial on spectral clustering. Stat Comput. 2007. pmid:18836571
  28. 28. Ng AY, Jordan MI, Weiss Y. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems. 2002.
  29. 29. Nocedal J, Wright S. Numerical optimization, series in operations research and financial engineering. Springer. 2006.
  30. 30. Loukarakis E, Bialek JW, Dent CJ. Investigation of Maximum Possible OPF Problem Decomposition Degree for Decentralized Energy Markets. IEEE Trans Power Syst. 2015.
  31. 31. Molzahn DK, Holzer JT, Lesieutre BC, DeMarco CL. Implementation of a large-scale optimal power flow solver based on semidefinite programming. IEEE Trans Power Syst. 2013.