Figures
Abstract
This paper introduces a resilient distributed model predictive control (RDMPC) framework for coordinating energy management across networked microgrids with demand response integration. The coordination mechanism employs an alternating direction method of multipliers (ADMM)-based distributed MPC formulation that maintains tie-line reciprocity via a shared consensus schedule; standard ADMM convergence results apply under reliable communication for the convex quadratic-program relaxation. Safe operation under communication impairments and early termination is achieved by executing the reciprocal consensus tie-line setpoints and performing a local feasibility-repair step with physically interpretable slack variables (load shedding and spillage), providing anytime feasibility while (under reliable communication) optimality improves with additional ADMM iterations. Communication failure resilience is achieved by treating tie-line mismatch as bounded disturbances and applying two-sided reserve margins (upward and downward) through constraint tightening, ensuring feasibility for any mismatch within the assumed bounds when sufficient reserve headroom exists. Demand response is incorporated using distinct models for shiftable loads (energy-by-deadline) and curtailable loads (penalized reduction). Evaluation on a five-microgrid benchmark under packet loss, burst outages, and topology changes confirms feasible reciprocal execution under loss; relative performance versus naive distributed MPC (B2) is seed-dependent, and DR ablation shows large degradations (about energy not served (ENS) and
cost increases) when flexibility is removed.
Citation: Alghamdi B (2026) Resilient distributed model predictive control for cooperative microgrids under communication loss with demand response integration. PLoS One 21(4): e0345857. https://doi.org/10.1371/journal.pone.0345857
Editor: Zhengmao Li, Aalto University, FINLAND
Received: January 29, 2026; Accepted: March 11, 2026; Published: April 8, 2026
Copyright: © 2026 Baheej Alghamdi. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: The author received no specific funding for this work.
Competing interests: The author has declared that no competing interests exist.
Introduction
Microgrids have emerged as fundamental building blocks for modern power distribution systems, offering improved reliability, renewable energy integration, and local energy management capabilities [1]. The evolution toward interconnected multi-microgrid networks creates opportunities for resource sharing and economic optimization, but introduces coordination challenges [2].
Centralized Model Predictive Control (MPC) approaches, such as the seminal work by Ouammi et al. [3], achieve global optimality but suffer from critical vulnerabilities: single point of failure where communication loss with the central controller disables coordination entirely, scalability limitations as computational complexity grows with network size, and privacy concerns since all microgrids must share detailed operational data.
Distributed MPC (DMPC) addresses these limitations by decomposing the global problem into local subproblems [4]. However, existing DMPC methods typically assume reliable communication, whereas in practice communication networks experience packet loss, delays, and topology changes that can destabilize coordination.
The literature on MPC for microgrids spans centralized and distributed approaches. Ouammi et al. [3] developed a centralized MPC framework for five cooperative microgrids, reporting performance gains of about 4–9% (e.g., 4.4% and 8.5%) compared to single-microgrid operation, while Parisio et al. [5] extended MPC with explicit constraint handling. These works established MPC as the dominant control paradigm but retained centralized architectures. On the distributed side, Hans et al. [6] developed hierarchical DMPC for interconnected microgrids, and Bersani et al. [7] developed distributed robust control of power flows in cooperating microgrids. ADMM-based DMPC has gained particular attention for its convergence guarantees and decomposability [8]. Asynchrony and delays in ADMM-based coordination have also been studied, motivated by heterogeneous computation and communication times in distributed networks. Asynchronous ADMM variants provide convergence guarantees under partial asynchrony and delayed updates [9,10], establishing that stale information can be handled with appropriate parameter choices. In power-system optimization, ADMM-based distributed optimal power flow has been analyzed under stochastic communication delays [11] and asynchronous updates [12], supporting our modeling of packet loss during coordination iterations and our focus on feasible execution when coordination is incomplete.
Distributed demand response control strategies have been developed using Lyapunov optimization [13], though comprehensive integration of DR with resilient DMPC remains limited. Communication impairments have been studied in distributed optimization/control, including asynchronous ADMM schemes that tolerate delayed or stale updates [9,10]. Two consensus-based approaches explicitly handle packet losses in microgrid coordination [14,15], but neither specifies execution semantics under communication failure nor enforces tie-line reciprocity; explicit resilience mechanisms with feasibility guarantees and well-defined execution semantics for cooperative microgrids remain limited.
Several recent works address related aspects of distributed microgrid coordination. Ananduta et al. [16] developed ADMM-based DMPC for adversarial microgrid scenarios with detection-based resilience, though without explicit stochastic packet-loss modeling, explicit tie-line execution semantics under asymmetric loss, or two-sided reserve tightening for bounded mismatch. Pham and Ahn [17] compared ADMM and dual decomposition methods but did not address communication impairments. Xie et al. [18] applied tube MPC for forecast uncertainty but targeted renewable variability rather than tie-line mismatch from coordination failures.
Jia et al. [19] developed ADMM-based economic dispatch for frequency deviation but without communication loss modeling.
To the best of our knowledge, none of these works combine all three elements required for resilient execution: (i) communication loss modeling at both optimization and execution stages, (ii) explicit tie-line reciprocity enforcement via handshake contracts, and (iii) two-sided reserves for bounded mismatch. Our RDMPC framework uniquely integrates these elements with demand response as an additional resilience resource.
Despite significant progress, critical gaps remain in the literature. Most DMPC methods assume reliable communication without explicit resilience mechanisms. Coordination algorithms often lack convergence guarantees or anytime feasibility. Integration of demand response with distributed, resilient control is limited. Most importantly, feasibility under communication failures is rarely proven formally.
This paper makes three concrete technical contributions. First (C1), we propose an ADMM-based anytime DMPC coordination scheme with convergence guarantees under reliable communication for the convex relaxation; under packet loss we use a lossy-update variant and an execution policy that implements the consensus tie-line setpoint with local feasibility repair, ensuring network-feasible schedules at any iteration. Second (C2), we develop a resilience mechanism with bounded mismatch wherein communication failures induce neighbor exchange uncertainty modeled explicitly as bounded disturbances, with constraint satisfaction guaranteed via two-sided constraint tightening (upward and downward reserves) using tube MPC principles when sufficient reserve headroom exists. Third (C3), we integrate demand response with distinct flexibility types by including shiftable and curtailable loads in the distributed optimization, quantifying their value under failures, and demonstrating that DR compensates for lost coordination capability.
Across all scenarios, the intended design goal is safe and feasible execution under communication loss (including maintaining service by limiting ENS via the load-shedding slack penalty), even when consistent cost improvements relative to standard DMPC are not observed.
Materials and methods
Problem statement
Decision variables (for each microgrid i over horizon ) are:
: energy storage system (ESS) charging/discharging power [kW]
: Binary charge/discharge mode selector
: Grid import/export power [kW]
: Shiftable load served [kW]
: Curtailable load reduction [kW]
: Renewable spillage/curtailment [kW]
- Pij(k): Power flow to neighbor j [kW] (positive = export from i)
Coupling across microgrids is enforced by tie-line reciprocity: . This constraint couples neighboring microgrids’ decisions.
Information structure distinguishes local and shared information:
- Local information: Own ESS state, local loads, local renewable forecasts, local prices
- Shared information: Tie-line flow decisions with neighbors (via ADMM)
The objective is to minimize total network operating cost while satisfying all constraints, with coordination achieved through distributed optimization.
Network topology
We consider a network of N microgrids represented as an undirected graph :
: Set of microgrids
: Set of power exchange links
: Neighbors of microgrid i
The benchmark five-microgrid topology is shown in Fig 1.
Each microgrid has PV/WT generation, ESS, and DR-enabled loads. Solid lines show tie-line flows Pij; dashed lines show communication links.
Microgrid components
Renewable generation.
Renewable power Pres,i(k) is treated as a forecasted input to the MPC:
Forecasts are generated using probabilistic models (Weibull for wind [20], Beta for solar [21]) with scenario reduction [22,23] to obtain expected values. Renewable spillage (curtailment beyond planned):
The usable renewable power is:
Energy storage system
State dynamics:
where [h] is the control interval.
Capacity constraints:
Power constraints with binary mode selection (mixed-integer quadratic program (MIQP) formulation):
This formulation correctly prevents simultaneous charging and discharging without nonconvex complementarity constraints. In the reported experiments, we relax , yielding a convex QP. Relaxing zi(k) permits simultaneous charge/discharge in principle; in our experiments it was negligible (see Limitations).
Net ESS power:
Demand response model
Total load consists of fixed, shiftable, and curtailable components:
Shiftable loads (e.g., electric vehicle (EV) charging, water heating) must receive their required energy by a deadline [24]:
We set Pshift,i,max(k) = 0 for k > kdeadline, so service does not occur after the deadline. The equality constraint ensures load shifting, not curtailment.
Curtailable load (can be reduced with penalty) satisfies:
We model the DR discomfort cost (linearized) as:
where:
Grid exchange
Power exchange with the distribution network operator (DNO) using variable splitting:
In practice, simultaneous import/export is suboptimal due to spread (), so the relaxation is tight.
Tie-line exchange
Sign convention: Pij(k) > 0 means microgrid i exports to microgrid j.
Local bounds:
Coupling constraint (enforced via ADMM):
Power balance
For each microgrid i at each time step k:
In implementation, we enforce the soft balance with slacks (Eq. (33)) to guarantee feasibility; Eq. (21) is recovered when . Here Pload,i(k) includes fixed, shiftable, and net curtailable components.
Economic model
Time-varying electricity prices (assumed identical across microgrids for simplicity):
: Import price from DNO [$/kWh]
: Export price to DNO [$/kWh], where
Instantaneous cost rate for microgrid i [$/h]:
where is a small penalty for renewable spillage (e.g., 0.01 $/kWh).
Operating cost over interval k [$]:
ADMM-based distributed MPC
Global optimization problem.
The centralized problem minimizes total network cost:
subject to all local constraints and coupling constraints from Eq (20).
ADMM formulation.
We reformulate using edge-based consensus variables. For each edge , introduce
as the consensus variable for tie-line flow. Although
is undirected, we associate directed decision variables Pij and Pji with each undirected edge
.
Augmented Lagrangian:
where is the dual variable [$/kW] and
is the ADMM penalty parameter [p.u.].
ADMM iteration.
At each ADMM iteration m, each microgrid i solves the local subproblem:
subject to local constraints (ESS dynamics, DR constraints, power balance, bounds). Denote the resulting tie-line decision values at iteration m by .
Next, for each edge (i,j), the consensus update is:
Under packet loss, the consensus and dual updates are applied only when the bidirectional handshake succeeds (); otherwise
is held and
is frozen (Algorithm 1).
Note: by construction, ensuring network-feasible tie-line setpoints.
Finally, the dual update is:
Convergence analysis
Primal residual:
Dual residual:
Stopping criterion:
For the convex relaxation where binary variables zi(k) are relaxed to [0,1], ADMM converges to the global optimum under standard assumptions [8,25]. With binary variables, the problem is nonconvex and standard guarantees do not apply; we treat ADMM as a practical coordination heuristic with warm-starting and iteration limits. Under packet loss (handshake gating and stale updates), we likewise use a practical lossy-update variant and emphasize anytime-feasible execution rather than convergence guarantees.
Execution policy and anytime feasibility
The physically implemented tie-line value under truncated ADMM and non-reciprocal updates must be specified.
We execute the consensus tie-line schedule using the following policy:
- At iteration m (or termination), the consensus variable
is the implemented tie-line setpoint.
- By construction,
, so the network is always physically consistent.
- Each microgrid performs a local feasibility repair with
fixed as the tie-line schedule.
After ADMM termination at iteration M, each microgrid i performs a local feasibility-repair solve. Specifically, it solves:
subject to all local constraints with fixed tie-line: for all
, and power balance with slack variables. Here
and
are nonnegative load-shedding and spillage slack variables (defined below). Feasibility here refers to solvability of this soft-constraint repair (slacks may be nonzero); Theorem 1 addresses hard-constraint feasibility when reserve headroom exists and mismatch remains within bounds.
This yields the following anytime properties:
- Network feasibility: Always satisfied (consensus variable is reciprocal by construction)
- Local feasibility: Guaranteed via repair solve with slack variables (soft constraints) [26]
- Optimality: Improves with more iterations under reliable communication for the convex relaxation
Slack variables with physical interpretation
We introduce separate slack variables with distinct physical meanings:
Load shedding (energy not served) uses slack [kW] with penalty
$/kWh.
Spillage / energy dump uses slack [kW] with penalty
$/kWh. Note that Pspill,i(k) denotes planned renewable curtailment, whereas
is an emergency power-balance slack representing surplus absorption (e.g., dump load) used only when strict balance would otherwise be infeasible.
Modified power balance:
Metrics: Energy Not Served (ENS): [kWh]. Energy Spilled:
[kWh].
Resilience mechanism
Communication failure model.
We consider three types of communication failures:
- Packet loss: Message from neighbor j not received with probability ploss [27,28]
- Burst failure: Consecutive losses following Gilbert-Elliott model [29,30]
- Communication link outage: Communication unavailable on an edge for duration Tfail while the physical tie-line remains intact
Physical tie-line outages (topology changes) are modeled separately by removing the edge (setting Pij = 0 and updating neighbor sets), and are evaluated in Scenario S6. Scenario S3 uses communication burst outages (tie-lines remain physical but communication is unavailable).
Communication state:
We define a bidirectional handshake indicator .
Unless otherwise noted (e.g., the Path B step-level outage model), packet losses are modeled as independent and identically distributed (i.i.d.) Bernoulli events across directed edges and ADMM iterations.
ADMM behavior under packet loss
When a message from neighbor j is not received at iteration m, we apply a stale-update rule:
- Use last received value:
(or last available)
- Track iteration staleness:
- Freeze dual update when handshake fails:
if
We distinguish (i) iteration staleness within an ADMM solve (used only for stale-update bookkeeping) and (ii) contract staleness across MPC steps, defined as the number of MPC steps since the last successful handshake on a directed edge. The mismatch bound used for reserve sizing is a function of the contract staleness .
As contract staleness increases, we inflate and saturate the mismatch bound:
where reflects uncertainty growth with staleness.
Bounded mismatch model
Under communication failure, microgrid i cannot coordinate with neighbor j. The planned tie-line exchange may not match the neighbor’s actual behavior.
Realized tie-line power:
where wij(k) is the mismatch disturbance, bounded by . We set
during execution (the executed contract setpoint).
Two-sided constraint tightening
Mismatch can cause either a deficit (needs upward reserve) or a surplus (needs downward reserve). Both directions must be protected. We implement tightening following robust/tube MPC principles [31].
Upward reserve (can increase net injection) must satisfy:
It is provided by ESS discharge headroom, grid import headroom, and additional load curtailment.
Downward reserve (can decrease net injection) must satisfy:
It is provided by ESS charging headroom, grid export headroom, and renewable spillage.
These reserves impose the following constraints. For upward:
For downward:
Operating modes
We describe two coordination regimes based on the set of disconnected neighbors. Mode 1 (Full Coordination, ) uses standard ADMM-based DMPC without reserve tightening. Mode 2 (Partial Coordination,
) continues ADMM with connected neighbors while applying two-sided reserve tightening for disconnected edges; stale values are held from the last successful exchange. If
, coordination reduces to local MPC with a fixed reciprocal tie-line contract (held under packet loss and updated only upon bidirectional handshake), together with tightened reserves.
Feasibility guarantee
The following sufficient feasibility condition follows from standard robust/tube MPC reasoning for bounded additive disturbances [31]. Theorem 1 (Feasibility under bounded mismatch). If the two-sided reserve margins satisfy:
and sufficient local flexibility exists such that the net adjustable range within ESS/grid/DR/spillage constraints contains , then the power balance constraint is satisfiable for any realization of mismatch
.
Proof: Consider the worst cases: (1) Maximum deficit ( for all disconnected j): The realized tie-line export is larger than planned, creating a local power deficit of
. The upward reserve
can cover this. (2) Maximum surplus (
for all disconnected j): The realized tie-line export is smaller than planned, creating a local power surplus. The downward reserve
can absorb this. For intermediate cases, the available reserves interpolate linearly.◻
Complete algorithm
The complete Resilient ADMM-DMPC procedure is presented below.
Algorithm 1: Resilient ADMM-DMPC
Input: Current states {Ei(t)}, Forecasts {Pˆres,i, Pfixed,i, prices}, Comm. state indicators
Output: Control actions
Parameters: , Mmax,
,
,
,
1. Initialization: For each microgrid i:
• Determine mode based on connected/disconnected neighbor sets
• Initialize: ,
for
• For : hold stale contract estimate
• Compute reserve requirements: ,
based on mode
2. ADMM Iterations: For :
(2a) Local Optimization (parallel): Solve QP with ADMM penalty terms
(2b) Communication: Send to
; Receive
. If received:
; else:
,
. Define handshake
(2c) Consensus Update: For each :
if
; else
(hold last)
(2d) Dual Update: If :
; else
(freeze)
(2e) Convergence Check: Compute r(m), s(m). If and
: break
3. Execution: For each microgrid i:
• Set tie-line setpoint: for all j
• Solve local repair problem (fix tie-lines, minimize slack)
• Implement first-step actions:
• Store as
for next interval
4. Return: control actions, convergence info, slack usage
Definition (Handshake). A handshake occurs when both microgrids i and j successfully exchange their proposed tie-line setpoints Pij and Pji within the same ADMM iteration (i.e., ). The bidirectional exchange is required for contract execution, and upon a successful handshake the consensus variable
is updated and used for execution; if either direction fails (asymmetric packet loss), the stale contract from the previous successful handshake is retained (hold-last policy).
Fig 2 provides a visual overview of the procedure and its execution semantics under communication loss.
At each MPC step, microgrids (i) perform mode-dependent resilience setup with bounded tie-line mismatch and two-sided reserve tightening, (ii) coordinate via lossy-communication ADMM, and (iii) execute the reciprocal consensus tie-line schedule with a local feasibility-repair solve, ensuring anytime feasibility even when ADMM is truncated.
Complexity analysis
Per ADMM iteration per microgrid (planning step): QP with continuous variables (tie-lines plus local energy variables, including spillage). Solved with CLARABEL in our experiments [32] (OSQP [33] and SCS [34] are compatible alternatives).
Per MPC step per microgrid (execution step): one additional local feasibility-repair QP with the tie-line schedule fixed to the consensus contract, so it has fewer free variables (, since tie-line trajectories are parameters rather than decision variables). The repair is solved once per MPC step (not once per ADMM iteration), so its cost is typically small relative to Mmax planning solves. As communication impairments grow, the number of MPC steps with nonzero slack may increase, but the repair problem size does not depend on the number of loss events; only the realized fixed tie-line schedule changes.
Communication per iteration: Each microgrid sends floats to neighbors. Total network communication:
floats.
Total computation per MPC step scales as local QP solves for planning plus O(N) repair solves, parallelizable across microgrids. Total communication per MPC step scales as
floats. For sparse microgrid graphs where node degree is bounded,
, so communication scales approximately linearly in N; for dense topologies, communication can grow as O(N2). ADMM iteration counts can increase with network size and under packet loss due to staleness, motivating an iteration cap; the proposed execution policy maintains feasibility at any iteration, trading optimality for runtime when coordination is slow.
AI use disclosure
GPT 5.2 via the OpenAI API was used for author-guided drafting and refinement of initial paragraphs in the literature review, Materials and methods, and Results sections; all AI-assisted text was then revised and checked by the author. GPT 5.2 was also used to assist with author-guided edits to the simulation code. All output was author-verified against primary sources.
Results
Simulation design
All reported experiments use the convex QP relaxation (binary ESS mode relaxed) and are solved with CLARABEL [32]. OSQP [33] and SCS [34] are compatible alternatives.
We compare against the following baselines in all experiments:
- B1 (Oracle planning benchmark; loss-unaware DMPC): ADMM-based DMPC planning assuming reliable communication; execution uses the same communication loss realizations and contract/repair policy as B2 and B3 for comparability (not deployable under loss).
- B2 (Naive DMPC): ADMM-based DMPC with stale consensus/dual updates when packets drop during planning, but no resilience mechanisms (reserves or constraint tightening).
- B3 (Proposed RDMPC): Full method with two-sided reserves, staleness-aware constraint tightening, and contract execution policy.
All methods share the same contract execution and local feasibility-repair policy; B3 uniquely applies reserve tightening during planning.
Execution semantics by method are summarized in Table 1.
Reported experimental scenarios are summarized in Table 2.
Scenario numbering follows the codebase convention; only scenarios evaluated in Results are listed here.
We report the following performance metrics:
- Economic: Executed operating cost [$]
- Feasibility: Energy Not Served (ENS) [kWh], curtailment [kWh], spillage (renewable spillage) [kWh]
- Coordination/communication: contract staleness (max/mean, in MPC steps), contract update rate, directed and handshake loss rates
- Resilience: Relative performance under failures and DR ablation effects
Table 3 lists nominal (Path A) simulation parameters.
Note (Path B overrides to Table 3): Path B uses asymmetric donor/receiver limits (donors: kW, Pess,max = 250 kW; receivers:
kW,
kW, Pess,max = 70 kW; neutral MGs:
kW, Pess,max = 150 kW), a higher tie-line limit (150 kW), and an iteration cap of Mmax = 15. The S6 topology-change scenario uses a tighter receiver configuration (
kW, Pess,max = 50 kW).
Results overview
We report results in two operating regimes: (i) a coordination-critical loss-phase regime (Path B) and (ii) a grid-connected economic regime (Path A). Path B evaluates a setting in which communication failures can produce infeasible local schedules unless resilience and anytime-feasible execution are enforced; this regime directly tests the proposed bounded-mismatch tightening and execution policy. Path A evaluates a grid-connected regime with negligible ENS, serving as a no-harm check that resilience mechanisms do not introduce systematic cost penalties under packet loss. We present Path B first because it is the primary stress test of resilience, then the S7 ablation (demand response contribution), and finally Path A as the economic no-harm check.
Path B: Loss-phase execution under contract loss
We evaluate a coordination-critical regime using a load pattern that tightens receiver-side constraints, with a 3-step initialization phase (no packet loss) followed by 5 steps under communication impairment (total 8 steps). Packet loss is generated by the step-level outage model with (directional loss). Results report loss-phase only (steps 3–7). We distinguish directional loss (a one-way packet drop on an edge) from handshake loss (failure of the bidirectional exchange required to establish a reciprocal tie-line contract for execution). Over 20 seeds, the loss-phase mean rates are 15.75% directional and 29.5% handshake. Here ploss corresponds to the bad-state loss probability pb in the Gilbert–Elliott step-level model (with pg = 0.05, pb = 0.3,
), so the observed rates are step-level directed-edge averages rather than i.i.d. per-iteration packet loss. These parameters are representative engineering settings chosen to yield moderate step-level directed-edge and handshake loss rates (reported above), rather than calibrated from a specific field communication trace. Staleness metrics reported below are contract staleness across MPC steps (number of MPC steps since the last successful handshake on a directed edge), summarized as the maximum across edges per step and then averaged over the loss phase.
B1 (oracle planning) uses the same execution policy under the same communication loss realizations but plans as if all packets succeed; it lacks resilience mechanisms (no two-sided reserves or staleness-based tightening), serving as an optimistic planning baseline. It is included as an upper-bound reference and is not deployable under lossy communication.
Table 4 shows loss-phase results.
Relative to B2, B3 does not consistently reduce executed ENS or cost in this handshake-gated setting. Paired per-seed deltas are mixed: B3 improves ENS in 5/20 seeds (5 ties, 10 worse) and improves cost in 10/20 seeds (10 worse). Mean deltas are small ( kWh,
), indicating seed-dependent outcomes rather than a uniform improvement. Formal paired tests on the 20 per-seed loss-phase deltas corroborate the absence of consistent dominance: two-sided Wilcoxon signed-rank tests yield p = 0.216 (ENS) and p = 0.622 (cost), and paired t-tests yield p = 0.320 (ENS) and p = 0.320 (cost). The unique contribution of B3 in Path B is the hard-constraint feasibility guarantee under bounded mismatch when reserve headroom exists (Theorem 1), while contract execution and local repair are shared across methods. Across loss-phase steps, the max contract staleness exceeded zero in 73% of steps under random loss and 90% under burst outage; the corresponding mean max staleness values are reported in Tables 4–5 (2.20 and 3.60 MPC steps, respectively), with a maximum of 5. In a 5-seed pilot diagnostic (25 loss steps; 125 MG-steps), upward reserve requirements were nonzero in 20% (random loss) and 58% (burst) of MG-steps; binding (headroom ≤ requirement, i.e., positive shortfall slack) occurred in 0.8% and 4.8% of MG-steps, respectively, while downward reserve binding was not observed. Executed spillage is negligible (
kWh; rounded to 0.00 in Tables 4–6) and denotes renewable spillage
. Because
$/kWh, Path B executed cost is dominated by ENS (e.g., 86.6 kWh corresponds to ∼ $86.6k of the $88.1k mean cost for B3).
Burst outage on critical edges
We repeat the experiment with a correlated failure pattern: after burn-in (steps 0–2), we enforce a contiguous burst outage on critical donor-receiver tie-lines for steps 3–5 (Tfail = 3), disabling both ADMM coordination and handshake execution on those edges. Steps 6–7 revert to nominal loss model with ploss = 0.3.
Table 5 shows burst outage results.
Burst outages produce heterogeneous outcomes across seeds; mean differences are modest and not uniform. Formal paired tests on the 20 per-seed deltas again indicate no consistent dominance: two-sided Wilcoxon signed-rank tests yield p = 0.898 (ENS) and p = 0.674 (cost), and paired t-tests yield p = 0.339 (ENS) and p = 0.339 (cost). Fig 3 visualizes these loss-phase mean metrics. Given this heterogeneity, we interpret the proposed framework primarily as an execution-safe coordination approach (contract execution + repair), while reserve tightening (B3) is aimed at providing a hard-constraint feasibility guarantee under bounded mismatch when reserve headroom exists (Theorem 1) rather than guaranteeing ENS/cost improvements.
Grouped bar chart comparing B1 (oracle planning), B2 (naive DMPC), and B3 (RDMPC) for executed ENS, cost (shown in $k), and curtailment. Left: random step-level losses (Gilbert–Elliott; directional mean 15.75%, handshake ). Right: Burst outage (Tfail = 3 on critical edges). B3 outcomes are mixed relative to B2; improvements are seed-dependent rather than uniform.
Fig 4 shows the per-seed deltas; outcomes are heterogeneous across seeds, so gains are not uniform across realizations. This supports a cautious interpretation: the framework ensures feasible reciprocal execution under loss via contract execution and repair, while reserve tightening does not guarantee lower executed ENS/cost for every seed.
Delta values sorted ascending; each curve sorted independently (x-axis is sorted rank position
, not the random seed index). Negative values indicate B3 outperforms B2. Under random step-level loss (blue), deltas are mixed (5/20 better, 5 ties, 10 worse), indicating no uniform improvement. Under burst outage (red), outcomes remain heterogeneous (7/20 better, 5 ties, 8 worse), with no consistent dominance.
Scenario S7: Demand response ablation under step-level random loss (Path B)
To isolate the contribution of demand response (DR) as a resilience resource (C3), we repeat the Path B random-loss experiment using the proposed controller but with DR disabled (denoted B3-noDR). In B3-noDR, shiftable and curtailable load flexibility is disabled (both fractions set to zero), removing all DR flexibility. All other elements are unchanged (same network, forecasts, reserve tightening, ADMM settings, execution policy, and local feasibility repair with slack penalties). The experiment uses the same 20 seeds, with a burn-in of 3 loss-free steps followed by a 5-step loss phase (steps 3–7) under the step-level outage model with ploss = 0.3. Metrics below report loss-phase totals per seed, averaged across seeds.
Loss-phase metrics for the DR ablation are summarized in Table 6.
Disabling DR causes a substantial degradation in loss-phase execution performance. Relative to B3, B3-noDR increases executed energy-not-served (ENS) from 86.6 kWh to 512.7 kWh (a 492% increase; ) and increases executed cost from $88,101 to $514,112 (a 484% increase;
). The paired per-seed deltas (B3 – B3-noDR) are consistently negative across all seeds (20/20): the mean ENS reduction is
kWh with a 95% CI [−508.87, −350.79], and the mean cost reduction is
$426,011 with a 95% CI [−$508,595, −$350,771]. Two-sided paired tests confirm the DR benefit is statistically significant (Wilcoxon signed-rank:
for both ENS and cost; paired t-test:
). As expected, B3-noDR exhibits near-zero executed curtailment by construction, confirming that DR flexibility is removed. The large cost increase is primarily driven by the increased ENS (and the associated load-shedding penalty), indicating that DR provides critical fast flexibility to maintain service under the same communication impairments and reserve-tightened operation (Fig 5).
Two-panel bar chart comparing B1 (oracle planning), B2 (naive DMPC), B3 (RDMPC with DR), and B3-noDR (RDMPC without DR). Left: Executed ENS (kWh). Right: Executed cost ($k). Disabling DR increases ENS by 492% (513 vs 86.6 kWh) and cost by 484% ($514k vs $88k).
Scenario S6: Topology change under step-level random loss (Path B)
To evaluate RDMPC resilience under network reconfiguration, we simulate a topology change where the middle tie-line (between MG3 and MG4 in the 5-MG line topology) is permanently removed at step 3. This models an intentional islanding event or line failure during operation. The topology change triggers immediate neighbor list updates for affected microgrids and clears stale ADMM variables. The experiment uses the same 20 seeds, with a 3-step burn-in (full topology) followed by a 5-step loss phase (steps 3–7) under reduced topology and step-level outage (ploss = 0.3). Metrics report loss-phase totals per seed, averaged.
Loss-phase metrics for the topology-change scenario are summarized in Table 7.
Paired deltas (B3 - B2) for the topology change scenario are effectively zero and not statistically significant (two-sided Wilcoxon: p = 0.363 for ENS and p = 0.936 for cost; paired t-test: p = 0.871 for ENS and p = 0.925 for cost): kWh with 95% CI [−0.01, + 0.01];
with 95% CI [−$7.07, + $5.89];
kWh with 95% CI [−0.01, + 0.17]. In contrast, B3 vs B1 shows significantly higher ENS and cost (CIs exclude zero):
kWh with CI [+43.68, + 109.28],
+$74,230 with CI [+$43,678, + $109,258], while curtailment is modestly higher but not significant (
kWh, CI [−5.19, + 21.74]).
These results indicate that, under topology change, handshake-gated RDMPC maintains feasible reciprocal execution but does not yield a consistent performance advantage over B2 in loss-phase ENS/cost (Fig 6).
Loss-phase totals (steps 3–7) under topology change where the MG3–MG4 tie-line is removed at step 3. Error bars show standard error (SE = std/) across 20 seeds. B3 (RDMPC) is comparable to B2 in ENS and cost, while B1 remains lowest on average.
Path A: Economic cost parity under packet loss
We evaluate a grid-connected economic regime (ENS ≈ 0) with packet-loss rates ploss in {0.1, 0.2, 0.3, 0.4, 0.5} and 50 seeds each. Cost excess is defined as . We report paired differences per seed,
, so negative
indicates B3 is better. All costs reported are executed costs, accumulated from the implemented (k = 0) actions each step.
For context, a centralized oracle benchmark at ploss = 0 provides a lower-bound reference; the oracle planning benchmark (B1) remains a strong but not globally optimal reference.
Table 8 shows the paired summary results.
Mean cost excess versus packet loss is summarized in Fig 7.
Mean cost excess for B2 (naive DMPC) and B3 (RDMPC) across . Shaded regions show
standard deviation across seeds (not confidence intervals). Lines overlap within variability bands at all loss rates, confirming cost parity.
The iteration-cap tradeoff is shown in Fig 8.
Mean cost excess versus ADMM iteration budget Mmax (n = 50 seeds, ploss = 0.3). Under packet loss, larger Mmax increases cost excess and then saturates, reflecting a tradeoff between coordination effort and cumulative loss exposure; feasibility remains anytime via the execution/repair policy.
Fig 7 shows these results; paired bootstrap CIs for the mean are effectively zero at all ploss (within
$, rounded to 0.00 in Table 8). Fig 8 shows that, in this lossy-update setting, larger Mmax does not yield monotonic improvement in cost excess and can increase it by increasing communication exposure. The anytime property here refers to feasibility (network-consistent reciprocal execution with local repair at any iteration), while optimality improvements with additional iterations hold under reliable communication for the convex relaxation. In this grid-connected Path A regime, reserve tightening was inactive in our simulations (
during planning), so B3 reduces to B2 and the resulting executed costs coincide.
Discussion
Theoretical contributions
The proposed RDMPC framework provides a formal feasibility guarantee for hard constraints under bounded communication failures through two-sided reserves, assuming sufficient reserve headroom; the execution repair guarantees feasibility with slack variables when mismatch exceeds bounds. The characterization of the performance-resilience tradeoff enables principled sizing of reserve margins. The execution policy ensures network-consistent tie-line schedules at any ADMM iteration, providing anytime feasibility regardless of convergence status.
Practical contributions
The algorithm is implementable with standard optimization solvers (CLARABEL used here; OSQP and SCS are alternatives). The quantified value of demand response under coordination failures shows that DR compensates for lost coordination capability. Guidelines for reserve margin sizing based on staleness and mismatch bounds enable practical deployment.
In practice, the two-sided tie-line mismatch bounds () should reflect a maximum credible reciprocity error on each tie-line during a communication impairment. A system operator can choose
based on tie-line ratings and historical/engineering limits on how far the realized intertie flow could deviate from the last agreed contract during a loss event, and tune the growth factor
to match how quickly uncertainty grows with contract staleness. The proposed implementation uses a staleness-dependent bound with a deadband (
) and cap (
) (Table 3), so tightening activates only when a handshake fails (contract staleness
). Very conservative caps can reduce economic performance under prolonged loss by shrinking the feasible region via larger reserve margins; however, they do not penalize normal operation because reserve tightening sums only over disconnected neighbors (
when all handshakes succeed), so no reserve requirement is imposed; in the grid-connected economic regime (Path A) reserve tightening did not activate and B3 remained cost-parity with B2.
Experimental findings
Path B highlights resilience under coordination-critical conditions. Under random step-level loss, RDMPC maintains feasible reciprocal execution but exhibits seed-dependent performance relative to B2 (better in some seeds, worse in others). Under burst outages, outcomes remain heterogeneous; RDMPC can mitigate severe execution-level degradation in a subset of realizations without uniformly dominating B2.
Whether B3 (RDMPC) performs better than B2 in a given realization is primarily driven by whether reserve tightening becomes active and binds. Tightening is more likely under longer burst outages (higher contract staleness) and when local headroom is limited (e.g., low ESS state of charge and tight grid limits at the time of failure, or high net-load/renewable ramps), which increases the chance that bounded mismatch would otherwise violate hard constraints. In contrast, when local flexibility is ample or loss episodes are short/mild, tightening rarely binds and B3 is effectively neutral relative to B2.
The DR ablation study isolates the contribution of demand response: disabling DR increases ENS by 492% () and executed cost by 484% (
) under the same communication failure conditions. This confirms that DR acts as a local flexibility resource that compensates for tie-line mismatch when coordination degrades.
Path A results support a no-harm claim: RDMPC is cost-parity with naive DMPC under packet loss in grid-connected economic regimes, with differences indistinguishable at n = 50.
Limitations and future work
The current evaluation uses a specific test system (5 microgrids with sparse topology); scalability to larger networks requires further study, particularly regarding ADMM iteration counts and communication overhead as grows. Communication-loss parameters for the step-level Gilbert–Elliott model were selected as representative engineering settings (to yield moderate loss rates) rather than calibrated from field communication traces.
All reported experiments use the convex QP relaxation (binary ESS mode relaxed), so convergence issues associated with mixed-integer switching were not observed. In a spot-check of the CLARABEL QP solutions under Path B full settings (20 seeds; burn-in=3, steps = 8, max iterations = 15, ploss = 0.3), simultaneous ESS charge/discharge above 10−4 kW did not occur (max kW), suggesting the relaxation does not materially activate simultaneous charge/discharge in practice for the studied regimes. However, under more stressed operating conditions the relaxation could in principle admit fractional modes; evaluating full MIQP formulations (and their convergence/anytime behavior under loss) remains future work. Future work should also explore adaptive ADMM penalty selection and dynamic reserve sizing based on observed communication quality.
Conclusion
This paper has presented a resilient distributed model predictive control framework designed for networked microgrids with demand response integration. Three key contributions emerge from this work: (1) an ADMM-based coordination scheme for which standard convergence results apply under reliable communication in the convex relaxation, together with an execution policy providing anytime feasibility; (2) a resilience mechanism that models tie-line mismatch as bounded disturbances and applies two-sided constraint tightening, guaranteeing feasibility when mismatch remains within assumed bounds and sufficient reserve headroom exists; and (3) the inclusion of both shiftable and curtailable demand response as additional flexibility resources.
The simulation results show that RDMPC maintains feasible reciprocal execution under random step-level loss and burst outages in coordination-critical conditions (Path B), while performance relative to B2 is seed-dependent rather than uniformly better. The demand response ablation study confirms the substantial resilience value of DR: removing DR flexibility causes large increases in ENS and executed cost under identical failure conditions. Under grid-connected economic regimes (Path A), RDMPC maintains cost parity with naive DMPC (B2), with differences indistinguishable at n = 50.
The proposed framework provides a principled approach to resilient microgrid coordination that maintains feasibility under communication failures, conditioned on bounded mismatch and sufficient reserve headroom (Theorem 1), without sacrificing economic performance under normal operation.
Supporting information
S1 Table. Notation summary.
Complete list of symbols used in the paper with descriptions and units.
https://doi.org/10.1371/journal.pone.0345857.s001
(S1_Table.PDF)
S1 Data. Supporting data package.
ZIP archive (S1_Data.zip) containing the processed result summaries (JSON) used to generate the manuscript figures and tables (Path A/B, S6, S7), including paired deltas, confidence intervals, and p-values.
https://doi.org/10.1371/journal.pone.0345857.s002
(S1_Data.ZIP)
S2 Code. Simulation code package.
ZIP archive (S2_Code.zip) containing the simulation code and experiment scripts used to generate the raw result files and processed summaries reported in the manuscript.
https://doi.org/10.1371/journal.pone.0345857.s003
(S2_Code.ZIP)
References
- 1. Lasseter R. Microgrids. In: Proc. IEEE power engineering society winter meeting. 2002. 305–8.
- 2. Olivares DE, Mehrizi-Sani A, Etemadi AH, Cañizares CA, Iravani R, Kazerani M,et al. Trends in microgrid control. IEEE Transact Smart Grid. 2014;5(4):1905–19.
- 3. Ouammi A, Dagdougui H, Dessaint L, Sacile R. Coordinated model predictive-based power flows control in a cooperative network of smart microgrids. IEEE Trans Smart Grid. 2015;6(5):2233–44.
- 4. Scattolini R. Architectures for distributed and hierarchical model predictive control – a review. J Process Control. 2009;19(5):723–31.
- 5. Parisio A, Rikos E, Glielmo L. A model predictive control approach to microgrid operation optimization. IEEE Transac Control Syst Technol. 2014;22(5):1813–27.
- 6. Hans CA, Braun P, Raisch J, Grune L, Reincke-Collon C. Hierarchical distributed model predictive control of interconnected microgrids. IEEE Trans Sustain Energy. 2019;10(1):407–16.
- 7. Bersani C, Dagdougui H, Ouammi A, Sacile R. Distributed robust control of the power flows in a team of cooperating microgrids. IEEE Trans Contr Syst Technol. 2017;25(4):1473–9.
- 8. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direction method of multipliers. FNT in Machine Learning. 2010;3(1):1–122.
- 9. Chang T-H, Hong M, Liao W-C, Wang X. Asynchronous distributed ADMM for large-scale optimization—part I: algorithm and convergence analysis. IEEE Trans Signal Process. 2016;64(12):3118–30.
- 10. Chang TH, Liao WC, Hong M, Wang X. Asynchronous distributed ADMM for large-scale optimization—part II: linear convergence analysis and numerical performance. IEEE Trans Signal Process. 2016;64(12):3131–44.
- 11. Xu J, Sun H, Dent CJ. ADMM-based distributed opf problem meets stochastic communication delay. IEEE Trans Smart Grid. 2019;10(5):5046–56.
- 12.
Guo J, Hug G, Tonguz O. Impact of communication delay on asynchronous distributed optimal power flow using ADMM. In: 2017 IEEE International Conference on Smart Grid Communications (SmartGridComm), 2017. 177–82. https://doi.org/10.1109/smartgridcomm.2017.8340718
- 13. Zheng L, Cai L. A distributed demand response control strategy using Lyapunov optimization. IEEE Transactions on Smart Grid. 2014;5(4):2075–83.
- 14. Duan J, Chow M-Y. Robust consensus-based distributed energy management for microgrids with packet losses tolerance. IEEE Trans Smart Grid. 2020;11(1):281–90.
- 15. Li H, Hui H, Zhang H. Consensus-based energy management of microgrid with random packet drops. IEEE Trans Smart Grid. 2023;14(5):3600–13.
- 16. Ananduta W, Maestre JM, Ocampo‐Martinez C, Ishii H. Resilient distributed model predictive control for energy management of interconnected microgrids. Optim Control Appl Methods. 2020;41(1):146–69.
- 17. Pham VH, Ahn HS. Distributed solution methods for MPC-based energy management of interconnected microgrids: dual ascent vs ADMM. 2024.
- 18. Xie P, Jia Y, Chen H, Wu J, Cai Z. Mixed-stage energy management for decentralized microgrid cluster based on enhanced tube model predictive control. IEEE Trans Smart Grid. 2021;12(5):3780–92.
- 19. Jia Y, Yong P, Li C, Meng K, Dong ZY, Sun C. An ADMM-based resilient distributed Economic MPC algorithm for frequency restoration and economic dispatch in networked microgrids. IEEE Trans on Ind Applicat. 2026;62(2):3275–85.
- 20. Seguro JV, Lambert TW. Modern estimation of the parameters of the Weibull wind speed distribution for wind energy analysis. J Wind Eng Indust Aerodynam. 2000;85(1):75–84.
- 21. Wahbah M, EL-Fouly THM, Zahawi B, Feng S. Hybrid Beta-KDE model for solar irradiance probability density estimation. IEEE Trans Sustain Energy. 2020;11(2):1110–3.
- 22. Dupačová J, Gröwe-Kuska N, Römisch W. Scenario reduction in stochastic programming: an approach using probability metrics. Mathemat Program. 2003;95:493–511.
- 23. Heitsch H, Römisch W. Scenario reduction algorithms in stochastic programming. Comput Optimizat Applicat. 2003;24(2–3):187–206.
- 24.
Gan L, Wierman A, Topcu U, Chen N, Low SH. Real-time deferrable load control: handling the uncertainties of renewable generation. In: Proc. 4th Int. Conf. Future Energy Systems (ACM e-Energy). 2013. https://doi.org/10.1145/2487166.2487179
- 25. Eckstein J, Bertsekas DP. On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators. Mathemat Program. 1992;55(1–3):293–318.
- 26. Zeilinger MN, Morari M, Jones CN. Soft Constrained Model Predictive Control With Robust Stability Guarantees. IEEE Trans Automat Contr. 2014;59(5):1190–202.
- 27. Sinopoli B, Schenato L, Franceschetti M, Poolla K, Jordan MI, Sastry SS. Kalman filtering with intermittent observations. IEEE Trans Automat Contr. 2004;49(9):1453–64.
- 28. Hespanha JP, Naghshtabrizi P, Xu Y. A survey of recent results in networked control systems. Proc IEEE. 2007;95(1):138–62.
- 29. Gilbert EN. Capacity of a burst-noise channel. Bell System Technical J. 1960;39(5):1253–65.
- 30. Elliott EO. Estimates of error rates for codes on burst-noise channels. Bell System Technical Journal. 1963;42(5):1977–97.
- 31. Mayne DQ, Seron MM, Raković SV. Robust model predictive control of constrained linear systems with bounded disturbances. Automatica. 2005;41(2):219–24.
- 32. Goulart PJ, Chen Y. Clarabel: an interior-point solver for conic programs with quadratic objectives. 2024;2405.12762:1.
- 33. Stellato B, Banjac G, Goulart P, Bemporad A, Boyd S. OSQP: an operator splitting solver for quadratic programs. Math Prog Comp. 2020;12(4):637–72.
- 34. O’Donoghue B, Chu E, Parikh N, Boyd S. Conic optimization via operator splitting and homogeneous self-dual embedding. J Optim Theory Appl. 2016;169(3):1042–68.