Unlocking ensemble ecosystem modelling for large and complex networks

The potential effects of conservation actions on threatened species can be predicted using ensemble ecosystem models by forecasting populations with and without intervention. These model ensembles commonly assume stable coexistence of species in the absence of available data. However, existing ensemble-generation methods become computationally inefficient as the size of the ecosystem network increases, preventing larger networks from being studied. We present a novel sequential Monte Carlo sampling approach for ensemble generation that is orders of magnitude faster than existing approaches. We demonstrate that the methods produce equivalent parameter inferences, model predictions, and tightly constrained parameter combinations using a novel sensitivity analysis method. For one case study, we demonstrate a speed-up from 108 days to 6 hours, while maintaining equivalent ensembles. Additionally, we demonstrate how to identify the parameter combinations that strongly drive feasibility and stability, drawing ecological insight from the ensembles. Now, for the first time, larger and more realistic networks can be practically simulated and analysed.


Introduction
Conservation actions aim to help preserve the populations of threatened species, and more generally maintain the health of an ecosystem.However, it can be challenging to foresee the effects of an intervention across the whole ecosystem, leaving the potential for unintended consequences [1,2,3,4], such as a shift in predation to increased consumption of a species of interest (e.g., Roemer et al., 2002 [5]).Quantitative models can provide critical insights for ecosystem management by forecasting species populations into the future, or in response to both anthropogenic and natural perturbations [6,7,8].However, parameterising these models is challenging.
There is typically limited information about the model parameters prior to any analysis [9] due to the difficulty, speed, cost, and uncertainty of expert elicitation and field experiments [10,11,12].Consequently, estimates of model parameters necessary to simulate the ecosystem are often poorly constrained and subsequently yield inconclusive forecasts [13].
Since time-series abundance data is often lacking for model calibration [14,15], parameters can be constrained based on desired features of the ecosystem; two common expected features are feasibility (also referred to as coexistence or persistence) and stability [16].Ensemble ecosystem modelling (EEM) -an extension of qualitative modelling methods [3,10,17] -is a method used to generate an ensemble of plausible ecosystem models by randomly sampling parameter values and retaining those that yield feasible and stable ecosystems [18].Many studies have used similar methods to simulate ecosystem properties such as these and investigate relationships between network structures, interaction strengths, and ecosystem properties [16,19,20,21,22].While studies investigating ecological theory could benefit from new parameterisation regimes, we focus on EEM because of its suitability in conservation planning under limited information.In practice, EEM has been used to assess the indirect consequences of species reintroductions [18,23,24], invasive species management [25], habitat restoration [26], population controls such as baiting [26], and assisted migration [27].
Predictions from EEM can inform conservation decisions in the all-too-common situation of limited data availability; however, the process of parameterising the ensemble becomes increasingly computationally intensive as the size of the ecosystem network increases.There can be a very low probability of randomly sampling feasible and stable systems [28]; for example, Peterson and Bode [27] reported fewer than 1 in 1, 000, 000 parameter sets were both feasible and stable for an ecosystem of 15 species.These constraints are even less likely to be satisfied for larger and more complex networks [19,29].
Due to the low probability of generating ecosystem models in which all species stably coexist, much theoretical literature, starting with the classic work of May [16,19,20,29], suggests it is unlikely for complex ecosystems to exist in nature, whereas others have recently proposed explanations for why they do exist -such as natural selection [30,31].In order to explore these ecological theories and to build decision-making tools, it is beneficial to model feasible and stable ecosystems -especially in the absence of time-series data.Yet in practice, this becomes computationally impractical via random sampling as the food web increases in size [27].
In this paper, we exploit established efficient parameterisation methods within Bayesian statistics to present and demonstrate a new method for efficiently generating an ensemble 1 of parameter sets that define feasible and stable ecosystem models, inspired by sequential Monte Carlo approximate Bayesian computation (SMC-ABC) [32,1].Promisingly, when this new method is compared to the original method proposed by Baker et al. [18] -hereby referred to as SMC-EEM and standard-EEM, respectively -the computational efficiency is increased by several orders of magnitude for larger systems, whilst retaining similar predictions.We demonstrate that SMC-EEM, yields consistent ensembles of ecosystem networks to the standard-EEM method using two common comparisons (parameter inferences and model forecasts) as well as via analysis of model sloppiness [34] -a novel model analysis tool [2] that has only recently been applied for comparison of model ensembles [36].Additionally, we demonstrate how this analysis of sloppiness could identify the key parameter combinations driving feasibility and stability, drawing ecological insight from the obtained ensembles.Therefore, the methods presented here unlock the capabilities of ensemble ecosystem models for representing in, and forecasting for, the complex ecosystem networks that exist in nature.

Ecosystem network modelling
An ecological community of interacting organisms and their physical environment can be represented as an ecosystem network or food web [37].Ecosystem networks represent the interactions between individual species or groups of species (often referred to as nodes), characterising relationships such as predator-prey, host-parasite, competitive or mutualist [37,38].An interaction matrix is used to characterise positive and negative interactions between species that represent a beneficial or detrimental effect on the abundance of the affected species [9].By characterising the direct effects of one population on another, the indirect effects that propagate through an ecosystem can be understood and modelled [39].These interaction networks have been analysed both qualitatively [10,40,41,42] and quantitatively [6,9,13,18,43] in order to forecast ecosystem population trajectories and predict responses to disturbances.
Ecosystems can be quantitatively modelled in many ways -such as non-parametric methods [44], empirical dynamic modelling [45,46] or stochastic autoregressive models [43] (see [12] for an overview).Here, we focus on the common quantitative approach of using the generalised Lotka-Volterra equations for forecasting change in ecosystem node abundances over time [6,9,47], where n i (t) is the abundance of the ith ecosystem node at time t, r i is the growth rate of the ith ecosystem node, N is the number of ecosystem nodes being modelled, and α i,j is the per-capita interaction strength characterising the effect of node j on node i.
If there is no known effect of species j on species i, the parameter α i,j = 0.However, relationships between species can be prescribed via the sign of the interaction strength parameters.For example, a mutualist relationship would require that both α i,j and α j,i are positive.Hence, connecting these Lotka Volterra equations to an ecosystem network informs ecosystem-specific information about the interaction strength parameters α i,j in the model.
In this work, we limit consideration to identifying suitable parameter values for a known model structure, rather than identifying appropriate model structures or networks.
The system represented in Equation ( 1) can be equivalently expressed in a vector form as where n = {n i : i = 1, ..., N } is the vector of species abundances, r = {r i : i = 1, ..., N } is the vector of species growth rates, A = {α i,j : i, j = 1, ..., N } is the N × N interaction matrix of per-capita interaction strengths between ecosystem nodes, and • is the Hadamard or element-wise product.

Feasibility and stability constraints
The EEM method generates an ensemble of plausible parameter sets for the generalised Lotka-Volterra model where there is limited data.To do this, it uses two constraints on the behaviour of the whole ecosystem: feasibility and stability [18].
Since there cannot be negative populations, a feasible ecosystem is one in which equilibrium populations of all species are positive [21].This feasibility condition is met if n * i > 0 for all i, where n * i is the equilibrium population abundance for node i, which is the solution to Following Equation ( 2), this condition can be rewritten conveniently as where n * is the vector of equilibrium population abundances n * i for all species.
A stable ecosystem is one which can recover after small perturbations of species abundances away from equilibrium [22].Specifically, local asymptotic stability (Lyapunov stability) requires that the dynamic system returns to the vicinity of the equilibrium point following a perturbation [21].To determine if the stability constraint is met the Jacobian matrix J must be evaluated at equilibrium n * , such that is the (i, j)th element of the Jacobian matrix J, and f i is the change in abundance for the ith node represented by Equation (1).Equation (5) indicates that the elements of this Jacobian matrix approximate the effect of species j on species i when the system is close to equilibrium [22].The dynamic system is considered locally asymptotically stable if the real part of all eigenvalues (λ i ) of the Jacobian matrix J are negative, i.e.R{λ i } < 0, ∀i = 1, ..., N .For the generalised Lotka-Volterra equations, the elements of the Jacobian matrix evaluated at equilibrium can be calculated as

Ensemble ecosystem modelling
Ensemble ecosystem modelling (EEM) aims to produce an ensemble of parameter sets that yield feasible and stable ecosystems for a given ecosystem network structure.The standard approach to EEM, introduced by Baker et al. [18], is to randomly search a pre-defined parameter space for possible intrinsic growth rate parameters r i and interaction strengths α i,j that together yield a feasible and stable ecosystem.Specifically, the model parameters θ ≡ {α ij , r i } i,j=1,...,N are first sampled from a pre-specified probability distribution which characterises any prior beliefs about the parameter values; this is the prior distribution π(θ).
Next, any sampled parameter sets θ which lead to feasible and stable ecosystems are added to the ensemble of plausible models, creating an ensemble of parameter sets from the target distribution π(θ|s) that have the desired system features s.Throughout this manuscript, we refer to this random sampling process for generating an ensemble of feasible and stable ecosystems -described in Algorithm 1 -as the standard -EEM method.After solving each system of Lotka-Volterra equations, the forecasts are combined to produce an ensemble that can simulate the multitude of potential effects of conservation actions on each of the species within the ecosystem [18,23,24,25,26,27].A summary of the EEM process is depicted in Fig 1.
while the ensemble is not sufficiently large do Propose parameter values using any prior beliefs θ * ∼ π(θ) if the model using θ * meets the feasibility and stability constraints then Save parameter values θ * to the ensemble Forecast using the ensemble of ecosystem models Figure 1: Overview of the ensemble ecosystem modelling (EEM) process.In the present work, we present the SMC-EEM method and compare it to the standard-EEM method [18].The inputs and outputs of the EEM process are the same regardless of the parameterisation method (SMC-EEM or standard-EEM) used.
While the standard-EEM method can produce a representative ensemble of feasible and stable ecosystems, in practice it is too computationally intensive to be practical for large or dense ecosystem networks.We show here that the efficiency of EEM can be greatly improved by exploiting efficient sampling methods developed for Bayesian statistics, such as sequential Monte Carlo-approximate Bayesian computation (SMC-ABC).To explain this, we first demonstrate the connection between EEM and approximate Bayesian computation (ABC).

Approximate Bayesian computation
ABC is a statistical inference technique used to estimate the parameters of complex models by comparing simulated data to observed data [48,49,50,51].The technique involves simulating data from the model using prior information about the model parameters θ as specified by the prior distribution π(θ).The simulated data ŷ (from the model specified by θ) is then compared to the observed data y via a summarisation function S that reduces the full dataset to a set of summary statistics.A discrepancy function ρ(S(y), S( ŷ)) is used to measure the similarity between the simulated and observed datasets [50], and if the simulated data closely matches the observed data, the parameter values are accepted as plausible.The target (posterior) distribution, which is a distribution of the parameters conditional on the available data π(θ|y), can then be approximately sampled using ABC accept-reject [52], or more efficient methods [48] such as Markov chain Monte Carlo ABC (MCMC-ABC) [53,54] or sequential Monte Carlo ABC (SMC-ABC) [55,56].For the interested reader, helpful reviews on approximate Bayesian methods can be found in Beaumont et al., [51], Drovandi [57], or Sisson et al., [49].

Connections between approximate Bayesian computation and ensemble ecosystem modelling
While similarities have been drawn between ABC and EEM [18,26], this connection has not been exploited in the literature to our knowledge.Where ABC uses summary statistics to capture key information in the observed data, EEM applications have no abundance data and instead assume that ecosystems are observably feasible and stable.While we suggest that EEM is not an ABC approach in the statistical sense, we propose to frame these system features as summary statistics and adopt ABC-based sampling methods.In this way, the output of EEM should instead be considered a constraint-informed prior, rather than a posterior distribution -as feasibility and stability are not directly observed.However, by placing EEM within an ABC framework, the vast literature on efficient sampling methods developed for ABC can be used to efficiently generate an ensemble of plausible ecosystem networks.
There are many different ABC methods available [49], and the simplest is accept-reject ABC.Table 1 reveals that the steps of the standard-EEM method (Algorithm 1) are exactly analogous to the ABC accept-reject method [58].Through both methods, the model parameters θ = {α ij , r i : i, j = 1, ..., N } are calibrated using prior information about the parameters and summaries of the data (feasibility and stability).
In the ABC accept-reject method depicted in Table 1, the aim is to minimise the discrepancy (ρ) between the modelled and observed data so that they match as much as possible, such that ρ < ϵ where the target discrepancy ϵ is small.Equivalently, in the standard-EEM method, the aim is for the features of modelled ecosystems to match what is assumed to be true for a real ecosystem of coexisting species -feasibility and stability.
Hence, ABC can be mathematically matched to EEM by introducing a discrepancy function ρ that becomes equal to a target discrepancy of zero (ϵ = 0) when the modelled ecosystem is feasible and stable.To this end, we define a discrepancy function ρ(θ) for an ecosystem represented by parameters θ = {r i , α i,j : i, j = 1, . . ., N }, as Step Standard-EEM method [18] ABC accept-reject method [ Reject the parameter values if the discrepancy between modelled and observed data is too large, such that θ * is rejected if ρ(S(ŷ), S(y)) > ϵ for some tolerance ϵ.

4
Repeat steps 1-3 until the ensemble is sufficiently large.
Repeat steps 1-3 until a sufficiently large ensemble is obtained.
Table 1: Placing EEM within an ABC framework.A comparison of the steps in the standard-EEM method and the ABC accept-reject method shows that the two methods are analogous.
where v f (θ) is a measure of infeasibility of all ecosystem nodes (the negativity of equilibrium populations n * i ), and v s (θ) is a measure of instablility of all ecosystem nodes (the positivity of the real parts of the Jacobian eigenvalues λ i ).Using the discrepancy function ρ(θ) defined in Equations ( 7)-( 9), a feasible and stable ecosystem possesses ρ(θ) = 0; however, any infeasibility or instability will result in ρ(θ) > 0.

Sequential Monte Carlo-approximate Bayesian computation
By placing EEM within an ABC framework we can take advantage of advanced ABC sampling methods beyond ABC accept-reject sampling.Within the ABC framework, there is a large suite of methods for sampling from the approximate posterior -such as ABC accept-reject, MCMC-ABC and SMC-ABC [51] -which each present different advantages and disadvantages.In the present work, we used SMC-ABC for sampling because it can be more efficient for applications with a low probability of randomly sampling acceptable parameter values [59] and this is the key computational bottleneck in ecosystem generation for large and complex networks.Hence, in the remainder of this section, we provide a brief overview of SMC-ABC as it pertains to ecosystem generation.SMC-ABC works by moving an ensemble of parameter sets through a sequence of distributions, ending at the target distribution [55].Typically starting with an ensemble drawn from the prior distribution p 0 , these parameter sets are manipulated to become representative of the next distribution in the sequence p 1 and this process is repeated until the ensemble is representative of the target distribution p T .In SMC-ABC, the sequence of distributions t = 0, ..., T is a sequence of decreasing maximum discrepancies ϵ, such that the tth distribution is p t (θ|ρ(θ) ≤ ϵ t ), where ϵ t ≤ ϵ t−1 .This sequence, whether prespecified or adaptively selected within the algorithm, commonly progresses the ensemble from the prior (maximum discrepancy ϵ 0 = ∞) to some target discrepancy (maximum discrepancy ϵ T ).In this way, SMC-ABC breaks up the sampling problem into a series of simpler problems [60].Provided that the sequence of distributions is chosen sensibly so that the effective sample size throughout the algorithm is maintained at a reasonable level, the sequence itself does not affect the target distribution, merely the speed that the target distribution is obtained.
In SMC, a distribution in the sequence is characterised by many independent and weighted parameter sets referred to as 'particles'.The weight attributed to each particle is determined by both the prior density and the discrepancy of the parameter set.As such, each particle θ i contains a proposed value for all model parameters and a weighting, and subsequently an ensemble of M particles make up an empirical approximation of the distribution p t .
Each distribution in the sequence, p t , can be approximated by manipulating the ensemble characterising the previous distribution p t−1 , using importance sampling and MCMC-ABC techniques [1].To progress the particles from one distribution to the next, three steps are iteratively applied: reweighting, resampling and moving [48,56].
1. Reweighting: The prior density and discrepancy for all particles is calculated and used to weight the particles.This ensures parameter sets that create outputs similar to the observations are more highly weighted.
2. Resampling: Particles are resampled according to their weight, such that highweighted particles are duplicated and low-weighted particles are eliminated.This focuses the particles into areas of the parameter space that can yield low discrepancies.
3. Moving: MCMC-ABC [54] is used to move the particles according to the current distribution in the sequence p t (θ|ρ(θ) ≤ ϵ t ).This diversifies the ensemble (avoiding duplicates) by jittering each parameter set relative to its current values.
By iterating through these three steps, the cluster of weighted particles can progress through the sequence of distributions to the target distribution.Algorithm 2 shows a summary of an adaptive SMC-ABC method [55], adapted to the EEM context by building on Drovandi and Pettitt's implementation [1].Further details of this algorithm are provided in File S1.We can think of the ABC accept-reject method (standard-EEM) as "uninformed": we reject models that do not fit the constraints, without learning from them.Instead, a more informed sampling method, such as SMC-ABC, utilises information from rejected models.SMC-ABC methods use a sequence of decreasing tolerances, so that parameter values are proposed from an iteratively more "informed" distribution, rather than the prior [60].As a result, SMC-ABC can perform more efficiently than ABC accept-reject for simulating rare events (when the prior and target distributions are very different) [51].

Analysis of model sloppiness
To compare the ensembles produced by standard-EEM and SMC-EEM, an analysis of model sloppiness can be used.Analysis of model sloppiness is a data-informed sensitivity analysis [61,62,63] that has recently been shown to provide useful insights for biological and ecological models parameterised using Bayesian inference [34,2,36].In the context of ecosystem generation, analysis of model sloppiness can be used to provide a comparison of the model ensembles generated via different Bayesian methods.
Whilst ensembles can (and should) also be compared based on the estimated marginal parameter distributions, this method can be misleading when individual parameter values are unconstrained.Complementarily, analysis of model sloppiness can be used to compare tightly constrained parameter combinations (e.g. products and ratios of parameters) between different ensembles, to indicate their similarity even when individual parameter values are relatively unconstrained [36].
The analysis of model sloppiness uses an eigendecomposition of a parameter-data sensitivity matrix to identify the directions in parameter space, with associated magnitudes, that are most informed by the data [34,61].Here the "data" refers to the feasibility and stability constraints.We use the posterior covariance sensitivity matrix -the inverse of the empirical covariance matrix of the logarithmically transformed ensemble [34,2] -to capture how tightly constrained parameters are after parameterisation.Hence, using this analysis on an ensemble generated via standard-EEM yields the directions in parameter space that are important for obtaining feasible and stable systems.
These important directions can be expressed as parameter combinations [34], known as eigenparameters θj : where is the jth eigenvector of the sensitivity matrix, n p is the number of model parameters, and θ i is the ith parameter in the model [2,62].Using a logarithmically transformed ensemble allows this eigenparameter to be expressed as a product (as in Eq (10)) rather than a sum, which is common in the literature [34,2,61].Each eigenparameter has a corresponding eigenvalue λ j that indicates how tightly constrained the parameter combination is, such that the largest eigenvalue (λ 1 ) corresponds to the most sensitive eigenparameter θ1 .These parameter combinations (expressed as in Equation ( 10)) can be directly analysed to identify important mechanisms [34], or visually represented to identify parametric trends [2] that drive the model to match the data (in this case feasibility and stability).For further information about the analysis of model sloppiness method or interpreting eigenparameters, see Monsalve-Bravo et al., 2022 [34] or Vollert et al., 2023 [2].
Additionally, we can substitute a set of parameter values into each parameter combination.
Repeating this process for all parameter sets in an ensemble yields a distribution for each eigenvalue which can be compared to assess ensemble similarity across the important directions in parameter space [36].Comparing the distributions of the standard-EEM and SMC-EEM ensembles will reveal whether the important parameter combinations for feasibility and stability are similar between the two methods, indicating ensemble similarity even if individual parameters are unconstrained.Hence, the analysis of model sloppiness here provides a critical assessment of the similarity of the ensembles produced by the two different methods of ecosystem network generation (standard-EEM and SMC-EEM).

Case studies
The standard-EEM and SMC-EEM methods were compared in two ways.Firstly, the two methods were compared generally across many randomly generated ecosystem network structures (referred to as the "simulation study").Secondly, the methods were compared via three case studies representing natural ecosystems.An ecosystem network representing semiarid Australia -originally used by Baker et al. [18] to introduce EEM -was investigated as an example network where standard-EEM is practical for ecosystem generation within a reasonable computation time.A network of Phillip Island, Australia [25] was used to showcase an example where SMC-EEM is much faster than standard-EEM for ensemble generation.Finally, a coral reef food web network proposed for the Great Barrier Reef [64] was investigated as an example of interest where the standard-EEM method is computationally impractical.For the simulation study and the three case studies, the computation times and the resulting ensembles produced by each method were compared.

Simulation study
To generally test the two methods, many ecosystem networks were simulated.Following the practice of May [19] (later replicated by many other studies, e.g., Allesina and Tang [28]), a random matrix theory approach was used, whereby the sign structure of an interaction network was randomly assigned, as follows.
A network of S species requires a S × S interaction matrix.The diagonal elements of the matrix (the effect of a species on itself) are negative so that the species populations are selfregulating.Each off-diagonal element of the matrix was treated independently via a two-step process.Firstly, the interaction was made non-zero with a probability c -this connectance parameter specifies the probability of direct interaction between two species [28].We focused our results on ecosystems generated with a connectance probability of c = 0.5; however, we also explored varying this probability to c = 0.25 and c = 0.75.Secondly, each non-zero element was allocated either a positive or negative interaction with probability p = 0.5 (such that there was an equal probability of positive or negative interactions).Network structures consisting of between 3 and 15 species (inclusive) were generated with this approach.
For each randomly generated network structure of 3-15 species, 1000 feasible and stable parameterisations were found using the ensemble generation methods discussed previously (standard-EEM and SMC-EEM).We aimed to generate and simulate 1000 ecosystems of each size.However, due to the computational burden of the experiment, we were unable to simulate this many large networks.Instead, there are a minimum of 100 ecosystems simulated for each network size.For each ecosystem network considered in this work (both simulated and natural case studies) the parameterisation used prior distributions of |α i,j | ∼ U(0, 1), and r i ∼ U(0, 5) following Baker et al., 2017 [18].

Case study 1: semiarid Australia network
The two ensemble generation methods (standard-EEM and SMC-EEM) were then applied to an eight-node ecosystem network representing semiarid Australia (see Figure 1b of [18]).This ecosystem network was previously used to introduce the standard-EEM method and to evaluate the plausible consequences of dingo reintroduction to a national park in Australia [18].
Since standard-EEM has been previously applied to this case study it serves as a useful test case where both methods are expected to generate an ensemble within a practical time frame.In this network, interaction matrix elements that do not represent direct effects of species on each other are set to zero and thus do not require sampling; if this were not the case then ecosystem generation for this (eight-node) network would require sampling of 72 parameters (total 64 interaction matrix elements α i,j and 8 growth rates r i ).Instead, this eight-node network has 33 parameters when represented as a generalised Lotka-Volterra model, which is small compared to other ecosystem networks observed in nature that have been quantitatively investigated (e.g.Booderee National Park represented as 20 nodes and 163 parameters [9]).

Case study 2: Phillip Island network
Next, we generated an ensemble of ecosystem models using both standard-EEM and SMC-EEM for a 22 node network which represents Phillip Island, Australia (see Figure 2 of [25]).This network is considerably larger and more complex than the semiarid Australian network -there are 110 parameters to be estimated when represented as a Lotka-Volterra system -such that the SMC-EEM method is expected to generate an ensemble faster than the standard-EEM method.
Case study 3: Great Barrier Reef network Lastly, we demonstrate the benefits of the SMC-EEM method using a case study where it is impractical to use standard-EEM.Rogers et al. [64] produced a conceptual 16-node coral reef food web from the literature which depicts a Great Barrier reef ecosystem (see Figure 1 of [64]).In addition to being a large, this ecosystem network is also densely connected, resulting in an extremely low probability of sampling a feasible and stable model.

Simulation study
Our new SMC-EEM method is orders of magnitude faster than the standard-EEM method for larger ecosystems when compared generally across many randomly generated ecosystem network structures (Fig 2).We observe that for smaller ecosystems the standard-EEM method may be more computationally efficient due to the additional computational processes required by the SMC-EEM method.This key result also holds for different connectance probabilities c (Fig S1 ).
More generally, the computation time of the standard-EEM method scales linearly with the probability of randomly selecting parameter values that are feasible and stable (Fig 3).This probability -known as the acceptance rate -is an emergent property of the model, prior and constraints, and can be estimated as the proportion of tested parameter sets that were accepted using standard-EEM.In our simulation study, the SMC-EEM method was computationally more efficient for ecosystems with an estimated acceptance rate smaller than 0.005 (vertical dashed line in Fig 3), such that less than 1 in 200 proposed systems are feasible and stable.Here, the SMC-EEM method is faster than the standard-EEM method because fewer parameter values need to be trialled (Fig S2), making the SMC-EEM method more statistically efficient.Though standard-EEM can outperform SMC-EEM at high acceptance rates, both methods were computationally inexpensive in these scenarios.
In our simulation study, ensembles of 1000 feasible and stable ecosystems could be generated in less than 12 seconds via either method in networks with an acceptance rate greater than 0.005.
Additionally, we find that the ensembles of ecosystem models produced by the standard-EEM and SMC-EEM methods are consistent with each other in their estimated parameter distributions, eigenparameter distributions, and time-series predictions (Fig 4 ).For example, for

Case study 1: semiarid Australia network
For both SMC-EEM and standard-EEM methods, it took less than a minute to generate a 10,000 model ensemble for the semiarid Australia network, though the standard-EEM method was faster (Table 2).These computation times are consistent with our previously observed relationship between acceptance rate and computation time (Fig 3), as the estimated acceptance rate for this network is 0.11, which is much larger than 0.005.As the acceptance rate for the semiarid Australia network is high, only two SMC-ABC iterations were required to generate the ensemble, making the SMC-EEM method statistically inefficient.For this eight-species ecosystem network, with connectance c = 0.39, the standard-EEM method would be the best choice of method, as it is faster and easier to implement.

Case study 2: Phillip Island network
The standard-EEM method required 108 days to generate 100,000 ensemble members for the Phillip Island network; however, SMC-EEM completed this task in under 6 hours (Table 3).(It should be noted that these computational exercises were performed in parallel on 12 cores.)The SMC-EEM method produced the ensemble in 0.22% of the time required by standard-EEM because it required 0.13% of the simulations.This massive computational saving is consistent with the results presented in Fig 3, as the acceptance rate for the Phillip Island network was 1.7 × 10 −6 .The SMC-EEM method is thus the only practical option, out of the two methods, for this 22-species network.

Standard-EEM SMC-EEM
Computation time (sec) 9.3×10Here the blue and red densities match almost exactly, demonstrating that the outputs of the standard-EEM and SMC-EEM methods are consistent.
indicating that the information gained about the parameters is consistent between methods.

Case study 3: Great Barrier Reef network
Parameterising the Great Barrier Reef network [64] for 100,000 ensemble members took 21 hours for the SMC-EEM method (Table 4) and could not be practically computed using the standard-EEM method.Based on a preliminary analysis of 20 ensemble members, it took approximately 40 hours to generate a single ensemble member using standard-EEM with an acceptance rate of O(10 −9 ), hence an ensemble of this size would take years to produce (estimated 450 years).The SMC-EEM method is thus the only practical option, out of the two methods, for this 16-species network.Computation time and the number of simulations required to generate an ensemble of 100,000 models using SMC-EEM for the Great Barrier Reef ecosystem network.The standard-EEM method could not be used to generate an ensemble of this size within a practical time-frame, so the results in Table 4 are estimated using an ensemble of 20 parameter sets.
Since we cannot produce a standard-EEM ensemble, instead we compared the outputs of two independently obtained SMC-EEM ensembles to assess their reproducibility.This indicates if SMC-EEM can adequately sample the parameter space to produce a representative ensemble.The two independent SMC-EEM ensembles of 100,000 yield consistent results when comparing the predicted equilibrium abundances ( , indicating that the information gained about the parameters is consistent across independent runs.Such a result is very encouraging given that, for this case study, we have yielded a representative approximation of 118-dimensional space with 100,000 parameter sets each.Once an ensemble is obtained, a data-informed sensitivity analysis -such as the analysis of model sloppiness -can be used to identify the important parameter combinations for achieving feasibility and stability.For this Great Barrier Reef ecosystem network, each of the five most tightly constrained parameter combinations focuses on balancing the positive growth rates of basal species, or the self-regulation of top predators (see Fig S12 and Fig S13).
For example, using the information in Fig S12, the first eigenparameter can be expressed as where r i is the positively constrained intrinsic growth rate for species i, α i,j is the interaction parameter for the effect of species j on species i, and the relevant species for this equation are represented as T A for turf algae, M A for macroalgae, C for coral, D for detritus and U for urchins.This eigenparameter describes the balance between the proliferation of turf algae and the negative impacts on its abundance: mainly competition with macroalgae (including the proliferation rate of macroalgae), but also other lower trophic species including detritus, coral and urchins.Similar relationships can be seen for the five most influential parameter combinations (Fig S12).
This could indicate that given growth rate parameters are constrained to be only positive, and self-interactions between species are constrained to be only negative (self-regulating), the most important features for parameterising feasible and stable ecosystems are a high abundance of basal species and limited populations of top-predators.This well-observed result, while not a surprising insight, indicates how this analysis could be used to identify key drivers for developing feasible and stable ecosystems.

Discussion
In this work, we have presented and demonstrated a method that, for the first time, can rapidly generate ensemble ecosystem models for higher dimensional ecosystem networks.This new method, which we call the SMC-EEM method, can generate consistent ensembles to the current gold-standard method -standard-EEM -whilst being orders of magnitude faster for large and densely connected networks.On a Phillip Island case study [25] SMC-EEM reduced the computation time from 108 days to 6 hours, with indistinguishable time-series predictions, estimated distributions of model parameters and model parameter combinations.For a Great Barrier Reef network, we showed that standard-EEM was not capable of producing a large ensemble, such that SMC-EEM was the only practical option.This new method permits large and complex ecosystems -as observed in nature -to be practically simulated and analysed.
The best ecosystem generation method depends on the properties of the ecosystem network obtained for a connectance probability c = 0.5 as in Fig  3), the number of species is a much more intuitive measure and does not require prior calculations to estimate, unlike the acceptance rate.
When considering the eight-species semiarid Australian ecosystem network (with c = 0.39), based on the number of species it would be unclear beforehand whether SMC-EEM or standard-EEM would be faster (Fig 2).However, by estimating the acceptance rate as 0.11 (roughly 1 in 9 parameter sets tested were feasible and stable), Fig 3 clearly shows standard-EEM is expected to outperform SMC-EEM for this network.Practically, both standard-EEM and SMC-EEM are acceptable choices for this case study as they both generated the model ensemble within a minute; however, we must acknowledge that standard-EEM is a simpler process (making it more straightforward to implement in computer code) and generated the ensemble faster (Table 2).
In contrast, for the 22-species Phillip Island case study (with c = 0.18) and an acceptance rate of 1.7 × 10 −6 (roughly 1 in 600,000 parameter sets tested were feasible and stable), it is clear from both Figs 2 and 3 that SMC-EEM will be significantly faster.When applying standard-EEM to this system, we found it would take 108 days to generate the ensemble (Table 3), making SMC-EEM the only practical option of the two methods.
Lastly, the 16-species Great Barrier Reef network (with c = 0.4) and an acceptance rate of O(10 −9 ) (roughly 1 in a billion parameter sets tested were feasible and stable) is expected to be orders of magnitude faster according to the trends shown in Figs 2 and 3, and the observed computation times (Table 4) were within the credible ranges indicated by these trends.Here we note that the acceptance rate for this network is considerably smaller than for the Phillip Island network, and this could be due to being more densely connected, or the structure of the network itself [65,66].

Comparing the ensembles generated by the two methods
In this work, we used the estimated parameter distributions and time-series predictions to compare ensembles produced using the two methods.Additionally, the distributions of the stiff eigenparameters, obtained using an analysis of model sloppiness, provided an additional diagnostic comparing the similarity of the ensembles.The analysis of model sloppiness can indicate how similar the ensembles are, whilst accounting for parameter interdependencies [34,36] -a perspective not easily observed via the estimated marginal distributions, quantities of interest, or via time-series predictions.We, therefore, encourage the comparison of Bayesian inference method-generated ensembles via comparison of eigenparameter distributions alongside a comparison of marginal parameter distributions, as this provides a more comprehensive comparison.For the ensembles tested in this work, the eigenparameter distributions did not indicate any substantive differences (Figs 4, S4 and S10).
To our best knowledge, the SMC-EEM method outputs match those produced by the standard-EEM method (Figs 4-7, and S3-S10).However, users should be cautious when selecting the ensemble size for SMC-EEM.While standard-EEM always randomly samples from the parameter space to propose new values, SMC-EEM proposes new values relative to current values in the ensemble (via the multivariate Gaussian proposal distribution centered on the current parameter value within the MCMC algorithm).Hence, if there are not enough particles to cover a high-dimensional parameter space, the SMC-EEM method may not sufficiently explore the parameter space, thereby creating an ensemble that is not representative and is different to the distribution of ensembles produced by standard-EEM.This difference in ensembles occurred when using only 10,000 ensemble members for both the Phillip Island and Great Barrier Reef case studies; however, the ensembles were found to be consistent for 100,000 ensemble members.
For ecosystem networks that are not overly complex, it is possible to assess whether there are enough parameter sets by comparing results of SMC-EEM and standard-EEM.But for high-dimensional ecosystem networks, it will not be practical to compare outcomes since the latter will have impractically high computational costs (as for the Great Barrier Reef case study).We therefore recommend multiple independent runs of the SMC-EEM method and a visual assessment of whether the ensemble is reproducible (through the estimated parameter distributions, stiff eigenparameter distributions, and time-series predictions), especially if the ecosystem network is as large as the Great Barrier Reef network explored here (see Fig 7).Hence, while the foundational analysis presented here demonstrates that the SMC-EEM method finally unlocks analysis of higher-dimensional networks, its accuracy will be limited primarily by the size of the ensemble.

Implications for ecosystem network generation in nature
While the main motivation behind SMC-EEM was to maximise the capabilities of the conservation tool, this parameterisation regime could also be of use for drawing theoretical insights.There is substantial debate in the literature regarding which features of natural ecosystems make them more likely to be stable and feasible (e.g., [67,28,68]).Some literature suggests that larger and more connected networks are less likely to be feasible and stable [19,16,28] because there is a lower probability of randomly sampling parameter values to satisfy these two constraints.However, treating the probability of generating a feasible and stable system through random sampling as a proxy for the likelihood of these systems developing in nature creates a disparity: complex food webs are actually observed in nature, yet are perceived theoretically as highly unlikely.
Interactions in ecosystems have been shaped by processes such as co-evolution, niche partitioning, and resource competition [69], making it unlikely that interactions in ecological networks are random.Additionally, the "community assembly" hypothesis [70] suggests that the development and persistence of large food webs may be the result of natural selection of species survival (from an even larger pool of initial species) whose interaction strengths possess particular statistical properties [30,31].These theories imply that the probability of randomly sampling parameter values to satisfy feasibility and stability does not indicate the probability of the ecosystem existing in nature.
Thus, instead of being limited by the conceptual argument that the inability to efficiently generate plausible ecosystems via random sampling suggests these ecosystems cannot exist in practice, a key implication of the community assembly hypothesis is that we can instead take advantage of the full suite of Bayesian approaches (as performed here) to identify an ensemble of parameters that can plausibly generate large ecosystems in a computationally efficient manner.The SMC-EEM method also has the potential (beyond specific case studies) to broadly explore the consequences of community assembly on the general properties of ecosystem networks that form in nature [30].
Now that we can quickly produce large ensembles of parameter values that match ecological theory, insights can be drawn from the results.This method could be used to compare the relative difficulties in obtaining models that meet different constraints; for example, is there a lower probability of obtaining feasible ecosystem models, or stable ecosystem models?Alternatively, practitioners could compare the estimated parameter values, or values of interest -such as abundance correlations between species -across ensembles parameterised using different ecological theories.
In our implementation, we assumed parameters were independent in the prior distribution; however, SMC-EEM can accommodate other prior choices (e.g., prior parameter dependencies such as a trophic transfer efficiency constraint [18] or intraspecific density dependencies [71] can be implemented using conditional distributions).However, assuming prior parameter independence does not prevent dependencies from being inferred when fit to the constraints.By analysing the covariance of the parameters once incorporating the constraints (using a method such as the analysis of model sloppiness), the parameter combinations that are important for feasibility and stability could be assessed, as we have shown in our analysis.
When we applied this analysis to the Great Barrier Reef case study, it suggested that high populations of basal species and low populations of top predators were the most important factors for achieving the constraints.While this result is unsurprising, it is also somewhat uninsightful.This is likely due to the relatively uninformed prior distributions used in the analysis (following those of Baker et al., [18]) that forced intrinsic growth rate parameters to be positive and had equal magnitude across all species.Growth rate prior distributions with negative values, or other prior distributions, could easily be used within SMC-EEM instead.However, any effect of these prior distributions on the ensemble would in turn affect this analysis, such that we recommend testing various prior specifications to assess its impact.

Computational efficiency unlocks new opportunities for improving ecosystem model realism
In the present analysis, we considered ecosystem networks generated by generalised Lotka-Volterra equations -as this is the mathematical model that EEM has been thus far applied to [18] -however, alternative models have been proposed to offer more complex representations of ecosystem interactions in nature, such as different functional responses [72], or more recently, higher-order (i.e.beyond pairwise) interactions [73].The generalised Lotka-Volterra model is computationally convenient for EEM because the equilibrium feasibility and stability conditions are readily computable via algebraic formulae (Equations ( 4) and ( 5)).A different choice of model or constraints could be much more computationally expensive to simulate and include many more parameters for calibration -e.g., models with predator learning or prey saturation [72], or constraints on ecosystem dynamics outside of the system equilibrium [74].The statistical efficiency of the SMC-ABC-based approach underlying our SMC-EEM method therefore offers a significant advantage over standard-EEM if other (potentially more realistic) model types and constraints are used.We surmise that the computational gains shown in the present work are expected to extend beyond the generalised Lotka-Volterra models, and feasibility and stability constraints considered here.
Within our SMC-EEM method, the choice of discrepancy function drastically reduced the computation time in comparison to the standard-EEM method for larger networks (Fig 3).We used a simple discrepancy function to indicate a measure of how infeasible and unstable an ecosystem parameterisation is (Equation ( 7)); however, there may be better choices for the discrepancy function which further improve the efficiency of the method -such as replacement of the sums and absolute values in Equation ( 7) with other distance measures like the Euclidean norm, or weighting the infeasibility and instability sums differently.We leave these investigations for future work, especially as the results regarding the "best" discrepancy function may be highly model and constraint-specific.
When additional constraints are imposed on the ensemble -which further reduces the acceptance rate -maintaining computational efficiency carries even greater importance than seen here.Case studies in the literature have considered constraints in addition to feasibility and stability, including feasibility and stability for subsets of the ecosystem [18,24], randomly assigned species interactions [25,27] and additional constraints on combinations of parameters (e.g.trophic energy transfer constraints) [18,24].While the inclusion of such additional constraints in the SMC-EEM method is possible, it can require more careful algorithmic programming than the standard-EEM method.
Additional data on population estimates, where available, should be used to inform the model parameters further.Since the constraints we used to parameterise SMC-EEM are not directly observable, we can consider the resulting ensemble as a constraint-informed prior distribution [75] which can then be updated to incorporate any available time-series data in a subsequent Bayesian analysis.Furthermore, it would be interesting to analyse the effects on population forecasts of the constraint-informed prior compared to the relatively uninformed prior.Alternatively, the constraints within the discrepancy function could be redefined where additional information about species abundance estimates is available (see e.g., Neutel et al [76]).Parameter sets with equilibrium abundances near the estimates could be given a lower discrepancy according to a Gaussian distribution, or equilibrium abundance limits could be defined -as in the feasibility constraint (see Equation ( 8))to avoid unreasonable population sizes.Though connecting these data with feasibility and stability constraints, we hope that ensemble ecosystem modelling can be more accurate for conservation decision-making.

Conclusion
Through SMC-EEM we have unlocked ensemble ecosystem modelling for large and complex networks.Increasing the computational efficiency means that users only need to wait hours, rather than months, to analyse the risks and potential consequences of conservation actions in remote and understudied ecosystems with limited data.Through drastically improved computational efficiency, SMC-EEM brings new opportunities to explore more realistic ecosystem models and constraints to study the large and complex ecosystem networks that exist in nature.
Supporting information S1 Code.Software for using the methods and reproducing the results presented in the manuscript.The code used for this analysis was implemented in MATLAB (R2022b) and is freely available for download on Figshare at https://doi.org/10.6084/m9.figshare.23707119.v2.The code (495MB) associated with this manuscript contains an 'EEM Methods' folder for using the methods and a separate 'Results Replication' folder for reproducing the results and figures.This figure shows the medians (dots) and 7.5-92.5% quantiles (error bars) of computation times for producing the results.Note, the computation time for any one ecosystem network was capped at 10 4 seconds due to the computational burden of the simulation study.More densely connected ecosystems (higher value of c) increase the computation time of both methods and decrease the network size at which the SMC-EEM method becomes more computationally efficient than the standard-EEM method.The columns of this table can be interpreted using Equation (10).Notice, that the most important parameters are all growth rates for lower trophic species, and self-regulation for top predators.Practitioners should note that within this algorithm there are three tuning parameters to be selected, whose values can have substantial impact on computation time and posterior samples if poorly chosen.These tuning parameters, their potential effects, and our suggested values are outlined in Table S1.

Figure 2 :
Figure 2: Ensemble generation times for different network sizes.The computation time required to parameterise an ensemble of 1000 feasible and stable ecosystem models using both the standard-EEM and SMC-EEM methods.This figure shows the medians (dots) and 7.5-92.5% quantiles (error bars) of computation times.Note, the computation time for any one ecosystem network was capped at 10 4 seconds due to the computational burden of the simulation study.

Figure 3 :
Figure 3: Ensemble generation times for different acceptance rates.The parameterisation computation times of Fig 2 with respect to the acceptance rate of the standard-EEMmethod -an estimation of the probability of randomly sampling a feasible and stable system given a network with a pre-specified structure.Acceptance rates are logarithmically displayed from 100% acceptance (left) to very small percentages (right).Note that the computation time for any one ecosystem network was capped at 10 4 seconds to maintain practical computations in the simulation study.

Figure 4 :
Figure 4: Outputs of a simulated network.Example outputs from a randomly chosen ecosystem simulated in Fig 3 using ensembles obtained from the prior distribution (grey),standard-EEM method (red) and SMC-EEM method (blue).In each case, notice that the standard-EEM method and SMC-EEM method produce consistent results that are significantly different to the prior: (a) A six-species ecosystem network generated using c = 0.5.This example ecosystem has 27 parameters and a 0.037 probability of randomly selecting feasible and stable parameter values.(b) Estimated marginal parameter distributions estimated via both methods and compared to the prior distribution.(c) Marginal distributions of the nine stiffest eigenparameters for each ensemble obtained from an analysis of model sloppiness.(d) The distribution of equilibrium population abundances predicted for each ensemble.Note that the range of equilibrium populations for the prior distribution was O(10 4 ), so is very diffuse (and hence barely visible in these plots) compared to the ensemble-predicted distribution abundances.(e) Time-series predictions of population abundances for each ensemble of ecosystem models using randomly chosen initial conditions (median population prediction and 95% credible intervals shown).14

For
this network (Fig 5a), we found that the SMC-EEM method produced consistent estimated distributions of equilibrium abundances to the standard-EEM method (Fig 5b).We also observe similar estimated parameters (Fig S3), stiff eigenparameters (Fig S4), and time-series predictions (Fig S5) for the standard-EEM and SMC-EEM produced ensembles.

Figure 5 :
Figure 5: Equilibrium abundances for the semiarid Australia network.Ensemble ecosystem modelling for an ecosystem network representing semiarid Australia parameterised using standard-EEM and SMC-EEM methods.(a)The semiarid Australian ecosystem network[18] consisting of eight nodes and 33 parameters when represented as a Lotka-Volterra system.(b) Distributions of equilibrium abundances from the prior distribution (grey), standard-EEM (red) and SMC-EEM (blue) ensembles of ecosystem models.Note that the range of equilibrium populations for the prior distribution is very diffuse (and hence barely visible in these plots) compared to the ensemble-predicted distribution abundances.Here the blue and red densities match almost exactly, demonstrating that the outputs of the standard-EEM and SMC-EEM methods are consistent.

Figure 6 :
Figure 6: Equilibrium abundances for the Phillip Island ecosystem network.Ensemble ecosystem modelling for an ecosystem network representing Phillip Island parameterised using standard-EEM and SMC-EEM.(a) The Phillip Island ecosystem network [25] consists of 22 nodes, with connectance c = 0.18, and 110 parameters when represented as a Lotka-Volterra system.(b) Distributions of equilibrium abundances from the prior distribution (grey), standard-EEM (red) and SMC-EEM (blue) ensembles of ecosystem models.Note that the range of equilibrium populations for the prior distribution is very diffuse (and hence barely visible in these plots) compared to the ensemble-predicted distribution abundances.Here the blue and red densities match almost exactly, demonstrating that the outputs of the standard-EEM and SMC-EEM methods are consistent.

Figure 7 :
Figure 7: Equilibrium abundances for the Great Barrier Reef network.Ensemble ecosystem modelling for an ecosystem network representing the Great Barrier Reef parameterised using standard-EEM and SMC-EEM.(a) The Great Barrier Reef ecosystem network [64] consists of 16 nodes, with connectance c = 0.4, and 118 parameters when represented as a Lotka-Volterra system.(b) Distributions of equilibrium abundances from the prior distribution (grey), and two independent SMC-EEM ensembles (light blue and dark blue) of ecosystem models.Note that the range of equilibrium populations for the prior distribution is very diffuse (and hence barely visible in these plots) compared to the ensemble-predicted distribution abundances.Here the independent SMC-EEM ensembles are consistent, demonstrating reproducibility.

Figure S1 :
Figure S1: Computation time required to generate an ensemble for various network connectances.The computation time needed to generate an ensemble of 1000 feasible and stable ecosystem models using a connectance probability of c = 0.25 (left), c = 0.5 (middle) and c = 0.75 (right), for both the standard-EEM and SMC-EEM methods.This figure shows the medians (dots) and 7.5-92.5% quantiles (error bars) of computation times for producing the results.Note, the computation time for any one ecosystem network was capped at 10 4 seconds due to the computational burden of the simulation study.More densely connected ecosystems (higher value of c) increase the computation time of both methods and decrease the network size at which the SMC-EEM method becomes more computationally efficient than the standard-EEM method.

Figure S2 :
Figure S2: The number of simulations required to generate an ensemble for various network sizes.The number of parameter sets trialled to generate an ensemble of 1000 feasible and stable ecosystem models using both the standard-EEM and SMC-EEM parameterisation methods.This figure shows the medians (dots) and 7.5-92.5% (error bars) of simulation numbers for the models parameterised in Fig 2 of the manuscript.Note, the computation time for any one ecosystem network was capped at 10 4 seconds due to the computational burden of the simulation study.

Figure S3 :
FigureS3: Parameter distributions for the semiarid Australia ecosystem network comparing standard-EEM to SMC-EEM.Marginal parameter distributions estimated using both the standard-EEM method (red) and the SMC-EEM method (blue).Species labels represent dingoes (D), mesopredators (M), large herbivores (H), small vertebrates (V), grasses (G), invertebrates (I), fires (F) and soil quality (S).Notice that the blue and red densities match almost exactly, demonstrating that the outputs of the standard-EEM and SMC-EEM methods are consistent.

Figure S4 :
Figure S4: Eigenparameter distributions for the semiarid Australia ecosystem network comparing standard-EEM to SMC-EEM.Marginal distributions of the nine stiffest eigenparameters estimated via the prior (grey), standard-EEM (red) and SMC-EEM (blue) ensembles.Notice that the blue and red densities match almost exactly, demonstrating that the outputs of the standard-EEM and SMC-EEM methods are consistent.

Figure S5 :
FigureS5: Time-series predictions for the semiarid Australia ecosystem network comparing the prior, standard-EEM, and SMC-EEM.Time-series forecasts for the prior (grey), standard-EEM (red) and SMC-EEM (blue) ensembles simulated from a random initial condition.Depicted are the median (think lines) and 95% credible intervals (thin dotted lines) for each ensemble.Notice that the blue and red predictions are similar, demonstrating that the outputs of the standard-EEM and SMC-EEM methods are consistent.

Figure S7 :
Figure S7: Eigenparameter distributions for the Phillip Island ecosystem network comparing standard-EEM to SMC-EEM.Distributions of the nine most constrained parameter combinations (stiffest eigenparameters) determined by an analysis of model sloppiness of the standard-EEM ensemble.Here we compare the values of the eigenparameters for the prior (grey), standard-EEM (red) and SMC-EEM (blue) ensemble.

Figure S8 :
FigureS8: Time-series predictions for the Phillip Island ecosystem network comparing the prior, standard-EEM, and SMC-EEM.Time-series forecasts for the prior (grey), standard-EEM (red) and SMC-EEM (blue) ensembles simulated from a random initial condition.Depicted are the median (think lines) and 95% credible intervals (thin dotted lines) for each ensemble.Notice that the blue and red predictions are similar, demonstrating that the outputs of the standard-EEM and SMC-EEM methods are consistent.

Figure S10 :
Figure S10: Eigenparameter distributions for the Great Barrier Reef ecosystem network comparing two independent SMC-EEM ensembles.Distributions of the nine most constrained parameter combinations (stiffest eigenparameters) determined by an analysis of model sloppiness of a SMC-EEM ensemble.Here we compare the values of the eigenparameters for the prior distribution (grey), and two independent ensembles generated via the SMC-EEM algorithm (black and blue).

Figure S11 :
FigureS11: Time-series predictions for the Great Barrier Reef ecosystem network comparing the prior and two independently generated SMC-EEM ensembles.Time-series forecasts for the prior (grey), and two independently generated SMC-EEM (light and dark blue) ensembles simulated from a random initial condition.Depicted are the median (think lines) and 95% credible intervals (thin dotted lines) for each ensemble.Notice that the two blue predictions are similar, demonstrating that the SMC-EEM ensembles are consistent.

Figure S12 :
Figure S12: Five most tightly constrained parameter combinations for the Great Barrier Reef ecosystem network.The eigenvector values for the first five eigenparameters, rescaled to be between -1 and 1.These values are shaded such that the darker colours indicates a greater contribution of the parameter to the important parameter combinations.The columns of this table can be interpreted using Equation(10).Notice, that the most important parameters are all growth rates for lower trophic species, and self-regulation for top predators.

Figure S13 :
Figure S13: Eighty most tightly constrained parameter combinations for the Great Barrier Reef ecosystem network.The eigenvector values for the first 80 eigenparameters, shaded such that darker colours indicate a greater contribution of the parameter to the eigenparameter.Each row represents an eigenparameter (ordered from most sensitive to least) and each column represents a model parameter (grouped by type).Note that beyond the first five eigenparameters, there are no clearly interpretable trends.

Table 2 :
Computational requirements for the semiarid Australia network.Computation time and the number of simulations required to generate an ensemble of 10,000 models using both the standard-EEM and SMC-EEM methods for the semiarid Australian ecosystem network.

Table 3 :
Computational requirements for the Phillip Island network.Computation time and the number of simulations required to generate an ensemble of 100,000 models using standard-EEM and SMC-EEM for the Phillip Island ecosystem network.

Table 4 :
Computational requirements for the Great Barrier Reef network.
2; see Fig S1 for results with other values of c), or if less than 1 in 200 parameter values are feasible and stable when sampled using standard-EEM (acceptance rate of 0.005; Fig 3).While the acceptance rate of an ecosystem network, which encapsulates both the number of species and connectance, is a better predictor of computation time than the number of species in the system (see Figs 2 and