Building clone-consistent ecosystem models

Many ecological studies employ general models that can feature an arbitrary number of populations. A critical requirement imposed on such models is clone consistency: If the individuals from two populations are indistinguishable, joining these populations into one shall not affect the outcome of the model. Otherwise a model produces different outcomes for the same scenario. Using functional analysis, we comprehensively characterize all clone-consistent models: We prove that they are necessarily composed from basic building blocks, namely linear combinations of parameters and abundances. These strong constraints enable a straightforward validation of model consistency. Although clone consistency can always be achieved with sufficient assumptions, we argue that it is important to explicitly name and consider the assumptions made: They may not be justified or limit the applicability of models and the generality of the results obtained with them. Moreover, our insights facilitate building new clone-consistent models, which we illustrate for a data-driven model of microbial communities. Finally, our insights point to new relevant forms of general models for theoretical ecology. Our framework thus provides a systematic way of comprehending ecological models, which can guide a wide range of studies.


Response to Reviewer 2
A basic requirement of any mathematical model is that it should provide consistent results if applied to the same situation. While this would have been of intuitive to the founders of theroretical ecology, it seems to be increasingly forgotten in more recent models that can accomodate a flexible number of species. In those models one could arbitrarily split a species into two model species and would expect the same result as if they were represented by one model species. But not all current models meet this essential requirement. The authors present a novel framework by which this requirement can be met in practice.
Many readers will feel that they know and use these concepts already, but a large enough fraction of modellers seem to be unaware of it tha this publication is valuable.
The manuscript leaves very little to criticise in terms of content or presentation and would be publishable in the present form.
We thank the reviewer for his positive and encouraging assessment. Of course this model violates the clone consistency conditions. The authors would argue that I have made a mistake and should have used loss terms of the form (∑ ) instead of 2 . This is in principle a valid point and many readers will profit from seeing it. However, for the sake of argument I want to defend the 2 terms. I stated above that I want to model stably coexisting populations. We know that asymptotically stable coexistence is only possible if the different species occupy distinct niches (modeled in this case by the square terms). In this case the clone consistenty argument formally does not apply. I cannot argue that I can arbitralily split a population in two and represent it by two model variables, because the resulting populations would occupy the same niche and are hence outside the scope of my model. The takeaway from is that one cannot decide whether a model is good or bad by looking at the equations alone, because whether clone consistency is even a criterion depends on the detailed premises of the model. The authors are right when they point out it would be bad to violate clone consistency accidentally without being aware of the additional assumptions that this implies. But, on the other hand, I would be equally bad to enforce clone consistency when it is at odds with the explicitely states assumptions. Doing so would prevent us from exploring some very valuable mathematical models.
The wording in some places suggests that the authors are aware of this point, but I fear that it would be lost on the casual or less informed reader will probably miss this. It would be good to mention it prominenently (e.g. when the authors present their introductory example) and perhaps hint in the abstract and/or author summary that sometimes clone inconsistecy is intentional and very much desired.
We agree with the reviewer that it is important prevent an overzealous application of our rules. We would like to emphasize that we already devote an entire subsection of Implications (ll. 362 f) to discuss the limitations of our approach and how it can point to assumptions such as the distinct niches. Indeed, the reviewer's first example is equivalent to the one discussed in Eq. 27 (formerly Eq. 26). As for the second example (May's random matrix model), we concur with the reviewer that applying our work in the suggested way would be profoundly wrong and pose a severe misinterpretation of our work. However, we think that addressing such an example in the manuscript would require extensive elaboration and could easily become more confusing than helpful, in particular since the model does conform with our criteria and the presumed clone inconsistency arises in a different way.
Following this reviewer suggestion, we now reference our discussion of this issue more clearly early in the manuscript, specifically we now write in the abstract: "Although clone consistency can always be achieved with sufficient assumptions, we argue that it is important to explicitly name and consider the assumptions made: They may not be justified or limit the applicability of models and the generality of the results obtained with them. " And in the author summary: "We further discuss that clone inconsistency, which occurs in several prominent models, reflects strong, often implicit, assumptions and it is important to check whether these are justified. "

Response to Reviewer 3
I consider the constructive framework developed by the authors and its translation into instructions on how to build consistent models, a valuable contribution to the field of ecological population models. However, I have some major concerns regarding the presentation of the work, its lack of context with respect to previous results in this area and the clarity and completeness of the interpretation of the results in the context of nonlinear dynamical systems. These major issues would need to be addressed for the article to proceed. They are detailed in the following together with several minor points. Numbers in parenthesis refer to the line number in the manuscript.
We thank the reviewer for their assessment and the detailed constructive criticism. We have addressed all concerns, as described in detail below.

Issue 1:
While the constructive framework appears to be plausible, the soundness of its justification suffers from unclear language, making it difficult to follow. I advise the authors to get feedback from colleagues with little familiarity with the work or even to work with a writing coach to improve the flow and readability of the text. The language should also be revised in terms of scientific writing standards. For example, I do not believe 'inevitable' to be the proper vocabulary to describe properties of mathematical objects. Furthermore, the authors should decide on a target audience and reflect this in their presentation. The presentation is uneven in terms of the required background knowledge in order to comprehend the statements.
We agree that this work needs to be easily accessible. The manuscript was already reviewed by several colleagues and others. Following the reviewer's suggestion, we obtained another round of feedback from an uninitiated colleague and made numerous changes to make the text easier to follow.
It is intentional that Methods is more mathematically challenging than the main manuscript. Our intent is that everybody who may want to apply our results can understand the main manuscript (without Methods). In contrast, Methods contains the mathematical details required to arrive at these results, which require more mathematical expertise. At the end of the introduction (ll. 112 f), we now explicitly state that Methods is intended for the mathematically inclined reader.
We removed two instances of the word inevitable (ll. 86 and 432) since they did not add much. However, we kept it in: "Thus, the model violates our consistency criteria and the observed clone inconsistency ( Fig. 1) is inevitable. " (ll. 228 f). Here, inevitable does not describe a mathematical property, but concisely stresses the strong connection between a modelling "mistake" and its consequences: A model not complying with our criteria will certainly exhibit inconsistencies like in Fig. 1. We could replace inevitable with synonyms like inescapable or unavoidable, but we do not consider these any better.

Issue 2:
While the authors provide accessible guidelines on how to construct a consistent model and how to check existent models for consistency, the manuscript fails to state and clarify the mathematical properties that allow models of the proposed form to be consistent. In the authors summary the authors claim to have 'investigated the mathematical properties of clone-consistent models', but I do not see where in the manuscript these mathematical properties are described. The authors do provide criteria that impact functions should fulfil, but never explain why they do so or actually show that they do.
As we understand the reviewer's concerns, we devoted entire sections to them, which the reviewer is clearly aware of, going by their accurate summary of our results. Here is our direct answer to the reviewer's concerns as we understand them: The mathematical properties of clone-consistent models are: • Impact functions (i.e., elementary ingredients of clone-consistent models) form a functional algebra ( ) generated by linear combinations and constant functions. If and only if a function is in that algebra it fulfils our criteria.
• Models of population growth additionally must be of the form described by Eqs. 10 and 11.
Consequences and more accessible formulations of these properties are described throughout the section: What models are clone-consistent? Being built from linear combinations and constant functions (via addition, multiplication, and function application or taking a limit, respectively) is what allows impact functions (or the models using them) to be consistent. We made changes throughout the section The functional algebra of impact functions to make these connections clearer.
In addition, how the properties of the impact functions translate to the properties of the whole model and especially the solutions of the associated differential equations is left unclear. Furthermore, the authors state not to be able to make statements about the cause of inconsistency (line 392). I am wondering how the cause of inconsistency is connected to the fact that a nonlinear differential equations is unlikely to show additivity in the systems state variables (in the sense that ( ) + ( ) is not equal to ( + )) unless it is built in a way that is has this property? Just as the actual cause of the inconsistency in the motivating example on lions is not that logarithms are not allowed in impact functions but that these system equations are not additive in We do not expect that clone consistency or the use of impact functions allows for drawing general conclusions about the respective models, including about their dynamics. For example, conclusions such as "Cloneconsistent models are less likely to exhibit oscillations" are unlikely to follow from properties of impact functions alone. Even if such conclusions were possible, they would not be straightforward to draw and go beyond the goals of our work. We now elaborate on this point in more detail in the first paragraph of General implications (ll. 430 f). In particular, we explicitly discuss that clone-consistent models can still be nonlinear: "Not only can a non-linear function still be applied to the linear combination (e.g., in Eq. 6), but any existing non-linear model can be made clone-consistent by sufficiently strong assumptions. The central question is whether these assumptions are biologically justified. "

Issue 3:
The manuscript fails to address how the findings relate to previous research in the area of nonlinear differential equation model reduction based on the aggregation of several state variables into a single one. If the goal is to preserve the system dynamics exactly this is for example termed 'perfect aggregation' in the field of ecological modelling, or 'exact lumping' in the field of biochemical reaction network modelling (just to mention two out of the many fields). The aggregation of variables via sums, as also considered by the authors, is a special case of what is called aggregation by a linear map or linear lumping in the context of the aforementioned fields, respectively. The following references provide an overview on this topic but are by no means exhaustive: Tóth, J., Li, G., Rabitz, H., & Tomlin, A. S. (1997). The effect of lumping and expanding on kinetic differential equations. SIAM Journal on Applied Mathematics, 57 (6), 1531-1556 and references therein, especially Iwasa, Y., Andreasen, V., & Levin, S. (1987). Aggregation in model ecosystems. I. Perfect aggregation. Ecological Modelling,, 287-302. [URL] and references therein.
I am aware that the framework proposed by the authors goes beyond the aggregation of identical species to also include consistency regarding the aggregation of identical modes of interaction only, but I still believe that the manuscript would profit from a discussion of the connection to the above-mentioned work.
We thank the reviewer for pointing out this prior work. These works consider the conditions under which models can be simplified when some dynamical variables are redundant (or near-redundant). A typical case for this is indeed that a model is clone-consistent and there are actual clones in the model. In contrast, our work considers the properties a model must have for this situation to arise in the first place. Therefore, these works and ours are related but not special cases of each other. An illustrative difference is that parameters that describe how populations affect each other play a crucial role in our work but not in the references mentioned.
Nevertheless, we agree that it is helpful to mention this previous work and explain the differences to our approach. In the revised manuscript, we thus describe the relation between our and the suggested works in the introduction, writing: "Finally, if a model is clone-consistent and clones actually exist, it can be simplified; this is called aggregating or lumping [citation of Tóth et al. and Iwasa et al.]. "

Examples Lack of reader guidance:
In general, the manuscript in several places lacks paragraphs to guide the reader. With that, I refer to paragraphs especially in the beginning of a section that prepare the reader for what results to expect in this section and which plot out the strategy on how the authors will get there. A general 'the paper is structured as follows …' paragraph mentioning what to find in which section would also be help-

ful. For example, the section 'What models are clone consistent' right away starts out with impact functions without any information on how these functions relate to the models that the authors set out to find. Other examples are 'The Functional Algebra of Impact Functions' (490) and 'Deriving a New Model for UTI strains -the Legwork' (598).
We agree that more guidance for the reader is helpful. Following the reviewer's suggestion, we rewrote the last paragraph of the introduction (ll. 103 f) to guide the reader along the manuscript's structure and added short introductions to many sections (ll. 116 ff, 273 ff, 512 ff, 635 f, 652 f) including those mentioned by the reviewer.

Figure 3 cannot be understood since needed information is only given later in the manuscript. Upper part of the figure: The reader cannot understand why question 5 is answered the way it is, because the checked model is only introduced and discussed later in the manuscript. This is especially a problem for question 5, since the answers to this question cannot be interpreted without the information given later in the manuscript. The answers to question 5 are confusing as well since it is not clear whether the answer is yes (no) or depending on the application.
We thank the reviewer for pointing this out. The purpose of Figure 3 is to explain the general procedure for validating clone consistency. To this end, we show a specific example with arbitrarily chosen answers to the binary questions (3, 5, and 6). We did not intend that the reader understands the specific example answers we used for Question 5 at this point and do not consider this necessary to understand the general scheme. We now label the answers to Question 5 as "arbitrary example" to make this clearer.

Finally, when the question of ecological plausibility is dealt with in the manuscript, the connection to question 5 (or Figure 3) is not indicated (line 294 ff).
Thanks for pointing this out. We now refer back to Figure 3 at this point (l. 325 f).

The authors also might want to reconsider the relevance of question 4, since later in the manuscript (line 340) it is stated that rewriting an equation in this way is always possible.
Good point. We changed Points 4 and 5 of the recipe to: "If not, cast them into this form by choosing parameters to be zero. " and "Are these choices biologically justified? (Depends on application.)" Lower part of Figure 3: The meaning of the footnote within the figure is unclear.
We tried to clarify this footnote, which now reads: "This part only applies to models of population growth as in Eq. 10".

In question 2 omega which was previously defined as a basic impact function is used as symbol for a function of a basic impact function.
We fixed this and now use in analogy to Eqs. 6 and 13.

The section 'Implications contains' contemplation on various aspects of which most content wise fit in neither of the subsections titled 'Checking Clone Consistency to Reveal Implicit Assumptions' and 'General Implications for Model Design'.
We restructured this section and improved the subsection titles.

Ambiguities making it hard for the reader to grasp the essential concept:
The models described by equations (2)

and (3) cannot be the 'same model' (line 68 and caption of Figure 1) since one of them has two state variables while the other on has three.
We now use general model instead of model to make clear that this refers to Eq. 1 (in which the number of populations is a parameter) and not a specific realisation of this general model (as described by Eqs. 2 and 3). We refrained from removing this formulation altogether since the fact that the simulations are based on the same general model is crucial here and this wording clearly communicates this.

Incomprehensive paragraphs:
The paragraph starting from line 217 on building models with aggregated phenomenological observables: We rewrote and restructured this entire paragraph, addressing all the criticisms of the reviewer, in detail:

• 223: 'each of the experimentally determined interaction parameter'. Not clear what is meant with interaction parameters
here. Each single measurement? Interaction parameter seems to have a different meaning here than in line 42.
We now clarify that is "the number of experimental interaction observables, i.e., the number of measurements per (ordered) pair of populations" (ll. 243 f).
• What is the reasoning behind the general Ansatz in equation (13)?
We now start off with an even more general ansatz, and the former Eq. 13 (now Eq. 14) becomes a specific case. We elaborate that this ansatz arises from "[c]ombining Eqs. 10, 11, 6, and 5" (l. 240).

The section on how to determine parameters and functions (224ff) is unclear.
We added examples to better explain this process (ll. 254 f) and also expect that the way we restructured the entire paragraph will help the reader to better connect this part to the context.
• 231: What are building blocks in this context?
We replaced building blocks with basic impact functions (ll. 243 f) to avoid this confusion. We also expect that the restructuring of the paragraph helps to clarify this.
• 234: 'Finally, for some applications, a sum or more complex way to combine the basic impact functions may be appropriate (as opposed to the product used in Eq. 13). In New Model, we provide an example for this approach. ' What approach? The product version or the sum?
This refers to the entire approach outlined in the paragraph (though a product is used in this particular case). We now explicitly refer to the "approach" (l. 260). Further, the restructuring of the paragraph avoids that this reference is mistaken.

Non-causal relations between sentences that imply such relations:
Line 47: 'These new experimental scenarios often call for new ecological models that can incorporate the respective data. One reason for this is that there is no single answer as to how multi-parameter o higher-order interactions should be measured [3,6,16,20,23]. ' We now clarified this sentence, writing in lines 49 ff: "These new experimental scenarios call for new ecological models that can incorporate the respective data. Existing models are often not suitable here since there is no uniform answer as to how multi-parameter or higher-order interactions should be measured [citations]. " Overall meaningless sentences like: 'Moreover, clone-inconsistent models are diverse for the same reason that non-linear functions are. ' (394) We expanded the respective paragraph and instead of this sentence, we now write: "For illustration, the diversity of clone-inconsistent models may be compared to that of all numbers not divisible by seven. " We do not consider this sentence meaningless, as it can help readers to appreciate the diversity of clone-inconsistent models and why it is difficult (if not impossible) to make general statements about them.

It is not clear to me why the authors come up with a new label
(clone consistency) for something that has been described before. The authors state mere similarity to earlier descriptions of the problem (line 59). If there is indeed a difference between earlier descriptions of the problem and 'clone consistency', I would like to request a discussion of those differences by the authors to justify new wording.
From the existing labels we know (all cited in the introduction), we consider invariance under relabeling to be the best candidate. However, this label is not established in the field (it is only used by Murrel et al (2004) as far as we know) and we considered it too wordy and grammatically inflexible for the heavy usage required in our manuscript. For example, instead of clone inconsistency we would have to write something like lack of invariance under relabeling or lack of relabeling invariance.
Regarding the other alternatives: • Kuang (2002) and Arditi and Michalski (1996) use invariant under identification of identical species, which we consider too cumbersome. This is corroborated by both papers referring to it as Criterion 1 instead.
• Drossel et al (2001) uses invariance under aggregation of identical species. Like the above, we consider this too cumbersome.
• Rossberg (2013), van Leeuwen et al. (2013), and Vallina (2014) use "common-sense" criterion, which is cumbersome, not descriptive, and bears the risk of coming off as arrogant in the context of our manuscript.
Hence, we prefer to keep the label clone consistency.
2. It is not clear if the proposed framework can also be used for models that model the mechanism of interaction, e.g. for consumer resource models. If so, how does the proposed framework integrate with commonly used impact functions?
While most of our examples focus on models where all the dynamical variables are populations, we never restrict ourselves to this case. In Implications, we discuss the application of our framework to consumerresource models and similar (ll. 412 f) and how impact functions relate to mechanisms (ll. 384 ff, 424 ff, and 445 ff.). The extension of our framework in this direction is straightforward: one simply uses impact functions to describe the impact of an ecosystem on a resource instead of a population. We now included references to consumer-resource models and similar at appropriate points in the manuscript (ll. 128 f and 210). Also see the next point.
The authors should unambiguously state in the introduction the latest for which types of models this framework can be used, so readers can decide early on whether the paper is relevant for their research.
To clarify this point, we now write in the introduction (ll. 112 f): "we discuss how our framework applies to all models involving multiple populations". We understand that this is rather vague. However, the more precise answer "every model that contains impact functions" is not understandable to the reader at this point. Also, while there are models to which our framework applies in general, but for which there is no danger of clone inconsistency (e.g., mechanistic models featuring interaction mechanisms that are exclusive to a pair of species/resources), we again cannot reasonably discern these cases without explaining central parts of our framework. We agree with the reviewer and that was not the intended meaning. To avoid this misunderstanding, we now write: "One sanity check for such models is to virtually split a population […]" (ll. 21 f).

The authors should clarify how their results for the system translate to properties of fixed points
We presume that this refers to the case study. As we elaborate at the end of the Supplement S2, the fixed points of the two models are equal, if we assume that the growth term does not become zero and < 1, which is the predominant case. If > 1, things become complicated due to case distinctions in the existing model (Eq. 21) though a certain similarity of fixed points can be expected. We now briefly reference this in the main manuscript (ll. 358 f).
We do not expect a more detailed analysis of this point to be worthwhile, in particular since it is not central to our manuscript. Also see our response on the general possibility of deducing dynamical properties from clone (in)consistency above.

As indicated to the comment on reader guidance, the methods
part would also benefit from more supportive text to help the reader understand what to expect from the section.
We now explicitly refer to the subsections of Methods in the main manuscript to point the reader directly to the right subsection and briefly describe the contents of the Methods section at the end of the introduction (ll. 113 f). With the exception of The functional algebra of impact functions and Proof: generates (which are clearly linked), the subsections of Methods have no overarching narrative. Thus, an introductory paragraph would mostly contain subsection titles. Instead, we reworked the first paragraph of each subsection in Methods to better guide the reader.
The subsection about notation indicates a comprehensive symbolic language that is used rigorously. This is not the case e.g.: 1. the use of mathfrak symbol sequences in Definition 1.
We now clarify that the notation overview does not fully extend to Proof: generates (ll. 509 f), as it requires a considerable amount of special notation that is only relevant to that subsection and would only confuse readers who do not engage with it.