### Correlation doesn’t imply causation

#### Posted by kelman on 22 Sep 2015 at 00:23 GMT

This paper starts showing that there is a positive correlation between inflow and storage. This is obvious for those familiar with the equation of continuity. They know that inflow causes storage. But the other way around – storage causing inflow - as proposed by the authors, isn´t necessarily true. One of the first things one learns in a statistics class is that correlation doesn’t imply causation.
As observed by the authors, “the wetter the catchment the more of the rainfall is converted in runoff water”. Using the authors´ notation, this means that the inflow r into the reservoir is not only a function of the rainfall R, but also of the water stored in the catchment, mostly in the soil, C. That is the inflow can be represented as r(C,R).
The authors “postulate that the volume of water stored in the reservoir is a surrogate for the drainage basin condition”. That is, they use r (V,R) rather than r(C,R) in the continuity equation of the reservoir. In their words: “inflow depends not only on rainfall, but also on volume... we propose that reservoir water-volume is subject to drastic regime changes due to the underlying bistability of the system. Bistability is the condition in which there are at least two possible stationary states of a system for the same set of parameters. In our case, the parameter is rainfall and the variable that may assume two possible alternative stable states is the volume of water stored in the reservoir. If the reservoir is at high levels the catchment has more stored water and thus more of the rainfall will flow into the reservoir. On the other hand, for low water levels the most obvious change is that much of the water is absorbed by dry soil and thus a lower proportion of rainfall becomes stored water in the reservoir”. This is the core of the authors´ claim of a nonlinear feedback.
In fact, if the reservoir is at high level, the catchment may have or may not have more stored water, depending on the previous decisions about outflow, which in turn depend on the water demand and on the status of the other water producing systems of the Metropolitan Region. In case the catchment has indeed more stored water, probably the mean infiltration rate will be smaller and more rainfall will be converted into runoff, as suggested by the authors. On the other hand, the high level of the reservoir decreases the gradient of the water table, and therefore the groundwater flow from the marginal land into the reservoir (Darcy equation).
Conceptually, the emptying of the reservoir could cause a nonlinear sink effect in the surrounding groundwater as it is the case, for example, when the pumping rate in one or more wells surpasses recharge, resulting on lowering the water table of the non-confined aquifer and, eventually, to a new equilibrium regime.
In order to evaluate whether an analogous phenomenon exists in the Cantareira case, it would be necessary to model the reservoir – groundwater relationship. This was not done by the authors neither by this commentator. However it is fair to assume that in this particular case the feedback mechanism is negligible because the maximum surface areas of the reservoirs is only 4% of the total catchment area. Therefore, the occurrence of the two equilibrium regimes in the Cantareira case doesn´t seem to be grounded on the physical reality.
The authors´ attempted to compare two modeling options: one with and the other without the feedback mechanism. They tried to fit the models to the observed daily series in the time interval from May 31, 2014 to May 31, 2015. However they failed to do so “because simultaneous fitting of all five parameters did not converge, or provided unreasonable estimates”. Nevertheless they went forward with the comparison using a cumbersome estimation procedure and adding white noise into the equation. That is, they used a stochastic model to produce Monte Carlo simulations. Based on this difficult-to-reproduce approach, they got “5000 numerical simulations of each model, with parameters sampled from the posterior distribution of the Bayesian particle filter estimation”.
The results were displayed in Figure 8.
According to the authors, “the model that does not take into account the effect of volume on inflow underestimated the stored volume in most of the period, and did not predict the increase and further stabilization of stored volume since February 2015. The better fit supports the hypothesis that the ratio of inflow to rainfall depends on the volume. This feedback is caused by the interaction between rainfall and stored volume, a surrogate of the hydrological state of the catchment. This, in turn, substantiates our statement about the existence of alternative states due to a feedback process”.
In reality the stabilization of stored volume since February 2015 was due to a radical reduction on the reservoir outflow, decided by Sabesp (the reservoir operator). This could only be achieved because Sabesp had just finished the construction of several interconnections in the water distribution system of the Metropolitan Region aiming to increase the operational flexibility. It was a manmade phenomenon that certainly could not be modeled based on natural processes.
In addition, the differences between the black, orange and blue lines don´t seem to be relevant. What really strikes the reader is the widening of the blue shaded area, starting on February 2015. This means that, according to the model proposed by the authors, there is a huge uncertainty regarding storage. Therefore, even if the model was based on a physically sound assumption, it would be of little help for actual decision making. This is what the authors suggest in some parts of the paper, like for example “…in this work we are interested in characterizing the regime shift rather than anticipating it”.
Nevertheless, in the concluding remarks, the authors say that “our results and our study case show that the management of reservoirs should take alternative regimes into account and avoid a transition to low-volume regimes. Failing to do so represents a prolongated burden, extending well beyond the period of anomalous rainfall, because outflow has to be kept as low as possible until a backwards transition occurs. Therefore managers should act as another feedback mechanism in the socio-environmental system that keeps it in the desired regime notwithstanding external forces like climate anomalies. In not doing so, managers of the Cantareira system acted like one more external force that pushed the reservoir to a catastrophic shift”.
That is, based on a debatable model, to say the least, the authors proclaim the occurrence of a catastrophic shift in the Cantareira. Furthermore, they suggest that managers of the Cantareira system contributed to this alleged catastrophe.
In reality what is happening in the Cantareira isn´t the consequence of the trapping of the system in a low stable level. It is just the result of a rare hydrological event. Indeed, if one chooses a normal distribution to model the annual inflow to the Cantareira reservoirs – which is a reasonable assumption in view of the Central Limit Theorem - and uses data from 1930 to 2013 to estimate the parameters, the estimated probability of an annual inflow equal or smaller of actually occurred in 2014 would be 0.004 (recurrence interval of 250 years).
The fact of the matter is that, under uncertainty it is extremely difficult for any Government to divert much needed financial resources from education, security and health of the population in order to build infrastructure capable of enduring very unlikely events. Monday morning quarterbacks don´t have this problem.

Competing interests declared: I am professor of Hydrology at the Federal University of Rio de Janeiro and since January 2015 I became the CEO of Sabesp, the water company of Sao Paulo State and operator of the Cantareira reservoirs.

### Cantareira operators must consider critical transitions - response to Jerson Kelman

#### renatomc replied to kelman on 25 Sep 2015 at 13:01 GMT

Jointly written by the paper authors: Renato M. Coutinho, Roberto A. Kraenkel and Paulo I. Prado.

We welcome comments about our work and are glad to publicly discuss them in an academic forum. The debate on scientific research relevant to policy decisions is a much-needed step to reduce the huge research-implementation gap that plagues environmental policies in Brazil (Pardini et al. 2013).

Jerson Kelman disagrees with our central thesis that the Cantareira system went through a catastrophic transition to a low-efficiency regime. In what follows we will discuss the criticisms that were raised and, besides clarifying several technical points, we will show that they are either unsound or just show a lack of familiarity with the state-of-the-art in statistical modelling. We conclude upholding our point that critical transitions must be taken into account in managing water reservoirs. For the benefit of the readers we also provide references to standard texts in the area.

Our paper provides three lines of evidence for the occurrence of a critical transition in the Cantareira system: (1) a correlation between stored volume and rainfall/inflow ratio, (2) statistical indicators of transitions in the time series of stored water volume, and (3) a fit of a phenomenological model to data. In this comment we first answer to the concerns raised by Kelman on evidence lines 1 and 3. We then elaborate on the evidences not evaluated by Kelman, as we think that the three lines altogether provide a strong argument for the existence of a transition.

On the role of correlation in our results

Kelman incorrectly claims that we used a correlation between inflow and storage. Actually we proposed that storage is correlated with catchment efficiency - inflow/rainfall, not inflow rate. Figure 2 clearly shows that this was the case in the Cantareira system. That "correlation does not imply causation" is an unnecessary truism in a debate among experienced users of statistics. Nevertheless we state clearly the purpose of seeking the correlation (emphasis added) : "[...] in order to proceed in a simpler way, we postulate that the volume of water stored in the reservoir is a surrogate for the drainage basin condition. To verify that this is a sound assumption, we show in Fig 2 the ratio of water inflow to the rainfall in the Cantareira reservoir as a function of the volume". Once the assumption is verified, we proceeded by building a phenomenological model based on the observed correlation. Hence, the model does not depend on causation links between volume and efficiency, only on the validity of the volume as a surrogate of catchment condition for the case study, which is guaranteed by the correlation between them.

On model fitting and model selection

The concerns raised by Kelman about our statistical models suggest that our methods may be novel for part of the readers, and thus some technical details may need further clarification. In this section we give additional information on the underlying logic and robustness of our statistical approaches.

The model-based inference we used is a well grounded reasoning to link theory to data through statistical modelling (Edwards 1972, Burnham & Anderson 2002). Once each hypothesis is translated into a statistical model, log-likelihood ratios are used to express the support that the data provides to one model vis a vis the competing ones. So, the criterion to compare models is not simply looking at the curves, which is obviously subjective, but using the log-likelihood ratio, reported in the text and obtained from Table 1. In this case, this ratio was of 12.11, which means that the model which includes the rainfall x volume interaction was exp(12.11)=181679 times more plausible than the competing one.

Our fitting procedure is not a "cumbersome" or "difficult-to-reproduce approach", it is cutting-edge statistics. Furthermore, our analyses were ran in R, a powerful statistical programming environment which is widely used in applied scientific research. A large academic community worldwide uses R to make the latest statistical methods accessible in the form of open source code. The R packages we used were developed by leading researchers in statistical indicators of critical transitions (Dakos et al. 2012) and stochastic dynamic models (King et al. 2009). This code-sharing culture and many specific tools make R fully compliant to reproducible research. Accordingly, our results can be checked and reproduced by running the R codes we made available at https://github.com/cantar... , as mentioned in the "Data sets" section of the paper.

To fit the competing models, we estimated the three parameters of greatest interest, after fixing the remaining two to reasonable values, a rather simple approach that is quite justified to fit many-parameter models. We included white noise in the models to represent stochastic fluctuations of the system itself. Hence, the noise parameter expresses process uncertainty, in contrast to measurement errors, that are expressed by a separated parameter. To do so, we benefited from recent advances in fitting and evaluating partially observed Markovian processes (POMP) (http://kingaa.github.io/p...). An interesting side-effect of using POMP models was noted by Kelman, that points out that the selected model predicts "a huge uncertainty regarding storage". In our model, most of this uncertainty is due to the process and not to the measurement error, suggesting that environmental stochasticity should be taken into account in decision making, instead of being blamed or ignored. Finally, the comment that "[the reduction of outflow] was a manmade phenomenon that certainly could not be modeled based on natural processes" suggests that Kelman did not understand that the outflow term in our models is simply the empirical data publicly provided by SABESP. Any intervention by SABESP was already present in our model and is not, by itself, sufficient to provide an explanation to the data.

On indicators of critical transitions

Another strong evidence we present is the peak in the conditional variance shown in Fig 4 and explained in detail in the section "Model independent results". Kelman asserts that no critical transition has taken place, by attempting to erode the rationale behind the feedback assumed by our deterministic and stochastic models. However, he did not tackle this result, which does not rely on a particular postulated feedback, mechanistic or otherwise. This kind of peak is typical of phase transitions and cannot simply be ignored or attributed to some kind of dry spell. An easy entrance point to the vast literature on this topic can be found in the book by Scheffer (2009), while the references for the more recent developments and techniques are cited in the paper.

Concluding remarks

According to Kelman, what happened in the Cantareira system "is just the result of a rare hydrological event" namely, an unusually low inflow in 2014. This amounts to exclusively blame bad fortune for the water supply crisis. As rare as this event might be, and exact probabilities are still a matter for further debate, our point is that it did happen, and therefore requires an explanation. Rare events are not always unpredictable, and the best way to avoid them is to understand them. Science's role is to learn how and why such events occur, and thus the use of expressions like "Monday morning quarterbacks don´t have this problem" hides a misconception and diverts the attention both from scientific discussion and from the assessment of responsibilities. It goes without saying that PLoS One is not a venue for expressions of that kind.

We indeed provided an explanation to the observed phenomenon - critical transitions to a low-efficiency state - along with evidences that such a transition took place. In our paper we pointed out that operators of the Cantareira system acted as one of the forces in the socio-environmental system that pushed it to the transition. We do not know if they were aware of the risk of a catastrophic shift at that time, but from now on they can be. So we hope that our paper and this comment will encourage managers of water reservoirs to consider critical transitions when taking decisions and to implement strategies to avoid them.

References

Burnham, K. P., & Anderson, D. R. (2002). Model Selection and Multimodel Inference: A Practical-Theoretic Approach, 2nd ed. New York, Springer-Verlag.

Dakos, V., Carpenter, S. R., Brock, W. A., Ellison, A. M., Guttal, V., Ives, A. R., ... & Scheffer, M. (2012). Methods for detecting early warnings of critical transitions in time series illustrated using simulated ecological data. PloS One, 7(7), e41010.

Edwards, A. W. F. (1972). Likelihood – An Account of the Statistical Concept of Likelihood and its Application to Scientific Inference. New York, Cambridge University Press.

King A. A., Ionides E. L. , Bretó C., Ellner S., Kendall B., Wearing H., et al.. (2009) pomp: Statistical inference for partially observed Markov processes; http://pomp.r-forge.r-rpr....

Pardini, R., Rocha, P.L.B., El-Hani, C. & Pardini, F. (2013) Challenges and opportunities for bridging the research-implementation gap in ecological science and management in Brazil. Conservation Biology – Voices from the Tropics (eds N.S. Sodhi, L. Gibson & P.H. Raven), Wiley-Blackwell, Oxford, UK.

Scheffer, M. (2009) Critical Transitions in Nature and Society. Princeton, Princeton University Press.

Acknowledgements

Our thanks to Renata Pardini and Diogo Melo for important suggestions to this comment.

No competing interests declared.

### RE: Cantareira operators must consider critical transitions - response to Jerson Kelman

#### tausk replied to renatomc on 27 Sep 2015 at 01:15 GMT

I'm not an expert on hidrology, but I have a lot of familiarity with this particular set of data and I can follow differential equations. This article contains the best discussion I have seen about the crisis in the Cantareira reservoir. The case made by the authors for the regime shift announced in the title is very convincing and their analysis gives a more robust justification to what I can more or less directly see by looking at the data: since (approximately) march 2014, the water inflow/rainfall ratio in the Cantareira system has been very abnormally low. The system looks "broken".

While most of the criticism expressed in Kelman's comment above seems to me unfair, I do have some concerns of my own that are more or less related to part of what Kelman wrote. Let me explain. Let r denote the inflow rate, V the volume of water stored in the reservoir, C the volume of water stored in the catchment area and R the rainfall. Of course, I expect r to be a function of C and R (plus other variables, which can be modeled as random noise). The authors think of r as a function of V and R, instead, which is justified by the somewhat reasonable assumption of a correlation between V and C; the data gives support to this assumption. One can then model the relation between V and R using a differential equation and draw some conclusions about bistability and regime shifts and make some predictions: so far so good. However, I expect the dynamics which suffers a regime shift to be the dynamics involving r, R and C, not V; V is only being used as a surrogate for C and this works well as the variables are correlated. But the variable that humans can directly influence (by reducing the amount of water taken from the reservoir) is V and doing this should only work to avoid the regime shift as long as what happens to V actually influences what happens to C. So, the question is, does V influence C? It probably does, to some extent, but my impression is that this should be a very small effect for the Cantareira system, given that the area containing the V is less than 4% of the area containing the C. If am right then, though the whole analysis is interesting, it means that one cannot avoid the shift to the inefficient regime by reducing the amount of water taken from the reservoir. Reducing the amount of water taken from the reservoir would then only be relevant for the obvious reason of saving water for later.

No competing interests declared.

### RE: RE: Cantareira operators must consider critical transitions - response to Jerson Kelman

#### renatomc replied to tausk on 30 Sep 2015 at 14:16 GMT

Jointly written by the paper authors: Renato M. Coutinho, Roberto A. Kraenkel and Paulo I. Prado.

Firstly, we welcome the fact that Daniel Tausk acknowledges the evidence for the existence of bistable behaviour in the system and thus of feedbacks. We agree that even a model for C should have feedbacks (r(R, C)), which lay at the origin of the alternative stable states.

The comment brings up some interesting aspects of the problem, which are relevant for applied and theoretical issues. We have been working on understanding those, and we take here the opportunity to expand on some of them.

On the possibility of avoiding the transition reducing water withdrawal

We verified that the catchment went through a transition to a lower efficiency state. This state is related to the amount of water in the catchment (C), which is necessarily the result of a balance of water flows. Therefore, to know if reductions in the water withdrawal from the reservoir system could be enough to change the state of the catchment, we have to compare the scale of the outflow rates with the other flows in the system, irrespective of the reservoir's area being just 4% of the total. Using average values from 2004-2013 (see the "Data sets" section in the paper), the outflow rate was about 34m3/s. Average inflow rate from rainfall was about 112m3/s, while potential (maximum) evapotranspiration is of the order of 70m3/s (Martins 2011). So the reservoir outflow is of the same magnitude of the other two main natural flows in the Cantareira catchment. Therefore it is reasonable that the reservoir outflows participate in an important way in the water balance in this system, and more importantly so the lower the rainfall. Hence not reducing water withdrawal before the transition contributed to its unfolding.

Another relevant point is that near tipping points small effects may be of importance. Transitions are threshold phenomena, they either occur or not. Staying below the threshold, even by a small difference, is decisive, as the system can go back to the basin of attraction of the favorable equilibrium in a time-scale related to relaxation processes.

On phenomenological models

So far, we have been following the same rationale as Tausk, and referring to a variable C, the volume of water contained in the catchment, as the looked-for causal variable. However, it is very unlikely that the state of the whole catchment, with its many intricacies, can be captured by a single variable. We took the phenomenological approach of depicting the state of the catchment through its closest observed surrogate, the reservoir volume V, that leads to a workable equation summarizing the complexities of the system while retaining its key dynamical features.

An alternative to simple phenomenological models, full-fledged models, can easily become too complex, to the extent that features like bistability are obscured by the profusion of state variables. This approach - detailed hydrological models - has been used to forecast the state of the Cantareira in weekly reports since January 2015 (Cemaden, http://www.cemaden.gov.br ). These reports suggest that Cemaden has a comprehensive and updated set of meteorological and hydrological data for the catchment. However, they did not explore the dynamical consequences of their models, which is not surprising in view of the previous discussion. Unfortunately, no one else can make progress in this direction, since neither methods, nor data or codes are public.

Reference

Martins, C. A. (2011). Estimativa da evapotranspiração no estado de São Paulo com o modelo da biosfera SiB2. Master's Dissertation, Instituto de Astronomia, Geofísica e Ciências Atmosféricas, University of São Paulo, São Paulo. Retrieved 2015-09-29 from http://www.teses.usp.br/t...

No competing interests declared.

### Sabesp should focus on improving Cantareira models not dismantling them

#### gleisonstorto replied to kelman on 26 Sep 2015 at 00:16 GMT

It strikes me that your comment, Mr. Kelman, does not try to extract anything interesting from the study. In spite of the fact it contains some inconsistencies, which are debatable, it presents important models and data analysis that could be useful to manage the reservoir. Unfortunately, your posture seems to be a clear example of destructive criticism.
Science does not deplete the credibility of a specific concurrent study, it tries to learn from all of them and improve its comprehension of the phenomenon. The same should be valid for the management of a basic public resource such as water.
I am sure Sabesp has relevant models, data and studies regarding the Cantareira reservoir. So I urge you, as a scientist, to let the scientific community to contribute to them to prevent such collapses to occur in the future.
Even if it is a consequence of a rare event, as you defend, its occurrence should not be disregarded from now on.

Competing interests declared: Electrical Engineer from École nationale supérieure des télécommunications