Reader Comments

Post a new comment on this article


Posted by pbio on 07 May 2009 at 22:18 GMT

Author: richard alexander
Position: statistician
Submitted Date: May 30, 2007
Published Date: June 6, 2007
This comment was originally posted as a “Reader Response” on the publication date indicated above. All Reader Responses are now available as comments.

I congratulate the authors on a well-written and interesting paper. I wish to point out though that because of limitations in the sample design that yielded the disease data and problems in the authors' treatment of this design in their analysis, the conclusions of this paper should be treated as tentative until a more careful analysis of their data is carried out.

It is not unusual in ecology that data collected for one project are used for another purpose altogether, a purpose for which the original sampling design is less than ideal. The authors cite the Australian Institute of Marine Science (AIMS) Long-term Monitoring Program as the source of their disease data. As is clearly documented in the AIMS annual reports (cited by the authors) the 48 reefs included in the monitoring program were chosen for logistical and historical reasons and are not a random sample. Particularly relevant here is the fact that the 48 reefs were not independently sampled but were visited in groups each year in five or six cruises. During a cruise the reefs in a particular geographic region were all visited in rapid succession within a few days of each other. A hiatus of a few weeks or months then followed after which another cruise commenced and the reefs in a different location were surveyed. Thus reefs that are spatially close were all sampled together and at roughly the same time of year. Clearly space and time are totally confounded in these data. Correctly accounting for this fact can have profound effects on parameter estimates and their standard errors, substantially decreasing the effective sample size.

The authors do state that they attempted to account for the repeated yearly measurements on individual reefs by fitting a GEE model but found the autocorrelation to be close to zero. This is not surprising because it is day to day correlation not annual correlation that is important in these data. They also claim to have accounted for the nested nature of their data by including sector "as a stratification variable". It's not clear what they mean by this as the only model results they report (their Table 2) gives no evidence that sector was incorporated in any way at all. In any case neither of these approaches is appropriate here.

The defining structure of their data is a spatiotemporal unit corresponding to a group of reefs measured at a specific time (a week or two of a particular year). Observations from the same spatiotemporal unit will be far less variable than observations coming from different units in the same year or from different years. One way to account for this in their regression would be with a mixed effects model in which a separate random intercept is included for each spatiotemporal unit.

Drawing valid inference from observational data requires great care. It will be interesting to see if the authors' conclusions hold up when their data are subjected to the more rigorous analysis they require.

Additional comments.
The coefficients shown in Table 2 must be for a multiplicative version of the model; an additive interpretation yields nonsensical predictions. Thus contrary to the authors' explanation, a positive coefficient in this model does not necessarily predict a frequency increase. Instead their model predicts that increased WSSTA will decrease disease prevalence in reefs with 45% coral cover or less. Only when coral cover exceeds 45% does increased WSSTA increase prevalence. This obviates the circuitous discussion of Fig. 2.

The authors' use of deviance as a goodness of fit measure for their negative binomial regression model is not correct. The guideline they cite derives from the deviance's asymptotic chi-squared distribution but this holds only in Poisson and grouped binary regression models and then only if there are no zero counts and few small counts. Because the authors carried out negative binomial regression on data with many small counts, it's not applicable.

No competing interests declared.