Reader Comments

Post a new comment on this article

A methodological flaw and two errors

Posted by LeNormand on 10 Sep 2012 at 12:14 GMT

Your paper has a serious flow that results in two significant sources of error.

While using a linear regression is a fine way to show some linear correlation, it is done under a few basic assumptions on the result, and if those assumptions happen not to be met, the whole analysis is invalidated (which is unfortunately what happens here).
Those assumptions are:
• Normality of the errors.
• Independence of the errors distribution vs prediction.
• Linearity of the relationships.

While your paper don’t give a lot of data to judge on the third, the graph you attached (Figure 1) is enough to reject your linear regression on the basis of the first two:
• The errors are very obviously not normal, as they show much less variation on the “right” than on the “left”, with a worst left hand variation of -0.8 sigma. (the likelihood of having 67 point and none below -1 sigma is less than 0.001%).
• The errors are not independent from prediction: between -10% and +10% of your indicator (belief… - belief …) the standard deviation is of approximately 0.3 sigma (that is 0.3 the deviation of the lot, and I would judge about one fifth of the variation of the subset with an indicator of more than +10%. (this issue is named heteroscedasticity and it’s a classical reason for invalidating the result of any sort of regressions – linear or not, with assumed Gaussian errors or not).

Those two errors are each enough (both on quality and in quantity, when we see they are so large as to be eye-spotted simply on a single graph of a very small subset of data of your analysis (it may well be as well that the part of the linear regression that implies the Z-spread is just as flawed) that they completely invalidate the results.

From a qualitative point of view, you seem to be comparing apple and pears, namely two subset, one comprising mostly those with a score of less than 10% and one comprising most of the rest (and presumably a few of the smaller scores).
The first sample has a very low variation in your predicted Z-score while the other one has a huge one.
It might well be that this constitutes a result (admittedly a very small one) that you can discriminate #### types of countries (your work to understand what ### stands for, not mine) by such a criterion as you have shown here.
It might even well be that said ### is a good predictor of your indicator and that the result you think you have found (although the statistical analysis is invalid and you a jumping very fast from correlation to causality) is a cause of it.

Regards,

(hope the tone of this message is more academic)

No competing interests declared.

RE: A methodological flaw and two errors

azimshariff replied to LeNormand on 10 Sep 2012 at 18:41 GMT

The following is a response to a previous comment by the above commenter that had been removed by PLOS ONE staff. We will update with a revised comment in due time.


Non-normality of residuals can, indeed, affect the outcome of a regression analysis. In particular, when the assumption of normally distributed residuals is not met, the estimated standard errors may be too large, leading to a higher-than-nominal rate of Type-I error (that is, a p-value less than .05 may occur in more than 5% of samples taken from the null hypothesis population, where no effect exists and residuals are non-normally distributed).



To examine whether non-normality affected our results, we re-ran the analyses using “robust” standard errors. (Specifically, we used the “MLR” estimator in Mplus, which uses the Satorra-Bentler formula to correct standard errors based on the degree of non-normality in the data; Satorra & Bentler, 1994). The resulting pattern of significant findings was exactly the same as those we reported in the original paper. Thus, we can safely conclude that non-normality, which can cause errors of interpretation in regression, did not do so here.



Contrary to LeNormand’s comment “In a word, your work is completely invalid (and the results are most likely an artifact)” (which is actually 16 words, not one), the reported results are not artifacts of non-normality.



-Azim Shariff and Mijke Rhemtulla



Reference:

Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications for developmental research (pp. 399–419). Thousand Oaks, CA: Sage.

No competing interests declared.

RE: RE: A methodological flaw and two errors

LeNormand replied to azimshariff on 11 Sep 2012 at 08:35 GMT

So that you can inform us in your answer, have you checked after using MLR that the residues follow a Chi-squared law ?
It’s no more better to be assuming a Chi-square than a normal law for the purpose of regression if the residues don’t follow said law. (and on first look they don’t quite look like a Chi-squared either at least when regressing on the assumption or normality).

I guess it doesn’t make a difference on heteroscedasticity ?

Regards

No competing interests declared.