Your citation 5 propagates a perception of quality of Wikipedia (WP) content that is based on a flawed study design involving too little statistical power to be able to draw the conclusions of near equal quality between WP and the Encyclopedia Britannica (EB). This matter was been a staple of online discussions of the Nature "study" (search the topic in relation to the search term "sample size"), and so current scholarly discussion should not ignore this study's shortcomings. A simple thought experiment addresses the root issue. Picture a Venn diagram reflecting the intersection of the formal article coverage of the two encyclopedias. There are about 4.5M articles in WP, and perhaps 230,000 in the latest hardcopy EB (May 2014 estimates). For sake of argument, assume approximately 200K articles in the intersection, i.e, in the set of all articles in common between the two. Now, posit that there exist at least some articles in which content is not of comparable quality, within this intersecting set. See for instance the content on the Total Synthesis of Steroids, at each site (i.e., the non-illustrated 50 words at vs. the richly illustrated 7500 words at; all URLs in this comment accessed, 7 May 2014). The challenge then becomes, what sample size must be taken of the 200K intersecting article set to arrive—at some acceptable confidence interval—at the relative count or proportion of articles in each collection that is superior or inferior to articles in the other collection. That this initial sampling across the intersecting set would have to be random and otherwise unbiased is clear. The answer one would arrive at to this question of minimum sample size giving adequate statistical power depends on details of the metrics for quality and how the study is otherwise designed; nevertheless, it is clear that "the human selection of articles, including on the basis of comparable article word length, and selection of just 50 articles (0.025% of our estimated 200K intersecting set size), together ensure that the sampling required for a study to identify the Steroid and other articles clearly different in quality were not met by the "study" performed by Nature. (See the link to supplemental information at; the sample size and selection method of the Nature study is reviewed on p. 1 therein.) I suggest that the fairest rigorous evaluation of the information at present is that there is clear evidence of gross differences in quality between particular articles of the two encyclopedias, and that a properly designed and statistically valid comparative quality study remains to be performed.

Links to Encyclopaedia Britannica's objections to the 'Nature' study and the response of the journal to those objections are also available at the 'Nature' link suppled above, to access the supplemental information.

As the last author of the paper you commented on I would like to join Taha in his reply and add only on remark. The Nature comparison (however doubtful it was) was NOT addressed to the QUALITY but to the RELIABILITY of the science related articles. Unfortunately, there are very few science articles, which could compete with EB in quality (there are perhaps some). However, in a freely edited encyclopedia reliability is certainly a central issue.
A further illustrative example (in addition to the steroids, given above) was communicated to us: see the "extraction (chemistry) article at Wikipedia, and "extraction" sections in the "chemical analysis" and "separation and purification" articles at EB.

Thank you very much for your comment. I really appreciate your effort to make the point about the study in reference 5. While, your point is totally taken, I must say that our article however is not really relying on the results of that article and our conclusions are very well supported by our own data analysis. In other words, the validity of that reference and otherwise has no effect on the merit of our work.

