Reader Comments

Post a new comment on this article

Correction, clarification and link to reproducible script

Posted by deevybee on 30 Dec 2019 at 06:42 GMT

I was prompted to re-run the bibliometric analysis reported in this paper after Dr Karla McGregor from Boys Town National Research Hospital contacted me about a discrepancy she found when attempting to replicate and extend the analysis. This concerned an error that arose from misplaced quotes in a search term. In checking the analysis, I also became aware of other issues in the reporting of the analysis that require clarification.

1. The erroneous datapoint is in Table 3, which gives the number of records in the period 1985-2009 retrieved on bibliometric search for different neurodevelopmental conditions. The number of records retrieved for the condition 'Developmental dyscalculia' was misreported as 229. The correct number is 142. The error arose because of misplaced quotes; the phrase "math* disorder*" had been entered without quotes, so the search retrieved records containing math* and disorder* in the title, rather than only the compound "math* disorder*". Developmental dyscalculia was already notable for the very low rate of research papers and funding it attracted: the corrected figure shows that its neglect is even more extreme than originally stated.

2. The way in which I had reported search terms was not clear. In the interests of saving space, I had used a backslash to refer to alternative terms, so that for developmental dyscalculia, Table 1 showed the search terms (in titles) as: "Developmental dyscalcul*"
OR "specific arithmetic*/math* disab*/retard*/disorder*" NOT dyscalcul*
In practice, some combinations yielded null results, so the actual combinations that were used were:
TI = ("developmental dyscalcul*"
OR ("specific arithmetic* disabil*" NOT dyscalcul*)
OR ("specific arithmetic* retard*" NOT dyscalcul*)
OR ("math* disorder*" NOT dyscalcul*)
OR ("math* disab*" NOT dyscalcul*))

3. In the original analysis, there was a problem with the Web of Knowledge search, in that it did not allow for 'Down syndrome' as a search term. Consequently the numbers for this condition were estimated, as explained in the paper. This anomaly has been corrected in the current Web of Science, so it is now possible to get a precise record count. This showed that the reported figure of 15,522 records in the 25 year period was substantially overestimated: the correct number is 11,210.

4. In comparing my results with those of Dr McGregor, it became clear that the specific databases used in the search also affected results. This is a point that has been made by Dallas et al (2018), who noted that, to optimise reproducibility of findings, authors should list the databases that are included in a search. No record was made of this at the time of doing the original study, but the results reported here are in reasonable agreement with those obtained using the following Databases: WOS, BCI, CCC, DRCI, DIIDW, KJD, MEDLINE, RSCI, SCIELO, ZOOREC. These are the current defaults at my institution when using the All Databases option, but will, as Dallas et al noted, differ across institutions.

5. I had hoped to produce a script that would allow for a fully reproducible version of the bibliographic data from this paper. Using the wosr package (Baker, 2018) in the R programming language (R Core Team, 2019), I recreated the record counts for all the conditions reported in the paper. However, with this package, it is possible to use only the Core Web of Science databases, and so the script retrieves a smaller number of records. Nevertheless, the overall pattern of findings can be replicated. The script and analysis are available on Open Science Framework: The databases used with wosr were: SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH, BKCI-S, BKCI-SSH, ESCI, CCR-EXPANDED, IC.

My thanks to Dr McGregor for drawing the initial error to my attention and sharing her analyses, Christopher Baker, for information on wosr, and also to Dr Bianca Kramer, Scholarly Communication Librarian at Utrecht University, who, via informal advice on Twitter, greatly facilitated my understanding of these issues.

Christopher Baker (2018). wosr: Clients to the 'Web of Science' and 'InCites' APIs. R package version 0.3.0. https://CRAN.R-project.or...
Dallas, T., Gehman, A., & Farrell, M. J. (2018). Variable bibliographic database access could limit reproducibility. Bioscience, 68(8), 552-553. doi:10.1093/biosci/biy074
R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

No competing interests declared.