More GenBank problems, possibly devalidating purported results

Posted by DasGrimm on 27 Nov 2014 at 13:58 GMT

Dear colleagues,

please apologise the third comment in a row on the same topic. I was a little overwhelmed about the inconsistencies in your GenBank submission. It is not uncommon that authors submitted a wrong sequence to gene bank (particular in the "old" days), but usually its a single one. It was already highly unusual to find a ITS shared by members of two different orders, hence, my decision to comment.

However, the four wrongly labelled sequences appear to be only the tip of an iceberg

Here's the blast report for another of your Rauvolfioideae accessions: Plumeria_alba JX856584, a fragment exhibiting an ITS2 sequence flanked by the end of 5.8S rDNA and 25S rDNA, portions are highly conserved within and between genera. I blasted the sequence because it deviated strongly from other accession of the same genus (including seven of yours)


You'll note the high similarity to other of your sequences, which cover members of about half a dozen angiosperm orders and two other families of the Gentianales. Note that not a single (available) accession is captured from other researchers despite available data on at least some of the genera.

The coding regions are highly unusual for an angiosperm rDNA. Megablast of the 25S strand only finds accessions included in the above report, which is highly unusual. The end of the 5.8S and start of 25S are highly conserved at genus to family level in angiosperms, hence megablasting should have retrieved many results.

Whatever this ITS2 sequence type is, it has probably not been amplified from a (functional) 35S rDNA cistron of an angiosperm genome.

bold I strongly recommend to retract all data you submitted to gene bank capture by megablast of JX856584. bold

Your data appear to be studded by problematic taxonomic associations, which would be a simple explanation for the low ability of ITS to recognise species that you report. Particular odd -- and this should have been recognised by expert reviewers and the editor of the paper as a hint towards problematic data used in your study -- is that you state that ITS2 performs equally poor than rbcL. The latter is a plastid gene which has very little to no variation between species (and sometimes even between genera). The ITS2 can be less divergent than the ITS1, and doe not necessarily allow identifying species (see e.g. Grimm and Denk, Taxon, 2010, for a comprehensive study of ITS variants in oak species) as purported in some recent bulk-barcoding studies, but even in those cases divergence in the ITS2 exceeds substantially the one found in the plastid rbcL gene.

best regards, Guido

No competing interests declared.