A Benchmark of Parametric Methods for Horizontal Transfers Detection

doi:10.1371/journal.pone.0009989

Table 1.

Classification of the 6 gamma proteobacteria and the artificial genomes used in this study according to their distance to artificial E. coli.

More »

Expand

Table 2.

The sixteen horizontal transfer detection methods analyzed in this paper.

More »

Expand

Figure 1.

ROC-like curves of the 16 methods.

Each dot of a curve corresponds to the values of type I error (100-sensitivity) and type II error (100-specificity) for each value of r (see M&M). The best methods are those with the less errors, i.e. those that are the closest of the origin.

More »

Expand

Table 3.

Mean performances of all the 16 methods with “standard” model genomes.

More »

Expand

Figure 2.

Mean errors of 7 methods according to (A) origin, (B) overall quantity, (C) size and (D) recipient genome.

The mean error is the mean of type I (sensitivity) and type II (specificity) errors. It is presented here for the 7 efficient HT detection methods of each criterion (codon usage: CU.KL; dinucleotide frequencies: dint5; GC content: GCtotal and GC1-GC3; and tetranucleotide frequencies: oli.chi2, oli.KL and signature) according to four parameters. A: the origin. The unique donor genome of the HTs are ordered according to their distance to the host genome (E. coli) in terms of tetranucleotide frequencies – the closest on the left and the farthest on the right. B: the overall quantity of HTs in percentage of the genome. C: the size of the HTs. Small, Medium, Large and Very Large respectively mean 1 to 5 genes, 5 to 10 genes, 10 to 20 genes and 20 to 30 genes. D: the host genome, i.e. the genome receiving the HTs.

More »

Expand

Table 4.

Sensitivity, specificity and mean performance of the methods with HTs originating from real gamma-proteobacteria.

More »

Expand

Table 5.

Mean performance of the combination of 2 methods over the “standard” model genomes and over the “real” E. coli genomes.

More »

Expand