Table 1.
Annotation modules and outputs of CTT.
Fig 1.
A schematic illustration of the CTT algorithm.
The "x" indicates a genomic region that encodes a target family domain. The coordinates of “x” is defined by a tBLASTn search using the query of seed sequences of the target family. "Hit1, 2,…n" indicate the top hits from a BLASTx search using a 10,000 nt genomic DNA sequence containing “x” as query against protein sequences of previously-annotated family members. The process is iterated for up to six times to shorten the genomic sequence until a new locus containing “x” is discovered. The iteration stops if the GeneWise score is lower than 50 or no BLASTx hit is obtained.
Table 2.
The sensitivity and specificity of CTT in finding plant F-box gene members.
Fig 2.
Performance test on the CTT program.
(A) Annotation comparison of the F-box gene superfamily in 18 plant genomes between a previous work [5] and the output of CTT automatic annotation in this work. (B) Number correlation of new F-box and new BTB loci discovered in 18 plant genomes. The blue line indicates equal x axis and y axis values. The full names of the species along with their abbreviations are as listed in S1 Table.
Table 3.
CTT annotation of F-box genes in 18 plant genomes.
Table 4.
CTT annotation of BTB genes in 18 plant genomes.
Table 5.
CTT annotation of F-box genes in 5 non-plant genomes.
Table 6.
CTT annotation of BTB genes in 5 non-plant genomes.
Table 7.
Performance benchmark test of CTT in annotating Pkinase genes in 40 genomes.
Fig 3.
JBrowse images showing 34 truly new Pkinase loci identified by CTT in the Amborella genome.
The transcript of each CTT-annotated Pkinase locus was used as query to BLASTn the Amborella genome. The best hit was visualized in JBrowse available on phytozome. Species abbreviation: Ath, Arabidopsis thaliana; Osa, Oryza sativa; Cre, Chlamydomonas reinhartii.
Table 8.
CTT annotation results of Pkinase genes in 40 genomes.