Skip to main content
Advertisement

< Back to Article

Fig 1.

Phylogenetic tree and estimated divergence times of 10 morphologically similar Tetrahymena species.

(a) Maximum likelihood species tree, using 198 one-to-one orthologs, 104,434 amino acid sites, and 1,000 bootstraps. Green boxes, estimated divergence times for each node; orange hexagons, number of shared ortholog groups for each node; gray bar, geologic timescale. (b) Overall cell morphology and (c) oral apparatus, as revealed by silver-staining. C, Cenozoic; M, Mesozoic; Mya, million years ago; N, Neoproterozoic; P, Paleozoic.

More »

Fig 1 Expand

Table 1.

Comparison of statistical data on the genomes of 10 Tetrahymena species.

More »

Table 1 Expand

Fig 2.

Top gene domains that contribute to the high MAC genome divergence in the 10 species.

(a) Percentage of species-specific genes in each species. (b) Heat map of the top 7 categories of domains found in species-specific genes: the bottom row (gray background) shows the total number of genes containing each domain category in all 10 species. CNBD, cyclic nucleotide-binding; GFR, growth factor receptor cysteine-rich; LRR, leucine-rich repeat; MAC, macronucleus; P-loop NTPase, P-loop-containing nucleoside triphosphate hydrolase; PK, protein kinase; TPR, tetratricopeptide repeat; WD40, WD40 repeat.

More »

Fig 2 Expand

Fig 3.

Pericentromeric and subtelomeric regions of MIC chromosomes are gene innovation centers.

Circos (http://circos.ca/) diagram mapping the frequency of various properties associated with rapidly evolving genes to the 5 chromosomes of the T. thermophila MIC genome. Chromosomes (after omitting IESs) were divided into approximately 1 Mb bins. Values were normalized for the total number of genes and plotted for each bin. SSG indicates the density distribution of all species-specific genes; y-axis is the number of genes. CSG indicates the density distribution of the most highly conserved genes (i.e., ortholog category X); y-axis is the number of genes. LRR indicates the density distribution of species-specific LRR genes; y-axis is the number of genes. Ka/Ks indicates the distribution of Ka/Ks ratios, plotted as median value of each bin. TD indicates the distribution of tandem gene duplication frequencies; y-axis is the percentage of tandem duplicated genes in this bin. GE indicates the gene expression level during vegetative growth (SPP medium), plotted as the median FPKM value. Note that the current chromosome-level assembly was generated based on short reads (e.g. Illumina), and the centromeric and some of subtelomeric regions are still incompletely assembled, which is the likely reason for the weak patterns seen at some chromosome termini (for example, both termini of chr2). chr2, Chromosome 2; CSG, conserved genes; FPKM, fragments per kilobase of exon per million reads mapped; GE, gene expression; IES, internal eliminated sequence; Ka/Ks, ratio of nonsynonymous to synonymous substitutions; LRR, leucine-rich repeat; MIC, micronucleus; SPP, Super Proteose Peptone; SSG, species-specific gene.

More »

Fig 3 Expand

Fig 4.

Unique features of LRR genes in Tetrahymena.

(a) LRR gene exon length distributions for all 10 Tetrahymena species. Every species shows an exon peak at 90 bp, representing the exactly 90-bp exon arrays. An inset shows the detailed exon distribution range from 85 to 95 bp in length. From right to left (inset), species are T. thermophila, T. malaccensis, T. elliotti, T. pyriformis, T. vorax, T. borealis, T. canadensis, T. empidokyrea (mosquito parasite), T. shanghaiensis, and T. paravorax. (b) LRR gene TTHERM_000586765, an example of a 90-bp exon array gene masked by at least 1 of the 8 de novo–identified MAC LRR gene CRSs. (c) Extreme phase 2 bias of introns among 90-bp exon containing LRR genes in 10 Tetrahymena species. The 10 concentric circles represent the 10 species, from inside to outside: T. thermophila, T. malaccensis, T. elliotti, T. pyriformis, T. vorax, T. borealis, T. canadensis, T. empidokyrea (mosquito parasite), T. shanghaiensis, and T. paravorax. (d) Highly variable numbers of 90-bp exons in different LRR genes in all 10 species. The numbers of 90-bp exons in different genes were used to make the dot plot. From left to right, species are T. thermophila, T. malaccensis, T. elliotti, T. pyriformis, T. vorax, T. borealis, T. canadensis, T. empidokyrea (mosquito parasite), T. shanghaiensis, and T. paravorax. The color scheme is the same as panel A. Numerical data underlying this panel are listed in S2 Data. CRS, consensus repeat sequence; LRR, leucine-rich repeat; MAC, macronucleus; RNA-Seq, RNA sequencing.

More »

Fig 4 Expand

Fig 5.

T. thermophila LRR genes can be sorted into 3 groups with different properties.

(a) LRR genes are sorted into 3 groups based on the presence or absence of 90-bp exons and on MAC CRS masking. (b) Ratios of species-specific to conserved genes among the 3 groups of LRR genes. Two asterisks indicate a significant difference between 2 groups (chi-squared test, p < 1 × 10−5). (c) Exon length distributions of the 3 groups of LRR genes. (d) Intron length distributions for the 3 groups of LRR genes. Colors in (c) and (d) are as in (b). CRS, consensus repeat sequence; LRR, leucine-rich repeat; MAC, macronucleus.

More »

Fig 5 Expand

Fig 6.

Differential distributions of the 3 groups of LRR genes and the 3′ terminal segment of Tt.REPs along the 5 T. thermophila MIC chromosomes.

Central ring: the 5 chromosomes in the T. thermophila MIC genome; group III: group III LRR genes; y-axis, number of genes. Group III: group II LRR genes; group I: group I LRR genes; REP: 54-bp conserved sequences at 3′ end of Tt.REPs; y-axis, number of Tt.REPs (represented by 54-bp conserved sequences). chr1, Chromosome 1; LRR, leucine-rich repeat; MIC, micronucleus; Tt.REP, T. thermophila REP retrotransposon.

More »

Fig 6 Expand

Fig 7.

Phylogeny and MIC chromosome distribution of 12 nearly identical 90-bp exons in different T. thermophila LRR genes give evidence of extensive ectopic recombination.

Right: intron/exon diagram of the 10 LRR genes containing the twelve 90-bp exons (shown in red) that share between 88 and 90 identical nucleotides. Listed above each gene: MIC chromosome location. L or R indicates the left or right arm of chromosome. Left: a maximum likelihood phylogenetic tree based exclusively on these twelve 90-bp exons. Note that (a) this is the largest group of nearly identical 90-bp exons and (b) TTHERM_001443819 and TTHERM_00001659049 both have 2 exons that belong to this group. Identical 90-bp exons share the same symbol: yellow asterisk (*) or number sign (#). chr3, Chromosome 3; LRR, leucine-rich repeat; MIC, micronucleus; mid-arm, near the middle of chromosome arms; NA, not available (the gene is located in still unassembled region of MIC genome); pCen, pericentromeric region.

More »

Fig 7 Expand

Fig 8.

Group III LRR gene repeats are the most common element flanking non-LTR REP retrotransposons in the T. thermophila MIC genome.

(a) Tt.REP sequence distribution across 5 MIC chromosomes in T. thermophila, normalized to the same length. (b) Example of a Tt.REP (non-LTR REP retrotransposon in T. thermophila) fragment flanking tLRR-MIC-CRS–masked segments of a functional group III LRR gene (TTHERM_01344670), containing a 90-bp exon array and supported by RNA-Seq gene expression data. (c) MIC sequences masked by tLRR-MIC-CRS are most frequently flanked on one or both sides by Tt.REP. Green bar and arrow, tLRR-MIC-CRS–masked loci; black bars, loci masked by the next most frequent repeat families or low-complexity sequences. Numerical data underlying this panel are listed in S2 Data. chr1, Chromosome 1; CRS, consensus repeat sequence; IES, internal eliminated sequence; LRR, leucine-rich repeat; MAC, macronucleus; MDS, MAC-destined sequence; MIC, micronucleus; non-LTR, Non-long terminal repeat; RNA-Seq, RNA sequencing; Tt.REP, T. thermophila REP retrotransposon.

More »

Fig 8 Expand

Fig 9.

Evolutionary model of the observed innovation in LRR genes with tandem 90-bp exons.

(a) Diagram of a Tetrahymena cell. (b) Key to LRR gene-related symbols. (c) Typical MIC chromosome showing the biased distribution of key genetic elements. Central green circle: centromere (not yet fully assembled and characterized). Red and blue shading: biased chromosomal distribution of youngest and most conserved genes, respectively. Darkest color: highest concentration. Pink dashed line above the chromosome: biased chromosomal distribution of TEs, REP included, and other repeated sequences. (d) Multiple exon-shuffling mechanisms proposed to explain how pericentromeric and subtelomeric regions of the MIC genome function as LRR gene innovation centers. (1) Unequal crossing over between 2 different exons of the same LRR gene leads to alleles with more and fewer tandem repeats. (2) Unequal crossing over between exons in 2 different LRR genes leads to exon duplications and deletions. (3) REP retrotransposition into a preexisting LRR gene (step 1), followed by possible (not yet demonstrated) REP-mediated retrotransduction of LRR gene repeats into another LRR gene (step 2) would lead to a net increase in number of LRR gene repeats. Pink line: transcript resulting from cotranscription of REP and 1 LRR gene repeat. Note that the right branch of represents a co-retrotransposition of REP and an LRR repeat, which lead to dispersal of the LRR repeats and could potentially mediate further ectopic recombinations. (4) REP copies, also being repeated sequences, can undergo unequal crossing over, with similar consequences as mechanism 2. (e) Representative product of the above mechanisms: LRR gene with long tandem arrays of 90-bp exons. LRR, leucine-rich repeat; MAC, macronucleus; MIC, micronucleus; non-LTR, Non-long terminal repeat; REP, REP-type retrotransposon; TE, transposable element.

More »

Fig 9 Expand