Table 1.
General features of Scherffelia, Tetraselmis and other core chlorophyte chloroplast genomes.
Fig 1.
Gene maps of the Scherffelia and Tetraselmis chloroplast genomes.
Filled boxes represent genes, with colors denoting gene categories as indicated in the legend. Genes on the outside of each map are transcribed counterclockwise; those on the inside are transcribed clockwise. The second outermost middle ring indicates the positions of the IR, LSC and SSC regions. Thick lines in the innermost ring represent the gene clusters conserved between the two chlorodendrophycean cpDNAs.
Fig 2.
Gene repertoires of the chloroplast genomes compared in this study.
Only the conserved genes that are missing in one or more genomes are indicated. The presence of a gene is denoted by a blue box. A total of 85 genes are shared by all compared genomes: atpA, B, E, F, H, I, cemA, clpP, ftsH, petB, D, G, L, psaA, B, C, J, psbA, B, C, D, E, F, H, I, J, K, L, N, T, Z, rbcL, rpl2, 5, 14, 16, 20, 23, 36, rpoA, B, C1, C2, rps2, 3, 7, 8, 9, 11, 12, 18, 19, rrf, rrl, rrs, tufA, ycf1, 3, 4, 12, trnA(ugc), C(gca), D(guc), E(uuc), F(gaa), G(gcc), G(ucc), H(gug), I(gau), K(uuu), L(uaa), L(uag), Me(cau), Mf(cau), N(guu), P(ugg), Q(uug), R(acg), R(ucu), S(gcu), S(uga), T(ugu), V(uac), W(cca), Y(gua).
Table 2.
Introns in the Scherffelia chloroplast genome.
Fig 3.
Gene partitioning patterns of the Scherffelia, Tetraselmis and other chlorophyte chloroplast genomes.
For each genome, one copy of the IR (thick vertical lines) and the entire SSC region are represented, but only the portion of the LSC region in the vicinity of the IR is displayed. The five genes composing the rDNA operon are highlighted in light green. The color assigned to each of the remaining genes is dependent upon the position of the corresponding gene relative to the rDNA operon in the cpDNA of the streptophyte alga Mesostigma viride, a genome displaying an ancestral gene partitioning pattern [56]. The genes highlighted in blue are found within or near the SSC region in this streptophyte genome (downstream of the rDNA operon), whereas those highlighted in light orange are found within or near the LSC region (upstream of the rDNA operon). The dark orange boxes denote the genes of LSC origin that have been acquired by the IRs of core chlorophytes (pedinophyceans, chlorodendrophyceans and core trebouxiophyceans). Note that, to simplify the comparison of gene order, some genomes are represented in their alternative isomeric form as compared to that used for the genome sequence deposited in GenBank.
Fig 4.
Extent of rearrangements between the Scherffelia and Tetraselmis chloroplast genomes.
These genomes were aligned using Mauve 2.3.1. Only one copy of the IR (pink boxes) is shown for each genome. The blocks of colinear sequences containing two or more genes are numbered as in Fig 1. Gene clusters 5 and 6 were retrieved as a single locally colinear block because their very small sizes did not allow them to be resolved in Mauve. Conversely, the gene cluster spanning the LSC/IR junction (cluster 1) was fragmented into three colinear blocks in Mauve because only one copy of the IR was included in this analysis and also because the two genomes were treated as linear instead of circular molecules (the genomes were linearized at the LSC/IR junction).
Fig 5.
ML phylogeny of chlorophytes inferred using the amino acid and nucleotide data sets assembled from 79 protein-coding genes.
The best-scoring RAxML tree inferred from the amino acid (PCG-AA) data set under the GTR+Γ4 model is presented. Bootstrap support (BS) values are reported on the nodes: from top to bottom or left to right, are shown the values for the analyses of the PCG-AA and the nucleotide PCG123degen and PCG12 data sets. A black dot indicates that the corresponding branch received a BS value of 100% in all three analyses; a dash represents a BS value < 50%. The scale bar denotes the estimated number of amino acid substitutions per site.
Fig 6.
Bayesian phylogeny of chlorophytes inferred using the PCG-AA data set assembled from 79 cpDNA-encoded proteins.
The majority-rule posterior consensus tree inferred with Phylobayes under the CAT+Γ4 model is presented. Posterior probability values are reported on the nodes: a black dot indicates that the corresponding branch received a value of 1.00 whereas a dash indicates a value < 0.95. The scale bar denotes the estimated number of amino acid substitutions per site.
Fig 7.
ML phylogeny of chlorophytes inferred using the nucleotide PCG12RNA and PCG123degenRNA data sets assembled from 79 protein-coding and 29 RNA-coding genes.
The best-scoring RAxML tree inferred from the PCG12RNA data set under the GTR+Γ4 model is presented. BS values are reported on the nodes: from top to bottom or left to right, are shown the values for the analyses of the PCG12RNA and PCG123degenRNA data sets. A black dot indicates that the corresponding branch received a BS value of 100% in both analyses; a dash represents a BS value < 50%. The scale bar denotes the estimated number of nucleotide substitutions per site.
Fig 8.
Shared gene pairs in chlorophyte chloroplast genomes.
The gene pairs that are shared by at least three taxa were identified among all possible signed gene pairs in the compared genomes. The presence of a gene pair is denoted by a blue box; a gray box refers to a gene pair in which at least one gene is missing due to gene loss. (A) Retention of prasinophyte gene pairs among core chlorophytes. The tree topology shown in Fig 7 was used to map losses of prasinophyte gene pairs. The characters indicated on the branches are restricted to those involving no gene losses; the characters denoted by triangles and rectangles represent homoplasic and synapomorphic losses, respectively. The full names of the gene pairs corresponding to the character numbers are given above the distribution matrix. The three chlorodendrophycean gene pairs highlighted in green and the pedinophycean gene pair highlighted in cyan are shared exclusively with prasinophyte genomes. (B) Gain of derived gene pairs among core chlorophytes. The six gene pairs highlighted in magenta denote synapomorphic characters uniting the Chlorellales and core trebouxiophyceans. Note that seven gene pairs (3'psaM-5'trnQ(uug), 3'trnQ(uug)-3'ycf47, 5'chlB-5’psbK, 3'chlB-5'psaA, 3'ftsH-3'trnL(caa), 3’rps4-5’trnS(gga) and 3'minD-5'trnN(guu)) could not be unambiguously included in this list of synapomorphies because at least one gene in each pair is missing in some taxa. Also note that the synapomorphic signatures of all highlighted gene pairs were confirmed using a larger data set including the gene pairs of all currently available chlorophyte chloroplast genomes.