Distinct Gene Number-Genome Size Relationships for Eukaryotes and Non-Eukaryotes: Gene Content Estimation for Dinoflagellate Genomes

Yubo Hou; Senjie Lin

doi:10.1371/journal.pone.0006978

Abstract

The ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log₁₀-transformed protein-coding gene number (Y′) versus log₁₀-transformed genome size (X′, genome size in kbp) were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y′ = ln(-46.200+22.678X′, whereas non-eukaryotes a linear model, Y′ = 0.045+0.977X′, both with high significance (p<0.001, R²>0.91). Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%–1%) compared to higher and relatively stable percentages in prokaryotes and viruses (97%–47%). The eukaryotic regression models project that the smallest dinoflagellate genome (3×10⁶ kbp) contains 38,188 protein-coding (40,086 total) genes and the largest (245×10⁶ kbp) 87,688 protein-coding (92,013 total) genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species.

Citation: Hou Y, Lin S (2009) Distinct Gene Number-Genome Size Relationships for Eukaryotes and Non-Eukaryotes: Gene Content Estimation for Dinoflagellate Genomes. PLoS ONE 4(9): e6978. https://doi.org/10.1371/journal.pone.0006978

Editor: Rosemary Jeanne Redfield, University of British Columbia, Canada

Received: May 19, 2009; Accepted: August 14, 2009; Published: September 14, 2009

Copyright: © 2009 Hou, Lin. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This work was supported by the NSF grants OCE-0452780 and EF-0626678. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

An increasing amount of evidence supports a general positive correlation between gene content and genome size in prokaryotes and small eukaryotes, but whether this trend applies to all eukaryotes has been questioned and remains to be investigated [1]–[3]. As genome size can be measured easily, a robust correlation between gene content and genome size would provide a simple tool for predicting gene contents of not-yet sequenced genomes such as those of dinoflagellates. Dinoflagellates are one of the largest algal groups in the ocean, contributing significantly to oceanic primary production and coral reef building. Dinoflagellates are ecologically and economically important also because many of them form harmful algal blooms and even produce toxins. Among many unique characteristics, dinoflagellates possess unusually large genomes [4]. Although smaller genomes may occur in some yet unrecognized dinoflagellates [5], the typical dinoflagellate genomes are larger than most eukaryotes examined to date. The smallest documented dinoflagellate genomes are found in the coral reef symbiont Symbiodinium spp., ranging from 1.5 to 4.8 (average ∼3) pg DNA per haploid genome [6], while the largest (250 pg DNA per haploid genome) is found in Prorocentrum micans [7]. Equivalent to 3–245×10⁶ kbp per haploid genome, dinoflagellate genomes are about 1–77 fold that of the human haploid genome, and greater than any other algal groups (∼13–200×10³ kbp) by a factor of hundreds to thousands [6]–[10]. It has been suggested that the large fraction of the dinoflagellate genomes are nonfunctional repeated DNA sequences [9], [11]–[15]. How many genes are encoded in the genomes of these unicellular and seemingly simple organisms remains a question, which potentially bears significance on eukaryotic genome evolution. Information on gene contents of dinoflagellate genomes will allow researchers to gain understanding on how the large genomes favor or disfavor these organisms in their wide range of habitats.

Unfortunately, the infeasibility of sequencing these gigantic genomes with the current technology has hindered the progress in understanding dinoflagellate gene content. The next generation technologies such as 454, Solexa, or SOLiD™ are promising in reducing the enormous costs needed to sequence a dinoflagellate genome. However, the challenge in assembling the relatively short fragments is still insurmountable especially because in dinoflagellates many genes occur in numerous highly similar copies [16], [17]. Predictably, it will not be so soon before a dinoflagellate genome can be completely sequenced and accurately assembled to give a correct gene count. Any indirect approach to provide gene content estimate is desirable presently.

Taking advantage of the rapidly growing genome sequence dataset, we analyzed the relationship between gene content and genome size in all sequenced life forms. We then used the resultant eukaryotic regression equations to estimate gene content for dinoflagellate genomes. In light of high gene copy numbers reported for various dinoflagellates, implications of the high gene numbers and possible evolutionary mechanisms giving rise to the enormous genomes in this phylum is discussed.

Methods

Data collection

Data up to date by February 2009 were retrieved from the Reference Sequence (RefSeq) collection in the National Center for Biotechnology information (NCBI; http://www.ncbi.nlm.nih.gov), the Integrated Microbial Genomes (IMG) system in DOE Joint Genome Institute (JGI; http://img.jgi.doe.gov), and peer-reviewed publications (Supplemental Table S1). Dataset included total number of nucleotide base pairs (i.e. genome size), number of protein-coding genes, and total number of genes (including protein-coding, rRNA, and tRNA), gene-coding percentage (percent of DNA bases that codes for genes in a genome) for 55 completely sequenced eukaryotic genomes and 1055 non-eukaryotic genomes including prokaryotes (478 from bacteria and 60 from archaea), viruses (260), and organelles (231 from mitochondria and 26 from chloroplasts). For gene-coding percentage, only data published in peer-reviewed articles were used in the analysis as data from JGI included introns and other untranslated regions and significantly overestimated gene-coding percentage in large eukaryotic genomes (Supplemental Table S1). Incomplete or draft genome sequence data were excluded from this study to avoid potential errors.

Regression analyses and dinoflagellate gene content prediction

The genome size and gene number datasets were subject to Shapiro-Wilk and Kolmogorov-Smirnov normality tests using SPSS 15. When normality was violated, data were logarithmic-transformed. Regression analyses for logarithmic-transformed protein-coding (or total) gene number (dependent variables) versus log genome size (independent variable) were conducted using linear, logarithmic, and power regression models in SPSS 15. The intention was to seek an overall correlation for all genomes, but if it failed, to seek separate correlations for separate groups of genomes (e.g. eukaryotes and others). The different regression models were compared based on significance level and R², and the best-fit model was selected. The established regression models were then used to predict dinoflagellate gene number based on documented genome size data (3–245×10⁶ kbp). Dinoflagellate gene-coding percentages were estimated based on this formula: (total gene number x average gene length/genome size)×100%, where average gene length was approximated as 1.346 kbp, a value previously found highly conserved in eukaryots [18].

Results

Distinct correlations between genome size and gene content for eukaryotes and non-eukaryotes

In the dataset we collected, the sequenced eukaryotic genomes ranged from 373 to 3,175,581 thousand base pairs (kbp) in size, while the genomes of non-eukaryotes (including bacteria, archaea, viruses, mitochondria, and chloroplasts) were substantially smaller, i.e., 2.4–9949.9 kbp (or kilobases in the case of single-stranded viral DNA or RNA) (Figure 1A). Correspondingly, total gene numbers were higher in eukaryotes than in non-eukaryotes (Figure 1A). The Shapiro-Wilk and Kolmogorov-Smirnov normality tests showed that the eukaryotic and non-eukaryotic genome sizes and total gene number were not of normal distribution. Thus, logarithmic-transformed data were used in further analysis.

Download:

Figure 1. Genome sizes, protein-coding gene numbers, and gene-coding percentages of eukaryotic, bacterial, archaea, viral, and organellar genomes.

(A) Genome size (shaded boxes) and number of protein-coding genes (open boxes). Total gene number is very close to protein-coding gene number and is not shown here. (B) Genome gene-coding percentage (fraction of DNA that constitutes genes). The lower and upper boundaries of the box indicate the first and third quartiles (or 25th and 75th percentiles) of each dataset, and the middle line in the box indicates the median value. The whiskers above and below the box indicate the 90th and 10th percentiles.

https://doi.org/10.1371/journal.pone.0006978.g001

When the log₁₀-transformed data of gene number were plotted against log₁₀ genome size, two distinct relations appeared: eukaryotes in one and non-eukaryotes in the other, with markedly different slopes emerging from initial linear regressions (Fig. 2A). Therefore, further multi-model analyses were performed separately for these two groups. For non-eukaryotes, the linear regression model was best fit (p<0.001, highest R²) among all the different models examined (Table 1). For eukaryotes, the log₁₀-transformed data best fit a natural logarithmic (ln) regression model (Table 1, Figure 3). As the protein-coding gene number was generally very close to the total gene number in each genome, similar significant positive correlations were found for total gene numbers in both eukaryotic and non-eukaryotic genomes (Table 1), although only the protein-coding gene number is shown in the figures (Figure 2A, 3).

Download:

Figure 2. Distinct relationships between genome features in sequenced eukaryotes and non-eukaryotes.

All correlations were highly significant (p<0.001). (A) Protein-coding gene number vs. genome size regression lines on log scale. Separate regression lines were yielded for eukaryotes (blue circles) and the non-eukaryotes (prokaryotes, viruses, and organelles; other symbols). (B) Gene-coding percentage vs. genome size on log scale. Note the negative trend for the eukaryotic genomes. The projected gene-coding percentage for the smallest (Symbiodinium sp., 1.80%) and largest dinoflagellate (Prorocentrum micans, 0.05%) genomes calculated based on reported average eukaryotic gene length (1.346 kbp) are shown for comparison. The trend for the non-eukaryotes is almost horizontal except for the outliers from some organelles.

https://doi.org/10.1371/journal.pone.0006978.g002

Download:

Figure 3. Logarithmic regression model for log₁₀-transformed eukaryotic gene number (y′) versus log₁₀-transformed genome size (x′).

Range of dinoflagellate genome size (3×10⁶–245×10⁶ kbp) is indicated by the shaded areas. The predicted gene numbers for the recognized smallest (38,188) and largest (87,688) dinoflagellate genomes correspond to their gene-coding percentages shown in Fig. 2B.

https://doi.org/10.1371/journal.pone.0006978.g003

Download:

Table 1. Summary of regression models with best fit models for each group italicized.

https://doi.org/10.1371/journal.pone.0006978.t001

On the contrary, the gene-coding fraction of the genome, i.e., gene-coding percentage, showed a different trend against genome size than the gene number trend (Figure 1B, 2B). In eukaryotes, the gene-coding percentage declined from 81.6% to 1.2% as the genome size increased (Figure 2B, Supplemental Table S1). The gene-coding percentage in non-eukaryotes was generally higher (97%–47%) and varied markedly less with genome size (Figure 1B, 2B) than in eukaryotes. The only exceptions were the organellar genomes, which exhibited a substantially lower gene-coding percentage than prokaryotes and viruses, indicating disproportionate loss of coding sequences during organellar genome reduction.

Dinoflagellate gene content estimation

The high R² and low p values (<0.001) in the log₁₀ gene number versus log₁₀ genome size regression models (Table 1) suggested that the empirically derived correlations were highly significant and could be used to make valid predictions of gene numbers. As the smallest recognized dinoflagellate genome (3×10⁶ kbp, in Symbiodinium spp.) falls within the range of genome sizes used to derive the eukaryotic correlation, the regression equation can be applied directly, which gave 38,188 protein-coding (40,086 total) genes per genome. For the largest documented dinoflagellate genome (245×10⁶ kbp, in P. micans), the empirical regression equation needed to be extrapolated with the assumption that the same correlation holds for larger genomes. As a result, the gene number estimate was 87,688 protein-coding (92,013 total) genes (Figure 3). Based on the previously reported average eukaryotic gene length, 1.346 kbp [18], these gene number estimates corresponded to 1.80% and 0.05% respectively for the smallest and the largest dinoflagellate genomes (Figure 2B).

Discussion

Distinction and robustness of regression models

Statistical analyses on up-to-date sequenced genome data show the lack of a universal correlation covering all life forms, in agreement with previous studies [1]–[3]. Our results further present evidence, for the first time, of an overall correlation in eukaryotic genomes between log₁₀ gene number and log₁₀ genome size. The best-fit regression model for log₁₀-transformed eukaryote data is a log_e function and that for log₁₀-transformed non-eukaryote data is a linear function, two distinct relationships. This indicates that as genome size increases the number of genes increases at a disproportionately slower rate in eukaryotes than in non-eukaryotes. In another word, the proportion of non-coding DNA increases with genome size faster in eukaryotes than in non-eukaryotes. This is consistent with the previous findings that the vast majority of nuclear DNA in eukaryotes is non-gene-coding elements including introns, pseudogenes, and transposable elements whereas prokaryotic, viral, and organellar genomes are mostly composed of gene-coding sequences [1], [3].

The smallest eukaryotic genomes collected in this study are from the nucleomorphs of Bigelowiella natans (373 kbp), Guillardia theta (551 kbp), and Hemiselmis andersenii (572 kbp) followed by the parasitic fungus Encephalitozoon cuniculi (2,500 kbp). Their gene numbers and genome sizes are comparable to some bacteria (Figure 2). The nucleomorph is a remnant nucleus of the secondary endosymbiont that has evolved to a chloroplast in the host crytophyte and chlorarachniophyte algae [19]. While the counterparts in other lineages of algae have been completely lost, nucleomorphs in these two lineages remain, but the sizes of their genomes have remarkably reduced. For E. cuniculi, its small genome may be a result of selection for a minimal genome size in parasitism evolution. Gene numbers of these small eukaryotic genomes appear to also fit on the non-eukaryotic regression lines (Fig. 2A), suggesting that nuclear genome reduction during chloroplast and parasitism evolution has resulted in elevated gene density. This is the reverse of genome expansion that results from disproportionate increase of non-gene-coding DNA [1], [3]. The two largest eukaryotic genomes analyzed were about 3,175,581 kbp in the primate Pan troglodytes and 3,080,436 kbp in humans, 8,514 times larger than the smallest (B. natans nucleomorph). Genome sequencing probably has biased toward relatively small genomes, as indicated by limited number of sequenced genomes larger than humans'; however, the current dataset cover a wide genome size, phylogenetic, and ecological ranges. The high statistical significance and R² value of the log₁₀ gene number- log₁₀ genome size correlation derived from this dataset suggests that the resultant regression equation should provide reliable predictions on gene numbers for many species.

Predicting power of the eukaryotic regression model for dinoflagellate genomes

A question about applying the eukaryotic regression model to dinoflagellate genomes stems from potential effects of distinct dinoflagellate genome organization on the log₁₀ gene number-log₁₀ genome size correlation. Unique among eukaryotes, dinoflagellate genomes have a few to over 200 chromosomes, which are permanently condensed, and not organized by nucleosomes [20]. The condensed chromosomes show a striating banding pattern under electron microscope that result from liquid cholesteric DNA crystal, which are formed by stacked disks of parallel bundles of DNA filaments that make a continuous left-handed twist along the chromosome's longitudinal axis [21]. Histone-like basic DNA-binding proteins are probably involved in stabilizing this structure by neutralizing local electronegative charges that would result from tightly compacted DNA filaments [22]. While most of this DNA is believed to be transcriptionally inactive, at the periphery of these disks are loops of DNA that are less tightly compacted and actively transcribed [23], [24]. As mentioned earlier, most of the dinoflagellate genes studied so far are organized in tandem repeats, not so commonly seen in eukaryotes. Dinoflagellate genomes also host complex molecular machinery of mRNA editing [25] and spliced leader (SL) trans-splicing [26 and ref therein].

While no information is available to prove whether these genomic features will lead to alteration of the log₁₀ gene number - log₁₀ genome size relationship, an examination on organisms sharing similar genomic features may provide some clue. Genomes of the kinetoplastids, which are phylogenetically distinct from dinoflagellates, share with dinoflagellates many of the unique genomic features, such as permanently condensed chromosomes, gene tandem repeat organizations, mRNA editing, and SL trans-splicing of transcripts [27]. Genomes of two kinetoplastid species, Leishmania major (32,800 kbp) and Trypanosoma brucei (26,000 kbp), have been sequenced, but data were not used in the regression analyses because the sequence annotation had not been finished at time of our data collection. The total gene numbers based on the draft genome sequences are 9,183 for L. major and 9,068 for T. brucei [28], [29], which are similar to what our eukaryotic regression model predicts (10,301 and 9,346, respectively). This comparison result indicates that the unique genome structures in this lineage will not cause significant deviation of genome features from the eukaryotic log₁₀ gene number- log₁₀ genome size relationship we have derived. It suggests that the relationship very likely holds for dinoflagellate genomes, particularly those of Symbiodinium spp. (∼3×10⁶ kbp), which are within the genome size range sampled in this study. The genomes of Symbiodinium spp. and some other modern dinoflagellates are shown to be haploid [30]–[35]. If polypoidy occurs in some dinoflagellates and accounts for their large nuclear genomes (see next section), practically gene contents in these species can also be estimated with their factored-down “haploid” genome sizes (if≤3×10⁶ kbp) using the regression equation developed here and the gene number estimate can then be factored up to the actual genome size. The equation can also be used to estimate the gene numbers for those having smaller genome size than Symbidinium spp. but yet to be identified [5].

Extrapolation of the regression model to accommodate genomes larger than sampled will have risk of overestimating or underestimating gene numbers, because the trend of the regression may possibly shift for large genomes like those of dinoflagellates. However, compared to a linear regression, the logarithmic regression we derived for eukaryotes inherently predicts a slower increase of gene number, and hence a progressively lower gene-coding percentage, as genome size increases. In fact, the predicted gene-coding percentages for the smallest and the largest dinoflagellate genome, 1.80% and 0.05% respectively, are remarkably lower than those for most other eukaryotes (1%–82%). Therefore, further leveling off of the regression line may not be so likely. A recent small-scale survey of Heterocapsa triquetra nuclear genome [36] is worth noting. Out of a 230 kbp sequence analyzed, 89.5% was non-repeated sequences with no similarity to any known genes but a 546-bp gene was identified. Applying the one per 230 kbp DNA gene density to the entire genome would yield about 91,500 genes for the 18.6–23.6×10⁶ (21.1×10⁶ on average) kbp H. triquetra nuclear genome. Alternatively, if we assume that the gene-coding percentage of this 230-kbp DNA (0.2%) and the previously reported eukaryotic average gene length (1.346 kbp) apply to this genome, the gene number would be 31,352. Our model-predicted 60,128 gene number for this species lies in the middle of the two extremes. Therefore, it seems unlikely that the eukaryote regression model we derived will seriously if at all overestimate gene numbers for large dinoflagellate genomes.

Dinoflagellate gene contents and their implications in genome evolution

While all the available information point to a reasonable accuracy, or at least no overestimation, the model-predicted gene numbers for dinoflagellates (38,188–87,688 or about 1-3 fold as many as that in a human genome) are exceedingly high for these unicellular and therefore relatively “simple” organisms. However, these gene number estimates may not really represent an extraordinarily high functional diversity of the encoded proteome. A survey of literature reveals that previously examined dinoflagellate genes occur in 30–5,000 copies per genome (Table 2), indicating that high gene copy number is a widespread phenomenon in dinoflagellate genomes. The sequences of these gene copies may be identical in some cases like the rRNA locus but slightly different from each other in most cases. Regardless, the widespread gene duplicates may offset the high total protein-coding gene numbers, giving a reasonable number of unique genes compared to what is expected of a typical unicellular eukaryote.

Download:

Table 2. Dinoflagellate gene copy numbers documented to date.

https://doi.org/10.1371/journal.pone.0006978.t002

While little genomic data are available to support this proposition, some insights can be obtained from EST data that have been generated for several dinoflagellate species. Typically in these studies EST sequences in each species were clustered at an identity cutoff around 95%, which is expected to group cDNA copies into unique (or semi-unique) transcripts. In Alexandrium tamarense (genome size 200×10⁶ kbp), 6,723 unique transcripts were identified out of a 11,171-EST dataset [37]; in Heterocapsa triquetra (about 20×10⁶ kbp), 2,022 unique clusters were assembled out of 6,765 sequenced ESTs [38]; in Karenia brevis (about 100×10⁶ kbp), 11,937 unique out of 25,000 ESTs [39]; in K. veneficum (formerly K. micrum; 5×10⁶ kbp), 11,903 unique out of 16,544 [40]; in Oxyrrhis marina (genome size unknown), 9,876 unique out of 18,012 [41]. True unique-gene numbers of these species likely are higher than these unique-transcript numbers because an EST dataset does not include genes not expressed at time of sampling, and furthermore, as the sequencing scales in these projects were relatively small the data likely only account for a fraction of the expressed gene pool missing those expressed at lower levels. Nevertheless, these incomplete EST data reveal a minimum of nearly 12,000 unique genes even for the relatively small dinoflagellate genome of K. veneficium (∼5×10⁶ kbp). In this case, if the average gene copy number is 3, the 42,770 protein-coding genes predicted by our regression model would represent a collection of 14,257 unique genes, a number close to the EST-based unique gene estimate (>12,000).

Many questions remain regarding dinoflagellate genome composition and its evolution. As the gene-coding percentage is very low, the large and widely ranged dinoflagellage genome sizes are clearly not due to the high gene numbers we predicted here. Non-coding DNA (e.g. repetitive sequences, introns, transposons) dominates the genomes as in any large eukaryote genomes, attested to by the abundant transposable elements found in a small fraction of H. triquetra genomic DNA [36]. On the contrary, the high gene numbers, especially high gene copy numbers, is likely the result of genome expansion. It is believed that dinoflagellate genomes have been subject to duplications of individual genes or segmental to whole genome duplication [5], [39], or combinations of these mechanisms. Tandem-repeated genes, like those that have been studied in dinoflagellates (Table 2), are more likely to have resulted from successive gene duplications through unequal cross-over of chromosomes [16]. In addition, it is possible that dinoflagellate genomes can take up and incorporate cDNAs, resulting in multiplication of genes such as that coding for SL [42]. However, location of gene copies on separate chromosomes is evident at least in the case of Rubisco in Prorocentrum minimum, suggesting possible duplication at chromosomal level or higher [16]. Whole genome duplications by autopolyploidy or allopolyploidy events are the most efficient mechanism to introduce extra genetic material and significantly expand the genomes [43], and have been well documented for animals, plants and protists such as the budding yeast Saccharomyces cerevisiae and the ciliate Paramecium tetraurelia [44]–[47]. Given the widespread gene repetition in dinoflagellates, genome duplication is very possible. In fact, ancient polyploidy has been suggested as a mechanism of speciation in the dinoflagellate Heterocapsa pygmaea [48]. Because usually most gene duplicates are eventually lost or diverged to different genes after genome duplication, the retention of the numerous copies of genes in dinoflagellates may indicate an evolutionary driving force associated with functional requirements imposed on dinoflagellates for adaptation to a wide range of habitats. In support of this, highly expressed genes tend to occur in tandem-repeated copies [16], [49]. The predicted high gene numbers can be a result of gene and genome duplication followed by differential gene loss and diversification. Ultimate verification of actual gene number and genome duplication as a potential causative mechanism would require sequencing of one or more dinoflagellate genomes, which will also further validate the eukaryotic log gene number-log genome size correlation empirically derived in this study.

Supporting Information

Table S1.

Genome size, protein-coding gene number, total gene number, and gene-coding percentage for the sequenced genomes of eukaryotes, bacteria, archaea, viruses, mitochondria, and chloroplasts estimated based on genome sequences.

https://doi.org/10.1371/journal.pone.0006978.s001

(1.97 MB DOC)

Acknowledgments

We thank Xue Feng Liu for assistance in compiling an initial dataset. Comments from the two reviewers helped to improve the manuscript significantly.

Author Contributions

Conceived and designed the experiments: SL. Performed the experiments: YH. Analyzed the data: YH SL. Wrote the paper: YH SL.

References

1. Lynch M, Conery JS (2003) The origins of genome complexity. Science 302: 1401–1404.
- View Article
- Google Scholar
2. Konstantinidis KT, Tiedje JM (2004) Trends between gene content and genome size in prokaryotic species with larger genomes. Proc Natl Acad Sci U S A 101: 3160–3165.
- View Article
- Google Scholar
3. Gregory TR (2005) Synergy between sequence and size in large-scale genomics. Nature Rev Genet 6: 699–708.
- View Article
- Google Scholar
4. Hackett JD, Anderson DM, Erdner DL, Bhattacharya D (2004) Dinoflagellates: a remarkable evolutionary experiment. Am J Bot 91: 1523–1534.
- View Article
- Google Scholar
5. Lin S (2006) The smallest dinoflagellate genome is yet to be found: a comment on LaJeunesse, et al. “Symbiodinium (Pyrrhophyta) genome sizes (DNA content) are smallest among dinoflagellates”. J Phycol 42: 746–748.
- View Article
- Google Scholar
6. LaJeunesse TC, Lambert G, Andersen RA, Coffroth MA, Galbraith DW (2005) Symbiodinium (Pyrrhophyta) genome sizes (DNA content) are smallest among dinoflagellates. J Phycol 41: 880–886.
- View Article
- Google Scholar
7. Veldhuis MJW, Cucci TL, Sieracki ME (1997) Cellular DNA content of marine phytoplankton using two new fluorochromes: taxonomic and ecological implications. J Phycol 33: 527–541.
- View Article
- Google Scholar
8. Holm-Hansen O (1969) Algae: amounts of DNA and organic carbon in single cells. Science 163: 87–88.
- View Article
- Google Scholar
9. Rizzo PJ (1987) Biochemistry of the dinoflagellate nucleus. In: Taylor FJR, editor. The biology of dinoflagellates. Oxford: Blackwell Science Inc. pp. 143–173.
10. Gregory TR, Nicol JA, Tamm H, Kullman B, Kullman K, et al. (2007) Eukaryotic genome size databases. Nucleic Acids Res 35: D332–D338.
- View Article
- Google Scholar
11. Allen JR, Roberts TM, Loeblich I, Alfred R, Klotz LC (1975) Characterization of the DNA from the dinoflagellate Crypthecodinium cohnii and implications for nuclear organization. Cell 6: 161–169.
- View Article
- Google Scholar
12. Hinnebusch AG, Klotz LC, Immergut E, Loeblich ARI (1980) Deoxyribonucleic acid sequence organization in the genome of the dinoflagellate Crypthecodinium cohnii. Biochem 19: 1744–1755.
- View Article
- Google Scholar
13. Steel RE (1980) Aspects of the composition and organization of dinoflagellate DNA [Ph.D. thesis]. New Haven: Yale University.
14. Anderson DM, Grabher A, Herzog M (1992) Separation of coding sequences from structural DNA in the dinoflagellate Crypthecodinium cohnii. Mol Mar Biol Biotechnol 2: 89–96.
- View Article
- Google Scholar
15. Moreau H, Geraud ML, Bhaud Y, Soyer-Gobillard MO (1998) Cloning, characterization and chromosomal localization of a repeated sequence in Crypthecodinium cohnii, a marine dinoflagellate. Int Microbiol 1: 35–43.
- View Article
- Google Scholar
16. Zhang H, Lin S (2003) Complex gene structure of the form II RUBISCO in the dinoflagellate Prorocentrum minimum (Dinophyceae). J Phycol 39: 1160–1171.
- View Article
- Google Scholar
17. Zhang H, Hou Y, Lin S (2006) Isolation and characterization of proliferating cell nuclear antigen from the dinoflagellate Pfiesteria piscicida. J Euk Microbiol 53: 142–150.
- View Article
- Google Scholar
18. Xu L, Chen H, Hu X, Zhang R, Zhang Z, et al. (2006) Average gene length is highly conserved in prokaryotes and eukaryotes and diverges only between the two kingdoms. Mol Biol Evol 23: 1107–1108.
- View Article
- Google Scholar
19. Archibald J (2007) Nucleomorph genomes: structure, function, origin and evolution. BioEssays 29: 392–402.
- View Article
- Google Scholar
20. Spector DL (1984) Dinoflagellate nuclei. In: Spector DL, editor. Dinoflagellates. New York: Academic Press. pp. 107–147.
21. Bouligand Y, Norris V (2001) Chromosome separation and segregation in dinoflagellates and bacteria may depend on liquid crystalline states. Biochimie 83: 187–192.
- View Article
- Google Scholar
22. Chan YH, Wong JTY (2007) Concentration-dependent organization of DNA by the dinoflagellate histone-like protein HCc3. Nucleic Acids Res 35: 2573–2583.
- View Article
- Google Scholar
23. Sigee DC (1984) Structural DNA and genetically active DNA in dinoflagellate chromosomes. Biosystems 16: 203–210.
- View Article
- Google Scholar
24. Bhaud Y, Guillebault D, Lennon JF, Defacque H, Soyer-Gobillard MO, et al. (2000) Morphology and behaviour of dinoflagellate chromosomes during the cell cycle and mitosis. J Cell Sci 113: 1231–1239.
- View Article
- Google Scholar
25. Lin S, Zhang H, Gray MW (2008) RNA editing in dinoflagellates and its implications for the evolutionary history of the editing machinery. In: Smith H, editor. RNA and DNA editing: molecular mechanisms and their integration into biological systems. Hoboken, NJ: John Wiley & Sons, Inc. pp. 280–309.
26. Zhang H, Campbell DA, Sturm NR, Lin S (2009) Dinoflagellate spliced leader RNA genes display a variety of sequences and genomic arrangements. Mol Biol Evol 26: 1757–1771.
- View Article
- Google Scholar
27. Lukes J, Leander BS, Keeling PJ (2009) Cascades of convergent evolution: the corresponding evolutionary histories of euglenozoans and dinoflagellates. Proc Natl Acad Sci U S A 106: 9963–9970.
- View Article
- Google Scholar
28. Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, et al. (2005) The genome of the kinetoplastid parasite, Leishmania major. Science 309: 436–442.
- View Article
- Google Scholar
29. Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, et al. (2005) The genome of the African trypanosome Trypanosoma brucei. Science 309: 416–422.
- View Article
- Google Scholar
30. Rizzo PJ, Nooden LD (1973) Isolation and chemical composition of dinoflagellate nuclei. J Euk Microbiol 20: 666–672.
- View Article
- Google Scholar
31. Roberts TM, Tuttle RC, Allen JR, Loeblich AR, Klotz LC (1974) New genetic and physicochemical data on structure of dinoflagellate chromosomes. Nature 248: 446–447.
- View Article
- Google Scholar
32. Blank RJ (1987) Cell architecture of the dinoflagellate Symbiodinium sp. inhabiting the Hawaiian stony coral Montipora verrucosa. Mar Bio 94: 143–155.
- View Article
- Google Scholar
33. Pfiester L, Anderson DM (1987) Dinoflagellate reproduction. In: Taylor FJR, editor. The Biology of dinoflagellates. Oxford: Blackwell Scientific Inc. pp. 611–648.
34. Coats DW (2002) Dinoflagellate life-cycle complexities. J Phycol 38: 417–419.
- View Article
- Google Scholar
35. Santos SR, Coffroth MA (2003) Molecular genetic evidence that dinoflagellates belonging to the genus Symbiodinium Freudenthal are haploid. Biol Bull 204: 10–20.
- View Article
- Google Scholar
36. McEwan M, Humayun R, Slamovits CH, Keeling PJ (2008) Nuclear genome sequence survey of the dinoflagellate Heterocapsa triquetra. J Euk Microbiol 55: 530–535.
- View Article
- Google Scholar
37. Hackett JD, Scheetz TE, Yoon HS, Soares MB, Bonaldo MF, et al. (2005) Insights into a dinoflagellate genome through expressed sequence tag analysis. BMC Genomics 6:
- View Article
- Google Scholar
38. Patron NJ, Waller RF, Archibald JM, Keeling PJ (2005) Complex protein targeting to dinoflagellate plastids. J Mol Biol 348: 1015–1024.
- View Article
- Google Scholar
39. Van Dolah FM, Lidie KB, Monroe EA, Bhattacharya D, Campbell L, et al. (2009) The Florida red tide dinoflagellate Karenia brevis: new insights into cellular and molecular processes underlying bloom dynamics. Harmful Algae 8: 562–572.
- View Article
- Google Scholar
40. Patron NJ, Waller RF, Keeling PJ (2006) A tertiary plastid uses genes from two endosymbionts. J Mol Biol 357: 1373–1382.
- View Article
- Google Scholar
41. Slamovits CH, Keeling PJ (2008) Plastid-derived genes in the nonphotosynthetic alveolate Oxyrrhis marina. Mol Biol Evol 25: 1297–1306.
- View Article
- Google Scholar
42. Slamovits CH, Keeling PJ (2008) Widespread recycling of processed cDNAs in dinoflagellates. Curr Biol 18: R550–R552.
- View Article
- Google Scholar
43. Lynch M (2007) Genomic expansion by gene duplication. In: Lynch M, editor. The origins of genome architecture. Sunderland, MA: Sinauer Associates. pp. 193–235.
44. Wolfe KH, Shields DC (1997) Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387: 708–713.
- View Article
- Google Scholar
45. Ramsey J, Schemske DW (1998) Pathways, mechanisms, and rates of polyploid formation in flowering plants. Annu Rev Ecol Syst 29: 467–501.
- View Article
- Google Scholar
46. Aury JM, Jaillon O, Duret L, Noel B, Jubin C, et al. (2006) Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444: 171–178.
- View Article
- Google Scholar
47. Gregory TR, Mable BK (2005) Polyploidy in animals. In: Gregory TR, editor. The evolution of the genome. San Diego, CA: Elsevier. pp. 427–517.
48. Loeblich AR, III, Schmidt RJ, Sherley JL (1981) Scanning electron microscopy of Heterocapsa pygmaea sp. nov., and evidence for polyploidy as a speciation mechanism in dinoflagellates. J Plankton Res 3: 67–79.
- View Article
- Google Scholar
49. Bachvaroff TR, Place AR (2008) From stop to start: tandem gene arrangement, copy number and trans-splicing sites in the dinoflagellate Amphidinium carterae. PLoS ONE 3: e2929.
- View Article
- Google Scholar
50. Salois P, Morse D (1997) Characterization and molecular phylogeny of a protein kinase cDNA from the dinoflagellate gonyaulax (Dinophyceae). J Phycol 33: 1063–1072.
- View Article
- Google Scholar
51. Liu LY, Hastings JW (2006) Novel and rapidly diverging intergenic sequences between tandem repeats of the luciferase genes in seven dinoflagellate species. J Phycol 42: 96–103.
- View Article
- Google Scholar
52. Lee D, Mittag M, Sczekan S, Morse D, Hastings JW (1993) Molecular cloning and genomic organization of a gene for luciferin-binding protein from the dinoflagellate Gonyaulax polyedra. J Biol Chem 268: 8842–8850.
- View Article
- Google Scholar
53. Bertomeu T, Morse D (2004) Isolation of a dinoflagellate mitotic cyclin by functional complementation in yeast. Biochem Biophys Res Commun 323: 1172–1183.
- View Article
- Google Scholar
54. Le QH, Markovic P, Hastings JW, Jovine RVM, Morse D (1997) Structure and organization of the peridinin chlorophyll a binding protein gene in Gonyaulax polyedra. Mol General Genet 255: 595–604.
- View Article
- Google Scholar
55. Reichman J, Wilcox T, Vize P (2003) PCP gene family in Symbiodinium from Hippopus hippopus: low level of concerted evolution, isoform diversity and spectral tuning of chromophores. Mol Biol Evol 20: 2143–2154.
- View Article
- Google Scholar

[ref1] 1. Lynch M, Conery JS (2003) The origins of genome complexity. Science 302: 1401–1404.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Konstantinidis KT, Tiedje JM (2004) Trends between gene content and genome size in prokaryotic species with larger genomes. Proc Natl Acad Sci U S A 101: 3160–3165.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Gregory TR (2005) Synergy between sequence and size in large-scale genomics. Nature Rev Genet 6: 699–708.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Hackett JD, Anderson DM, Erdner DL, Bhattacharya D (2004) Dinoflagellates: a remarkable evolutionary experiment. Am J Bot 91: 1523–1534.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Lin S (2006) The smallest dinoflagellate genome is yet to be found: a comment on LaJeunesse, et al. “Symbiodinium (Pyrrhophyta) genome sizes (DNA content) are smallest among dinoflagellates”. J Phycol 42: 746–748.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. LaJeunesse TC, Lambert G, Andersen RA, Coffroth MA, Galbraith DW (2005) Symbiodinium (Pyrrhophyta) genome sizes (DNA content) are smallest among dinoflagellates. J Phycol 41: 880–886.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref7] 7. Veldhuis MJW, Cucci TL, Sieracki ME (1997) Cellular DNA content of marine phytoplankton using two new fluorochromes: taxonomic and ecological implications. J Phycol 33: 527–541.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref8] 8. Holm-Hansen O (1969) Algae: amounts of DNA and organic carbon in single cells. Science 163: 87–88.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref9] 9. Rizzo PJ (1987) Biochemistry of the dinoflagellate nucleus. In: Taylor FJR, editor. The biology of dinoflagellates. Oxford: Blackwell Science Inc. pp. 143–173.

[ref10] 10. Gregory TR, Nicol JA, Tamm H, Kullman B, Kullman K, et al. (2007) Eukaryotic genome size databases. Nucleic Acids Res 35: D332–D338.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref11] 11. Allen JR, Roberts TM, Loeblich I, Alfred R, Klotz LC (1975) Characterization of the DNA from the dinoflagellate Crypthecodinium cohnii and implications for nuclear organization. Cell 6: 161–169.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref12] 12. Hinnebusch AG, Klotz LC, Immergut E, Loeblich ARI (1980) Deoxyribonucleic acid sequence organization in the genome of the dinoflagellate Crypthecodinium cohnii. Biochem 19: 1744–1755.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref13] 13. Steel RE (1980) Aspects of the composition and organization of dinoflagellate DNA [Ph.D. thesis]. New Haven: Yale University.

[ref14] 14. Anderson DM, Grabher A, Herzog M (1992) Separation of coding sequences from structural DNA in the dinoflagellate Crypthecodinium cohnii. Mol Mar Biol Biotechnol 2: 89–96.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref15] 15. Moreau H, Geraud ML, Bhaud Y, Soyer-Gobillard MO (1998) Cloning, characterization and chromosomal localization of a repeated sequence in Crypthecodinium cohnii, a marine dinoflagellate. Int Microbiol 1: 35–43.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref16] 16. Zhang H, Lin S (2003) Complex gene structure of the form II RUBISCO in the dinoflagellate Prorocentrum minimum (Dinophyceae). J Phycol 39: 1160–1171.
View Article
Google Scholar

[43] View Article

[44] Google Scholar

[ref17] 17. Zhang H, Hou Y, Lin S (2006) Isolation and characterization of proliferating cell nuclear antigen from the dinoflagellate Pfiesteria piscicida. J Euk Microbiol 53: 142–150.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref18] 18. Xu L, Chen H, Hu X, Zhang R, Zhang Z, et al. (2006) Average gene length is highly conserved in prokaryotes and eukaryotes and diverges only between the two kingdoms. Mol Biol Evol 23: 1107–1108.
View Article
Google Scholar

[49] View Article

[50] Google Scholar

[ref19] 19. Archibald J (2007) Nucleomorph genomes: structure, function, origin and evolution. BioEssays 29: 392–402.
View Article
Google Scholar

[52] View Article

[53] Google Scholar

[ref20] 20. Spector DL (1984) Dinoflagellate nuclei. In: Spector DL, editor. Dinoflagellates. New York: Academic Press. pp. 107–147.

[ref21] 21. Bouligand Y, Norris V (2001) Chromosome separation and segregation in dinoflagellates and bacteria may depend on liquid crystalline states. Biochimie 83: 187–192.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref22] 22. Chan YH, Wong JTY (2007) Concentration-dependent organization of DNA by the dinoflagellate histone-like protein HCc3. Nucleic Acids Res 35: 2573–2583.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref23] 23. Sigee DC (1984) Structural DNA and genetically active DNA in dinoflagellate chromosomes. Biosystems 16: 203–210.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref24] 24. Bhaud Y, Guillebault D, Lennon JF, Defacque H, Soyer-Gobillard MO, et al. (2000) Morphology and behaviour of dinoflagellate chromosomes during the cell cycle and mitosis. J Cell Sci 113: 1231–1239.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref25] 25. Lin S, Zhang H, Gray MW (2008) RNA editing in dinoflagellates and its implications for the evolutionary history of the editing machinery. In: Smith H, editor. RNA and DNA editing: molecular mechanisms and their integration into biological systems. Hoboken, NJ: John Wiley & Sons, Inc. pp. 280–309.

[ref26] 26. Zhang H, Campbell DA, Sturm NR, Lin S (2009) Dinoflagellate spliced leader RNA genes display a variety of sequences and genomic arrangements. Mol Biol Evol 26: 1757–1771.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref27] 27. Lukes J, Leander BS, Keeling PJ (2009) Cascades of convergent evolution: the corresponding evolutionary histories of euglenozoans and dinoflagellates. Proc Natl Acad Sci U S A 106: 9963–9970.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref28] 28. Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, et al. (2005) The genome of the kinetoplastid parasite, Leishmania major. Science 309: 436–442.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref29] 29. Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, et al. (2005) The genome of the African trypanosome Trypanosoma brucei. Science 309: 416–422.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref30] 30. Rizzo PJ, Nooden LD (1973) Isolation and chemical composition of dinoflagellate nuclei. J Euk Microbiol 20: 666–672.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref31] 31. Roberts TM, Tuttle RC, Allen JR, Loeblich AR, Klotz LC (1974) New genetic and physicochemical data on structure of dinoflagellate chromosomes. Nature 248: 446–447.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref32] 32. Blank RJ (1987) Cell architecture of the dinoflagellate Symbiodinium sp. inhabiting the Hawaiian stony coral Montipora verrucosa. Mar Bio 94: 143–155.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref33] 33. Pfiester L, Anderson DM (1987) Dinoflagellate reproduction. In: Taylor FJR, editor. The Biology of dinoflagellates. Oxford: Blackwell Scientific Inc. pp. 611–648.

[ref34] 34. Coats DW (2002) Dinoflagellate life-cycle complexities. J Phycol 38: 417–419.
View Article
Google Scholar

[91] View Article

[92] Google Scholar

[ref35] 35. Santos SR, Coffroth MA (2003) Molecular genetic evidence that dinoflagellates belonging to the genus Symbiodinium Freudenthal are haploid. Biol Bull 204: 10–20.
View Article
Google Scholar

[94] View Article

[95] Google Scholar

[ref36] 36. McEwan M, Humayun R, Slamovits CH, Keeling PJ (2008) Nuclear genome sequence survey of the dinoflagellate Heterocapsa triquetra. J Euk Microbiol 55: 530–535.
View Article
Google Scholar

[97] View Article

[98] Google Scholar

[ref37] 37. Hackett JD, Scheetz TE, Yoon HS, Soares MB, Bonaldo MF, et al. (2005) Insights into a dinoflagellate genome through expressed sequence tag analysis. BMC Genomics 6:
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref38] 38. Patron NJ, Waller RF, Archibald JM, Keeling PJ (2005) Complex protein targeting to dinoflagellate plastids. J Mol Biol 348: 1015–1024.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref39] 39. Van Dolah FM, Lidie KB, Monroe EA, Bhattacharya D, Campbell L, et al. (2009) The Florida red tide dinoflagellate Karenia brevis: new insights into cellular and molecular processes underlying bloom dynamics. Harmful Algae 8: 562–572.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref40] 40. Patron NJ, Waller RF, Keeling PJ (2006) A tertiary plastid uses genes from two endosymbionts. J Mol Biol 357: 1373–1382.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref41] 41. Slamovits CH, Keeling PJ (2008) Plastid-derived genes in the nonphotosynthetic alveolate Oxyrrhis marina. Mol Biol Evol 25: 1297–1306.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref42] 42. Slamovits CH, Keeling PJ (2008) Widespread recycling of processed cDNAs in dinoflagellates. Curr Biol 18: R550–R552.
View Article
Google Scholar

[115] View Article

[116] Google Scholar

[ref43] 43. Lynch M (2007) Genomic expansion by gene duplication. In: Lynch M, editor. The origins of genome architecture. Sunderland, MA: Sinauer Associates. pp. 193–235.

[ref44] 44. Wolfe KH, Shields DC (1997) Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387: 708–713.
View Article
Google Scholar

[119] View Article

[120] Google Scholar

[ref45] 45. Ramsey J, Schemske DW (1998) Pathways, mechanisms, and rates of polyploid formation in flowering plants. Annu Rev Ecol Syst 29: 467–501.
View Article
Google Scholar

[122] View Article

[123] Google Scholar

[ref46] 46. Aury JM, Jaillon O, Duret L, Noel B, Jubin C, et al. (2006) Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444: 171–178.
View Article
Google Scholar

[125] View Article

[126] Google Scholar

[ref47] 47. Gregory TR, Mable BK (2005) Polyploidy in animals. In: Gregory TR, editor. The evolution of the genome. San Diego, CA: Elsevier. pp. 427–517.

[ref48] 48. Loeblich AR, III, Schmidt RJ, Sherley JL (1981) Scanning electron microscopy of Heterocapsa pygmaea sp. nov., and evidence for polyploidy as a speciation mechanism in dinoflagellates. J Plankton Res 3: 67–79.
View Article
Google Scholar

[129] View Article

[130] Google Scholar

[ref49] 49. Bachvaroff TR, Place AR (2008) From stop to start: tandem gene arrangement, copy number and trans-splicing sites in the dinoflagellate Amphidinium carterae. PLoS ONE 3: e2929.
View Article
Google Scholar

[132] View Article

[133] Google Scholar

[ref50] 50. Salois P, Morse D (1997) Characterization and molecular phylogeny of a protein kinase cDNA from the dinoflagellate gonyaulax (Dinophyceae). J Phycol 33: 1063–1072.
View Article
Google Scholar

[135] View Article

[136] Google Scholar

[ref51] 51. Liu LY, Hastings JW (2006) Novel and rapidly diverging intergenic sequences between tandem repeats of the luciferase genes in seven dinoflagellate species. J Phycol 42: 96–103.
View Article
Google Scholar

[138] View Article

[139] Google Scholar

[ref52] 52. Lee D, Mittag M, Sczekan S, Morse D, Hastings JW (1993) Molecular cloning and genomic organization of a gene for luciferin-binding protein from the dinoflagellate Gonyaulax polyedra. J Biol Chem 268: 8842–8850.
View Article
Google Scholar

[141] View Article

[142] Google Scholar

[ref53] 53. Bertomeu T, Morse D (2004) Isolation of a dinoflagellate mitotic cyclin by functional complementation in yeast. Biochem Biophys Res Commun 323: 1172–1183.
View Article
Google Scholar

[144] View Article

[145] Google Scholar

[ref54] 54. Le QH, Markovic P, Hastings JW, Jovine RVM, Morse D (1997) Structure and organization of the peridinin chlorophyll a binding protein gene in Gonyaulax polyedra. Mol General Genet 255: 595–604.
View Article
Google Scholar

[147] View Article

[148] Google Scholar

[ref55] 55. Reichman J, Wilcox T, Vize P (2003) PCP gene family in Symbiodinium from Hippopus hippopus: low level of concerted evolution, isoform diversity and spectral tuning of chromophores. Mol Biol Evol 20: 2143–2154.
View Article
Google Scholar

[150] View Article

[151] Google Scholar

Figures

Abstract

Introduction

Methods

Data collection

Regression analyses and dinoflagellate gene content prediction

Results

Distinct correlations between genome size and gene content for eukaryotes and non-eukaryotes

Dinoflagellate gene content estimation

Discussion

Distinction and robustness of regression models

Predicting power of the eukaryotic regression model for dinoflagellate genomes

Dinoflagellate gene contents and their implications in genome evolution

Supporting Information

Table S1.

Acknowledgments

Author Contributions

References