Figure 1.
BAC sequences (103) and fosmid sequences (90,954) were analyzed for tandem repeats (TRF), interspersed repeats by homology (CENSOR against CPRD), and interspersed repeats via de novo methods (REPET). Details of the annotation process are also shown.
Table 1.
BAC and Fosmid Sequence Set Summary.
Table 2.
Summary of tandem repeats from BAC and fosmid sequences.
Table 3.
Most frequent periods for three categories of tandem repeats in conifer genomic sequence.
Table 4.
Summary of full-length repetitive content.
Figure 2.
Microsatellite density across multiple species.
Cross-species comparison of microsatellites ranging from dinucleotide to octanucleotide, as calculated by TRF (microsatellite/Mbp). Analysis included two gymnosperm BAC sets (Picea glauca, Taxus mairei) and four angiosperms genomes (Cucumis sativus, Arabidopsis thaliana, Vitis vinifera, Populus trichocarpa).
Figure 3.
Distribution of homology-based repeat annotations by species.
Interspersed repeats were analyzed via a redundant similarity search (CENSOR against CPRD). Percentage in each sector represents base pair coverage over the redundant annotations. (A) Displays species coverage for full-length and partial elements. Species with contributions less than 3%, were categorized as ‘Other’. (B) Displays species coverage for full-length elements only.
Figure 4.
Distribution of transposable elements from similarity search.
A combination of the non-redundant CENSOR results from the BAC sequences (103) and fosmid sequences (90,954) were used to ascertain the major contributing classes of TEs. (A) Compares partial and full-length TE content by homology against other species. (B) Examines the full-length TE content in loblolly pine annotated in homology based and de novo searches.
Table 5.
Filtered (full-length) vs. Unfiltered (partial and full-length) repetitive content estimates.
Figure 5.
Genomic sequence represented by the highest coverage elements.
Base pair coverage attributed to copies of the high coverage LTR TEs.
Table 6.
High coverage LTR families identified with the de novo methodology.
Figure 6.
Annotated high copy LTR repeat families.
Multiple alignments of the top ten high coverage and novel elements were performed using MUSCLE and visualized in Jalview. The final consensus sequence was exported with substitutions resolved, annotated (LTRdigest), and visualized (AnnotationSketch). (A) Multiple sequence alignment of the 24 sequences in the representative cluster of the PtOuachita family. (B) Multiple sequence alignment of the 67 sequences in the representative cluster of the PtAppalachian family. (C) Multiple sequence alignment of the 68 sequences in the representative cluster of the PtPineywoods family.