Figure 1.
Genomic contig sizes based on various assembly strategies.
Frequency histograms of contig sizes for (A) the Sage-Grouse de novo assembly, (B) the Clark's Nutcracker de novo assembly, (C) the Sage-Grouse reference-guided assembly (1x read coverage) split at (N)500 motifs, and (D) the Clark's Nutcracker reference-guided assembly (1x read coverage) split at (N)500.
Figure 2.
Comparison of N50 scaffold length and total assembly length for various assemblies.
Histograms of the N50 scaffold length for new bird genomes with (N)500 motifs removed and total genome sizes for guide genomes. (A) N50 contig length for the Chicken reference genome, the de novo Sage-Grouse genome, and each of the guided assembly genomes. (B) N50 scaffold length for the Zebra Finch reference genome, the de novo Clark's Nutcracker genome, and each of the Clark's Nutcracker guided assembly genomes. Note that the y-axis scales differ between panels A and B. (C) Total genome sizes for the Chicken reference genome, de novo Sage-Grouse, and three guided Sage-Grouse genomes at different read coverage levels. (D) Total genome sizes for the Zebra Finch reference, de novo Clark's Nutcracker, and three guided Clark's Nutcracker genomes at different read coverage levels.
Table 1.
Summary of raw genome sequence data used.
Table 2.
Summary of genome assembly statistics from various assembly approaches.
Figure 3.
Comparison of Core Eukaryotic Genes identified in various new and reference genome assemblies.
Histogram of the number of complete and partial ultraconserved CEGs obtained from the CEGMA pipeline. Maximum number of CEGs is 248. (A) The de novo assembly and three guided genome assemblies for the Sage-Grouse at different read depth thresholds, plus the guide genome the Chicken. (B) The de novo assembly and three guided genome assemblies for the Clark's Nutcracker at different read depth thresholds, plus the guide genome the Zebra Finch.
Figure 4.
Percent of the genome identified as repetitive elements by RepeatMasker.
Histograms of percent repetitive content for all assemblies and the reference genomes of both species. Repetitive content was estimated using RepeatMasker.
Figure 5.
Genomic simple sequence repeat (SSR) density in raw reads and various genome assemblies.
Histograms of the simple sequence repeat (SSR) density of sequence is given for raw sequence reads, each of the assembly genomes, and reference genomes for (A) the Sage-Grouse and (B) the Clark's Nutcracker. Density for each motif length is the number of motif loci per Mb.
Figure 6.
Genomic simple sequence repeat (SSR) density across select amniote vertebreate genomes.
Histograms of SSR density for each de novo assembly and its respective reference genome, and for the Turkey (Meleagris gallopavo) and the Anolis Lizard (Anolis carolinensis) genome assemblies. Density for each motif length is the number of motif loci per Mb.
Figure 7.
Genomic GC isochore structure among amniote vertebrates, and in draft genomes.
(A) GC isochore structure plot of 1x guided assemblies for both bird species, their reference genomes, and other select amniote vertebrate genomes using a 3 kb window size. (B) GC isochore structure plot comparison of 1x the Clark's Nutcracker guided assembly and the reference the Chicken genome. All contigs at both a 3,000 and a 5,000 bp window were used for the Clark's Nutcracker (n = 73,158 and n = 30,090 contigs respectively). All contigs (referred to as “all” in figure) or a random selection equal to the number of contigs in the Clark's Nutcracker assembly (“limited”) for both the 3,000 and 5,000 bp window were used in the comparison.
Figure 8.
Heterozygous variant composition for the Sage-Grouse and the Clark's Nutcracker.
Pie chart includes Single Nucleotide Variants (SNV), Multiple Nucleotide Variants (MNV), Insertions, Deletions, and Replacements. SNVs are further annotated in a bar graph form according to all possible transitions. Key provides color-coding for each variant.
Figure 9.
Estimated divergence times among birds, including focal and reference genome species.
Bayesian relaxed clock estimate of divergence times among several bird lineages based on 12 mitochondrial protein-coding genes, with 95% credibility intervals shown as shaded bars at nodes. Dark arrows represent calibration points used in the analysis.