Figure 1.
Long-read sequences map accurately to the chicken genome.
Shown in this UCSC genome browser image is one example of a long-read alignment shown alongside the corresponding short-read data for this region as well as existing RefSeq and Ensembl annotations.
Figure 2.
Broad coverage of existing chicken genes using long-read sequencing.
Along the x-axis representing transcript length is a histogram of the number of RefSeq transcripts within a given range of lengths (grey). A similar histogram is shown for those transcripts that overlap each RefSeq annotation by any amount (red), or by more than 90% of the gene length (dark red).
Table 1.
Bulk gene level analysis.
Figure 3.
Validation of mapped long-read splice sites.
TopHat2 was used to identify observed splice junctions from the short-read data. The light red and light blue lines show the distribution of distances from Ensembl-annotated splice sites to the experimentally observed splice sites. Both of these lines peak heavily at 0 indicating the degree of agreement between these orthogonal datasets. Splice sites annotated from long-read sequencing (blue and red), also show overall agreement, with a small peak of misidentified splice donor sites within 10 bp of the accurate one, which is possibly due to alignment errors near sites with multiple possible splice donor sites.
Figure 4.
Identification of new isoforms and genes.
Shown in this image from the UCSC genome browser are examples of the new genes and isoforms identified from among the short and long-read annotations: A. This region carries two distinct annotations, one for an alternate transcription start site (TSS), and another for a completely new gene that is currently unannotated. B. A heart-specific isoform of the FKBP7 gene. C. Both long-read and short-read data support the existence of transcripts going in opposite directions in this region of chromosome 9.
Table 2.
New Isoforms and genes.
Figure 5.
New isoforms and genes with tissue-specific expression.
The relative expression of (a) the 2,018 differentially expressed isoforms and (b) the 120 new differentially expressed gene annotations are provided across chicken adult brain, cerebellum, heart, kidney, liver and testes datasets, along the scale provided.