Figure 1.
Effect of library concentration on CapFlank.
Four dilutions of an Rattus exulans and Mus fragilicauda Illumina genomic DNA library were prepared. The dilution series was enriched for the control region using an 1040 bp mitochondrial DNA control region PCR product bait sheared to average size of 250 bp generated from Mus caroli, Mus cervicolor, Mus cookii, Mus fragilicauda, Rattus norvegicus and Rattus exulans. and the reads assembled to the Rattus exulans or Mus musculus mitochondrial genome reference (GenBankNC_012389.1, EF108336.1) for Rattus exulans and Mus fragilicauda respectively. The standard dilution results (1.5 µg input library) are emphasized with a box around both the Rattus exulans and Mus fragilicauda results. The region covered by the bait is indicated as a magenta bar. Mapping results at depths over 10,000 per bp, 1,000 per base, 500 per base, 50 per base are shown in orange, yellow, green and blue respectively. The data demonstrate that the more dilute the input library used, the lower the enrichment of sequences flanking the target becomes.
Table 1.
Hybridization Capture Results.
Figure 2.
Mitochondrial DNA coverage per base beyond bait sequences for six historical koala samples.
Reads mapping to the koala mitochondrial DNA genome (AB241053) are presented as per base coverage starting from the first base beyond the bait. The 5′ end of the bait is shown to the left of the X axis 3′ to the right. The six historical koala age and sample information are shown in Table S1. The average library insert size for all six libraries was 93 bp. No correlation between year of sample collection and extension beyond the bait end was observed. The results demonstrate that only limited flanking sequence was captured for the six historic samples tested.
Figure 3.
Testing the effects of homology on CapFlank.
Non-homologous amplicon libraries with or without M13 adaptors added (Figure S1) representing a potato blight PiRXLRc, giant squid mtDNA ND4, pigeon mtDNA COI, and grape chlorplast rbcL sequences were captured with blight panel A or pigeon panel B 200 bp baits. The capture libraries were then analysed by qPCR for the 250 bp amplicon products for each of the 4 amplicons. Captured libraries without M13 adaptors are shown in blue and with M13 adaptors added shown in red and demonstrate an increase in enrichment with M13 adaptor addition.
Figure 4.
CapFlank can capture sequences thousands to millions of bases away from the bait region.
Five ca. 1 kb baits spaced approximately 1 Mb apart were used to capture E. coli strain 536 from DNA extracted from human urine. Bait regions for 3 of the 5 baits employed with 1 kb of sequence 5′ and 3′ of the bait are shown on the left and non-targeted bacterial regions are shown on the right. Positions of the baits and genes are indicated above the covered positions by name and with a black line covering the length of the bait and the ends marked with vertical black lines. The y axis represents per base coverage and the x axis position. As per base coverage varied, the scales in each panel are not identical. Relative positions of the targeted genes are shown in order along the bacterial genome and the positions are shown within the triangles (green triangles targeted regions, purple non-targeted). Within each graph mapped reads for bacterial targeted hybridization capture and bacterial capture with KoRV are compared (blue and red lines respectively).
Figure 5.
The bead bound biotinylated baits are shown on the left of the figure as a grey circle (magnetic bead) bound to the biotinylated bait, black line attached to red circle. Library molecules are displayed as insert (black) with library adaptors (orange). The end of the homology between bait and target is shown to represent the last nucleotides of a targeted region. Library molecules with homology to the unbound portion of bait immobilized library molecules can hybridize to the unbound fraction. The process is iterative with newly bound molecules hybridizing to library molecules with further extending unbound portions becoming enriched by the growing contigs bound to the baits. Historic DNA differs in that the insert sizes are much shorter so much less unbound homologous sequence is available for extending beyond the target region. The effects are shown for mitochondrial genomes whereby the targeted region (baits shown in red with red lines delimiting the ends of the baits sequence) are highly enriched for both modern and historic DNA and that reads are enriched for the full mitochondrial genome with a decrease in number by distance and a far faster decrease by distance for historic DNA.