Low-cost cross-taxon enrichment of mitochondrial DNA using in-house synthesised RNA probes

Hybridization capture with in-solution oligonucleotide probes has quickly become the preferred method for enriching specific DNA loci from degraded or ancient samples prior to high-throughput sequencing (HTS). Several companies synthesize sets of probes for in-solution hybridization capture, but these commercial reagents are usually expensive. Methods for economical in-house probe synthesis have been described, but they do not directly address one of the major advantages of commercially synthesised probes: that probe sequences matching many species can be synthesised in parallel and pooled. The ability to make “phylogenetically diverse” probes increases the cost-effectiveness of commercial probe sets, as they can be used across multiple projects (or for projects involving multiple species). However, it is labour-intensive to replicate this with in-house methods, as template molecules must first be generated for each species of interest. While it has been observed that probes can be used to enrich for phylogenetically distant targets, the ability of this effect to compensate for the lack of phylogenetically diverse probes in in-house synthesised probe sets has not been tested. In this study, we present a refined protocol for in-house RNA probe synthesis and evaluated the ability of probes generated using this method from a single species to successfully enrich for the target locus in phylogenetically distant species. We demonstrated that probes synthesized using long-range PCR products from a placental mammal mitochondrion (Bison spp.) could be used to enrich for mitochondrial DNA in birds and marsupials (but not plants). Importantly, our results were obtained for approximately a third of the cost of similar commercially available reagents.


Introduction
Hybridization capture and high-throughput sequencing (HTS) have become a powerful combination of tools for sequencing genomic targets from degraded samples such as ancient DNA (aDNA) [1,2]. Hybridization capture enriches loci of interest, which lowers the sequencing effort needed and thus reduces the costs of a study. Hybridization capture uses complementary oligonucleotide probes to bind and immobilize target DNA allowing unwanted sequences to be washed away and can be performed with the probes in-solution or attached to a solid support. The combination of hybridization capture and HTS has allowed large genomic targets to be recovered from sub-fossil specimens including a whole chromosome [3], millions of single nucleotide polymorphisms [2], and nuclear genomes [4]. However, commercial probe sets for enriching large genomic targets require considerable expenditure to purchase, and may be excessive for many projects (such as those targeting only mitochondrial genomes). While a few companies (e.g. Arbor Biosciences) will synthesize probes for smaller targets, these reagents can be expensive especially for small-scale experiments.
To minimise the costs of hybridisation enrichment experiments for studies of small targets, attempts have been made to develop in-house probe production methods [5]. While this partially reduces the necessary expenditure, one of the advantages of commercially synthesised probe sets is that probe sequences matching many different species can be included, creating a "phylogenetically diverse" set of probes. For example, a single phylogenetically diverse probe set synthesised by Arbor Biosciences has been used to enrich mtDNA from a range of extinct bird species, including: elephant birds [6], parrots [7], owls [8], ducks [9], and ravens [10]. Replicating a similarly diverse set of probes in-house would be extremely labour intensive, potentially limiting the cost-effectiveness of in-house approaches versus commercial options. However, previous studies have reported successful hybridization capture of targets from taxa phylogenetically distant from the probe sequences [11][12][13], though these studies used commercially synthesised probes. If in-house synthesised probes can also be used to enrich for phylogenetically distant targets, then phylogenetically diverse probe sets may be unnecessary, making in-house probe sets much more appealing and economically viable.
There is clearly a need for low-cost methods to produce probes for modestly sized targets (e.g. mitogenomes � 16,000 bp and chloroplast genomes � 150,000 bp) that will work on a diverse array of taxa. To meet this need, in the present study we present and evaluate a protocol to generate RNA probes for the enrichment of moderately sized targets (Fig 1), including an assessment of the impacts of wash stringency on capture efficiency. Previously, probes synthesized with a similar protocol have been used to enrich whole mitochondrial genomes (mitogenomes) from Pleistocene bison [14] and pre-Columbian Native Americans [15]. However, the primary goal of the current study was to determine if probes synthesized from one species could successfully enrich target DNA from phylogenetically distant taxa. First, we verified the protocol by enriching for northern hairy-nosed wombat mitochondrial DNA (mtDNA) from museum specimens using probes produced from modern northern hairy-nosed wombat tissue. Then, to test the limits of cross-taxon enrichment, probes synthesized from modern bison mtDNA were used to enrich mtDNA from museum and sub-fossil specimens of divergent taxa, which included eutherian steppe bison (extinct) and bighorn sheep, marsupial thylacine (extinct), avian emu, and even broomcorn millet. Finally, we performed a cost-analysis of our in-house probes and equivalent commercial reagents.

Modern DNA for probe synthesis
Modern DNA for long-range PCR of mitogenomes came from the following sources: a northern hairy-nosed wombat ear punch [16], and blood samples from American bison (Bison bison) and European bison (Bison bonasus). DNA was isolated using a Qiagen DNeasy Blood & Tissue Kit following the manufacturer's instructions. In-house synthesis of RNA probes for mtDNA hybridization capture

Long-range amplicons for molecular probe synthesis
Long-range PCR primers (Table 1) were designed to amplify overlapping amplicons of the entire mitochondrial genome of the American bison (GenBank accession number NC_ 012346.1; total length = 16319 bp) or nucleotides 1 to15420 of the northern hairy-nosed wombat mitochondrial genome (GenBank accession number KJ868118.1; total length = 17028 bp), which excluded the D-loop because of the difficulty in designing primers to amplify over this region. The T7 RNA promoter sequence was attached to the 5' end of one primer in each primer pair to allow in vitro transcription of the long-range amplicon. Amplifications were performed in PCRs using the Expand Long Range dNTPack Version 7 (Roche). Each PCR contained: 1x Expand Long Range Buffer 2, 0.5 mM dNTPs, 0.3 μM each of forward and reverse primers, 20 ng DNA template, 3.5 U Expand Long Range Polymerase, and H 2 O to 50 μL. To increase the diversity of the sequences in the bison probe pool we attempted to generate long-range amplicons from both modern European and American bison. For the bison primers, parallel PCRs were performed with one set of amplifications containing DNA from American bison and the other set containing DNA from European bison.
All long-range PCRs were performed in a heated-lid thermal cycler programmed as follows: initial denaturation at 94˚C for 2:00 min; 10 cycles of 94˚C for 10 sec, 55˚C for 15 sec and 68˚C for 7:00 min; 25 cycles of 94˚C for 10 sec, 55˚for 15 sec, and 68˚C for 7:00 min + 20 sec/ cycle; and a final elongation at 68˚C for 5:00 min. All wombat long-range amplicons were successfully amplified and pooled together in equimolar concentrations. For the bison PCRs, the bovid-mt1 and bovid-mt2 amplicons could not be amplified from the modern European bison DNA. Consequently, the bison long-range amplicons were combined in the following ratios: 1 mole bovid-mt1 from American bison, 1 mole bovid-mt2 from American bison, 0.5 mole  bovid-mt3 from American bison, 0.5 mole bovid-mt3 from European bison. Pools of longrange amplicons were purified with 1x volume SpeedBeads (GE Healthcare) [17], quantified with a NanoDrop spectrophotometer (ThermoFisher), and visualized with GelRed (Biotium) staining after electrophoresis on a 2% agarose gel.

In vitro transcription of long-range amplicons
Each amplicon pool was transcribed in several in vitro transcription reactions using a HiScribe T7 High Yield RNA Synthesis Kit (New England Biolabs) following the provided protocol except that the incubation was carried out overnight. All transcription reactions were further incubated with 1x DNase Buffer and 4 U RNase free DNase I at 37˚C for 15 minutes to remove the amplicon template. The DNase I was inactivated by adding 6 μL 0.5 M EDTA and heating at 75˚C for 10 min. The RNA was then precipitated with ethanol, sodium acetate, and glycogen [18].

Biotinylation of fragmented RNA
Fragmented RNA was photolabelled with biotin using the EZ-Link Psoralen-PEG3-Biotin reagent (ThermoFisher) and the manufacturer's instructions. The labelled RNA was purified with a RNeasy MinElute Cleanup kit, quantified with a Qubit RNA BR assay (ThermoFisher), and separated into 125 ng aliquots for use in single hybridization capture reactions.

Samples for hybridization capture
Libraries from previous studies were selected for capture with our in-house probes. Ten historic northern hairy-nosed wombat libraries made from bone/teeth samples (Queensland Museum and Museum Victoria) were selected for enrichment of partial mitochondrial genomes ( Table 2). To explore the taxonomic range of hybridization capture by bison probes, samples of ancient bison, bighorn sheep, and a museum specimen thylacine (Tasmanian Museum and Art Gallery) were selected for study. A historic broomcorn millet library was also chosen for enrichment with bison probes ( Table 2). The ages of the wombat and thylacine samples were estimated from collection dates while the age of the bighorn sheep sample was estimated from the stratigraphic position in which the bone was found. All other samples were carbon dated at the Oxford Radiocarbon Accelerator Unit. No permits were required for the animals native to Australia. Import permits for the samples from outside Australia were obtained in accordance with Australian Government Quarantine Act 1908 Section 13(2AA). A permit to perform research on the bighorn sheep tooth was obtained from the Bureau of Land Management (USA) and permits were not required for the scientific investigation of the remaining samples. All samples are publicly available except the broomcorn millet grain, which was destroyed in this study during the extraction of aDNA. The holding institutions and additional details on the samples is given in S1 Table. The precautions taken for working with aDNA and extraction methods are given in S1 Supplemental Methods.

Library construction and amplification
All library construction protocols were based on previously established methods for aDNA that included blunt end ligation of truncated Illumina adapters. The truncated adapters included internal barcodes and the protocols included an enzymatic treatment to partially remove uracils [15,[20][21][22]. All taxa were taken through the Bst adapter fill-in stage of library construction and then amplified with different PCR protocols depending on the study from which the sample originated. Detailed descriptions of library construction and initial amplification are given in S1 Supplemental Methods.

Shotgun library amplification (addition of full-length Illumina adapters)
A portion of each truncated library was converted into a sequencing library by amplifying with primers that completed the Illumina adapter structure [20,21]. The protocols for shotgun amplification are given in Methods section of the Supplemental Information.

Mitochondrial DNA enrichment by hybridization capture
Each of the wombat DNA libraries was taken through a single round of hybridization capture using wombat probes and a wash procedure that followed the MyBaits User Manual Version 3.02 (http://www.arborbiosci.com/wp-content/uploads/2017/10/MYbaits-manual-v3.pdf). For the bison probes, two parallel hybridization capture reactions were performed for each nonwombat sample to evaluate the effect of high stringency washes (low salt, high temperature) and low stringency washes (high salt, low temperature) on enrichment efficiencies across different taxa. For high stringency washes we used the procedure outlined in the MyBaits User Manual and for the low stringency washes a protocol was used from a previous study involving our in-house probes [14]. Prior to hybridization capture, the following two solutions were prepared: • Probe Solution: 9 μL 20x SSPE, 0.5 μL 0.5 M EDTA pH 8.0, 3.5 μL 50X Denhardt's Solution, 0.5 μL 10% SDS, 1 μL SUPERase-In (20 U/μL, ThermoFisher), 125 ng in-house baits, 0.5 μL 50 μM RNA oligos to block truncated Illumina adapters [22], and nuclease-free H 2 O to 20 μL.
Hybridization capture was performed in a heated-lid thermal cycler. Tubes containing Target Solution were placed in the thermal cycler and incubated 5 min at 95˚C. Tubes containing the Probes Solution were placed in the thermal cycler and incubated for 5 min at 65˚C. Probe Solutions (18μL) were added to the Target Solutions and then mixed well by pipetting. The hybridization capture reactions were incubated for 5 hours at 65˚C, 5 hours at 60˚C, and � 38 hours at 55˚C (� 48 hours in total).
Just prior to the end of the incubations, MyOne Streptavidin C1 beads (ThermoFisher) were washed and non-specific binding sites blocked with yeast tRNA in preparation of recovering the annealed probe/target complexes. For each hybridization capture, 30 μL of beads were briefly washed with 500 μL binding buffer (1M NaCl, 10mM Tris-HCl, pH 7.5, 1mM EDTA) and then blocked by incubating the beads in 500 mL binding buffer + 10 μg yeast tRNA (ThermoFisher) for 30 min on a rotator at room temperature. After blocking, the beads were washed twice more with 500 μL binding buffer and then suspended in 70 μL binding buffer and placed in a 1.5 mL tube.

High stringency washes (MyBaits protocol)
Blocked beads were warmed for 5 min in a heating block set to 55˚C. Each hybridization capture reaction was added to an aliquot of blocked beads and then mixed by pipetting. These mixtures were incubated at 55˚C for 30 min with the tubes flicked with fingers every 5 min to keep the beads in suspension. Afterwards, the beads were pelleted with a magnetic rack and the supernate discarded. Beads were then suspended in wash buffer (0.02X SSC + 0.1% SDS) warmed to 55˚C and incubated at 55˚C for 10 min with occasional flicking to resuspend the beads. These steps were repeated an additional two times for a total of three washes. Once the last wash supernate was discarded, the beads were suspended in 40 μL H 2 O and heated at 95C for 5 min to release the DNA. The beads were precipitated with a magnetic rack and the released DNA was transferred to a new 1.5 ml tube for storage at -20˚C.

Low stringency washes
Each hybridization capture reaction was mixed with an aliquot of blocked beads at room temperature and then place on a rotator for 30 min at room temperature. Subsequently, the beads were pelleted with a magnetic rack, the supernate discarded, and the beads were taken through the following 4 washes: 0.5 mL 2x SSC + 0.05% Tween-20 at room temperature on a rotor for 10 min, 0.5 mL 0.75x SSC + 0.05% Tween-20 in a heating block at 50˚C for 10 min, repeat previous wash, and 0.5 mL 0.20x SSC + 0.05% Tween-20 in a heating block at 50˚C for 10 min. After the last wash the DNA was released from the beads as with the high stringency washes.
The DNA recovered from hybridization capture with bison probes was quantified by qPCR [23] as before (see S1 Supplemental Methods). Recovered DNA was divided among 4 x 25 μL PCRs containing 2.5 μL 10x High Fidelity PCR Buffer, 1 μL 50 mM MgSO 4 , 0.5 μL 10 mM dNTPs, 0.5 μL each of 10 μM IS4 and indexing primers, 0.1 μL Platinum Taq DNA Polymerase High Fidelity (5 U/μL), and molecular biology grade H 2 O to 25 μL. The PCRs were amplified in a heated-lid thermal cycler programmed as follows: initial denaturation 94˚C for 2 min, (see Table 3) cycles at 94˚C for 15 sec, 58˚C for 30 sec, 68˚C for 45 sec, and a final extension at 68˚C for 2 min. For the bison probe enrichments, the libraries processed with low and high stringency washes were amplified with primers containing different Illumina i7 indexes to allow separation of these samples during downstream bioinformatics analysis. PCRs from the same library were pooled and were purified using Sera-Mag SpeedBeads.

High-throughput DNA sequencing
Indexed libraries were quantified on a TapeStation 4200 using a D1000 ScreenTape (Agilent) and sent to the Australian Genome Research Facility for sequencing on an Illumina NextSeq 500 platform. Wombat samples were sequenced with 2 x 75 bp paired-end (150 cycle) chemistry and all other libraries with 2 x 150 bp paired-end (300 cycle) chemistry.

Data processing
The fastq files from the sequencer were demultiplexed by the internal barcodes using Sabre 1.0 (https://github.com/najoshi/sabre). AdapterRemoval v2.2.1 [24] was then used to trim adapters, collapse reads, discard reads < 25 bp, and remove reads of low quality (< 4). To eliminate the effect of sequencing depth and allow direct comparison of the washing conditions, all libraries enriched with bison probes were randomly subsampled to 2.5 x 10 6 reads using the reformat command of BBTools (36.62-intel-2017.01: https://jgi.doe.gov/data-and-tools/ bbtools/) and Java (1.8.0_121). The level of subsampling was determined by the total number of reads obtained from the library with the least reads. As hybridization capture conditions were not being compared for the wombat samples we did not subsample the reads from those libraries. Collapsed reads were mapped to mitochondrial reference genomes using BWA aln (v0.5.11-foss-2016b) [25,26]. Duplicate reads were removed from the mapped data using the SortSam and MarkDuplicates packages of Picard Tools (v 2.2.4: https://broadinstitute.github. io/picard/index.html). Read depths for the mtDNA enriched libraries were also generated using SAMtools (v1.3.1-foss-2016a). For the bison probe capture reactions, sequencing data from each species were mapped to a reference of the same species (i.e., emu reads were mapped to emu reference) except for the millet sample which was mapped to sorghum because currently there is no mitochondrial reference for broomcorn millet. The GenBank reference sequences used for mapping were: KJ868118.

Validation of protocol using wombat DNA
We validated our protocol by applying it to sequence mtDNA from museum specimens of the northern hairy-nosed wombat. Of the 10 hairy-nosed wombat historic specimens, one (16036A) contained very few mitochondrial sequences in the pre and post-enrichment libraries and was likely too poorly preserved for study. For the remaining 9 wombat samples, hybridization capture with our in-house probes produced a 28 to 264 fold enrichment (average = 139; standard deviation = 89) of mtDNA sequences and generated mean unique read depths that were on average >111 times greater than shotgun libraries (Table 4). For comparison, a study that used commercial enrichment probes on camel aDNA reported fold enrichments of mtDNA that ranged from 0.2 to 1,153 with an average of 187 [28]. The endogenous DNA proportions of the wombat samples, as judged by the fraction of unique mtDNA reads in shotgun libraries was strongly correlated with unique mitochondrial sequences in the final enriched libraries (Fig 3).

Hybridization capture of targets from phylogenetically distant taxa
The primary goal of this study was to determine if probes produced from modern bison DNA could be used to successfully enrich for mtDNA from phylogenetically divergent taxa. The ability to perform cross-taxon enrichment would demonstrate that our system is flexible and that one set of in-house probes could be used to capture target DNA from multiple taxa. While successful hybridization capture of taxa divergent from probe sequences has been reported previously [11][12][13], these studies used commercial probes of a standard length in contrast to our randomly fragmented probes of variable length [14]. Further we also sought to examine the impact of low and high stringency washes on the enrichment of targets from divergent taxa. We transcribed probes from modern bison (American and European) and then enriched mtDNA from species that ranged from close relatives such as steppe bison, to very distant species such as broomcorn millet (with an estimated divergence time from modern bison of � 1.5 billion years ago [19]). As expected, probes synthesized from modern bison were readily able     to enrich mtDNA from steppe bison libraries (Table 5). Compared to the other taxa examined, the steppe bison sample underwent the greatest enrichment of unique reads (> 55 fold increase) and greatest coverage of the reference genome (> 99.8% of the reference covered by at least 1 unique read) with the bison probes. The bison probes were able to enrich sequences from divergent animal taxa, producing 2.8 to 61.55-fold enrichment of unique reads for the bighorn sheep, thylacine, and emu libraries. Enrichment with bison probes increased the percentage of the mitochondrial reference being covered by unique reads for the bighorn sheep (17.71% to 97.25%) and emu (10.25% to 70.7%) samples, but in contrast produced only a slight (<1%) change in the thylacine ( Table 5). The bison probes were ineffective at enriching millet mtDNA, as expected given the evolutionary distance and sequence divergence involved. The inability of our in-house bison probes to enrich highly divergent millet mtDNA provides a negative control for the methodology and demonstrates that the mtDNA enrichment is dependent on the interaction of our probe with Fig 2. MapDamage plots. Levels of cytosine deamination in mitogenome mapped data were assayed with mapDamage 2 [27]. Only plots from hybridization capture enrichments that used high stringency washes are shown. Additionally, only a single northern hairynosed wombat sample is shown as an example. All samples exhibit increased levels of C ! T and G ! A substitutions that are typical of aDNA. Hybridization capture enrichments with low stringency washes and the other wombat samples gave similar results as those shown above.

Millet -High Stringency Washes
https://doi.org/10.1371/journal.pone.0209499.g002 target sequences and is not the result of non-specific activity that favouring mitochondrial over nuclear molecules. Shotgun data indicate that mitochondrial sequences are present in the millet library and that the lack of enrichment is not due the sample being too degraded for recovery of mtDNA.

Wash stringency
High stringency washes encourage the denaturation of annealed nucleic acids so that only probes/library complexes with high levels of sequence homology remained bound together contrast to low stringency washes. The impact of wash stringency on the recovery of target DNA is demonstrated by the additional 4 to 6 cycles needed (determined by qPCR) to amplify the libraries that underwent high stringency washes ( Table 3). As expected, the high stringency washes proved more effective at enriching steppe bison sequences than the low stringency washes (107.13 versus 55.18 fold enrichment of unique reads, Table 5). In contrast, the low stringency washes proved to be more effective with all the other animal samples. The sequence divergence between these species and modern bison would make probe/library complexes less stable because of sequence mismatches so the less disruptive nature of the low stringency washes allowed the recovery of more target sequences.

GC content
The nucleotide content of enriched mitogenomes did not appear to have a consistent impact on the recovery of mtDNA in most of the animal samples (Fig 4). Some drops in GC content  Table 4). The high correlation between the fraction of unique reads in shotgun and enriched libraries suggest that initial screening of a sample with shallow sequencing and adjusting sequencing effort accordingly may reduce the cost of studies involving hybridization capture enrichment. https://doi.org/10.1371/journal.pone.0209499.g003 In-house synthesis of RNA probes for mtDNA hybridization capture of the wombat mitogenome appear to be associated with reduced read depth, but this pattern was not universal. In a segment of the thylacine mitogenome (bp 16,151 to 16,743) the GC content is only 13% and this locus appears to be completely devoid of mapped reads. The reduced data in these low GC content areas likely stems from several factors: PCR amplification is biased against low GC content sequences [29] and hybridization capture methods generally recover sequences with extreme GC content (low and high) less efficiently [30]. Similar to other hybridization capture methods, our in-house probes do not enrich loci with extreme GC content efficiently and this limitation must be considered when designing a study.
The thylacine mtDNA enrichments did not perform in the same manner as the other animals and produced only � 3 fold enrichment compared to the >13 increase of observed for the animals ( Table 5). The combination of GC content and phylogenetic distance may have contributed to lower performance of the bison probes with the thylacine sample. The GC content of the thylacine mitogenome spans 35 percentage points in comparison to the maximum of 23 observed in the other animal mitogenomes (Fig 4). This broad range, which includes regions of very low GC content, coupled with the sequence divergence between thylacine and bison mitogenomes may have reduced the efficiency of the in-house probes with the marsupial sample. In-house hybridization capture probes made from modern bison DNA were used to enrich mtDNA from ancient specimens of divergent taxa. To eliminate the effect of sequencing depth, analysis was performed on 2.5 x 10 6 reads randomly subsampled from each library. The bison probes were able to enrich divergent animal mtDNA but not plant. The effect of low stringency washes (low temperature and high salt concentration) and high stringency washes (high temperature and low salt concentration) on enrichment efficiency with in-house bison probes indicates that high stringency washes are more effective with closely related taxa while low stringency washes are more effective with distantly related species. Mapping statistics for the entire sequencing data set are given in S2 Table. https://doi.org/10.1371/journal.pone.0209499.t005 In-house synthesis of RNA probes for mtDNA hybridization capture Wombat High Stringency

Probe cost
The estimated the cost of our in-house mitochondrial probes for a single hybridization was approximately $7.00 (USD). Similar mitochondrial probes from a commercial source would cost approximately $8.00 to $18.00 (USD) depending on the number of reactions purchased. Although the cost of commercial reagents can approach the price of our in-house probes this would require the purchase of a kit that is sufficient to enrich several hundred mitogenomes, which may not be suitable for all studies. Our methodology did require an initial expenditure for various reagents and kits, but after this purchase probes were produced inexpensively.

Advantages over comparable protocols
Previously, Maricic et al. (2010) described a probe synthesis procedure somewhat similar to the methodology described in the current paper [5]. In Maricic et al., probes were synthesized by attaching biotin directly to fragmented long-range amplicons instead of transcribing RNA from these PCR products. The procedure described in the current study has several advantages over the Maricic et al. (2010) protocol. Firstly, the in vitro transcription of the long-range amplicons results in an amplification of RNA molecules and increased probe yield [31]. Second, only single stranded RNA is transcribed in our methodology [31], so there will be minimal formation of probe/probe complexes which is a concern for the probe synthesized from dsDNA using methods such as the Maricic et al. (2010) protocol.

Conclusion
In this study, we have described a method to synthesize hybridization capture probes from overlapping long-range PCR amplicons, which has substantial advantages over comparable inhouse protocols. Our procedure does not require any specialized instruments and the probes can be synthesized in most standardly equipped molecular biology laboratories. The methodology is flexible as researchers can design primers with the T7 promoter to enrich targets as needed. Target size should also be flexible in our system, as primer sets can be designed to enrich small to modest sized targets including mitogenomes, chloroplast genomes, exons, or any genomic region spanning a few tens of kilobases. However, enriching a large target such as a complete exome is not feasible with our system, as the effort needed to produce the longrange templates would make commercial probes more cost effective. Our probe synthesis method offers a less expensive and more flexible alternative to commercial reagents. In the current study, we demonstrate that our probes behave in a consistent manner and will even enrich mitochondrial genomes from ancient samples with damaged DNA. Importantly, we also demonstrate that our in-house system can be used to enrich target DNA from taxa that are phylogenetically distant from the species used to transcribe the probe molecules, especially when the stringency of the wash steps is lowered (by increasing salt concentration and lowering temperature). This should remove the need for phylogenetically diverse probe sets, and make in-house probe synthesis viable and cost-effective even for small projects on varied taxa. Read depths of animal sequencing libraries enriched for mtDNA and mapped to a mitogenome reference. The small black graph above the read depth plot represents the fraction of GC nucleotides in reference mitogenome in a 200 bp sliding window. Sequence composition did not appear to have a consistent impact on the recovery of mtDNA with the RNA probes. Some regions of references with low GC content did appear to have reduced levels of read depth, which include nucleotides 16,151 to 16,743 of the thylacine reference where the GC content drops to � 13%. The large variability in GC content of the thylacine mitogenome may have contributed to the reduced enrichment efficiency observed with the marsupial sample.
Supporting information S1