An Improved Chloroplast DNA Extraction Procedure for Whole Plastid Genome Sequencing

Background Chloroplast genomes supply valuable genetic information for evolutionary and functional studies in plants. The past five years have witnessed a dramatic increase in the number of completely sequenced chloroplast genomes with the application of second-generation sequencing technology in plastid genome sequencing projects. However, cost-effective high-throughput chloroplast DNA (cpDNA) extraction becomes a major bottleneck restricting the application, as conventional methods are difficult to make a balance between the quality and yield of cpDNAs. Methodology/Principal Findings We first tested two traditional methods to isolate cpDNA from the three species, Oryza brachyantha, Leersia japonica and Prinsepia utihis. Both of them failed to obtain properly defined cpDNA bands. However, we developed a simple but efficient method based on sucrose gradients and found that the modified protocol worked efficiently to isolate the cpDNA from the same three plant species. We sequenced the isolated DNA samples with Illumina (Solexa) sequencing technology to test cpDNA purity according to aligning sequence reads to the reference chloroplast genomes, showing that the reference genome was properly covered. We show that 40–50% cpDNA purity is achieved with our method. Conclusion Here we provide an improved method used to isolate cpDNA from angiosperms. The Illumina sequencing results suggest that the isolated cpDNA has reached enough yield and sufficient purity to perform subsequent genome assembly. The cpDNA isolation protocol thus will be widely applicable to the plant chloroplast genome sequencing projects.


Introduction
Chloroplasts (plastids) are plant organelles that contain a circular DNA containing ,130 genes with the size ranging from 72 to 217 kb [1,2]. cpDNAs of green plants are exceptionally conserved in their gene content and organization, providing sufficient information for genome-wide evolutionary studies. Recent efforts have proven their potentials in resolving phylogenetic relationships at different taxonomic levels and understanding structural and functional evolution by using the whole chloroplast genome sequences [3,4,5].
Plant cpDNAs have been set as targets among the very early genome sequencing projects owing to their small sizes [6]. To date, at least 200 plant complete cpDNAs have been sequenced (http://www.ncbi.nlm. nih.gov/genomes/GenomesGroup.cgi?taxid = 2759&opt = plastid), and in the recent years, the number is rapidly increasing due to an extensive application of the second-generation sequencing technologies to the whole chloroplast genome sequencing. Despite its short sequence reads, excess sequence data produced by the secondgeneration sequencing technologies are fairly suitable for the genome assembly, because the chloroplast genome is much smaller in size and simple in structural complexity compared to nuclear genomes [7]. For example, a single 600 Gbp per run in the Illumina HiSeq-2000 (http://www.illumina.com) could conceivably sequence ,40,000 average-sized chloroplast genomes to a depth of 1206. Next-generation sequencing technologies have undoubtedly made it possible to sequence the entire plant genomes more efficiently and economically than ever before with decreased time and costs compared with traditional approaches [6]. With rapid progress in sequencing technologies, the acquisition of high quality cpDNAs from plant tissues for the whole genome sequencing is urgently needed.
Two experimental methods are often employed to collect cpDNAs in plants. The first is the whole chloroplast genome amplification from total DNA using long polymerase chain reaction (PCR), and the second is direct isolation of cpDNAs from fresh plant materials based on sucrose gradient. The former method is PCR-based cpDNA sequencing, which is usually used to the situation that substantial plant leaf materials (e.g., ,20 to 100 g fresh leaves) are unavailable but can be substituted by extracting total DNA from limited materials. The cpDNA fragments are further amplified by using the conservative primer pairs [8]. The latter focuses on isolating the chloroplasts from fresh plant leaves according to sucrose gradient centrifugation, followed by a direct extraction of cpDNAs from intact chloroplasts [9]. Of them, sucrose gradient centrifugation is limited by the availability of ultracentrifuges which are not facilitated in many laboratories [10]. As a result, the PCR method is the most extensively used among the chloroplasts sequencing projects regardless of its timeconsumption and higher costs [11]. As an alternative of the sucrose gradient centrifugation process, DNAse I treatment [12] and high salt precipitation [13] have succeeded in isolating cpDNAs of some specific plant species, but further applications to additional species have been restricted, as both of them were not easy to make a balance between enough yield and quality with limited contamination of nuclear and mitochondria DNA [14].
The rapid progress in the next-generation technologies requires developing new methods to isolate cpDNAs with increased quality and yield, especially aiming to simplify the isolation process so as to meet the need for the whole chloroplast genome sequencing. We modified the above-described methods [9,12,13] to develop a new protocol and further applied it to isolate cpDNAs from the three species, Oryza brachyantha, Leersia japonica and Prinsepia utihis. To test their purities, these three isolated cpDNAs were subsequently sequenced by using the Illumina (Solexa) sequencing-by-synthesis technology.

Results and Discussion
The isolation of cpDNA The cpDNA isolation includes the three basic steps: separation of plastids from leaf tissues, lysis of the chloroplasts, and purification of DNA. Because the isolation of intact chloroplasts often acts as a critical stage of the whole procedure, the method based on sucrose gradient ultracentrifugation is the most commonly employed to effectively separate nuclear DNAs from cpDNAs. Using two grass species (O. brachyantha and L. japonica) and one rosid plant (P. utihis), we first performed cpDNA isolation by following the previously described procedure [9]. Electrophoresis of the resulting DNA displayed a very weak band on agarose gel, indicative of very low cpDNA yield ( Figure 1C). This study only used 20 g fresh leaves, while more than 100 g of leaf tissues were recommended [9]. Another possible explanation was that, after the sucrose gradient centrifugation, only a small amount of chloroplast pellet was collected, leading to the extraction of few cpDNAs. Because the library preparation for the whole genome sequencing needs a substantial amount of starting DNA, either repeated cpDNA isolation or substantial leaves are required to use this method. Considering the amount of time-consumed by sucrose gradient preparations, two alternative methods, DNAse I treatment and high salt method, may be suitable to replace the sucrose gradient centrifugation method.
The DNAse I treatment method used DNAse I to digest nuclear DNA that adheres to the outer chloroplast membrane. The success in isolating the cpDNAs was reported from two of many species which they have attempted [9]. When the three plant species were used in this study, however, we failed to isolate intact cpDNAs since they all were degraded by the DNAse I ( Figure 1B). The result is consistent with the fact that DNAse I digest not only the nuclear DNA but also the cpDNA which might not be well protected within intact plastids [9,15].
The second alternative method employs a high NaCl concentration in the isolation buffers, which do not involve any sucrose gradient centrifugation. This method was only reported to have succeeded in isolating the pea cpDNA [13]. Considering that only increasing the NaCl concentration may not be enough to enhance cpDNA purity, we made several modifications of the method to broaden its application to as many taxa as possible. The final protocol (see materials and methods) demonstrated the advantage of isolating sufficient cpDNAs with leaf materials of the same three plant species ( Figure 1A).
As a modification of the sucrose gradient centrifugation, the high salt method significantly simplified the cpDNA isolation process. By using this method, our first effort to isolate the cpDNAs also seems successful, as it can get a relatively clearly defined DNA band. When we increased the amount of fresh leaves, however, a positive correlation between increased DNA yield and the possibility of DNA degradation was found, indicating that there is more contamination of nuclear DNAs ( Figure 2). The observation suggests that this method may not be suitable to isolate cpDNAs with high purity. As an alternative approach in chloroplast isolation, four to six volumes (v/w) cold isolation buffer (in the original protocol) may not be enough to homogenate the fresh leaves (e.g., 20 g fresh leaves with 100 ml isolation buffer). Therefore, we increased the amount of isolation buffer from 5 to 20 volumes of fresh leaves (e.g., 20 g fresh leaves with 400 ml buffer A in our protocol) in the subsequent experiment. Even when 50 g fresh leaves were used, a well-defined cpDNA band can be observed (Figure 2), suggesting that the modification led to a successful isolation of the cpDNAs. It is likely that about 20 g leaves may be more optimal as it could include less contaminating nuclear DNA. Furthermore, two additional centrifugation steps (200 g 20 min and 3500 g 20 min, separately) were used to discard the cell debris and collect chloroplast pellet. To decrease the nuclear DNA contamination that adheres to the outer chloroplast membrane, we also incorporated extra steps to wash the chloroplast pellet with buffer B, further increasing the purity of isolated cpDNAs. Last but not least, chloroplasts were lysed using SDS and Proteinase K instead of cetyltrimethylammonium bromide (CTAB) followed by phenol/chloroform extraction. The final isolated cpDNAs were digested with HindIII and the result was visualized on a 0.8% agarose gel ( Figure 3). Among these modifications, incorporated gradual centrifugation steps were of the most importance, because they are able to increase the cpDNA purity by separating the chloroplasts from cell debris. If larger amounts of starting materials (e.g., 50 g fresh leaves) were used, it is necessary to add a second centrifugation step at 200 g.
Of these four methods, our modified high salt method was more efficient to isolate the cpDNAs, and most importantly, to balance cpDNA yield and purity to the greatest extent. Indeed, our lab has been employing this improved protocol and extracted hundreds of plant species, which has proved to be highly efficient to isolate cpDNA from more taxa of plants (unpublished data).

Sequencing chloroplast DNAs using the secondgeneration illumina sequencing technology
The vast improvements made in DNA sequencing technologies offer unprecedented opportunities to perform phylogenomic studies based on the whole chloroplast genome sequences. Multiplex sequencing with the second-generation technology allows multiple samples to be sequenced in a run, generating millions of reads that significantly increase the sequence depth [16]. To test the cpDNA purity isolated by our protocol, in this study, we sequenced these three chloroplast genomes by using Illumina sequencing technology. Sequencing reactions generated a total of 330 Mbp sequence data with 5 Mbp in O. brachyantha, 21 Mbp in L. japonica and 304 Mbp in P. utihis (table 1). A reference-guided chloroplast genome assembly was performed to roughly estimate the genome coverage (figure 4), the O. brachyantha ( Figure 4A) and L. japonica ( Figure 4B) were assembled to O. nivara, while P. utihis ( Figure 4C) was assembled to Prunus persica.
We surprisingly found that the cpDNA purity, represented by the percentage of the reads aligned to the reference genome, were relatively consistent across the three species, although the amount of sequence data varied greatly among them, ranging from 51, 606 reads in O. brachyantha to 3, 132, 702 reads in P. utihis. The cpDNA reads were 51.6% in O. brachyantha, 43.0% in L. japonica, and 44.2% in P. utihis, respectively (    Figure 4A). Our results thus suggest that, given the cpDNA purity isolated with this modified method, obtaining 50 Mbp of sequence data could lead to at least 1006 average coverage of the chloroplast genome which is sufficient for the assembly.
Previous studies [17,18] suggested that no more than 5% of cpDNAs usually exists among the total DNA in angiosperms. However, our protocol can efficiently isolate the cpDNAs with percentages of about 40-50% (table 1). The RCA-based (rolling circle amplification) cpDNA sequencing method [9] reported that approximately 10-40% of the resulting RCA products consisted of non-cpDNA [19]. In comparison, our method apparently showed its power in isolating cpDNAs with improved quality and lowered sequencing costs, although there is room to further improve the cpDNA purity.
In conclusion, this study provides a quick and efficient method for isolating cpDNAs from angiosperms. In comparisons with the commonly used methods of sucrose gradient centrifugation and the DNAse I treatment, our modified method indeed works competently when testing with leaf materials of the same three plant species of O. brachyantha, L. japonica and P. utihis. The cpDNA bands could be clearly defined on the agarose gel. By means of the next-generation Illumina sequencing technology, the three isolated cpDNA samples were subsequently sequenced and their purity reached ,40-50%, which were sufficiently pure to further perform the genome assembly. In addition, we tested the genome coverages influenced by the sequence data, showing that only ,50 Mbp could attain at least 1006 average coverage of the chloroplast genome when the cpDNA purity reached ,40-50%. In all, this modified method is able to serve as an efficient cpDNA extract procedure to complete the chloroplast genome sequencing of angiosperms.

Plant materials
The O. brachyantha and L. japonica (Poaceae) plants were grown in the greenhouse, while P. utihis (Rosaceae) was transplanted in  Botanical Garden of Kunming Institute of Botany, Chinese Academy of Sciences. For each plant species, ,20 g of the fresh leaves were collected and cleaned with distilled water, and then they were restored in 4uC refrigerator for further experimental uses.

Protocols
The four cpDNA isolation methods used in our study were described as below: A. Modified high salt method ( Figure 5) Reagents Both BSA and DTT were added just before the start of the experiment.
Chloroplast isolation All the following steps were carried out at 0uC if not otherwise stated.
1. Prior to extraction, about 20 g (fresh weight) leaves were collected and kept in dark for 48 to 72 hours at 4uC to decrease starch level stored in the leaves.
2. The leaves were nervure-removed, cut into pieces (,1 cm) and homogenized in 400 ml ice-cold buffer A for 30 seconds. Filter the homogenate into centrifuge bottles using two layers of Miracloth (Merck) by softly squeezing the cloth.
4. Repeated the centrifugation once again. The supernatant included chloroplasts suspended in it.
5. Centrifuged the supernatant at a higher centrifugal force of 3500 g for 20 min, the resulting pellet were chloroplast pellet with some contamination of nuclear DNAs.
6. Added 250 ml Buffer B to the pellet and suspend it gently using a paintbrush to wash the nuclear DNAs attaching to the chloroplast cytomembrane. Then centrifuge with 3500 g for 20 min and discard the supernatant.
7. Re-suspended the pellet with 250 ml Buffer B again and centrifuged (3750 g for 20 min) to gain the purified chloroplasts.
Chloroplast DNA isolation 8. Added 8 ml Buffer C, 1.5 ml 20% SDS, 20 ml b-Me, 30 ml Proteinase K (10 mg/ml) to the purified chloroplast pellet and incubate at 55uC for at least 4 hours or overnight. The chloroplasts would be fully lysed. 9. Put the centrifuge bottles on ice for 5 minutes, add 1.5 ml 5 M KAc (PH 5.2) and continue to freeze for 30 minutes. Then 10000 g centrifuge 15 min, discarding the pellet.
10. Extracted the supernatant with an equal volume of saturated phenol and chloroform:isoamyl-alcohol (24:1) in the centrifugation of 10000 g 20 min for twice. 11. Added an equal volume of isopropyl alcohol (about 10 ml) to the upper clear aqueous phase. Then put the centrifuge bottles in the 220uC for 1 hour or overnight.
12. Centrifuged the aqueous phase at 10000 g for 20 min. The cpDNA pellet was washed repeatedly with ethanol (70%, 96%), air-dried, and re-dissolved in 50 ml TE buffer. 13. Treated the cpDNA sample with 2 ml RNAse and visualize the DNA band on a 0.8% agarose gel.
Chloroplast isolation All the following steps were carried out at 0uC if not otherwise stated.
1. Prior to extraction, about 20 g (fresh weight) leaves were collected and kept in dark for 48 to 72 hours at 4uC in order to decrease starch level stored in the leaves.
2. The leaves were cut into pieces (,1 cm) and homogenized in 100 ml ice-cold Cold isolation buffer for 30 seconds. Filter the homogenate into centrifuge bottles using two layers of Miracloth (Merck) with softly squeezing the cloth.
5. Resuspended the final chloroplast pellet in 10 ml cold isolation buffer.
Chloroplast DNA isolation 6. Added 1/10 volume of 10% CTAB to lyse the chloroplasts. Incubate at 55uC for 1 to 2 hours. 7. Extracted the supernatant with an equal volume of saturated phenol and chloroform:isoamyl-alcohol (24:1) in the centrifugation of 10000 g 20 min for twice.
8. Added an equal volume of isopropyl alcohol (about 10 ml) to the upper clear aqueous phase. Then put the centrifuge bottles in the 220uC for 1 hour or overnight.
9. Centrifuged the aqueous phase at 10000 g for 20 min. The cpDNA pellet is washed repeatedly with ethanol (70%, 96%), airdried, and re-dissolved in 50 ml TE buffer. 10. Treated the cpDNA sample with 2 ml RNAse and visualize the DNA band on a 0.8% agarose gel.
C. sucrose gradient centrifugation [9] Reagents Chloroplast isolation 1. Prior to extraction, about 20 g (fresh weight) leaves were collected and kept in dark for 48 to 72 hours at 4uC in order to decrease starch level stored in the leaves.
2. The leaves were cut into pieces (,1 cm) and homogenized in 400 ml ice-cold isolation buffer for 30 seconds. Filter the homogenate into centrifuge bottles using two layers of Miracloth (Merck) with softly squeezing the cloth.
5. Gently loaded the resuspended pellet onto a step gradient consisting of 18 ml of 52% sucrose, overlayered with 7 ml of 30% sucrose.
6. Centrifuged the step gradients at 25,000 rpm for 60 min at 4uC in a swinging bucket rotor.
7. Removed the chloroplast band from the 30-52% interface using a wide-bore pipette, dilute with 40 ml wash buffer, and centrifuge at 1500 g for 15 min at 4uC. 8. Resuspended the chloroplast pellet with 2 ml wash buffer. Chloroplast DNA isolation 9. Chloroplast DNA isolation followed steps 6-10 in high salt method.
D. DNAse I treatment [9] In the DNAse I treatment method, steps were the same with sucrose gradient centrifugation method except the step 9 which was treated with DNAse I. That is, the step 9 in sucrose gradient centrifugation method was substituted with: add 20 ml DNAse I (10 mg/ml) and 250 ml 200 mM MgCl 2 to chloroplast solution buffer, incubate at 37uC for 60 min. Then add 1 ml 0.5 M EDTA to terminate the reaction.

Chloroplast genome sequencing and data analysis
After the cpDNA isolation with modified high salt method, approximately 5-10 mg of DNA was sheared, followed by adapter ligation and library amplification, subjecting to Illumina Sample Preparation Instructions. The fragmented cpDNAs were sequenced at both single-read using the Illumina Genome Analyzer IIx platform at the in-house facility at The Germplasm Bank of Wild Species in Southwestern China. The obtained paired-end reads (26100 bp read lengths) were assembled to the reference genome sequence to roughly estimate the genome coverage and cpDNA purity (the reads aligned to the reference genome sequence were served as cpDNA sequence) using the software program Geneious version 4.7 [20]. The reference chloroplast genome sequence of O. nivara (NC_005973) and P. persica (NC_014697) were downloaded from GenBank.