Extraction and selection of high-molecular-weight DNA for long-read sequencing from Chlamydomonas reinhardtii

Frédéric Chaux; Nicolas Agier; Stephan Eberhard; Zhou Xu

doi:10.1371/journal.pone.0297014

Abstract

Recent advances in long-read sequencing technologies have enabled the complete assembly of eukaryotic genomes from telomere to telomere by allowing repeated regions to be fully sequenced and assembled, thus filling the gaps left by previous short-read sequencing methods. Furthermore, long-read sequencing can also help characterizing structural variants, with applications in the fields of genome evolution or cancer genomics. For many organisms, the main bottleneck to sequence long reads remains the lack of robust methods to obtain high-molecular-weight (HMW) DNA. For this purpose, we developed an optimized protocol to extract DNA suitable for long-read sequencing from the unicellular green alga Chlamydomonas reinhardtii, based on CTAB/phenol extraction followed by a size selection step for long DNA molecules. We provide validation results for the extraction protocol, as well as statistics obtained with Oxford Nanopore Technologies sequencing.

Citation: Chaux F, Agier N, Eberhard S, Xu Z (2024) Extraction and selection of high-molecular-weight DNA for long-read sequencing from Chlamydomonas reinhardtii. PLoS ONE 19(2): e0297014. https://doi.org/10.1371/journal.pone.0297014

Editor: Ramachandran Srinivasan, Sathyabama Institute of Science and Technology, INDIA

Received: August 25, 2023; Accepted: December 26, 2023; Published: February 8, 2024

Copyright: © 2024 Chaux et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All sequencing data, which include raw Nanopore FAST5 files and read FASTQ files, and genome assemblies (as FASTA files) have been submitted to the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena/browser/home) under the project accession number PRJEB59713.

Funding: Research in ZX’s laboratory was supported by ANR grant “AlgaTelo” (ANR-17-CE20-0002-01; https://anr.fr/) and by Ville de Paris (Programme Émergence(s); https://www.paris.fr/appels-a-projets). SE was supported by the “Initiative d’Excellence” program of the French State (‘DYNAMO’, ANR-11-LABX-0011-01; https://anr.fr/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

In recent years, long-read sequencing technologies, such as the ones developed by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (Nanopore), have emerged as a solution to the pitfalls of short-read technologies in the detection of structural variants and in assembling repeated sequences and other complex regions [1]. Additionally, because native DNA is used, long-read technologies can directly detect a variety of modified bases, including the most commonly studied methylated cytosines [2, 3]. For their applications in genome assembly and structural variant detection, these technologies typically sequence DNA molecules ranging in size from kilobases to hundreds of kilobases as a continuous read. Reads traversing repeated sequences are necessary to correctly assemble neighboring regions, with longer reads enabling more contiguous genome assemblies. Today, the major bottleneck to sequence long reads comes from the ability to extract high-quality DNA devoid of polyphenol and polysaccharide contaminants with sizes compatible with this purpose. This is especially true for most plant tissues and algae cells, because polyphenols and polysaccharides are often co-extracted with DNA and can inhibit downstream applications such as sequencing [4, 5].

Chlamydomonas reinhardtii is a unicellular green alga that is widely used as a model organism to study photosynthesis and cellular motility [6], and is an organism of choice for biotechnological application, with many synthetic biology tools being currently developed [7, 8]. In C. reinhardtii, as for other plants and algae, contending with phenolic and polysaccharide contaminants while preserving HMW DNA is a major challenge and requires an optimized protocol. PacBio and Nanopore sequencing have been performed on this organism, contributing to important advances in our understanding of its genome structure and content, base modifications and evolution [9–16]. However, it appears that a size selection step can substantially enrich for longer molecules, as noted in [14, 15] and as we demonstrate in this work. An efficient and well documented protocol is therefore needed for sequencing projects that require long DNA molecules.

Here, we present a detailed protocol dedicated to efficiently extract and select HMW DNA from C. reinhardtii cells. The protocol minimizes DNA-shearing manipulations [17] and comprises an additional step to enrich for HMW DNA. We validated the method by pulse-field gel electrophoresis (PFGE) and measurement of read length from Nanopore sequencing.

Materials and methods

The protocol described in this peer-reviewed article is published on protocols.io, dx.doi.org/10.17504/protocols.io.8epv59j9jg1b/v2 and is included for printing purposes as S1 File.

Nanopore sequencing

Sequencing libraries were prepared as per manufacturer’s recommendations, using NEBNext companion module (E7180S, NEB) and Ligation Sequencing Kit SQK LSK-109 (Nanoporetech), except for the ligation time, which we increased to 30 min. For each run, 500 ng were loaded on MinION flow cells (R9.4.1, Nanoporetech) and sequenced for 6h to 16h, depending on flow-cell kinetics. Libraries were loaded at least twice, with 1h wash using the manufacturer’s washing buffer (EXP-WSH004) between runs. Basecalling was performed using Guppy (version 4.3.4) with parameters set to “high accuracy”.

Results

We extracted genomic DNA following the presented protocol (S1 File) and applied size selection using the Short Read Eliminator (SRE) kit (Circulomics), an easy-to-use method that does not require dedicated devices which is based on a length-dependent precipitation of nucleic acids driven by polyvinylpyrrolidone crowding. Large amounts of small DNA fragments can be detrimental for long-read Nanopore sequencing [18], not only because the subsequent reads are short, but also because these molecules can outcompete the longer ones, both for adapter ligation and pore usage, thus yielding suboptimal results.

The size distribution of the extracted DNA was assessed by PFGE and Nanopore sequencing, with and without size-selection for HMW DNA. Samples were migrated in a pulse field, stained by ethidium bromide and imaged with UV light (Fig 1A). The DNA molecules extracted without size selection migrated as a large smear spread between approximately 1.5 and 150 kb. After size selection with the SRE kit, the upper part of the distribution remained unchanged while the low-molecular-weight fragments (< 10 kb) were visibly reduced. We made a similar observation after electrophoresis and staining of the samples in a 0.3% agarose gel (Fig 1B).

Download:

Fig 1. Visualization of extracted genomic DNA size distributions.

(a) PFGE using 0.5 μg of DNA prepared with (+) or without (-) SRE size-selection, embedded in 30 μl of 0.5% low-melting agarose plugs, migrated in a 1% SeaKem GTG agarose (Lonza) gel. The ladder is a mix of PFG mid-range (N0342S, NEB) and GeneRuler 1 kb Plus (SM1331, ThermoFischer). Electrophoresis conditions: 0.5X TBE (Tris Borate EDTA) buffer, 6 V.cm^-1, 120° angle, for 11h, switching time ramp from 1 to 60 seconds. Gel stained in ethidium bromide and imaged with UV. (b) Standard gel electrophoresis (0.3% agarose) of the indicated samples. GeneRuler 1 kb Plus (SM1331, ThermoFischer) is used as the ladder. See S3 Fig for the uncropped images.

https://doi.org/10.1371/journal.pone.0297014.g001

Size-selection of DNA fragments before preparation of libraries for Nanopore sequencing led to a substantially decreased number of shorter molecules and an enrichment of longer ones (Fig 2A and 2B), without negatively affecting read quality (Fig 2D) and with no effect on genome-wide sequencing depth (S1 Fig). Size-selection doubled the mean read length, increased the N50 from 12 kb to 17 kb, with reads in the top decile being longer than 21 kb (S1 Table). The length distribution after size-selection was robust across different experiments using two other independent biological samples, and reached an N50 of up to 20 kb and a top decile length of up to 27 kb (Fig 2C and S1 Table). The longest molecules we sequenced were over 100 kb, which are instrumental for genome assemblies. Indeed, we recently assembled the genome of C. reinhardtii based on these reads and found a genome size between 114 and 117.7 Mb [15], depending on the assembler, which is consistent with the 114 Mb of the recently released version 6 of the reference genome [16]. Overall, this protocol and the resulting quality and length of the DNA molecules are suitable for reaching highly contiguous genome assemblies.

Download:

Fig 2. Distributions of read length in Nanopore-sequenced datasets.

(a, b) Count percentage of (a) reads and of (b) bases as a function of read length obtained from genomic DNA of C. reinhardtii (experiment “A”, see S1 Table) with or without size selection (+SRE and -SRE). (c) Count of bases after size-selection (+SRE) as a function of read length obtained from three different biological samples (see S1 Table and S2 Fig). (d) Quality score for individual reads, grouped into bins of 0.1 log unit for samples “A-SRE” and “A+SRE”. The shaded areas represent the values between the 1^st and 3^rd quartiles.

https://doi.org/10.1371/journal.pone.0297014.g002

Supporting information

S1 File. Step-by-step protocol, also available on protocols.io: dx.doi.org/10.17504/protocols.io.8epv59j9jg1b/v2.

https://doi.org/10.1371/journal.pone.0297014.s001

(PDF)

S1 Table. Summary statistics for 6 DNA preparations and sequencing experiments.

Major limiting outputs are shown in red. ^a https://www.chlamylibrary.org and reference [19]. ^b with quality > 7. ^c as per manufacturer’s protocol (Monarch® HMW DNA Extraction Kit for Tissue Cat. no. T3060L, New England Biolabs). ^d cell lysis using DNeasy Maxi Plant (Cat. no. 68163, Qiagen) as in [20] and purification using Genomic-tip 100/G (Cat. no. 10243, Qiagen), then AMPure beads (Cat. no. A63880, Beckman Coulter).

https://doi.org/10.1371/journal.pone.0297014.s002

(PDF)

S1 Fig. Genome-wide sequencing depth normalized to the median, for all chromosomes, using DNA obtained with (+) or without (-) SRE size selection.

https://doi.org/10.1371/journal.pone.0297014.s003

(PDF)

S2 Fig. Count percentage of bases as a function of read length with alternative sample preparations without size selection (-SRE).

See S1 Table for details. Sample C was sequenced in the presence of control DNA (“DNA CS” from Oxford Nanopore sequencing), which peaked at 3 kb.

https://doi.org/10.1371/journal.pone.0297014.s004

(PDF)

S3 Fig. Raw images.

https://doi.org/10.1371/journal.pone.0297014.s005

(PDF)

Acknowledgments

We thank Samuel O’Donnell for his help in the initial development of this protocol.

References

1. Logsdon GA, Vollger MR, Eichler EE. Long-read human genome sequencing and its applications. Nat Rev Genet. 2020;21(10):597–614. pmid:32504078
- View Article
- PubMed/NCBI
- Google Scholar
2. Rand AC, Jain M, Eizenga JM, Musselman-Brown A, Olsen HE, Akeson M, et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat Methods. 2017;14(4):411–3. pmid:28218897
- View Article
- PubMed/NCBI
- Google Scholar
3. Feng Z, Fang G, Korlach J, Clark T, Luong K, Zhang X, et al. Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic. PLoS Comput Biol. 2013;9(3):e1002935. pmid:23516341
- View Article
- PubMed/NCBI
- Google Scholar
4. Porebski S, Bailey LG, Baum BR. Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant molecular biology reporter. 1997;15(1):8–15.
- View Article
- Google Scholar
5. Healey A, Furtado A, Cooper T, Henry RJ. Protocol: a simple method for extracting next-generation sequencing quality genomic DNA from recalcitrant plant species. Plant Methods. 2014;10:21. pmid:25053969
- View Article
- PubMed/NCBI
- Google Scholar
6. Harris EH. The Chlamydomonas Sourcebook: Elsevier/Academic Press; 2009.
7. Scaife MA, Nguyen G, Rico J, Lambert D, Helliwell KE, Smith AG. Establishing Chlamydomonas reinhardtii as an industrial biotechnology host. Plant J. 2015;82(3):532–46. pmid:25641561
- View Article
- PubMed/NCBI
- Google Scholar
8. Crozet P, Navarro FJ, Willmund F, Mehrshahi P, Bakowski K, Lauersen KJ, et al. Birth of a Photosynthetic Chassis: A MoClo Toolkit Enabling Synthetic Biology in the Microalga Chlamydomonas reinhardtii. ACS Synth Biol. 2018;7(9):2074–86. pmid:30165733
- View Article
- PubMed/NCBI
- Google Scholar
9. O’Donnell S, Chaux F, Fischer G. Highly Contiguous Nanopore Genome Assembly of Chlamydomonas reinhardtii CC-1690. Microbiol Resour Announc. 2020;9(37). pmid:32912911
- View Article
- PubMed/NCBI
- Google Scholar
10. Liu Q, Fang L, Yu G, Wang D, Xiao CL, Wang K. Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data. Nat Commun. 2019;10(1):2449. pmid:31164644
- View Article
- PubMed/NCBI
- Google Scholar
11. Chaux-Jukic F O’Donnell S, Craig RJ, Eberhard S, Vallon O, Xu Z. Architecture and evolution of subtelomeres in the unicellular green alga Chlamydomonas reinhardtii. Nucleic Acids Res. 2021;49(13):7571–87. pmid:34165564
- View Article
- PubMed/NCBI
- Google Scholar
12. Craig RJ, Hasan AR, Ness RW, Keightley PD. Comparative genomics of Chlamydomonas. Plant Cell. 2021. pmid:33793842
- View Article
- PubMed/NCBI
- Google Scholar
13. Lopez-Cortegano E, Craig RJ, Chebib J, Balogun EJ, Keightley PD. Rates and spectra of de novo structural mutations in Chlamydomonas reinhardtii. Genome Res. 2023;33(1):45–60. pmid:36617667
- View Article
- PubMed/NCBI
- Google Scholar
14. Payne ZL, Penny GM, Turner TN, Dutcher SK. A gap-free genome assembly of Chlamydomonas reinhardtii and detection of translocations induced by CRISPR-mediated mutagenesis. Plant Commun. 2023;4(2):100493. pmid:36397679
- View Article
- PubMed/NCBI
- Google Scholar
15. Chaux F, Agier N, Garrido C, Fischer G, Eberhard S, Xu Z. Telomerase-independent survival leads to a mosaic of complex subtelomere rearrangements in Chlamydomonas reinhardtii. Genome Res. 2023;33(9):1582–98. pmid:37580131
- View Article
- PubMed/NCBI
- Google Scholar
16. Craig RJ, Gallaher SD, Shu S, Salome PA, Jenkins JW, Blaby-Haas CE, et al. The Chlamydomonas Genome Project, version 6: Reference assemblies for mating-type plus and minus strains reveal extensive structural mutation in the laboratory. Plant Cell. 2023;35(2):644–72. pmid:36562730
- View Article
- PubMed/NCBI
- Google Scholar
17. Kovacic RT, Comai L, Bendich AJ. Protection of megabase DNA from shearing. Nucleic Acids Res. 1995;23(19):3999–4000. pmid:7479050
- View Article
- PubMed/NCBI
- Google Scholar
18. Delahaye C, Nicolas J. Sequencing DNA with nanopores: Troubles and biases. PLoS One. 2021;16(10):e0257521. pmid:34597327
- View Article
- PubMed/NCBI
- Google Scholar
19. Li X, Zhang R, Patena W, Gang SS, Blum SR, Ivanova N, et al. An Indexed, Mapped Mutant Library Enables Reverse Genetics Studies of Biological Processes in Chlamydomonas reinhardtii. Plant Cell. 2016;28(2):367–87. pmid:26764374
- View Article
- PubMed/NCBI
- Google Scholar
20. Eberhard S, Valuchova S, Ravat J, Fulnecek J, Jolivet P, Bujaldon S, et al. Molecular characterization of Chlamydomonas reinhardtii telomeres and telomerase mutants. Life science alliance. 2019;2(3). pmid:31160377
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Logsdon GA, Vollger MR, Eichler EE. Long-read human genome sequencing and its applications. Nat Rev Genet. 2020;21(10):597–614. pmid:32504078
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Rand AC, Jain M, Eizenga JM, Musselman-Brown A, Olsen HE, Akeson M, et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat Methods. 2017;14(4):411–3. pmid:28218897
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Feng Z, Fang G, Korlach J, Clark T, Luong K, Zhang X, et al. Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic. PLoS Comput Biol. 2013;9(3):e1002935. pmid:23516341
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Porebski S, Bailey LG, Baum BR. Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant molecular biology reporter. 1997;15(1):8–15.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref5] 5. Healey A, Furtado A, Cooper T, Henry RJ. Protocol: a simple method for extracting next-generation sequencing quality genomic DNA from recalcitrant plant species. Plant Methods. 2014;10:21. pmid:25053969
View Article
PubMed/NCBI
Google Scholar

[17] View Article

[18] PubMed/NCBI

[19] Google Scholar

[ref6] 6. Harris EH. The Chlamydomonas Sourcebook: Elsevier/Academic Press; 2009.

[ref7] 7. Scaife MA, Nguyen G, Rico J, Lambert D, Helliwell KE, Smith AG. Establishing Chlamydomonas reinhardtii as an industrial biotechnology host. Plant J. 2015;82(3):532–46. pmid:25641561
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref8] 8. Crozet P, Navarro FJ, Willmund F, Mehrshahi P, Bakowski K, Lauersen KJ, et al. Birth of a Photosynthetic Chassis: A MoClo Toolkit Enabling Synthetic Biology in the Microalga Chlamydomonas reinhardtii. ACS Synth Biol. 2018;7(9):2074–86. pmid:30165733
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref9] 9. O’Donnell S, Chaux F, Fischer G. Highly Contiguous Nanopore Genome Assembly of Chlamydomonas reinhardtii CC-1690. Microbiol Resour Announc. 2020;9(37). pmid:32912911
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref10] 10. Liu Q, Fang L, Yu G, Wang D, Xiao CL, Wang K. Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data. Nat Commun. 2019;10(1):2449. pmid:31164644
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref11] 11. Chaux-Jukic F O’Donnell S, Craig RJ, Eberhard S, Vallon O, Xu Z. Architecture and evolution of subtelomeres in the unicellular green alga Chlamydomonas reinhardtii. Nucleic Acids Res. 2021;49(13):7571–87. pmid:34165564
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref12] 12. Craig RJ, Hasan AR, Ness RW, Keightley PD. Comparative genomics of Chlamydomonas. Plant Cell. 2021. pmid:33793842
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref13] 13. Lopez-Cortegano E, Craig RJ, Chebib J, Balogun EJ, Keightley PD. Rates and spectra of de novo structural mutations in Chlamydomonas reinhardtii. Genome Res. 2023;33(1):45–60. pmid:36617667
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref14] 14. Payne ZL, Penny GM, Turner TN, Dutcher SK. A gap-free genome assembly of Chlamydomonas reinhardtii and detection of translocations induced by CRISPR-mediated mutagenesis. Plant Commun. 2023;4(2):100493. pmid:36397679
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref15] 15. Chaux F, Agier N, Garrido C, Fischer G, Eberhard S, Xu Z. Telomerase-independent survival leads to a mosaic of complex subtelomere rearrangements in Chlamydomonas reinhardtii. Genome Res. 2023;33(9):1582–98. pmid:37580131
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref16] 16. Craig RJ, Gallaher SD, Shu S, Salome PA, Jenkins JW, Blaby-Haas CE, et al. The Chlamydomonas Genome Project, version 6: Reference assemblies for mating-type plus and minus strains reveal extensive structural mutation in the laboratory. Plant Cell. 2023;35(2):644–72. pmid:36562730
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref17] 17. Kovacic RT, Comai L, Bendich AJ. Protection of megabase DNA from shearing. Nucleic Acids Res. 1995;23(19):3999–4000. pmid:7479050
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref18] 18. Delahaye C, Nicolas J. Sequencing DNA with nanopores: Troubles and biases. PLoS One. 2021;16(10):e0257521. pmid:34597327
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref19] 19. Li X, Zhang R, Patena W, Gang SS, Blum SR, Ivanova N, et al. An Indexed, Mapped Mutant Library Enables Reverse Genetics Studies of Biological Processes in Chlamydomonas reinhardtii. Plant Cell. 2016;28(2):367–87. pmid:26764374
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref20] 20. Eberhard S, Valuchova S, Ravat J, Fulnecek J, Jolivet P, Bujaldon S, et al. Molecular characterization of Chlamydomonas reinhardtii telomeres and telomerase mutants. Life science alliance. 2019;2(3). pmid:31160377
View Article
PubMed/NCBI
Google Scholar

[74] View Article

[75] PubMed/NCBI

[76] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Nanopore sequencing

Results

Supporting information

S1 File. Step-by-step protocol, also available on protocols.io: dx.doi.org/10.17504/protocols.io.8epv59j9jg1b/v2.

S1 Table. Summary statistics for 6 DNA preparations and sequencing experiments.

S1 Fig. Genome-wide sequencing depth normalized to the median, for all chromosomes, using DNA obtained with (+) or without (-) SRE size selection.

S2 Fig. Count percentage of bases as a function of read length with alternative sample preparations without size selection (-SRE).

S3 Fig. Raw images.

Acknowledgments

References