Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Rapid and collective determination of the complete “hot-spring frog” mitochondrial genome containing long repeat regions using Nanopore sequencing

  • Yuka Asaeda,

    Roles Data curation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    Affiliation Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan

  • Kento Shiraga,

    Roles Formal analysis, Investigation, Writing – review & editing

    Affiliation Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan

  • Makoto Suzuki,

    Roles Data curation, Writing – review & editing

    Affiliations Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan, Amphibian Research Center, Hiroshima University, Higashi-Hiroshima, Japan

  • Yoshihiro Sambongi,

    Roles Data curation, Writing – original draft, Writing – review & editing

    Affiliation Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan

  • Hajime Ogino,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliations Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan, Amphibian Research Center, Hiroshima University, Higashi-Hiroshima, Japan

  • Takeshi Igawa

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    tigawa@hiroshima-u.ac.jp

    Affiliations Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan, Amphibian Research Center, Hiroshima University, Higashi-Hiroshima, Japan

Abstract

The mitochondrial genome (mt-genome) is one of the promising molecular markers for phylogenetics and population genetics. Recently, various mt-genomes have been determined rapidly by using massively parallel sequencers. However, the control region (CR, also called D-loop) in mt-genomes remain difficult to precisely determine due to the presence of repeat regions. Here, using Nanopore sequencing, we succeeded in rapid and collective determination of complete mt-genome of the hot-spring frog, Buergeria japonica, and found that its mt-genome size was 22,274 bp including CR (6,929 bp) with two types of tandem repeat motifs forming repeat regions. Comparison of assembly strategies revealed that the long- and short-read data combined together enabled efficient determination of the CR, but the short-read data alone did not. The B. japonica CR was longer than that of a congenic species inhabiting cooler climate areas, Buergeria buergeri, because of the long repeat regions in the former. During the thermal adaptation of B. japonica, the longer repeat regions in its CR may have accumulated within a period after divergence from B. buergeri.

Introduction

Recent advances in massively parallel sequencing technologies have facilitated the rapid acquisition of genomic information, leading to the development of a variety of applications. Such technologies also apply to mitochondrial genomes (mt-genomes) [e.g., 13], which are used as molecular markers for phylogenetics and population genetics. However, the control region (CR, also called D-loop) in the mt-genome of some animals is difficult to sequence due to the presence of repeat sequences in such species. To overcome this problem, long-read sequencing technologies [4, 5] are beginning to contribute to the determination of sequences that are to-date difficult to read.

In amphibians, the CR includes direct and inverted tandem repeat sequences causing longer total length of their mt-genome. Moreover, frequent gene rearrangements have also been reported in modern anuran amphibians (Suborder Neobatrachia) [6]. The mt-genome of a bell-ring frog, Buergeria buergeri, also includes long CR of approximately 4.6 kbp and rearranged ND5 gene (translocation of ND5 next to CR) [7]. However, the evolutionary origin of these mt-genome features has not been clarified because only one species of genus Buergeria has been reported so far. Recent molecular phylogenetic analyses based on several mitochondrial genes reveal that B. japonica, inhabiting large regions across Taiwan and the Ryukyu Archipelago in Japan, is the first diverged species within genus Buergeria [8, 9]. Thus, sequencing the B. japonica mt-genome can reveal the origin of mt-genome features in this genus. Apart from being important molecular markers, mitochondrial genes, which are involved in aerobic energy metabolism, are also known to contribute to the organismal temperature tolerance function in some organisms [10, 11]. Because B. japonica is also known as “hot-spring frog” and the tadpoles have high temperature tolerance [1215], clues to understanding the evolutionary strategies for temperature adaptation will be available with the determination of the B. japonica mt-genome sequence, in comparison with that of B. buergeri inhabiting cooler climate areas.

In the present study, we conducted sequencing and assembling of the B. japonica mt-genome. We first obtained sequence data by Illumina and Oxford Nanopore Technology (ONT) sequencers, and then examined multiple recent assembly software. The results demonstrate the successful use of sequencing and assembly strategies to obtain the complete B. japonica mt-genome, which can then help to shed light on organismal thermal adaptation in this species.

Materials and methods

Extraction of genomic DNA

A female B. japonica individual was collected in April 2018 at Seranma hot spring in Kuchinoshima, Tokara Islands, Japan. Permission to perform field sampling and experiment on Kuchinoshima Island was granted by the mayor of Toshima village. The individual kept until death in March 2019 at the Amphibian Research Center, Hiroshima University under the clean and slow water flow aquarium in accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the Hiroshima University Animal Research Committee. Genomic DNA was extracted from the muscle of fresh dead body of this individual using DNA suisui-F (Rizo, Tsukuba, Japan) according to the manufacturer instructions and dissolved in 50 μL of nuclease-free water. All procedures were approved by the Hiroshima University Animal Research Committee (Approval number: G17-9).

Genomic DNA sequencing using Illumina sequencer

The extracted genomic DNA was used for library construction using NEBNext® DNA Library Prep Kit (New England Biolabs, MA, USA), and sequencing (150 bp paired end) was conducted at Novogene Co. Ltd. (Beijing, China) using Novaseq 6000 sequencing.

Genomic DNA and amplicon sequencing using Nanopore sequencer

The extracted genomic DNA was also used for library construction for Nanopore sequencing using a ligation library preparation kit (LQK-LSK109, ONT). Following manufacturer instructions, the quality and molecular weight of the genomic DNA were measured using Qubit, and sequencing was conducted using MinION sequencer (MinION Mk1B, ONT) and Flongle flow cell (FLO-FLG001, ONT).

To perform sequencing more effectively, we also conducted amplicon sequencing. For amplification of mitochondrial DNA fragments by PCR, two sets of oligo DNA primers were designed for two regions (spanning from ND6 to 12S rRNA and 12S rRNA to Cytb) based on the sequence of B. buergeri (AB127977.1) [7] and partial sequences of Cytb from B. japonica (AB998751-63) [16]: Bb_ND6_13766_Fow 5’-CTCGGACACCCCTCATCACTCA-3’; Bb_12SrRNA_2722_Rev 5’-GAGCTGCACCTTGACCTGACGT-3’; Bb_12SrRNA_2559_Fow 5’-CAACGCCAGGGAATTACGAGCT-3’; Bj_Cytb_Rev 5’- AGGATTTTTGTAAGTGGGCGGAA -3’. Polymerase chain reaction (PCR) amplification was performed in a Bio-Rad Laboratories T100 thermal cycler in 20 μL reactions containing 10 μL KOD One® PCR Master Mix (TOYOBO, Osaka, Japan), 1 μL DNA solution, and 10 pmol of each primer. Temperature cycle was performed in an initial step of 95 °C for 3 min, followed by 35 cycles of 98 °C for 10 s and 68 °C for 3 min. PEG (polyethylene glycol) precipitation [17] was then conducted to remove the remaining primers and the sample was dissolved in 10 μL of nuclease-free water. PCR products (amplicons) were measured for DNA concentration using Qubit Fluorometer (Thermo Fisher Scientific) and Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific), and 150 ng of each amplicon was used for library preparation with the LQK-LSK109 ligation library preparation kit. Library preparation and sequencing were conducted in the same manner as with genomic DNA sequencing. For sequence analysis, we used a PC with NVIDIA Quadro P2200 and MinKNOW software for basecalling (ONT). Real-time basecalling was performed in super accuracy basecalling mode.

De novo assembly of mt-genome

We attempted assembly of the mt-genome of B. japonica using three different datasets: (1) Illumina short read data (genomic DNA, 150PE); (2) Nanopore long read data (genomic DNA); and (3) Nanopore long read data (amplicon) (Fig 1). For dataset (1), we used MitoZ 2.4a [18] with “all” command trimming of adapter and low-quality sequences using trimmomatic v0.39 [19]. For dataset (2), we used a previously developed pipeline [20]. The reads derived from mt-genome were extracted using mtBlaster (https://github.com/nidafra92/squirrel-project/blob/master/mtblaster.py) after filtering reads with average quality score of 10 or higher using NanoFilt [21]. The mt-genome sequence of B. buergeri (AB127977.1) was used as reference data. The extracted sequence data was assembled using minimap2 v0.3-r179 [22] with the runtime option ava-ont and miniasm v2.24-r1122 [22]. The resultant contigs were polished using racon v1.5.0 [23] after re-mapping the raw reads to the contigs using minimap2. To improve sequence accuracy, polishing was performed five times. Finally, the polished sequence was indexed using BWA-mem2 [24] and polished again using Nanopolish (https://github.com/jts/nanopolish). For dataset (3), we assembled reads using Trycycler [25] after filtering reads with an average quality score of 10 or higher and a length of 1,000 bp or greater using NanoFilt. The resulting contigs were filtered by length between 15,000 bp and 25,000 bp using ‘seq’ command of seqkit [26] and aligned and merged in MEGA X [27]. Finally, after comparing the three assemblies, the complete mt-genome sequence of B. japonica was determined by combining the coding region based on (1) and the CR based on (3). The length of the CR was verified by checking distribution and frequency of the length in raw reads in (3) using electronic PCR (ePCR) [28] with the 5’ and 3’ tip of the CR as a primer pair. The protein coding genes were annotated based on information of the mt-genome of B. buergeri using the ‘Annotate from Database’ function of Geneious (version 9.1.8).

thumbnail
Fig 1. Flowchart of mitochondrial genome assembly of B. japonica using Illumina and Nanopore sequence data.

https://doi.org/10.1371/journal.pone.0280090.g001

Results

The total yield of B. japonica genomic DNA using Illumina sequencer (1) was 1,256,313,437 bp (8,375,809 paired reads), whereas yields of genomic DNA (2) and PCR amplicon of mt-genome (3) using ONT MinION/Flongle sequencer were 5,491,346,786 bp (total reads: 1,106,685, read length N50: 9,354 bp, read length average: 4,962 bp) and 255,561,618 bp (total reads: 80,109, read length N50: 10,107 bp, read length average: 3,190 bp), respectively (Table 1). Data were deposited in the DDBJ Sequence Read Archive (DRA) under accession numbers DRR418996–8. De novo assembly of the short reads (1) using MitoZ assembler resulted in a B. japonica mt-genome size of 15,574 bp with 943 mean coverage depth per nucleotide. In contrast, assembly of the long reads of genomic DNA (2) using the pipeline [20] and of amplicon (3) using Trycycler [25] resulted in mt-genome sizes of 21,833 bp (with 270 mean coverage depth per nucleotide) and 22,272 bp (with 2,253 mean coverage depth per nucleotide), respectively.

thumbnail
Table 1. Summary of sequence data and results of assembly.

https://doi.org/10.1371/journal.pone.0280090.t001

The CR sequences obtained in (1), (2), and (3) differed in length: 229 bp, 6,488 bp, and 6,968 bp, respectively. The long-read assemblies in (2) and (3) contained two types of tandem repeat motif sequences [20–24 of repeat motif A (40 bp) and 168–203 of repeat motif B (20 bp)] (Table 1). These repeat motifs were also found in assembly of (1), but with only small copy number (5 for motif A and 1 for motif B). The number of repeat motifs varied even in assembly of (2) and (3), and thus, we measured the length of CR in raw reads in (2) and (3) by ePCR. Histogram of the length of CR exhibited unimodal distribution with 6751–6850 bp and 6851–6950 bp as peaks in (2) and (3) (Fig 2). In the coding region, 15 and 14 sites in assembly of (2) and (3), respectively, were different from that of (1), possibly due to the homo-polymer errors in Nanopore sequencing. Therefore, we combined the sequence of coding region from assembly of (1) and the CR from assembly of (3) as the final complete nucleotide sequence of B. japonica mt-genome (Fig 1), resulting in a total mt-genome length of 22,274 bp. The mt-genome includes 6,929 bp of CR comprised of conserved sequence block (CSB) 1–3, termination associated sequence (TAS), replication origin of H-strand, and a total of 227 repeat motif sequences (24 motif A repeats and 203 motif B repeats) (Fig 3).

thumbnail
Fig 2. Representative histogram showing length distribution of the raw reads from the control region.

https://doi.org/10.1371/journal.pone.0280090.g002

thumbnail
Fig 3. Distribution of repeat motifs in the control region of B. japonica mitochondrial genome.

The upper and lower numbers indicate nucleotide positions from the 5’ end of the control region and from 5’ end of the ND5 gene, respectively.

https://doi.org/10.1371/journal.pone.0280090.g003

In the final complete nucleotide sequence of the B. japonica mt-genome, genes for 13 proteins, 2 ribosomal RNAs, and 22 transfer RNAs were annotated. These coding regions started with the ATG codon except COX1 (ATA), ND2 (ATT), ND4L (ATC), ND6 (ATT), and Cytb (ATT). COX1 and ATP8 were terminated by TAA, and COX3, ATP6, ND1, ND3, ND4, and ND4L were terminated by an incomplete stop codon, T. Moreover, COX2, ND5, and Cytb were terminated by AGA, and ND6 and ND2 were terminated by AGG and TAG, respectively. The ND5 gene was located next to the CR. The final assembly was deposited in the DDBJ database under accession number LC739528.

Discussion

Using short and long read sequencers, we succeeded in the rapid and collective determination of the complete B. japonica mt-genome. The present assembly using only short reads (1) results in a short CR (229 bp) with a limited number of repeat motifs. This may be because read length is insufficient to reconstruct the whole sequence, and the sequence of repeat region is not defined unless the reads cover both start and end of each repeat region. In addition, the number of repeat motifs is variable in both raw DNA (2) and amplicon (3) (Fig 2); this variability is attributed to somatic mutation (heteroplasmy) and/or PCR error. Since the advent of the Illumina sequencer, mt-genomes of many species have been determined using only short read sequencing with fully automated assemblers (e.g. MitoZ). Although amphibians tend to have a longer CR with repeat motifs, many published mt-genome assemblies of amphibian species may lack CR sequence because they are assembled using short reads only. Therefore, for precise comparison of mt-genomes among species, a careful strategy is necessary to assemble the sequence data.

When comparing (2) and (3) for determining longer CR, (3) is essentially sufficient, and it is more efficient due to its enrichment of the mitochondrial sequence by PCR. While the amount of data generated in (2) (approximately 5 Gb) required a MinION flow cell, that in (3) is adequate with a Flongle flow cell. The total cost of (3) is about 1/10th of that of (2). Taken together, when stable primers are available for amplifying the mitochondrial genome or can be developed based on temporal Illumina-based assembly, then amplicon sequencing using the Flongle flow cell would be the optimal approach.

From the gene annotation of the B. japonica mt-genome, we confirm that the resulting gene arrangement is the same as that of the B. buergeri mt-genome, indicating that the rearrangement of the ND5 gene may have occurred in the ancestral lineage of genus Buergeria after divergence from other lineages of Rhacophorid species. However, the length of the B. japonica mt-genome (22,274 bp) is longer than that of B. buergeri (19,959 bp), the latter having been accurately determined through sequencing a series of deleted subclones of PCR fragments from B. buergeri CR [7]. This difference is due to the presence of much a longer CR in the B. japonica (~7 kbp) compared to B. buergeri (~4.6 kbp). In addition, the two types of repeat motifs observed in the B. japonica CR are not homologous with those found in B. buergeri; thus, the repeat motifs in B. japonica may have accumulated independently after divergence from the other Buergeria species (~10–30 Ma) [8, 16].

Intraspecific variation in the length of CR occurs in the eastern spadefoot toad (Pelobates syriacus) [29]. Longer mt-genomes are also identified in rain frogs inhabiting sand areas, having adapted to dry and hot environments in East and Southern Africa: Breviceps adspersus (28,757 bp), Breviceps poweri (28,059 bp), and Breviceps mossambicus (22,543 bp) [3]. These previous findings together with our present results suggest that similar evolutionary selective forces acting on mt-genomes, especially in the CR, occur under extreme conditions. Functions of the CR include regulation of mitochondrial gene expression and DNA replication through the formation of D-loop. Recently, R-loops have also been shown to be important for mitochondrial DNA metabolism by the third strand of RNA in the CR [30]. Elucidating the relationship between the tandem repeat motifs in the CR and the mitochondrial gene expression pattern may provide important insights into the thermal adaptation of ectotherms.

Acknowledgments

We would like to thank Dr. Quintin Lau for English editing.

References

  1. 1. Chen W, Qin H, Zhao Z, Liao J, Chen H, Jiang L, et al. The mitochondrial genome and phylogenetic analysis of Rhacophorus rhodopus. Sci Rep. 2022;12: 1–10. pmid:35953583
  2. 2. Igawa T, Okamiya H, Ogino H, Nagano M. Complete mitochondrial genome of Hynobius dunni (Amphibia, Caudata, Hynobiidae) and its phylogenetic position. Mitochondrial DNA Part B. 2020;5: 2241–2242. pmid:33366990
  3. 3. Hemmi K, Kakehashi R, Kambayashi C, du Preez L, Minter L, Furuno N, et al. Exceptional enlargement of the mitochondrial genome results from distinct causes in different rain frogs (anura: Brevicipitidae: Breviceps). Int J Genomics. 2020;2020. pmid:32064272
  4. 4. Rhoads A, Au KF. PacBio Sequencing and Its Applications. Genomics Proteomics Bioinformatics. 2015;13: 278–289. pmid:26542840
  5. 5. Jain M, Olsen HE, Paten B, Akeson M. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 2016;17: 1–11. pmid:27887629
  6. 6. Xia Y, Zheng Y, Miura I, Wong PBY, Murphy RW, Zeng X. The evolution of mitochondrial genomes in modern frogs (Neobatrachia): Nonadaptive evolution of mitochondrial genome reorganization. BMC Genomics. 2014;15: 1–15. pmid:25138662
  7. 7. Sano N, Kurabayashi A, Fujii T, Yonekawa H, Sumida M. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of the bell-ring frog, Buergeria buergeri (family Rhacophoridae). Genes Genet Syst. 2004;79: 151–63. Available: http://www.ncbi.nlm.nih.gov/pubmed/15329496
  8. 8. Nishizawa T, Kurabayashi A, Kunihara T, Sano N, Fujii T, Sumida M. Mitochondrial DNA diversification, molecular phylogeny, and biogeography of the primitive rhacophorid genus Buergeria in East Asia. Mol Phylogenet Evol. 2011;59: 139–147. pmid:21296673
  9. 9. Wang YH, Hsiao YW, Lee KH, Tseng HY, Lin YP, Komaki S, et al. Acoustic differentiation and behavioral response reveals cryptic species within Buergeria treefrogs (Anura, Rhacophoridae) from Taiwan. PLoS One. 2017;12: 1–23. pmid:28877201
  10. 10. Li XC, Peris D, Hittinger CT, Sia EA, Fay JC. Mitochondria-encoded genes contribute to evolution of heat and cold tolerance in yeast. Sci Adv. 2019;5: 1–10. pmid:30729162
  11. 11. Christen F, Desrosiers V, Dupont-Cyr BA, Vandenberg GW, le François NR, Tardif JC, et al. Thermal tolerance and thermal sensitivity of heart mitochondria: Mitochondrial integrity and ROS production. Free Radic Biol Med. 2018;116: 11–18. pmid:29294390
  12. 12. Chen T-C, Kam Y-C, Lin Y-S. Thermal physiology and reproductive phenology of Buergeria japonica (Rhacophoridae) breeding in a stream and a geothermal hotspring in Taiwan. Zoolog Sci. 2001;18: 591–596.
  13. 13. Wu C-S, Kam Y-C. Thermal tolerance and thermoregulation by Taiwanese Rhacophorid tadpoles (Buergeria japonica) living in geothermal hot springs and streams. Herpetologica. 2005;61: 35–46.
  14. 14. Komaki S, Igawa T, Lin S-M, Sumida M. Salinity and thermal tolerance of Japanese stream tree frog (Buergeria japonica) tadpoles from island populations. Herpetological Journal. 2016;26: 209–213.
  15. 15. Komaki S, Lau Q, Igawa T. Living in a Japanese onsen: field observations and physiological measurements of hot spring amphibian tadpoles, Buergeria japonica. Amphibia-Reptilia. 2016;37: 311–314.
  16. 16. Tominaga A, Matsui M, Eto K, Ota H. Phylogeny and Differentiation of Wide-Ranging Ryukyu Kajika Frog Buergeria japonica (Amphibia: Rhacophoridae): Geographic genetic pattern not simply explained by vicariance through strait formation. Zoolog Sci. 2015;32: 240–247. pmid:26003978
  17. 17. Lis JT, Schleif R. Size fractionation of double-stranded DNA by precipitation with polyethylene glycol. Nucleic Acids Res. 1975;2: 383–390. pmid:236548
  18. 18. Meng G, Li Y, Yang C, Liu S. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic Acids Res. 2019;47: e63–e63. pmid:30864657
  19. 19. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30: 2114–2120. pmid:24695404
  20. 20. Franco-Sierra ND, Díaz-Nieto JF. Rapid mitochondrial genome sequencing based on Oxford Nanopore Sequencing and a proxy for vertebrate species identification. Ecol Evol. 2020;10: 3544–3560. pmid:32274008
  21. 21. de Coster W, D’Hert S, Schultz DT, Cruts M, van Broeckhoven C. NanoPack: Visualizing and processing long-read sequencing data. Bioinformatics. 2018;34: 2666–2669. pmid:29547981
  22. 22. Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021;37: 4572–4574. pmid:34623391
  23. 23. Vaser R, Sović I, Nagarajan N, Šikić M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017;27: 737–746. pmid:28100585
  24. 24. Vasimuddin Md, Misra S, Li H, Aluru S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE; 2019. pp. 314–324.
  25. 25. Wick RR, Judd LM, Cerdeira LT, Hawkey J, Méric G, Vezina B, et al. Trycycler: consensus long-read assemblies for bacterial genomes. Genome Biol. 2021;22: 1–17. pmid:34521459
  26. 26. Shen W, Le S, Li Y, Hu F. SeqKit: A cross-platform and ultrafast toolkit for FASTA/Q file manipulation. Zou Q, editor. PLoS One. 2016;11: e0163962. pmid:27706213
  27. 27. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: Molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35: 1547–1549. pmid:29722887
  28. 28. Schuler GD. Sequence mapping by electronic PCR. Genome Res. 1997;7: 541–550. pmid:9149949
  29. 29. Munwes I, Geffen E, Friedmann A, Tikochinski Y, Gafny S. Variation in repeat length and heteroplasmy of the mitochondrial DNA control region along a core-edge gradient in the eastern spadefoot toad (Pelobates syriacus). Mol Ecol. 2011;20: 2878–87. pmid:21645158
  30. 30. Holt IJ. Survey and summary: The mitochondrial R-loop. Nucleic Acids Res. 2019;47: 5480–5489.