Figures
Abstract
Carpesium (Asteraceae) is a genus that contains many plant species with important medicinal values. However, the lack of chloroplast genome research of this genus has greatly hindered the study of its molecular evolution and phylogenetic relationship. This study used the Illumina sequencing platform to sequence three medicinal plants of the Carpesium genus: Carpesium abrotanoides, Carpesium cernuum, and Carpesium faberi, obtaining three complete chloroplast genome sequences after assembly and annotation. It was revealed that the three chloroplast genomes were typical quadripartite structures with lengths of 151,389 bp (C. abrotanoides), 151,278 bp (C. cernuum), and 151,250 bp (C. faberi), respectively. A total of 114 different genes were annotated, including 80 protein-coding genes, 30 tRNA genes, and 4 rRNA genes. Abundant SSR loci were detected in all three chloroplast genomes, with most composed of A/T. The expansion and contraction of the IR region indicate that the boundary regions of IR/SC are relatively conserved for the three species. Using C. abrotanoides as a reference, most of the non-coding regions of the chloroplast genomes were significantly different among the three species. Five different mutation hot spots (trnC-GCA-petN, psaI, petA-psbJ, ndhF, ycf1) with high nucleotide variability (Pi) can serve as potential DNA barcodes of Carpesium species. Additionally, phylogenetic evolution analysis of the three species suggests that C. cernuum has a closer genetic relationship to C. faberi than C. abrotanoides. Simultaneously, Carpesium is a monophyletic group closely related to the genus Inula. Complete chloroplast genomes of Carpesium species can help study the evolutionary and phylogenetic relationships and are expected to provide genetic marker assistance to identify Carpesium species.
Citation: Shi X, Xu W, Wan M, Sun Q, Chen Q, Zhao C, et al. (2022) Comparative analysis of chloroplast genomes of three medicinal Carpesium species: Genome structures and phylogenetic relationships. PLoS ONE 17(8): e0272563. https://doi.org/10.1371/journal.pone.0272563
Editor: Pankaj Bhardwaj, Central University of Punjab, INDIA
Received: March 31, 2022; Accepted: July 22, 2022; Published: August 5, 2022
Copyright: © 2022 Shi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data are deposited in the GenBank database (accession number OM302256, OM302256 and OM302256).
Funding: This work was supported by the Science and Technology Support Program of Guizhou Province ([2020]4Y111) to MXW, and Excellent young scientific and Technological Talents Project of Guizhou Province ([2019]5658) to QWS.The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
The family Asteraceae is the most differentiated dicotyledons with about 1,479 genera and 21,105 species distributed worldwide, except for the Antarctic region [1]. Carpesium is a genus of the Asteraceae family with beneficial medicinal value. About 21 species globally are majorly distributed in the Eurasian continent [2]; 17 species and 3 variety species found in China, mainly distributed in the Southwest of China [3]. The genus of the Carpesium plants, such as C. abrotanoides, C. cernuum, and C. divaricatum has been widely used as a folk medicine in treating mumps, folliculitis, toothache, colds, and fever [4]. Pharmacological and chemical studies have confirmed that they contain sesquiterpenoids with antibacterial, anti-inflammatory, antimalarial, antitumor, and antioxidant effects [5]. Among these species, C. abrotanoides is the most widely used longest in history as an herbal medicine. In China, the fruit of C. abrotanoides is called “He Shi,” possessing antiparasitic properties and eliminating its accumulation [5, 6]. Moreover, its aerial parts are often used to treat bruises and fever [7]. Additionally, the C. cernuum and C. faberi are also used as medicinal plants by folks to treat lymph node nuclei, mastitis, fever, sore throat, toothache, blood stranguria, and other diseases [2, 8].
Morphological similarity between plants in Carpesium has led to confusion in the base source of folk medicine, affecting the safety and efficacy of medicines obtained from its species to a certain extent. However, current research on this genus is focused on their active chemical components and pharmacological activities. The taxonomic identification of its species is still based on morphological studies, thus leading to inaccurate interspecific identification of the species within the genus. Meanwhile, due to the lack of abundant genetic marker information, research in understanding the phylogenetic position and genetic diversity of the Carpesium is still lacking. Therefore, it is necessary to establish more discriminative genetic markers to analyze and discuss the interspecific relationship of this genus and its position in Asteraceae to provide a reliable basis for the genetic identification of medicinal materials.
The chloroplast (cp) is an organelle with a bilayer membrane structure that originates from symbiotic cyanobacteria cells [9, 10] capable of releasing oxygen to convert solar energy into carbohydrates through photosynthesis, aimed at sustaining its life [11]. Cp also plays a vital role in amino acid and lipid synthesis metabolism [12]. In 1986, the whole cp genome of tobacco was sequenced and annotated, and currently, various cp genomes have been reported [13]; therefore, many researchers have been attracted to devote themselves to studying cp genomes of plants. In many angiosperms, the cp genome is usually a quadruplex, consisting of two inverted repeat regions (IRs), a large single-copy region (LSC), and a small single-copy region (SSC), with the IR regions separating the LSC and SSC regions [14]. Generally, cp varies in size from 120 to 180 kb [15], 60 to 130 kb encoding genes mainly involved in photosynthesis and other metabolic processes [10]. Compared to the nuclear genome, the cp genome has a haploid inheritance, conserved structure, smaller genome, and slow mutation rate [16–18], making it an ideal model for molecular identification of species, genetic diversity studies, and revealing phylogenetic relationships [19–21]. Recently, Artemisia, Panax, Physalis, Paeonia, Salvia, and other genera’s complete cp genome data have been used to identify highly differentiated regions and make phylogenetic inferences, eventually providing a reference for identification and phylogenetic studies of these species [22–26].
Given this, we sequenced and annotated the whole cp genome of C. abrotanoides, C. cernuum, and C. faberi to explore the relationships among the Carpesium species. Then, simple sequence repeats (SSRs), interspersed repeated, IR expansion and contraction were investigated, and mutation hot spots were screened. Additionally, a phylogenetic tree was constructed using the whole cp genomes of 38 species of Asteraceae. This study explored the genetic differentiation and structural characteristics of the genus Carpesium and its developmental relationships in Asteraceae at the molecular level. It also provides the basis for elucidating the evolutionary process of the genus Carpesium, revealing its phylogenetic relationship, and identifying species of the genus.
Materials and methods
Collection of plant materials, DNA extraction, and sequencing
The fresh leaves of C. abrotanoides and C. faberi were collected from Longli County, Guizhou Province, China, whereas C. cernuum were collected from Pingba County, Guizhou Province, China. Samples were immediately frozen in liquid nitrogen and stored at −80°C. According to the manufacturer’s instructions, the whole DNA samples were extracted from fresh leaves using an EZNA Plant DNA extraction kit (OMEGA, USA). The quality and quantity of extracted DNA were measured using NanoPhotometer spectrophotometer (IMPLEN, USA) and Qubit 2.0 Fluorometer (Life Technologies, USA), respectively. The genome was sequenced by the Illumina NovaSeq Sequencing System to generate paired-end 2×150 bp reads, and about 7.06 Gb (C. abrotanoides), 5.48 Gb (C. cernuum), and 5.63 Gb (C. faberi) raw data were obtained.
Cp genome assembly, annotation
Trimmomatic [27] was applied to filter the raw data. Next, NOVOPlasty [28] was adopted to assemble the cp genome, then Gap Close [29] repaired the inner gaps. Finally, the reference genome of C. abrotanoides was used for correcting the positions and directions of the four cp regions (LSC/IRa/SSC/IRb). The genomes were annotated with manual correction by the CpGAVAS2 [30] and were determined to obtain the complete cp genome sequence. Whole cp genome maps were drawn with the CHLOROPLOT [31]. The annotated genome sequence was submitted to GenBank (with accession numbers: OM302256, OM302257, and OM302258).
Analysis of codon usage and repeat sequence
MEGA7 [32] was applied to analyze the synonymous codon and relative synonymous codon usage (RSCU) of the three Carpesium species. MISA determined the SSR according to Beier et al. (2017) [33] with the following settings: ten repeat units for mononucleotide SSRs, five repeat units for dinucleotide SSRs, four repeat units for trinucleotide SSRs, and three repeat units for tetranucleotide, pentanucleotide, and hexanucleotide repeats. The interspersed repeated analysis was performed using REPuter [34], including the forward repeat (F), reverse repeat (R), complement repeat (C), and palindromic repeat (P), with parameters set at minimal repeat size 30 bp, and 90% sequence identity (hamming distance 3).
Comparative genome analysis and sequence variation
The boundary information of the four regions (IR, LSC, and SSC areas) of cp genomes was visualized using IRscope according to Amiryousefi et al. (2018) [35]. The three whole cp genomes were compared using the online genome analysis program mVISTA [36], whereas C. abrotanoides was used as a reference in the Shuffle-LAGAN mode. According to the method of Katoh et al. (2005) [37], MAFFT was used to compare the complete cp genome sequences of 3 species of the genus Carpesium, then DNAsp v.6.10 [38] was used for sliding window analysis with a step length of 200 bp and window length of 600 bp.
Phylogenetic analysis
The three sequenced cp genomes of Carpesium and the whole cp genomes of the 35 species (using Taraxacum mongolicum, Taraxacum officinale, and Lactuca sativa as outgroups) were retrieved from the NCBI database for constructing a phylogenetic tree (S1 Table). MAFFT was then applied to align the complete cp genomes of all species with a manual correction [37]. The best nucleotide substitution model was tested with the built-in ModelFinder in IQ-tree. The IQ-tree was then used to construct the maximum likelihood (ML) tree with 1,000 bootstrap replicates [39–41].
Results
Characteristics and structure of cp genome
The full length cp genomes were 151,389 bp, 151,278 bp, and 151,250 bp for C. abrotanoides, C. cernuum, and C. faberi, respectively. Similar to most angiosperms, the Carpesium cp genomes also appeared with a typical quadripartite structure, distributed in one LSC region (82,915 bp–83,059 bp) and one SSC region (18,426 bp–18,447 bp) separated by a pair of inverted repeats (IRa and IRb; 49,004 bp) (Fig 1; Table 1). The overall GC content of the three plants was the closest, ranging from 36.6% to 36.7%. The GC content of the IR regions (43%) was higher than that of the LSC and SSC regions (35.7%–35.9% and 31.2%–31.3%), respectively (Table 1).
The inner genes are transcribed clockwise, and the outer genes are transcribed counterclockwise. Different colors indicate genes with different functions. The light black of the inner circle indicates GC content, and dark gray indicates AT content.
Complete cp genome of C. abrotanoides, C. cernuum, and C. faberi encoded 132 genes. Among them, 114 genes were unique, including 80 protein-coding genes, 30 transfer RNA (tRNA), and 4 rRNA genes. Additionally, one gene (ycf1) was annotated as pseudogenes (Table 2). Furthermore, among the 114 genes, 18 genes contained introns (12 protein-coding genes and 6 tRNAs genes), among the 15 genes (atpF, ndhA, ndhB, petB, petD, rpl2, rpl16, rps16, rpoC1, trnA-UGC, trnG-UCC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC) contained one intron, and the 3 genes (rps12, ycf3, and clpP) contained two introns. Four of these genes (trnA-UGC, trnI-GAU, ndhB, and rpl2) appeared in both IR regions, whereas one gene (ndhA) was in the SSC region (S2 Table).
Codon usage
Amino acid frequency analysis and RSCU showed high similarities among the species. The protein-coding sequences of the C. abrotanoides, C. cernuum, and C. faberi cp genomes consisted of 26,112, 26,205, and 26,203 codons, respectively (S3 Table). The percentage of the coded amino acids are presented in increasing order as follows; Cysteine (1.11%), Isoleucine (8.42%–8.46%), and Leucine (10.62–10.64%) (Fig 2). Tyagi et al. (2020) [42] reported that the Leucine had the highest, whereas the Cysteines had the lowest abundance of amino acids in other angiosperm cp genomes. In the cp genomes of the three genera, the codon AUG (Methionine) and UGC (Tryptophan) were unbiased with RSCU = 1.00. These two amino acids had no preference because they were encoded using one codon. Additionally, the codons of other amino acids exhibited significant differences. In contrast, the different codons except the UUG containing A or T were the most preferred codon in encoding amino acids with RSCU > 1.
Identification of SSRs and repeat sequences
MISA detected 41 SSRs in C. abrotanoides, 39 in C. cernuum, and 37 in C. faberi (mononucleotide, dinucleotide, trinucleotide, and tetranucleotide as shown in S4 Table and Fig 3a). Among the three species, we found that the content of mononucleotide A or T homopolymers in the 4 SSR types is the highest, which illustrated that SSRs usually comprises poly-A and poly-T, but rarely tandem guanine (G) and cytosine (C), thereby contributing to the AT abundance in the cp genome (Fig 3b–3d). Furthermore, mononucleotide repeats (53.85%–60.98%) were the most frequent, whereas the trinucleotide repeats (4.88%–5.41%) were the least (Fig 3e), showing that the mononucleotide repeats made more contribution to genetic variations than other SSRs.
(a) The number of different SSR types. (b-d) The frequencies of different SSR types of three Carpesium species. (e) The proportion of different SSR types.
The repeated sequences of each cp genome were analyzed using REPuter. A total of 39–40 interspersed repeated sequences, not more than 30 bp were detected in three species of Carpesium, including the forward and palindromic repeats (S5 Table). The different types of repetitive sequences in the same species showed differences but not in the same type of repetitive sequences, among different species. Palindrome repeats were the most common (55%), followed by the forward repeats (45%) in C. abrotanoides and C. cernuum cp genome. Also, palindrome repeats accounted for 54%, and the forward repeats accounted for 46% in the genome of C. faberi (Fig 4a). The repeat sequence length of the majority was 30–40 bp (Fig 4b).
(a) Repeat sequence types and number of repeats. (b) Number of repeat sequences of different lengths.
IR expansion and contraction
Based on the comparison of the IR/SC boundary regions, the cp genomes of the three Carpesium genera showed that their expansion and contraction were similar (Fig 5). The rpl22, rps19, rpl2, trnH, and psbA genes were almost distributed in the LSC/IR border, whereas ycf1 and ndhF genes were in the SSC/IR border. The gene ycf1 crossed the SSC/IRa region, and the pseudogene fragment ψycf1 was located at the IRb region, close to the SSC/IRb border. ndhF was 37 bp, 6 bp, 6 bp away from the SSC/IRb border in C. abrotanoides, C. cernuum, and C. faberi, respectively. These results suggest that C. cernuum and C. faberi are more similar than C. abrotanoides.
Comparative genome analysis and divergence hotspot regions
The complete cp genomes of the three Carpesium species were compared and plotted using mVISTA by aligning the cp genomes with C. abrotanoides as the reference in elucidating the levels of sequence divergence (Fig 6). The results showed higher sequence variation in conserved non-coding sequences regions than in conserved protein-coding regions. Also, the conservation of the IR region was higher than the LSC and SSC regions, whereas the rRNA genes were highly conserved almost without variation. Furthermore, the coding regions with a large variation in the three cp genomes were matK, accD, rpoA, ccsA, psbI, ndhF, and ycf1, whereas the other genes had a higher degree of conservation. Variant loci in intergenic regions were significantly higher than those in the gene regions. The intergenic regions included trnH-psbA, rps16-trnQ, trnC-petN, petA-psbJ, psbA-ycf3 etc. To clarify the variation in the higher regions, we calculated the nucleotide diversity values (pi) using DNAsp v.6.10 software (Fig 7). Five divergent loci (trnC-GCA-petN, psbI, petA-psbJ, ndhF, and ycf1) had a P-value ≥ 0.01, with the trnC-GCA-petN, psbI, petA-psbJ located in the LSC region, whereas the other loci were in the SSC region, and none being detected in the IR region. These results confirm that the LSC and SSC regions were more variable than the two IR regions.
The y-axis represents the range of identity (50%–100%). The x-axis indicates the coordinate in the cp genome. The gray arrows above the comparison represent the gene orientation and location, and different colors indicate the various regions of the genome.
Phylogenetic analysis
The phylogenetic tree was reconstructed for 38 species of Asteraceae using the best fit model TVM+F+R3. Most branch points had high bootstrap values, shown in Fig 8. The figure showed that all Asteraceae species were divided into ten subgroups (Anthemideae, Astereae, Gnaphalieae, Inuleae, Heliantheae, Millerieae, Tageteae, Coreopsideae, and Carlininae) with slight differences in the bootstrap support values of each tree topology. The genetic relationship between Inuleae and Plucheinae was closed, whereas within the Inuleae family, the genera Blumea, Inula, and Carpesium formed a cluster. The three species of Carpesium formed a monophyletic clade, which consisted of C. abrotanoides cluster, C. cernuum, and C. faberi cluster, with a bootstrap value of 100%. Additionally, the genus Carpesium was closely related to the Inula genus clade.
The data next to each column represent bootstrap test scores. Taraxacum mongolicum, Taraxacum officinale, and Lactuca sativa were set as outgroups.
Discussion
Cp genome analysis
The complete cp genomes of three species of Carpesium were obtained using the Illumina NovaSeq sequencing technology, with comparative analysis showing highly conserved genes and structures. Similar to other sequenced angiosperm cp genomes, the Carpesium had a quadripartite structure typically composed of one LSC, one SSC, and two IR regions, highlighting the cp genomes with highly conserved characteristics [10]. The sizes of the cp genomes of C. abrotanoides, C. cernuum and C. faberi ranged from 151,250 bp to 151,389 bp, suggesting that the cp genome length in Carpesium was highly conserved, within the size range of most angiosperm cp genomes [15, 43]. The GC content distribution in the cp genomes of the three species was the same as that for other angiosperms [44, 45]. The IR regions had the highest GC content among the four other regions, followed by the LSC and SSC regions. The high GC content in the IR regions may be attributed to the presence of rRNAs (rrna4.5, rrna5, rrna23, and rrna16) with low A/T content [46].
The use of codons determines whether genetic information can be expressed correctly. It also helps to understand the molecular evolution and environmental adaptations of species and learn about the evolutionary relationships between species and genome structure, especially crucial in studying gene expression [47, 48]. The same codons were used in the three species of Carpesium, including 61 amino acid codons (start codon AUG) and 3 stop codons (UAA, UAG, and UGA). However, differences existed in the number and type of codons encoding the 20 amino acids that were preferentially used. Most amino acids with a preference for codons encoding, the third nucleotide contained A/U. These findings were correlated with that in other angiosperms [49, 50]. Our results also showed high similarity indices of codon usage, revealing that three species suffered from a similar environmental pressure [22].
Studies on cp genomes have shown that repetitive sequences are important for duplication, deletion, and rearrangement events [51]. Additionally, repetitive sequences are important in studying phylogeny and genome recombination [52]. The repeated analysis was performed on three cp genomes of the genus Carpesium, and a total of 39–40 repeat sequences were detected, mostly 30–39 bp in length. SSR or microsatellite is a commonly used class of microsatellite molecular markers [53]. The chloroplast SSRs (cp SSRs) are uniparental, simple, and possess a relatively conserved structure [54]. It also has high polymorphism, multiple alleles, and co-dominance of nuclear genomic SSR markers [55]. Therefore, it is widely used in studying population structure, genetic variation, and species identification phylogeny [56]. In this study, 37–41 SSR were detected, with single nucleotide repeat sequences (A/T), the most abundant type consistent with other angiosperms [57]. The cpSSRs also had great diversity, which may help study interspecies variety and development of molecular markers for population genetics analysis.
IR expansion and contraction are common in cp genomes and are the leading cause of cp genome size variation [58, 59]. This study found that ndhF was 37 bp from the SSC/IRb boundary in C. abrotanoides. In comparison, ndhF was only 6 bp from the SSC/IRb boundary in the other two species, which might be a reason for the longer cp genome length in C. abrotanoides than C. cernuum, and C. faberi.
The mVISTA is a common tool for comparative genomic analysis used to rapidly identify conserved regions of DNA sequences [60]. In this study, mVISTA software was used to compare and analyze the cp genomes of these three species. We found that the sequence differentiation of the cp genomes was lower, the IR region was more conserved than the SC region, and the coding region was more conserved than the non-coding region, consistent with the cp genomes of most high angiosperms [59]. The subsequent calculation of Pi values further clarified the changes in the coding region. Also, a high variation in these genes (matK, accD, rpoA, ccsA, psbI, ndhF and ycf1) and intergenic regions (trnH-psbA, rps16-trnQ, trnC-petN, petA-psbJ, and psbA-ycf3, etc) were recorded. It has also been shown that matK, ycf1, trnH-psbA, rps16-trnQ, atpH-atpI, and psaA-ycf3 can be used as DNA barcodes for other plant taxa [61–64]. These highly variable regions can provide abundant and significant information for resolving the interspecific relationships of Carpesium in the phylogeny of the Asteraceae.
Phylogenetic analysis
Carpesium is a genus attributed to the Asteraceae family, with its species similar in morphology and widely distributed. Recently, researchers have successively applied DNA sequence of cp regions (ndhF, trnL-F, trnH-psbA, rps16-trnQ, rpl32-trnL, ndhF-rpl32) and nuclear ribosomal region (ITS, ETS) for taxonomic studies, suggesting that Carpesium has a polyphyletic nature [65–70]. These studies are the foundation for the classification and identification of the Carpesium. However, the relatively short length of cp or nuclear gene sequence fragments limits phylogenetics, resulting in phylogenetic trees with low support values. Therefore, further phylogenetic classification of the genus Carpesium is required.
The complete cp genome is a powerful means for explaining phylogenetic relationships among species due to its rich phylogenetic information. It has been successfully used in phylogenetic studies of angiosperms [71, 72]. In this study, complete cp genomes of 38 species were used for phylogenetic analysis. The ML analysis results showed that the tested Carpesium formed a monophyletic lineage in phylogenetic evolution with 100% support values, closely related to genera such as Inula, Blumea, and Pluchea. The classical taxonomic approach places C. abrotanoides and C. faberi in the Sect. Abrotanoides and C. cernuum in the Sect. Carpesium. However, this study found that C. cernuum was more closely related to C. faberi, deviating from the traditional morphological classification method. Further study is needed to ascertain whether the traditional taxonomy is reasonable or truly shows the relationship between the species of this genus. It is impossible to resolve questions about the relationship of species under the genus Carpesium and the subclassification of the genus due to the availability of a few cp genome sequences of Carpesium and Asteraceae. More studies are therefore needed on the complete cp genome of this genus so that we can accurately analyze the affinities between the species.
Conclusion
In this study, the cp genomes of three species of Carpesium (C. abrotanoides, C. cernuum, and C. faberi) were sequenced and annotated using high-throughput sequencing technology. Through bioinformatics analysis, we compared the cp genomes of these three species revealing that the structure and gene content of the cp genomes among the three Carpesium species were highly similar and conserved, indicating a close relationship with each other. Approximately, 40 SSR loci were identified with potentials to be used as molecular markers in studying the diversity in the genus Carpesium. It was also discovered that five mutation hot spots could be used to develop DNA markers suitable for the interspecies discrimination between Carpesium. Maximum likelihood (ML) tree analysis showed that the three Carpesium plants were entirely clustered into one branch and were closely related to the Inula plants. This study on the cp genomes of the three Carpesium genera provides valuable information for the species, enriches Carpesium cp genome data, and provides genetic resources for further species identification and phylogenetic studies of this genus.
Supporting information
S1 Table. List of species GenBank accessions numbers were used in phylogenetic analysis.
https://doi.org/10.1371/journal.pone.0272563.s001
(XLSX)
S2 Table. The intron containing genes of the cp genomes of three Carpesium species and their exon and intron lengths.
https://doi.org/10.1371/journal.pone.0272563.s002
(XLSX)
S3 Table. Codon usage in chloroplast genomes of three Carpesium species.
https://doi.org/10.1371/journal.pone.0272563.s003
(XLSX)
S4 Table. SSR type and number identified in three Carpesium species.
https://doi.org/10.1371/journal.pone.0272563.s004
(XLSX)
S5 Table. Repeat sequences (≥ 30bp) identified in three Carpesium species.
https://doi.org/10.1371/journal.pone.0272563.s005
(XLSX)
References
- 1. Zhao C, Xu WF, Huang Y, Sun QW, Wang B, Chen CL, et al. Chloroplast genome characteristics and phylogenetic analysis of the medicinal plant Blumea balsamifera (L.) DC. Genet Mol Biol. 2021;44(4):e20210095. pmid:34826835.
- 2. Yang YX, Shan L, Liu QX, Shen YH, Zhang JP, Ye J, et al. Carpedilactones A-D, four new isomeric sesquiterpene lactone dimers with potent cytotoxicity from Carpesium faberi. Org Lett. 2014;16(16):4216–9. pmid:25079622.
- 3.
Editorial Committee of Flora of China (ECFC). Flora republicae popularis sinicae vol. 75. Beijing: Science Press; 1979.
- 4. Yang YX. Studies on sesquiterpene lactones from Carpesium faberi. China Journal of Chinese Materia Medica. 2016;41(11):2105–11. pmid:28901108
- 5. Zhang JP, Wang GW, Tian XH, Yang YX, Liu QX, Chen LP, et al. The genus Carpesium: a review of its ethnopharmacology, phytochemistry and pharmacology. J Ethnopharmacol. 2015;163:173–91. pmid:25639815.
- 6. Liu XY, Guo GW, Wang H. Killing Effect of Carpesium abrotanoides on Taenia asiatica Cysticercus. Chinese Journal of Parasitology and Parasitic Diseases. 2015;33(3):237–8. pmid:26541048.
- 7. Wang F, Yang K, Ren FC, Liu JK. Sesquiterpene lactones from Carpesium abrotanoides. Fitoterapia. 2009;80(1):21–4. pmid:18948175.
- 8. Park YJ, Cheon SY, Lee DS, Cominguez DC, Zhang Z, Lee S, et al. Anti-Inflammatory and Antioxidant Effects of Carpesium cernuum L. Methanolic Extract in LPS-Stimulated RAW 264.7 Macrophages. Mediators Inflamm. 2020:3164239. pmid:32848508.
- 9. Dyall SD, Brown MT, Johnson PJ. Ancient invasions: from endosymbionts to organelles. Science. 2004;304(5668):253–7. pmid:15073369.
- 10. Wicke S, Schneeweiss GM, dePamphilis CW, Müller KF, Quandt D. The evolution of the plastid chromosome in land plants: gene content, gene order, gene function. Plant Mol Biol. 2011;76(3–5):273–97. pmid:21424877.
- 11. Daniell H, Lin CS, Yu M, Chang WJ. Chloroplast genomes: diversity, evolution, and applications in genetic engineering. Genome Biol. 2016;17(1):134. pmid:27339192.
- 12. Neuhaus HE, Emes MJ. Nonphotosynthetic metabolism in plastids. Annu Rev Plant Physiol Plant Mol Biol. 2000;51:111–40. pmid:15012188.
- 13. Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N, Matsubayashi T, et al. The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 1986;5(9):2043–9. pmid:16453699.
- 14. Palmer JD. Comparative organization of chloroplast genomes. Annu Rev Genet. 1985;19:325–54. pmid:3936406.
- 15. Li CY, Zhao YL, Xu ZG, Yang GY, Peng J, Peng XY. Initial Characterization of the Chloroplast Genome of Vicia sepium, an Important Wild Resource Plant, and Related Inferences About Its Evolution. Front Genet. 2020;11:73. pmid:32153639.
- 16. Palmer JD, Jansen RK, Michaels HJ, Chase MW, Manhart . Chloroplast DNA variation and plant phylogeny. Ann Missouri Bot Garden. 1988;75:1180–206.
- 17. Shaw J, Lickey EB, Beck JT, Farmer SB, Liu W, Miller J, et al. The tortoise and the hare II: relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis. Am J Bot. 2005;92(1):142–66. pmid:21652394.
- 18. Hansen DR, Dastidar SG, Cai Z, Penaflor C, Kuehl JV, Boore JL, et al. Phylogenetic and evolutionary implications of complete chloroplast genome sequences of four early-diverging angiosperms: Buxus (Buxaceae), Chloranthus (Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae). Mol Phylogenet Evol. 2007;45(2):547–63. pmid:17644003.
- 19. Wambugu PW, Brozynska M, Furtado A, Waters DL, Henry RJ. Relationships of wild and domesticated rices (Oryza AA genome species) based upon whole chloroplast genome sequences. Sci Rep. 2015;5:13957. pmid:26355750.
- 20. Zhang SD, Jin JJ, Chen SY, Chase MW, Soltis DE, Li HT, et al. Diversification of Rosaceae since the Late Cretaceous based on plastid phylogenomics. New Phytol. 2017;214(3):1355–67. pmid:28186635.
- 21. Chen XL, Zhou JG, Cui YX, Wang Y, Duan BZ, Yao H. Identification of Ligularia herbs using the complete chloroplast genome as a super-barcode. Front Pharmacol. 2018;9:695. pmid:30034337.
- 22. Shahzadi I, Abdullah , Mehmood F, Ali Z, Ahmed I, Mirza B. Chloroplast genome sequences of Artemisia maritima and Artemisia absinthium: Comparative analyses, mutational hotspots in genus Artemisia and phylogeny in family Asteraceae. Genomics. 2020;112(2):1454–63. pmid:31450007.
- 23. Nguyen VB, Linh Giang VN, Waminal NE, Park HS, Kim NH, Jang W, et al. Comprehensive comparative analysis of chloroplast genomes from seven Panax species and development of an authentication system based on species-unique single nucleotide polymorphism markers. J Ginseng Res. 2020;44(1):135–44. pmid:32148396.
- 24. Feng SG, Zheng KX, Jiao KL, Cai YC, Chen CL, Mao YY, et al. Complete chloroplast genomes of four Physalis species (Solanaceae): lights into genome structure, comparative analysis, and phylogenetic relationships. BMC Plant Biol. 2020;20(1):242. pmid:32466748.
- 25. Wu LW, Nie LP, Xu ZC, Li P, Wang Y, He CN, et al. Comparative and Phylogenetic Analysis of the Complete Chloroplast Genomes of Three Paeonia Section Moutan Species (Paeoniaceae). Front Genet. 2020;11:980. pmid:33193580.
- 26. Gao CW, Wu CH, Zhang Q, Zhao X, Wu MX, Chen RR, et al. Characterization of Chloroplast Genomes From Two Salvia Medicinal Plants and Gene Transfer Among Their Mitochondrial and Chloroplast Genomes. Front Genet. 2020;11:574962. pmid:33193683.
- 27. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. pmid:24695404.
- 28. Dierckxsens N, Mardulyn P, Smits G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2017;45(4):e18. pmid:28204566.
- 29. Luo RB, Liu BH, Xie YL, Li ZY, Huang WH, Yuan JY, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1(1):18. pmid:23587118.
- 30. Shi LC, Chen HM, Jiang M, Wang LQ, Wu X, Huang LF, et al. CPGAVAS2, an integrated plastome sequence annotator and analyzer. Nucleic Acids Res. 2019;47(W1):W65–73. pmid:31066451.
- 31. Zheng SY, Poczai P, Hyvönen J, Tang J, Amiryousefi A. Chloroplot: An Online Program for the Versatile Plotting of Organelle Genomes. Front Genet. 2020;11:576124. pmid:33101394.
- 32. Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol. 2016;33(7):1870–4. pmid:27004904.
- 33. Beier S, Thiel T, Münch T, Scholz U, Mascher M. MISA-web: a web server for microsatellite prediction. Bioinformatics. 2017;33(16):2583–5. pmid:28398459.
- 34. Kurtz S, Schleiermacher C. REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics. 1999;15(5):426–7. pmid:10366664.
- 35. Amiryousefi A, Hyvönen J, Poczai P. IRscope: an online program to visualize the junction sites of chloroplast genomes. Bioinformatics. 2018;34(17):3030–1. pmid:29659705.
- 36. Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I. VISTA: computational tools for comparative genomics. Nucleic Acids Res. 2004;32(Web Server issue):W273–9. pmid:15215394.
- 37. Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33(2):511–8. pmid:15661851.
- 38. Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, et al. DnaSP 6: DNA Sequence Polymorphism Analysis of Large Data Sets. Mol Biol Evol. 2017;34(12):3299–302. pmid:29029172.
- 39. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–74. pmid:25371430.
- 40. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14(6):587–9. pmid:28481363.
- 41. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Mol Biol Evol. 2018;35(2):518–22. pmid:29077904.
- 42. Tyagi S, Jung JA, Kim JS, Won SY. Comparative Analysis of the Complete Chloroplast Genome of Mainland Aster spathulifolius and Other Aster Species. Plants (Basel). 2020;9(5):568. pmid:32365609.
- 43. Fan Y, Jin YN, Ding MQ, Tang Y, Cheng JP, Zhang KX, et al. The Complete Chloroplast Genome Sequences of Eight Fagopyrum Species: Insights Into Genome Evolution and Phylogenetic Relationships. Front Plant Sci. 2021;12:799904. pmid:34975990.
- 44. Xiang BB, Li XX, Qian J, Wang LZ, Ma L, Tian XX, et al. The Complete Chloroplast Genome Sequence of the Medicinal Plant Swertia mussotii Using the PacBio RS II Platform. Molecules. 2016;21(8):1029. pmid:27517885.
- 45. Zhou JG, Chen XL, Cui YX, Sun W, Li YH, Wang Y, et al. Molecular Structure and Phylogenetic Analyses of Complete Chloroplast Genomes of Two Aristolochia Medicinal Species. Int J Mol Sci. 2017;18(9):1839. pmid:28837061.
- 46. Zhao ZY, Wang X, Yu Y, Yuan SB, Jiang D, Zhang YJ, et al. Complete chloroplast genome sequences of Dioscorea: Characterization, genomic resources, and phylogenetic analyses. PeerJ. 2018;6:e6032. pmid:30533315.
- 47. Quax TE, Claassens NJ, Söll D, van der Oost J. Codon Bias as a Means to Fine-Tune Gene Expression. Mol Cell. 2015;59(2):149–61. pmid:26186290.
- 48. Parvathy ST, Udayasuriyan V, Bhadana V. Codon usage bias. Mol Biol Rep. 2022;49(1):539–65. pmid:34822069.
- 49. Munyao JN, Dong X, Yang JX, Mbandi EM, Wanga VO, Oulo MA, et al. Complete Chloroplast Genomes of Chlorophytum comosum and Chlorophytum gallabatense: Genome Structures, Comparative and Phylogenetic Analysis. Plants (Basel). 2020;9(3):296. pmid:32121524.
- 50. Somaratne Y, Guan DL, Wang WQ, Zhao L, Xu SQ. The Complete Chloroplast Genomes of Two Lespedeza Species: Insights into Codon Usage Bias, RNA Editing Sites, and Phylogenetic Relationships in Desmodieae (Fabaceae: Papilionoideae). Plants (Basel). 2019;9(1):51. pmid:31906237.
- 51. Li B, Zheng YQ. Dynamic evolution and phylogenomic analysis of the chloroplast genome in Schisandraceae. Sci Rep. 2018;8(1):9285. pmid:29915292.
- 52. Asaf S, Khan AL, Khan AR, Waqas M, Kang SM, Khan MA, et al. Complete Chloroplast Genome of Nicotiana otophora and its Comparison with Related Species. Front Plant Sci. 2016;7:843. pmid:27379132.
- 53. Kaur S, Panesar PS, Bera MB, Kaur V. Simple sequence repeat markers in genetic divergence and marker-assisted selection of rice cultivars: a review. Crit Rev Food Sci Nutr. 2015;55(1):41–9. pmid:24915404.
- 54. Kaundun SS, Matsumoto S. Heterologous nuclear and chloroplast microsatellite amplification and variation in tea, Camellia sinensis. Genome. 2002;45(6):1041–8. pmid:12502248.
- 55. Provan J, Powell W, Hollingsworth PM. Chloroplast microsatellites: new tools for studies in plant ecology and evolution. Trends Ecol Evol. 2001;16(3):142–7. pmid:11179578.
- 56. Saski C, Lee SB, Daniell H, Wood TC, Tomkins J, Kim HG, et al. Complete chloroplast genome sequence of Gycine max and comparative analyses with other legume genomes. Plant Mol Biol. 2005;59(2):309–22. pmid:16247559.
- 57. Saina JK, Li ZZ, Gichira AW, Liao YY. The Complete Chloroplast Genome Sequence of Tree of Heaven (Ailanthus altissima (Mill.) (Sapindales: Simaroubaceae), an Important Pantropical Tree. Int J Mol Sci. 2018;19(4):929. pmid:29561773.
- 58. Kim KJ, Lee HL. Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative analysis of sequence evolution among 17 vascular plants. DNA Res. 2004;11(4):247–61. pmid:15500250.
- 59. Zhang YJ, Du LW, Liu A, Chen JJ, Wu L, Hu WM, et al. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses. Front Plant Sci. 2016;7:306. pmid:27014326.
- 60. Ratnere I, Dubchak I. Obtaining comparative genomic data with the VISTA family of computational tools. Curr Protoc Bioinformatics. 2009;Chapter 10:Unit 10.6. pmid:19496056.
- 61. Dong WP, Liu J, Yu J, Wang L, Zhou SL. Highly variable chloroplast markers for evaluating plant phylogeny at low taxonomic levels and for DNA barcoding. PLoS One. 2012;7(4):e35071. pmid:22511980.
- 62. Hollingsworth PM, Li DZ, van der Bank M, Twyford AD. Telling plant species apart with DNA: from barcodes to genomes. Philos Trans R Soc Lond B Biol Sci. 2016;371(1702):20150338. pmid:27481790.
- 63. Yang J, Vázquez L, Chen XD, Li HM, Zhang H, Liu ZL, et al. Development of Chloroplast and Nuclear DNA Markers for Chinese Oaks (Quercus Subgenus Quercus) and Assessment of Their Utility as DNA Barcodes. Front Plant Sci. 2017;8:816. pmid:28579999.
- 64. Cui N, Liao BS, Liang CL, Li SF, Zhang H, Xu J, et al. Complete chloroplast genome of Salvia plebeia: organization, specific barcode and phylogenetic analysis. Chin J Nat Med. 2020;18(8):563–72. pmid:32768163.
- 65. Anderberg AA, Eldenäs P Bayer RJ, Bayer RJ. Evolutionary relationships in the Asteraceae tribe Inuleae (incl. Plucheeae) evidenced by DNA sequences of ndhF; with notes on the systematic positions of some aberrant genera. Org Divers Evol. 2005;5:135–46.
- 66. Pornpongrungrueng P, Borchsenius F, Englund M, Anderberg AA, Gustafsson MHG. Phylogenetic relationships in Blumea (Asteraceae: Inuleae) as evidenced by molecular and morphological data. Plant Syst Evol. 2007;269:223–43.
- 67. Englund M, Pornpongrungrueng P, Gustafsson MHG, Anderberg AA. Phylogenetic relationships and generic delimitation in Inuleae subtribe Inulinae (Asteraceae) based on ITS and cpDNA sequence data. Cladistics. 2009;25:319–52. pmid:34879610
- 68. Yoo KP, Park SJ. A Phylogenetic Study of Korean Carpesium L. Based on nrDNA ITS Sequences. Korean J Plant Res. 2012;25:96–104.
- 69. Nylinder S, Anderberg AA. Phylogeny of the Inuleae (Asteraceae) with special emphasis on the Inuleae-Plucheinae. Taxon. 2015;64:110–30.
- 70. Gutiérrez-Larruscain D, Santos-Vicente M, Anderberg AA, Rico E, Martínez-Ortega MM. Phylogeny of the Inula group (Asteraceae: Inuleae): Evidence from nuclear and plastid genomes and a recircumscription of Pentanema. Taxon. 2018;67:149–64.
- 71. Jansen RK, Cai ZQ, Raubeson LA, Daniell H, Depamphilis CW, Leebens-Mack J, et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc Natl Acad Sci U S A. 2007;104(49):19369–74. pmid:18048330.
- 72. Leebens-Mack J, Raubeson LA, Cui L, Kuehl JV, Fourcade MH, Chumley TW, et al. Identifying the basal angiosperm node in chloroplast genome phylogenies: sampling one’s way out of the Felsenstein zone. Mol Biol Evol. 2005;22(10):1948–63. pmid:15944438.