Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Characterization of the complete chloroplast genome of Brassica oleracea var. italica and phylogenetic relationships in Brassicaceae

  • Zhenchao Zhang,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Writing – original draft, Writing – review & editing

    Affiliation Department of Vegetables and Flowers, Zhenjiang Institute of Agricultural Sciences, Jurong, China

  • Meiqi Tao,

    Roles Investigation, Software, Validation, Visualization, Writing – original draft

    Affiliation Department of Vegetables and Flowers, Zhenjiang Institute of Agricultural Sciences, Jurong, China

  • Xi Shan,

    Roles Investigation, Methodology, Resources, Validation, Visualization

    Affiliation Department of Vegetables and Flowers, Zhenjiang Institute of Agricultural Sciences, Jurong, China

  • Yongfei Pan,

    Roles Data curation, Investigation, Methodology, Resources

    Affiliation Department of Vegetables and Flowers, Zhenjiang Institute of Agricultural Sciences, Jurong, China

  • Chunqing Sun,

    Roles Formal analysis, Methodology, Resources

    Affiliation Department of Vegetables and Flowers, Zhenjiang Institute of Agricultural Sciences, Jurong, China

  • Lixiao Song,

    Roles Data curation, Formal analysis, Funding acquisition, Methodology, Resources

    Affiliation Department of Vegetables, Jiangsu Academy of Agricultural Sciences, Nanjing, China

  • Xuli Pei,

    Roles Funding acquisition, Investigation, Methodology, Software

    Affiliation College of Agriculture and Life Science, Kunming University, Kunming, China

  • Zange Jing,

    Roles Funding acquisition, Investigation, Methodology, Software

    Affiliation College of Agriculture and Life Science, Kunming University, Kunming, China

  • Zhongliang Dai

    Roles Conceptualization, Project administration, Supervision, Writing – review & editing

    zzc1981zzc@163.com

    Affiliation Department of Vegetables and Flowers, Zhenjiang Institute of Agricultural Sciences, Jurong, China

Abstract

Broccoli (Brassica oleracea var. italica) is an important B. oleracea cultivar, with high economic and agronomic value. However, comparative genome analyses are still needed to clarify variation among cultivars and phylogenetic relationships within the family Brassicaceae. Herein, the complete chloroplast (cp) genome of broccoli was generated by Illumina sequencing platform to provide basic information for genetic studies and to establish phylogenetic relationships within Brassicaceae. The whole genome was 153,364 bp, including two inverted repeat (IR) regions of 26,197 bp each, separated by a small single copy (SSC) region of 17,834 bp and a large single copy (LSC) region of 83,136 bp. The total GC content of the entire chloroplast genome accounts for 36%, while the GC content in each region of SSC,LSC, and IR accounts for 29.1%, 34.15% and 42.35%, respectively. The genome harbored 133 genes, including 88 protein-coding genes, 37 tRNAs, and 8 rRNAs, with 17 duplicates in IRs. The most abundant amino acid was leucine and the least abundant was cysteine. Codon usage analyses revealed a bias for A/T-ending codons. A total of 35 repeat sequences and 92 simple sequence repeats were detected, and the SC-IR boundary regions were variable between the seven cp genomes. A phylogenetic analysis suggested that broccoli is closely related to Brassica oleracea var. italica MH388764.1, Brassica oleracea var. italica MH388765.1, and Brassica oleracea NC_0441167.1. Our results are expected to be useful for further species identification, population genetics analyses, and biological research on broccoli.

Introduction

Broccoli is a vegetable with a high nutrient content in Brassica oleracea. It possesses of wide range of nutrients, including vitamins A and K, antioxidants, β-carotene, calcium, riboflavin, and iron [1], as well as phytochemicals, such as phenols, flavonoids, glucosinolates, minerals, and selenium. The consumption of broccoli is beneficial to human health [2], exerting anti-inflammatory, anti-obesity, cholesterol-lowering, and anti-carcinogenic effects as well as high antioxidant activity [3, 4]. Broccoli was introduced to China as a special vegetable and was initially cultivated on a small scale. Over the past few decades, broccoli has played in increasing role in the booming vegetable industry and has become an increasing source of income for farmers.

Chloroplasts (cp) are crucial organelles in plant cells as a metabolic center of cellular reactions [5]. They play critical roles in carbohydrate metabolism, photosynthesis, and various molecular processes as well as in the regulation of physiology, growth, development, and stress responses [6, 7]. The typical cp genome of angiosperms has a quadripartite structure with a small single-copy (SSC) region and large single-copy region (LSC) region divided by two inverted repeat (IR) regions [8]. The gene content and organization of cp genomes are highly conserved; however, IR expansions and contractions, gene loss, inversions, and rearrangements have been reported [9]. Owing to their high conservation and slow rates of evolution, cp genomes are invaluable for phylogenetic classification, DNA barcoding, and genetic engineering [10, 11].

Broccoli crops from the Brassica oleracea group likely originated in the Mediterranean basin and are linked to closely related species, e.g., Brassica cretica and Brassica montana [12]. The selection and domestication processes led to the spread and exchange of genetic materials with other Brassica oleracea cultivars. Intense trade and human migration among several continents promoted the spread of the crop worldwide since the 15th century, resulting in the development of new cultivars and hybrids, mainly in European and Asian countries. Adaptation to different soil and climatic conditions resulted from the cultivation and selection of genotypes with beneficial agronomical and qualitative traits [13].

Advances in high-throughput Illumina genome sequencing technologies have provided an opportunity to obtain and analyze the complete chloroplast genome of broccoli for analyses of its molecular and genomic characteristics. A sufficient knowledge of its genetic diversity is essential for the development of efficient strategies for its exploitation. Several complete cp genomes of Brassica oleracea are available in GenBank (Accession numbers KX681657.1, MH388765.1, MH388764.1, KX681654.1, KR233156.1, MG717288.1, MG717287.1, KX681655.1, and KX681656.1).

In this study, we sequenced and assembled the complete cp genome of broccoli cultivar 2001B (B. oleracea var. italica) for analyses of phylogenetic relationships with B. oleracea cultivars and other members of Brassicaceae. In particular, we de novo sequenced and assembled the complete cp genome of broccoli using the Illumina HiSeq2500 platform, followed by gene annotation and structural analyses, the identification of simple sequence repeat (SSR) markers, and reconstruction of evolutionary relationships among species in Brassicaceae. These results will hopefully improve our understanding of the cp genome and provide a theoretical basis for future scientific research on broccoli.

Materials and methods

DNA extraction and sequencing

Broccoli was planted in the experimental field of Zhenjiang Institute of Agricultural Sciences in Jurong, Jiangsu Province, China (N31°58’, E119°9’). Fresh leaves were collected and wrapped with tin foil, frozen with liquid nitrogen, and immediately stored at -80°C. Total genomic DNA was extracted from approximately 5 g of leaves with Plant DNA Isolation Reagent (Takara, USA) following the manufacturer’s protocol. An Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and NanoDrop 2000 Microspectrophotometer were used to evaluate the quality and integrity of the extracted DNA. After purification, the DNA was employed to build a sequencing library according to the manufacturer’s instructions. The Illumina HiSeq2500 platform (San Diego, CA, USA) was utilized to construct paired-end (PE) libraries with insert sizes of 150 bp and sequenced according to standard protocols, including sample quality testing, library construction and quality testing, and library sequencing.

Cp genome assembly and annotation

High-quality clean reads were generated by trimming and filtering the low-quality reads and sequencing adapters using Trimmomatic v. 0.3649. The clean reads were mapped onto the available cp genome reference of B. oleracea var. capitata (NCBI accession: KX681654.1) using Bowtie2(version 2. 2. 5) [14] with default parameters and preset options. All cp-like reads were assembled into contigs using SPAdes3.10.1 [15]. Then, the contigs were aligned again on the Brasscia oleracea var. capitata reference using the BLAST algorithm. The generated contigs and mate-pair reads were used for scaffolding using SSPACE(version 3.0) [16] to form a circular genome.

The tRNAs, rRNAs, and protein-coding genes of the plastome were annotated using the CpGAVAS online (http://www.herbalgenomics.org/0506/cpgavas) [17] and then manually corrected. BLAST(v2.2.31) and DOGMA (http://dogma.ccbb.utexas.edu/) were used to check the annotation results [18]. The online tool tRNAscan-SE with default settings (http://lowelab.ucsc.edu/tRNAscan-SE/) was applied to analyze the tRNAs [19]. The physical circular cp genome map was generated using OrganellarGenomeDRAW (http://ogdraw.mpimp-golm.mpg.de/index.shtml) [20] with default settings and checked manually. Relative synonymous codon usage (RSCU) was evaluated using CodonW v1.4.4 [21]. Long repetitive sequences and SSRs were analyzed using REPuter (http://bibiserv.techfak.uni-bielefeld.de/reputer/) [22] and MISA v1.0 (http://pgrc.ipk-gatersleben.de/misa/misa.html) [23] with the parameters are 1–8 (single base repeats 8 times and more), 2–5 (double base repeats 5 times and more), 3–3 (tribasic repeats 3 times and more), 4–3 (tetrabase repeats 3 times and more), 5–3 (pentabase repeats 3 times and more), respectively.

Sequencing data and gene annotations of B. oleracea var. italica were submitted to NCBI GenBank database (Accession Number: MN649876.1).

Cp genome comparison

The newly generated genome (MN649876.1) was compared with the genomes of Brassica oleracea var. italica (Accession Number: MH388765.1), Brassica oleracea var. capitata (Accession Number: MG717287.1), Brassica oleracea var. botrytis (Accession Number: KX681655.1), and Brassica oleracea var. gongylodes (Accession Number: KX681656.1). And compare the boundaries between the LSC, IR and SSC regions in the chloroplast genome with other six genomes(Arabidopsis thaliana, Capsella burse-pastoris, Brassica napus, Brassica juncea, Brassica nigra, and Bunias orientalis) using mVISTA (http://genome.lbl.gov/vista/index.shtml) in the shuffle-LAGAN mode [24], with the annotation of B. oleracea var. capitata as a reference. The IRB-LSC, IRB-SSC, IRA-SSC, and IRA-LSC boundaries were compared among the seven species with the annotations of cp genomes available in GenBank.

Phylogenetic analysis

The phylogenetic trees were constructed by aligning total chloroplast protein-coding sequences from 31 species in Brassicaceae obtained from the GenBank database, using Carica papaya (NC_010323.1) as an outgroup. MAFFTA version 7.017 [25] was used generate sequence alignments. FastTree v. 2.1.10 [26] was employed to construct a phylogenetic tree by the maximum likelihood (ML) method with the GTRGAMMA model and 1000 bootstrap replicates to evaluate node support.

Results

Characteristics of the broccoli cp genome

The newly generated genome (MN649876.1) was a typical quadripartite circular molecule 153,364 bp in length, containing a pair of two IR (IRA and IRB) regions of 26,197 bp each, separated by a SSC region of 17,834 bp and a LSC region of 83,136 bp (Fig 1 and Table 1). The AT and GC contents of overall cp genome were 63.64% and 36.36%, respectively. The cp genome had a biased base composition (31.36% A, 32.28% T, 17.86% G, and 18.5% C) with an overall GC content of 36.36%. The GC contents of the IR, LSC, and SSC regions were 42.35%, 34.15%, and 29.1%, respectively.

thumbnail
Fig 1. Physical map of the B. oleracea var. italica cp genome.

https://doi.org/10.1371/journal.pone.0263310.g001

thumbnail
Table 1. Summary of cp genome of B. oleracea var. italica.

https://doi.org/10.1371/journal.pone.0263310.t001

The genome harbored 133 genes, including 88 protein-coding genes (PCGs) (79 PCG species), 37 tRNA genes (30 tRNA species), and 8 rRNA genes (4 rRNA species) (Fig 1, Tables 1 and 2). Among these, 15 genes encoded a small ribosomal subunit (SSU), 11 encoded a large ribosomal subunit (LSU), and 4 genes encoded the DNA-directed RNA polymerase. Forty-five genes were associated with photosynthesis, including 5 encoding photosystem I and 15 encoding the photosystem II complex, 12 for subunits of NADH dehydrogenase, 6 for the cytochrome b/f complex, 6 for different subunits of ATP synthase, and one for the Large subunit of rubisco. Five genes were associated with functions other than self-replication and photosynthesis, and eight genes had unknown functions. Thirty-four genes, including 14 tRNA genes, 2 rps7, 2 ndhB, 2 rpl2, 2 rpl23, 2 rrn5, 2 rrn4.5, 2 rrn23, 2 rrn16, 2 ycf2, and 2 ycf15 were duplicated in the IR regions. Most of the genes occurred as a single copy, and 18 gene species occurred in two copies, including 4 rRNA species (rrn4.5, rrn5, rrn16, and rrn23), 7 tRNA species (trnA-UGC, trnI-GAU, trnN-GUU, trnV-GAC, trnL-CAA, trnI-CAU, and trnR-ACG), and 7 PCG species (rps7, rpl2, rpl23, ndhB, ycf1, ycf2, and ycf15), in addition, one PCG species (rps12) occurred in three copies. Except for ycf1 and rps12 residing within the LSC region, all other 15 duplicated gene species were completely located within the IR regions. Nine PCG species (rps16, rpl2, rpl16, rpoC1, ndhA, ndhB, petB, petD, and atpF) and five tRNA species (trnA-UGC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC) contained a single intron, while three other PCG species (rps12, ycf3, and clpP) harbored two introns (Tables 2 and 3). The trnK-UUU gene had the largest intron (2557 bp), followed by the ndhA gene (1098 bp), whereas trnL-UAA has the smallest intron (311 bp). The intron in the trnK-UUU gene was 2555 bp, and the matK gene was contained within the intron.

thumbnail
Table 2. Gene contents in the cp genome of B. oleracea var. italica.

https://doi.org/10.1371/journal.pone.0263310.t002

thumbnail
Table 3. Lengths of introns and exons in genes in the B. oleracea var. italica cp genome.

https://doi.org/10.1371/journal.pone.0263310.t003

For comparative analyses, information from the new genome and other genomes from the GenBank was compared (Tables 4 and 5). Except for the genome size of B. oleracea var. Italica (MH388765.1) which is 153365bp, the other genome sizes are 1553364bp (Table 5). The tRNA genes were exactly the same detected in MN649876.1 and MH388765.1. Besides, a pseudogene, rps19, in the MH388765.1 cp genome was not detected in other species. KX681655.1 and KX681656.1 lost 11 genes detected in MN649876.1 (Table 4). In addition, Compared to the reference genome, Brasscia gongylodes (KX681656.1) genome contains two indels and nine SNPs, one of the indels involves the rpoC2 gene, Brasscia italica (MH388765.1) genome contains one indel which involves the ycf1 gene and five SNPs, Brasscia capitata (MG717287.1)) includes 12 SNPs, 10 of which involves the ycf1 gene, and Brasscia botrytis (KX681655.1) includes 2 SNPs (Table 5).

thumbnail
Table 4. Differences in annotated genes between the newly generated genome (MN649876.1) and other Brassica oleracea genomes.

https://doi.org/10.1371/journal.pone.0263310.t004

thumbnail
Table 5. Differences in genome size and genome divergence (SNPs and Indels) between the newly generated genome (MN649876.1) and other Brassica oleracea genomes.

https://doi.org/10.1371/journal.pone.0263310.t005

Examination of codon usage frequency

According to the coding sequence (CDS), the relative synonymous codon usage frequency (RSCU) and codon usage frequency were estimated (Table 6, Fig 2). All protein-coding genes in the cp genome were composed of 26,681 codons. Among these codons, the termination codons were UAA, UAG, and UGA. AUG encoding methionine had the highest RSCU value (2.9901). The most abundant amino acid in the protein-coding genes was leucine (2829 codons, approximately 10.6% of the total), compared with only 325 codons (1.22%) for cysteine.

thumbnail
Fig 2. Codon contents of 20 amino acid and stop codons in all protein-coding genes of the broccoli cp genome.

https://doi.org/10.1371/journal.pone.0263310.g002

thumbnail
Table 6. Codon usage in the B. oleracea var. italica cp genome.

https://doi.org/10.1371/journal.pone.0263310.t006

The codon-anticodon recognition patterns of the cp genome showed that 30 tRNAs contained codons corresponding to 20 essential amino acids for protein biosynthesis. The AT contents at the first, second, and third codon positions were 55.3%, 62.53%, and 71.21%, respectively. Moreover, of all 66 codons, the RSCU values for 31 codons were >1, and most (13/16, 93.5%) ended with base A or U, whereas 34 codons had RSCU values of <1, and most of these (16/15, 91.2%) ended with base C or G. Trp was encoded by only a UGG codon, indicating no biased usage (RSCU = 1).

Analyses of repeat sequences and SSRs

A total of 35 repeat sequences, including 12 forward (F), 20 palindromic (P), and 3 reverse (R) repeats were detected using REPuter in the broccoli cp genome (Table 7). Repeat lengths were generally between 30 to 47 bp. LSC, SSC, and IR regions harbored 22, 7, and 12 repeats, respectively. Most repeats were mainly located in intergenic spaces (IGS), ycf, and intron sequences, whereas 13 repeats were located in psaA, psaB, trnS-GCU, trnS-GGA, and trnS-UGA.

thumbnail
Table 7. Repeat sequences in the broccoli chloroplast genome.

https://doi.org/10.1371/journal.pone.0263310.t007

A total of 92 SSRs, including 66 mononucleotides (P1), 18 dinucleotides (P2), 3 trinucleotides (P3), and 5 tetranucleotides were explored. Most were distributed in the LSC (58, 63.00%) and SSC regions (22, 23.9%), with some in the IR region (12,13.00%) (Tables 8 and 9, Fig 3). One SSRs belonged to the C repeat units and the others belonged to the A and T types (98.49%), while dinucleotides included TA and AT repeats. Trinucleotides were the last prevalent with the lowest number of repeat units (3). Moreover, 37 repeats were found in different genes, and the remaining were found in intergenic regions.

thumbnail
Fig 3. Statistical summary of repeat sequences in the cp genome of broccoli.

https://doi.org/10.1371/journal.pone.0263310.g003

thumbnail
Table 8. Number of SSRs distributed in the SSC, LSC, and IR regions.

https://doi.org/10.1371/journal.pone.0263310.t008

IR junction characteristics

The expansion and contraction of IR-SSC and IR-LSC boundaries of seven species, including B. oleracea var. italica, A. thaliana, C. bursa−pastoris, B. napus, B. juncea, B. nigra, and Bunias orientalis, were compared (Fig 4). In the figure, JLB, JLA, JSB, JSA represent for IRb/LSC, IRa/LSC, IRb/SSC, and IRa/SSC junction, respectively. The IR sizes of the LSC, IR, and SSC regions were similar in the cp genomes of the seven species, and the IR length varied from 26,035 bp in B. napus to 26,459 bp in C. bursa−pastoris (accession number: AP009371). The JLB border was within the coding region of rps19 in the above seven species and only 1 base par difference in location across different cp genomes. The two genes ycf1 and ndhF crossed the JSB junction. Most of the ycf1 gene in the seven species was located in the IRB region and 1–4 bp was located in the SSC region. Overlap between ycf1 and ndhF was detected at the JSB boundary in all studied cp genomes, with lengths of 35 bp to 38 bp. The ycf1 gene crossed the JSA region in all cp genomes, and its length reflected changes in the JSA region. The tRNA noncoding gene trnH-GUG in the seven species were all within the LSC region, located 2–30 bp from the JLA boundary. These results suggested that the IR border shifts were relatively minor, involving only a small number of genes, with differences in gene overlap lengths and the distance of trnH-GUG at the junction of JLA boundaries.

thumbnail
Fig 4. Comparison of boundaries between the LSC, IR, and SSC regions in chloroplast genomes of seven species.

Genes are depicted by colored boxes. Boxes above or below the main line indicate adjacent border genes.

https://doi.org/10.1371/journal.pone.0263310.g004

Phylogenetic analysis

cpDNA-based phylogenetic analyses have provided insight into evolutionary relationships, population genetics, and classification in different plant taxa [27]. To investigate the taxonomic status and evolutionary relationships of Brassica oleracea var. italica within Brassicaceae, ML phylogenetical analyses were performed based on 56 complete cp genome sequences (Fig 5). The phylogenetic analysis revealed that all B. oleracea cultivars were closely related, forming a well-supported clade. The newly generated genome (Accession Number: MN649876.1) was classified as B. oleracea and formed a clade with Brassica oleracea (NC_041167.1). The two Brassica oleracea var. italica cultivars MH388765.1 and MH388764.1 formed a clade. B. oleracea var. Gongylodes and B. oleracea MG717288.1 formed a clade. The phylogenetic results clearly elucidate the position of B. oleracea var. italica within in Brassicaceae and provide a basis for future evolutionary studies.

thumbnail
Fig 5. Phylogenetic tree inferred by the maximum likelihood method based on the complete cp genomes from 56 species.

Bootstrap support values are shown at the nodes.

https://doi.org/10.1371/journal.pone.0263310.g005

Discussion

B. oleracea var. italica is an important vegetable among B. oleracea cultivars. In general, the gene content and genome organization of land plant chloroplast genomes are more highly conserved than those of mitochondrial and nuclear genomes. However, gene losses and inversions had been reported in Asteraceae, Leguminosae, and Gentianaceae [2830]. In the present study, we compared the complete cp genomes and gene annotations of various B. oleracea cultivars with data available in the GenBank database. The size of the cp genome obtained in this study was similar to those of other B. oleracea varieties. However, the number of annotated genes differed among genomes; this may be explained by incomplete data, gene losses, or interspecific differences.

Our results indicated that the DNA GC content was not evenly distributed among genomic regions. The GC content in the IR region was higher than those in other regions, possibly because the GC content (an indicator of species relationships) of the four rRNAs in this region was high [11, 31]. The newly sequenced broccoli genome contained 133 genes, with high conservation in composition and arrangement, including self-replication genes, photosynthetic genes, other functional genes, and genes with unknown functions, consistent with previous research [32]. Furthermore, 23 genes contained one intron or two introns, and trnR-UKK had the largest intron. Introns play crucial roles in the regulation of gene expression depending on conditions and on the location [33]. Coding usage is a key factor in cp genome evolution. In the broccoli cp genome, the most and least frequent amino acids were leucine and cysteine, respectively, as observed in other angiosperm genomes, such as Ananas comosus, Decaisnea insignis, Nasturtium officinale, and Magnolia zenii [3436]. In the broccoli cp genome, AT was preferred over GC, especially at the second and third codon positions (62.53% and 71.21%, respectively), consistent with results obtained for many terrestrial species [37].

A repeat analysis revealed 12 forward, 20 palindromic, and 3 reverse repeats in the broccoli cp genome. Most of these repeats were located in intron sequences, intergenic spacers, and the ycf gene, but several occurred in CDS regions and tRNAs. Repeat sequences are involved in sequence variation, genome rearrangements, and many rearrangement endpoints in algal and angiosperm genomes [38, 39]. The organization of cp genome sequences is highly conserved and the SSR primer for cp genomes can be inherited across genera and species. Accordingly, SSRs are widely used as molecular markers for genetic linkage map construction, population genetic analyses, polymorphism identification, plant breeding, and taxonomic analyses [40]. A total of 92 SSRs were obtained in this study, and 66 (71.7%) SSRs belonged to the P1 type, among which 65 (70.7%) belonged to A and T repeat units, while TA and AT repeats belonged to the P2 type. These findings agree with previous results [41, 42]. The phylogenetic analysis yielded 53 notes with bootstrap values, among which 21 and 36 notes had bootstrap values greater than 100% and 90%, respectively. In this present study, the phylogenetic trees demonstrated that Brassica nigra and S. arvensis were clustered into one subgroup, which was consistent with others research [43]. And the newly generated genome (Accession Number: MN649876.1) was closely related to NC_041167.1.

Plant cp genomes are considered highly conserved; however, the sizes and LSC/IRb/SSC/IRa boundaries change due to contraction or expansion at the borders of the IR region [44]. Our results indicated that divergence in the IR border between seven species was related to the different positions of four genes, rps19, ycf1, ndhF, and trnH-GUG, in agreement with previous results [45, 46]. It is worth noting that the ycf1 gene was found at the JSA boundary from 1022 to 1034 bp in the IRA region in all cp genomes analyzed. Besides, the trnH gene located at the LSC region in all tested cp genomes, but the distance to the JLA boundary varies from 2-30bp. Combining the above results, we indicate that these seven cp genomes were relatively conserved, and the boundary divergence in the JSA and JLA in these species was the main reason for the expansion and contraction of the IR region.

References

  1. 1. Jang MW, Ha BJ. Effects of Broccoli on Anti-inflammation and Anti-oxidation According to Extraction Solvent. Journal of Food Hygiene & Safety. 2012;27(4).
  2. 2. Finley JW, Ip C, Lisk DJ, Davis CD, Hintze KJ, Whanger PD. Cancer-protective properties of high-selenium broccoli. Journal of Agricultural and Food Chemistry. 2001;49(5):2679–83. pmid:11368655
  3. 3. Latté K, Appel KE, Lampen A. Health benefits and possible risks of broccoli—An overview. Food & Chemical Toxicology. 2011;49(12):3287–309. pmid:21906651
  4. 4. Lee JJ, Shin HD, Lee YM, Kim AR, Lee MY. Effect of Broccoli Sprouts on Cholesterol-lowering and Anti-obesity Effects in Rats Fed High Fat Diet. Journal of The Korean Society of Food Science and Nutrition. 2009.
  5. 5. Krzysztof B, Burch-Smith TM. Chloroplast signaling within, between and beyond cells. Frontiers in Plant Science. 2015;6(781):781. pmid:26500659
  6. 6. Spetea C, Hundal T, Lundin B, Heddad M, Adamska I, Andersson B. Multiple evidence for nucleotide metabolism in the chloroplast thylakoid lumen. Proceedings of the National Academy of Sciences USA. 2004;101: 1409–14. pmid:14736920
  7. 7. Chung HY, Won SY, Kim YK, Kim JS. Development of the chloroplast genome-based InDel markers in Niitaka (Pyrus pyrifolia) and its application. Plant Biotechnology Reports. 2019;13: 51–61.
  8. 8. Wicke S, Schneeweiss GM, Müller KF, Quandt D. The evolution of the plastid chromosome in land plants: gene content, gene order, gene function. Plant Molecular Biology. 2011;76(3–5): 273–297. pmid:21424877
  9. 9. Chumley TW, Palmer JD, Mower JP, Matthew FH, Calie PJ, Boore JL, et al. The Complete Chloroplast Genome Sequence of Pelargonium × hortorum: Organization and Evolution of the Largest and Most Highly Rearranged Chloroplast Genome of Land Plants. Molecular Biology & Evolution. 2006;(11):2175–2190. pmid:16916942
  10. 10. Hollingsworth PM, Forrest LL, Spouge JL, Hagibabaei M, Ratnasingham S, van der Bank M, et al. A DNA barcode for land plants. Proceedings of the National Academy of Sciences. 2009;106(31): 12794–12797. pmid:19666622
  11. 11. Guo S, Guo LL, Zhao W, Xu J, Li YY, Zhang XY, et al. Complete chloroplast genome sequence and phylogenetic analysis of Paeonia ostii. Molecular. 2018;23: 246. pmid:29373520
  12. 12. Gόmez-Campo C, Gustafsson M. Germplasm of wild n = 9 Mediterranean species of Brassica. Botanika Chronika. 1991;10:429–434.
  13. 13. Nuez F, Gómez Campo C, Fernández de Córdova P, Soler S, Valcárcel JV. Colleccion de semillas de coliflor y broccoli. Instituto Nacional de Investigationy Tecnologia Agrariay Alimentaria, Madrid, pgg. 1999;120.
  14. 14. Langmead B, Salzberg S. Fast gapped-read alignment with bowtie 2. Nature Methods. 2012;9: 357–359. pmid:22388286
  15. 15. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology. 2012;19: 455–477. pmid:22506599
  16. 16. Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding preassembled contigs using SSPACE. Bioinformation. 2011;27: 578–579. pmid:21149342
  17. 17. Liu C, Shi LC, Zhu YJ, Chen HM, Zhang JH, Lin XH, et al. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genomics. 2012;13: 715. pmid:23256920
  18. 18. Wyman SK, Jansen RK, Boore JL. Automatic annotation of organelle genomes with DOGMA. Bioinformation. 2004;20: 3252–3255. pmid:15180927
  19. 19. Schattner P, Brooks AN, Lowe TM. The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Research. 2005;33: W686. pmid:15980563
  20. 20. Lohse M, Drechsel O, Kahlau S, Bock R. Organellar genome DRAW-a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Research. 2013;41: W575–W5891. pmid:23609545
  21. 21. Peden JF. Analysis of codon usage. Biosystems. 1999; 5: 45–50. pmid:9287961
  22. 22. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Research. 2001;29: 4633–4642. pmid:11713313
  23. 23. Mudunuri SB, Nagarajaram HA. IMEx: imperfect microsatellite extractor. Bioinformation. 2007;23: 1181–1187. pmid:17379689
  24. 24. Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I. VISTA: computational tools for comparative genomics. Nucleic Acids Research. 2004;32 (suppl_2): W273–W279. pmid:15215394
  25. 25. Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: Improvement in accuracy of multiple sequence alignment. Nucleic Acids Research. 2005;33: 511–518. pmid:15661851
  26. 26. Price MN, Dehal PS, Arkin AP. FastTree 2-approximately maximumlikelihood trees for large alignments. PloS One. 2010;5: e9490. pmid:20224823
  27. 27. Gu CH, Ma L, Wu ZQ, Chen K, Wang YX. Comparative analyses of chloroplast genomes from 22 Lythraceae species: inferences for phylogenetic relationships and genome evolution within Myrtales. BMC Plant Biology. 2019;19: 281. pmid:31242865
  28. 28. Doyle JJ, Doyle JL, Ballenger JA, Palmer JD. The distribution and phylogenetic significance of a 50-kb chloroplast DNA inversion in the flowering plant family Leguminosae. Molecular Phylogenetics and Evolution. 1996;5: 429–438. pmid:8728401
  29. 29. Walker JF, Zanis MJ, Emery NC. Comparative analysis of complete chloroplast genome sequence and inversion variation in Lasthenia burkei (Madieae, Asteraceae). American Journal of Botany. 2014;101: 722–729. pmid:24699541
  30. 30. Sun SS, Fu PC, Zhou XJ, Cheng YW, Zhang FQ, Chen SL, et al. The complete plastome sequences of seven species in Gentiana sect. Kudoa (Gentianaceae): insights into plastid gene loss and molecular evolution. Frontiers in Plant Sciences. 2018;9: 493. pmid:29765380
  31. 31. Shen XF, Wu ML, Liao BS, Liu ZX, Bai R, Xiao SM, et al. Complete chloroplast genome sequence and phylogenetic analysis of the medicinal plant Artemisia annua. Molecular. 2017;22: 1330. pmid:28800082
  32. 32. Saski C, Lee SB, Daniell H, Wood TC, Tomkins J, Kim HG, et al. Complete chloroplast genome sequence of glycine max and comparative analyses with other legume genomes. Plant Molecular Biology. 2005;59(2), 309–322. pmid:16247559
  33. 33. Niu DK, Yang YF. Why eukaryotic cells use introns to enhance gene expression: Splicing reduces transcription-associated mutagenesis by inhibiting topoisomerase I cutting activity. Biol Direct. 2011;6: 24. pmid:21592350
  34. 34. Redwan RM, Saidin A, Kumar SV. Complete chloroplast genome sequence of MD-2 pineapple and its comparative analysis among nine other plants from the subclass Commelinidae. BMC Plant Biology. 2015;15: 196. pmid:26264372
  35. 35. Li B, Lin F, Huang P, Guo W, Zheng Y. Complete chloroplast genome sequence of decaisnea insignis: genome organization, genomic resources and comparative analysis. Scientific Reports. 2017;7(1):10073. pmid:28855603
  36. 36. Li YF, Sylvester SP, Li M, Zhang C, Li X, Duan YF, et al. The complete plastid genome of Magnolia zenii and genetic comparison to Magnoliaceae species. Molecular. 2019;24:261. pmid:30641990
  37. 37. Wang W, Yu H, Wang JH, Lei WJ, Gao JH, Qiu XP, et al. The complete chloroplast genome sequences of the medicinal plant forsythia suspensa (Oleaceae). International Journal of Molecular Sciences. 2017;18 (11): 2288. pmid:29088105
  38. 38. Haberle RC, Fourcade HM, Boore JL, Jansen RK. Extensive rearrangements in the chloroplast genome of Trachelium caeruleum are associated with repeats and tRNA genes. Journal of Molecular Evolution. 2008;66: 350–361. pmid:18330485
  39. 39. He S, Wang Y, Volis S, Li D, Yi T. Genetic diversity and population structure: implications for conservation of wild soybean (Glycine soja Sieb. et Zucc) based on nuclear and chloroplast microsatellite variation. International Journal of Molecular Sciences. 2012;13: 12608–12628. pmid:23202917
  40. 40. Xue J, Wang S, Zhou SL. Polymorphic chloroplast microsatellite loci in Nelumbo (Nelumbonaceae). American J Bot. 2012;99 (6): 240–244. pmid:22615305
  41. 41. Li X, Li YF, Zang MY, Li MZ, Fang YM. Complete chloroplast genome sequence and phylogenetic analysis of Quercus acutissima. International Journal of Molecular Sciences. 2018;19: 2443. pmid:30126202
  42. 42. Li XQ, Zuo YJ, Zhu XX, Liao S, Ma JS. Complete chloroplast genomes and comparative analysis of sequences evolution among seven Aristolochia (Aristolochiaceae) medicinal species. International Journal of Molecular Sciences. 2019;20: 1045. pmid:30823362
  43. 43. Du X, Zeng T, Feng Q, Hu L, Zhu B. The complete chloroplast genome sequence of yellow mustard (sinapis alba l.) and its phylogenetic relationship to other brassicaceae species. Gene. 2020; 731, 144340. pmid:31923575
  44. 44. Yang KW, Nath UK, Biswas MK, Kayum MA, Yi G, Lee J, et al. Whole- genome sequencing of Brassica oleracea var. capitata reveals new diversity of the mitogenome. Plos One. 2018;13(3): e0194356. pmid:29547671
  45. 45. Bolger A, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformation. 2014; 30: 2114–2120. pmid:24695404
  46. 46. Li ZX, Liu YM, Fang ZY, Yang LM, Zhuang M, Zhang YY, et al. Development status, existing problems and coping strategies of broccoli in China. China vegetables. 2019;4: 1–5.