Combined Analysis of the Chloroplast Genome and Transcriptome of the Antarctic Vascular Plant Deschampsia antarctica Desv

Background Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. Results The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5′- or 3′-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. Conclusions We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast transcriptome.


Introduction
Chloroplasts are plant-specific organelles that conduct photosynthesis, providing essential energy for the synthesis of starch, fatty acids, pigments, and amino acids [1,2]. Chloroplasts contain DNA and their own genetic information. In higher plants, chloroplast genomes exist as circular DNA, with the size ranging from 120 kb to 150 kb, and generally have a highly conserved quadripartite organization composed of two copies of inverted repeats (IRs), which separate the large single copy (LSC) and small single copy (SSC) regions [3,4]. In vascular plants, chloroplast genomes usually contain 110-130 unique genes encoding 4 rRNAs, 30-31 tRNAs, and 80-90 proteins; these encode ribosomal proteins and RNA polymerase subunits involved in protein synthesis, thylakoid proteins, and the Rubisco large subunit for photosynthesis, as well as protein subunits for an NADH dehydrogenase complex, which mediates redox reactions [2,5]. Advances in high-throughput sequencing technologies have resulted in the full sequences of organelle genomes from a growing number of organisms [6]. Currently, plastid genome resources with .420 records have been established. These provide a vast amount of high-resolution information that can be exploited in phylogenetic and ecological studies, making it possible to track the evolutionary history of a species after obtaining the full sequence of its chloroplast genome.
The grass family (Poaceae), which occurs in nearly every terrestrial habitat, is one of the most diverse angiosperm families, including approximately 10,000 species over 700 genera. To date, 38 chloroplast genomes of grass species [32 from the BEP (Bambusoideae, Ehrhartoideae, Pooideae) clade and 6 from the PACMAD (Panicoideae, Arundinoideae, Chloridoideae, Micrairoideae, Aristidoideae, and Danthonioideae) clade] have been deposited into the GenBank database, and recent studies have tried to reconstruct the phylogeny of the subfamilies and genera in the Poaceae family using whole sequences of chloroplast genomes [7,8].
Extremophile plants have evolved tolerance overcoming unfavorable environmental conditions, such as freezing temperatures, drought, high salinity, and high UV radiance. The genetic information on such species provides clues for the evolutionary or geological history of the species, as well as resources for genetic engineering. Antarctic hairgrass (Deschampsia antarctica Desv.) is the only native grass species that thrives in the harsh environment of Antarctica [9]. As an extremophile, it may be useful as a source of genes associated with stress tolerance [10]. It has also been suggested as an ecological marker of global warming because of its successful adaptation to climate change and its rapid spread [10,11]. Despite the importance of this terrestrially isolated plant, its phylogenetic position is still controversial [12][13][14], and available genetic resources are limited.
Here, we obtained the complete chloroplast genome sequence of D. antarctica by high-throughput sequencing and de novo assembly. By comparison with the chloroplast genomes from other representative members of the BEP clade, we explored the deep-phylogenetic relationship of D. antarctica to other grass species at the genomic level. In addition, using combinatorial analysis of the RNA-seq data, we conducted high-resolution mapping of the chloroplast-derived transcripts to a reference chloroplast genome to demonstrate transcriptome profiles of the coding and noncoding genes and the posttranscriptional processing by RNA editing in the chloroplasts of D. antarctica. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family and the characteristics of the chloroplast transcriptome.

Ethics Statement
This study including sample collection and experimental research conducted on these materials was according to the law on activities and environmental protection to Antarctic approved by the Minister of Foreign Affairs and Trade of the Republic of Korea.   1  Photosystem I  psaA, B, C, I, J, ycf3 a , ycf4   2  Photosystem II  psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z   3 Cytochrome   Genome Assembly, Annotation, and Sequence Analysis After trim of low quality reads and adapters, the raw reads were aligned to 330 publicly available chloroplast genomes downloaded from NCBI organelle genome resources. De novo assembly was done with the collected chloroplast-related reads by Celera Assembler 6.1 (Celera Genomics, Alameda, USA). The assembled contigs were ordered with reference chloroplast genomes of two ryegrass species, Lolium multiforum (NC_019651) and Festuca altissima (JX871939), which were identified as the top-hit species when the input reads were blasted against the nr database. The gaps were filled by realignment of input reads using Geneious R6 v6.1.5 (Biomatters Ltd., Auckland, New Zealand) and PCR-based Sanger sequencing using primers designed for gap-flanking regions ( Table  S1). The sequences from the junction and highly variable region were validated by Sanger sequencing. The complete plastome was annotated using the online software DOGMA with default parameters [16]. Repeat sequences were analyzed using REPuter [17].

Phylogenetic Analysis
Complete plastome sequences of nine Poaceae species (accession numbers are listed in Table S2) were aligned using the LAGAN program within the mVISTA online suite of computational tools [18]. Default parameters were applied, and the annotation framework of the perennial ryegrass chloroplast genome was used. The percentage identity between each plastome, all relative to that of D. antarctica, was subsequently visualized using an mVISTA plot [19]. The plastome-based phylogeny was reconstructed for the nine Poaceae species using the whole plastome alignment generated by LAGAN. The phylogenetic tree was constructed through the method of maximum parsimony, as implemented by MEGA 5.2 [20]. Sites with gaps or missing data were excluded from the analysis, and statistical support was achieved through bootstrapping using 1000 replicates.

Transcriptome and Small Noncoding RNA Analysis
We analyzed in-house RNA-seq data libraries generated from two sets of RNAs (mRNA and small RNA), obtained as described above. For transcriptome analysis, we analyzed combined data sets of mRNAs and small RNAs. The reads of the combined data sets were mapped to the complete chloroplast genome, and the filtered reads were collected using the Bowtie 2.0 program with mismatch #2 bp [21]. The filtered reads were remapped according to the genome annotation using Cufflinks to calculate the fragments per kilobase of exon per million fragments mapped (FPKM) values of the transcripts and TopHat for alignment of transcript variants [22]. For small noncoding RNA analysis, we collected the reads in the size range of 20-24 nt from the small RNA data set. The sizefiltered reads were mapped using Bowtie 2.0 with the criterion of zero mismatch. To search for RNA-editing sites in the chloroplast genome, putative target sites were predicted using two independent methods: 1) the PREP-chloroplast [23] search program using the chloroplast-genome sequence and 2) SAMtools/BCFtools, which calls single-nucleotide polymorphisms (SNPs) and indels by comparing transcripts against references [24]. After prediction, the candidate sites were manually examined in the transcriptome data using the Integrative Genomics Viewer (IGV) genome browser.

Chloroplast Genome Assembly and Validation
Illumina paired-end sequencing produced 153,346,825 raw reads with a sequence length of 101 bp and a total base number of 15,488,029,325. After quality trim and alignment of the raw reads against the publicly available chloroplast genomes reported in NCBI, we collected 1,985,544 chloroplast-related paired reads with 191,735,269 bases. The subsequent de novo assembly resulted in 18 large contigs .3 kb (max: 50,269 bp, min: 3,046 bp). To order the contigs, the chloroplast genomes of L. multiforum, and F. altissima were used as references because these species were identified as the top-hit species when the input reads were blasted against the nr database. The resulting gaps were filled by alignment of the input reads using the Geneious program and PCR-based Sanger sequencing. The sequences from the junction regions (LSC-IRA, LSC-IRB, SSC-IRA, SSC-IRB) and the regions with high interspecific variability were validated by Sanger sequencing. The final D. antarctica chloroplast genome sequence has been submitted to GenBank (Accession No. KF887484).

Genome Organization and Gene Content
The size of the D. antarctica chloroplast genome was 135,362 bp, similar in range as other Poaceae species, with a typical quadripartite structure ( Figure 1). The LSC and SSC regions were 79,881 bp and 12,519 bp in size, respectively, separated by a pair of inverted repeats (IRa and IRb), which were both 21,481 bp in length. The GC content of the D. antarctica chloroplast genome was 38.3%, consistent with other reported Poaceae chloroplast genomes. The GC contents of the LSC and SSC regions were 36.3% and 32.4%, respectively, whereas that of the IR region was 43.85%.
The D. antarctica chloroplast genome contained 81 unique protein-coding genes, 12 of which were duplicated in the IR, including rps7, rps12, rps15, rps19, rpl2, rpl23, ycf1, ycf2, ycf15, ycf68, ndhB, and partial ndhH. Additionally, 29 unique tRNA genes, representing all 20 amino acids, were distributed throughout the genome (1 in the SSC region, 20 in the LSC region, and 8 in the IR region). Four rRNA genes were also identified, with complete duplication in the IR regions. Altogether, the D. antarctica chloroplast genome contained 114 unique genes (Table 1). Among them, 14 genes contained a single intron (9 protein-coding genes and 5 tRNA genes), while ycf3 contained two introns. Of the 15 genes with introns, 10 were located in the LSC (7 protein-coding genes and 3 tRNAs; 9 contained one intron and 1 contained two introns), 1 in the SSC (a protein-coding gene with a single intron), and 4 in the IR region (2 protein coding genes and 2 tRNAs, all 4 containing a single intron) ( Table 2). The rps12 gene is a transspliced gene with a 59-end exon located in the LSC region and duplicated 39-end exons located in the IR region. The trnK-UUU gene contained the largest intron (2,486 bp), which included the matK gene.
On the basis of the sequences of protein-coding genes and tRNA genes within the chloroplast genome, the frequency of codon usage was deduced (Table 3). Among these codons, 2,466 (11.22%) encode for leucine, while 321 (1.46%) encode for  cysteine, which are the most and least used amino acids, respectively. The codon usage is biased toward a high representation of A and T at the third codon position, which is similar to a previous report [25].

Comparison with Other Poaceae Chloroplast Genomes
The availability of multiple complete Poaceae chloroplast genomes provides an opportunity to compare sequence variation within the family at the genome-level. The sequence identity of seven Poaceae chloroplast genomes was plotted using the mVISTA program, with the annotation of D. antarctica as a reference ( Figure 2, percent identity plot, as summarized in Table  S3). The whole aligned sequences indicate that the Poaceae chloroplast genomes are rather conservative, although some divergent regions were found between these genomes. Similar to other plant species, the coding region is more conservative than the noncoding counterpart. Of all genes, ycf1 appears to be the most divergent pseudogene. In addition, rpl32, ycf2, and rpoC2 also displayed high sequence divergence. The noncoding regions showed a higher sequence divergence than the coding regions among the eight Poaceae chloroplast genomes. In the alignment sequences, several intergenic regions were found to display high divergence, including trnG(UCC)-trnfM(CAU), trnY(GUA)-trnD(GUC), ndhF-rpl32, and rpl32-trnL(UAG). In addition, the intron sequences from trnK(UUU), trnL(UAA), and ndhA showed high sequence divergence.
The length variation was also examined among D. antarctica and the eight Poaceae chloroplast genomes. The most interesting region with length variation was the rbcL-psaI region, which contains four gene regions and three intergenic regions (Figure 3). The variation of gene region was detected in the presence of an rpl23 translocation product and an accD pseudogene in the region between rbcL and psaI. The rpl23 gene was absent from L. perenne, F. arundinacea, and Brachypodium distachyon, and was present in the five other analyzed Poaceae species, including D. antarctica. Remnants of the accD gene were detected in D. antarctica, L. perenne, F. arundinacea, and Hordeum vulgare. This pseudogene was identified in rice but was not predicted in the other species according to DOGMA. The variation in size of the intergenics regions was also detected among species of the Pooideae subfamily. Three intergenic regions occurred between the rbcL and psaI genes. The intergenic region between rbcL and rpl23 ranged from 288 bp (D. antarctica) to 498 bp (Triticum aestivum). Between rpl23 and accD, it ranged from 0 bp (B. distachyon) to 661 bp (H. vulgare), and between accD and psaI, it ranged from 141 bp (B. distachyon) to 392 bp (Agrostis stolonifera). In cases when a particular gene was absent, the boundaries of the intergenic regions were determined based on homologies between the species.

Phylogenomic Analysis
Phylogenomic analysis of representatives from the Pooideae subfamily, including D. antarctica, produced a single, well-supported tree using maximum parsimony (Figure 4). The tree is well congruent with respect to species, and the two outgroup species belonging to the BEP clade (Bambusa oldhamii from Bambusoideae and Oryza sativa subsp. japonica from Ehrhartoideae) are basal to the remaining species in a separate resolved clade.

Repeat Sequence Analysis
Repeat regions of DNA are an important factor in genome recombination and rearrangement. We identified 69 repeats in D. antarctica, including 43 forward, 24 palindromic, and 2 reverse repeats with a length .20 bp and a sequence identity e-value ,10 23 , using the REPuter program (Table S4). Among the 69 repeats, 58 (84%) were 25-80 bp in length, 51 (63%) were 25-

Expression Analysis
We performed an expression analysis of the 81 chloroplast protein-coding genes using in-house RNA-seq data from leaf tissues of D. antarctica (Lee et al., unpublished data). The short reads were mapped to the D. antarctica chloroplast genome, and the numbers of reads corresponding to coding genes were calculated and normalized according to gene length ( Table 4). The most abundant genes were ndhC, psbJ, rps19, psaJ, and psbA, with FPKM value .10,000. Thirteen genes (ccsA, ndhI, rpoA, rpoC2, rps2, ndhA, ndhD, ycf1, rps11, rps3, ycf2, rpoC1, and rpoB) had low expression, with FPKM value ,100.

RNA Editing
RNA editing is a sequence-specific posttranscriptional modification resulting in conversion, insertion, and deletion of nucleotides in a precursor RNA. Such modifications are observed across organisms. In plants, RNA editing has been reported to occur with C-to-U or U-to-C (rare) conversions in mitochondria and plastids [26]. In the Deschampsia chloroplast genome, we first predicted 37 RNA-editing sites out of 16 genes using the PREP-chloroplast program (Table S5). Using another method, we aligned read sequences from the RNA-seq data using variant searching tools comparing transcripts against a reference genome and confirmed 30 editing sites. The 30 nucleotide substitutions occur in 23 genes in the D. antarctica chloroplast genome, which results in 25 nonsynonymous amino acid changes (Table 5). Of the substitutions, 17 (54.8%) were C-to-U conversions, resulting in 14 non-synonymous amino acid changes. In contrast, only 1 edit was a U-to-C conversion with synonymous base change. Although RNA editing of plant plastids has been shown to be conversions of C to U and U to C, we observed different versions of edits, including 3 A-to-Cs, 3 A-to-Gs, 3 G-to-As, 1 G-to-C, 1 U-to-A, 1 A-to-U, and 1 U-to-G in 13 sites.
We calculated the ratio between the number of reads with an alternate base and the number of reads with the same base as the reference. The percentages of the conversion rates of each edit varied with the locus (16-100%) ( Table 5). However, some edits with C-to-U conversion in several genes showed very high editing rates (.90%), especially for atpA, ycf3, ndhK, petB, rpoA, rps8, ndhD, ndhG, and ndhA, suggesting that the edited RNAs for these gene are common forms in the processed RNA pools in D. antarctica.

Discovery of Plastid Small Noncoding RNA in D. antarctica
Numerous small noncoding RNAs have been identified in the nuclear genomes of bacteria and eukaryotes. Small noncoding RNAs are also transcribed from mitochondria and plastid genomes [27][28][29]. In this study, we screened for small noncoding RNAs from our deep sequencing data in the small RNA library generated from D. antarctica leaf tissues. The reads between 20 and 24 nt in length were mapped to the chloroplast genome with 100% identity. In total, 12,753,636 reads were distributed unevenly throughout the chloroplast genome ( Figure 6), including coding regions of psbA and rbcL, intergenic regions, regions encoding several tRNA genes, and inverted repeat regions in which most of the rRNA genes exist. To exclude RNA fragments that may have been generated from abundant RNA species, we compared the distribution of reads that were 20-24 nt in length with those longer than 30 nt. As a result, we identified 27 loci where short noncoding RNAs (sRNAs) of 20-24 nt length with unique sequences were abundantly expressed ( Table 6).
The D. antarctica plastid sRNAs were not evenly distributed throughout the genome. The relative positions of the sRNAs showed that 19 of 27 (71%) were located in the noncoding regions (18 in intergenic regions and 1 in an intronic region). In particular, 30% and 11%, respectively, of the intergenic sRNAs were located at the 59-and 39-ends of genes (.100 bp from the start or termination codons) (Figure 7). Fifteen (55.6%) sRNAs were located within 2150 to +50 bp from the start codon of genes, suggesting that proximity to the 59-ends of genes is important.
To determine if the identified sRNAs are evolutionarily conserved, we compared the sequences of 27 sRNAs in D. antarctica with the sRNAs reported for other plant species by multiple sequence alignment [28,29]. In total, we found that 13 sRNAs have orthology with the plastid sRNAs found in Arabidopsis, rice, or barley (Figures 8, Figure S1, and Table 6). Among the pairs identified, four sRNAs (psbH-petB, atpH 59end, ndhB 59end, and petD_rpoA) showed .90% sequence homology, and their locations within the genome were the same in all of the species examined, suggesting these plastid sRNAs may be evolutionarily conserved across angiosperms (Figure 8).

Discussion
We obtained the completed sequence of the chloroplast genome of D. antarctica using whole genome sequencing data from total genomic DNA from leaves. As previous studies have reported, aligning all the reads against the plastid genome database allow the rapid and efficient assembly of the chloroplast genome [8,30,31].
By this method, we identified 1.2% of the total genomic reads as chloroplast-related sequences.
The chloroplast genome of D. antarctica has the typical features found in the genomes of other Poaceae species. The size of its genome and the ratio of GC content is 135,362 bp and 38.3%, respectively, similar to other Poaceae species. The subfamily Pooideae, which includes one-third of all grass species, has been divided into 13 tribes [14], but recent analyses have demonstrated wide variations between them. For example, neither Poeae nor Aveneae are monophyletic, and the components of these two groups are intermixed within a clade [13,32]. Traditional morphological phylogenetic studies placed Deschampsia within the tribe Aveneae. However, molecular studies inferred alternative phylogenetic positions of Deschampsia (i.e., Aveneae or Poeae), depending on the target sequences used for examination or the parameters used for grouping [12,13,[32][33][34][35]. In this study, we revised the phylogenetic position of D. antarctica using complete sequences of chloroplast DNA. A comparative analysis based on both whole plastome and open reading frame sequences of coding genes suggest that D. antarctica is more closely related with species in the Poeae tribe than the Aveneae tribe. This is in agreement with the results of Davis and Soreng [13], Catalan et al. [33], and Nadot et al. [34], in which Deschampsia forms a closer relationship with species of the Poeae than with those of Aveneae, as suggested by Souto et al. [12] and Hsiao et al. [35]. However, in our genome structure analysis, we found an interesting region (rbcL-psaI) where both the rpl23 translocation product and accD pseudogene were found. This appears to be specific to Deschampsia, since other Poeae or Aveneae species have kept only one remnant of accD or rpl23 in the region, suggesting that this region could be molecular evidence for an intermixed lineage of Deschampsia.
For the transcriptome analysis of the chloroplast genome, we utilized RNA-seq data from libraries generated by two preparation methods (mRNA-seq and small RNA-seq). We found that a significant proportion of the reads from RNA-seq data represent the organelle derived sequences, suggesting that the eukaryotic RNA-seq results are very good resources for a functional study of genes in organelles.
The transcriptome analysis of D. antarctica plastid RNAs revealed several interesting aspects of RNA metabolism. First, a search of the variant transcripts revealed numerous RNA-editing sites in the D. antarctica chloroplast genome. RNA editing has been observed in the chloroplasts of extant descendants of early land plants other than liverworts and mosses. In angiosperm plastids, RNA editing is mostly restricted to a C-to-U conversion, and the conversion occurs at about 30 different positions, whereas hornworts and fern plastids extensively edit U-to-C as well as Cto-U at .300 different positions [36]. A comparative analysis of eight land plants, including hornworts, ferns, and seed plants, suggested that chloroplast RNA editing is of monophyletic origin and evolved as a system to generate new variations [37]. Our transcriptome analysis revealed in situ editing sites beyond those predicted by computational tools (Table 4 vs. Table S5). According to the variant transcript search, the major form of RNA-editing is C-to-U conversion (54.8%), and the conversion rate of C-to-U edits (.90%) is much higher than those of other edits. Some edits with C-to-U conversion in several genes, such as atpA, ycf3, ndhK, petB, rpoA, rps8, ndhD, ndhG, and ndhA, have been reported in other species [37], indicating that these edits are functionally conserved in plants. Comparison between the whole genome DNA and transcriptome data also showed that various versions of edits exist and that their respective conversion rates differ. The difference in conversion rates among edits might be the result of tissue-specific, gene-specific, or developmental stagespecific RNA-editing patterns. Considering that mitochondrial RNA editing occurs with developmental and tissue specificity in plants [38][39][40], exploring whether tissue-disparity exists in plastid RNA-editing and the regulatory mechanisms that underlie it would be worthwhile.
We identified 27 plastid small noncoding RNAs in the D. antarctica chloroplast genome by high-resolution mapping of the transcriptome data. In Arabidopsis, rice, maize, and barley, small RNAs are expressed in plastids and their sequences correlate with the termini of processed mRNA [28,29]. These studies also suggested that the small RNAs are footprints of the RNA-binding pentatricopeptide repeat (PPR) proteins, which protect RNAs from exonucleolytic degradation. Our results support this hypothesis. We observed a large amount of small RNAs expressed in the D. antarctica plastid, and these RNAs were not randomly distributed but were located in intergenic regions preferentially near the 59-or 39-ends of coding regions. This suggests that many small RNAs are evolutionarily conserved in their sequences and locations, which might have resulted from the functionally conserved gene regulatory system of higher plants.

Conclusions
Using Illumina high-throughput sequencing technology, we obtained the complete sequence of the D. antarctica chloroplast genome. This is the first chloroplast genome sequenced from a plant species endemic to Antarctica. Sequence divergence analysis with other plastomes of the BEP clade in the grass family suggests a sister relationship between D. antarctica and two species of the Poeae tribe, F. anrundinacea and L. perenne. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts resulting from RNA-seq data. As a result, we could make an expression profile for 81 protein-coding genes and proposed ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes in D. antarctica. Analysis of small RNAseq revealed that 27 small noncoding RNAs are preferentially located close to the 59-or 39-ends of genes. Also, .30 RNAediting sites were found in the D. antarctica chloroplast genome, with a predominance of C-to-U conversions. These will be very useful for molecular phylogeny studies of the evolution of Antarctic plants and for transcriptome studies specific to plant organelles.   To determine if the identified sRNAs are evolutionarily conserved, Deschampsia antarctica sRNAs were compared with the plastid sRNAs identified in Arabidopsis, rice, or barley [28,29]. The sequence aligments of sRNAs which have .90% sequence homology are shown. The multiple sequence alignments were performed with ClustalW2 algorithm (http://www. ebi.ac.uk/Tools/msa/clustalw2/) and visualized with Jalview program [41]. The consensus sequences between ortholog sRNAs were shown at the bottom of each alignment. doi:10.1371/journal.pone.0092501.g008