Small RNA molecules are short, non-coding RNAs identified for their crucial role in post-transcriptional regulation. A well-studied example includes miRNAs (microRNAs) which have been identified in several model organisms including the freshwater flea and planktonic crustacean Daphnia. A model for epigenetic-based studies with an available genome database, the identification of miRNAs and their potential role in regulating Daphnia gene expression has only recently garnered interest. Computational-based work using Daphnia pulex, has indicated the existence of 45 miRNAs, 14 of which have been experimentally verified. To extend this study, we took a sequencing approach towards identifying miRNAs present in a small RNA library isolated from Daphnia magna. Using Perl codes designed for comparative genomic analysis, 815,699 reads were obtained from 4 million raw reads and run against a database file of known miRNA sequences. Using this approach, we have identified 205 putative mature miRNA sequences belonging to 188 distinct miRNA families. Data from this study provides critical information necessary to begin an investigation into a role for these transcripts in the epigenetic regulation of Daphnia magna.
Citation: Ünlü ES, Gordon DM, Telli M (2015) Small RNA Sequencing Based Identification of MiRNAs in Daphnia magna. PLoS ONE 10(9): e0137617. https://doi.org/10.1371/journal.pone.0137617
Editor: Muhammad Barozai, University of Balochistan, PAKISTAN
Received: April 29, 2015; Accepted: August 20, 2015; Published: September 14, 2015
Copyright: © 2015 Ünlü et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This work was supported by the Council of Higher Education of Turkey (http://www.yok.gov.tr/en/web/uluslararasi-iliskiler/anasayfa) to MT.
Competing interests: The authors have declared that no competing interests exist.
The discovery of regulatory functions for microRNA molecules has revolutionized gene expression studies, especially in the field of epigenetics. As small non-coding RNAs, miRNAs are approximately 22 nucleotides in length and work as post-transcriptional interfering agents primarily to control mRNA translation and degradation . miRNAs are initially generated by RNA polymerase II activity as ~200-nucleotide long primary miRNA transcripts (pri-miRNA) that fold to generate an imperfect hairpin structure. Pri-miRNAs are subsequently cleaved to generate 70-nucleotide long precursor miRNAs (pre-miRNA) that are exported to the cytosol. In the cytosol, pre-miRNAs are processed by the RNAse III enzyme, Dicer, to generate ~22 nucleotide double-stranded miRNAs. One of the two strands of the miRNA will be assemble with the RNA-induced silencing complex, thereby defining the mRNA target. Recently, miRNAs have come under intensive focus due to their relative abundance, their diversity, and their association with a wide range of cellular mechanisms. For example, miRNAs have been identified in mammals, plants, worms and insects and have been shown to be involved in regulating development, cellular differentiation, proliferation, and death . However, it is their potential involvement in cancer onset and progression that has garnered the attention of many miRNA researchers (reviewed in [2, 3, 4]). To date, many of miRNA studies have focused on a small group of model organisms including mouse, C. elegans, and A. thaliana with little effort placed on other sytems. Although interest in the identification of miRNA in non-model organisms has increased in recent years (e.g. Meloidogyne incognita , Apostichopus japonicas , Paulownia fortune ) dissecting the roles of miRNAs with an evolutionary perspective requires additional studies focused on a variety of suitable model organism from different fields.
Recently, Daphnia has emerged as a promising model organism with regards to epigenetic studies because of the availability of large genomic and ecological database sets and its unique biological attributes. Daphnia species are cyclic parthenogenetic with sex determination between clonal (parthenogenetic) or sexual reproduction controlled by environmental cues . They exhibit morphological alterations such as helmet and neck teeth development within one generation in the presence of predator stress . Detection of predator cues may also lead to the behavioral alteration called diurnal vertical migration (DVM). Characteristic of DVM, Daphnia will migrate into the dark and cold hypolimnetic layer of a lake during the day, staying in these unfavorable conditions (food scarcity and low temperature) to avoid fish predation . Clonally reproduce Daphnia populations are genetically identical but still show plasticity in behavioral, morphological, and physiological traits. Furthermore, a short life cycle, ease of cultivating under laboratory conditions, and recently described genome sequence data make Daphnia an ideal candidate to investigate epigenetic regulation across environmental conditions . In addition, the recent development of targeted gene knockout techniques in Daphnia may provide an unparalleled opportunity to investigate the epigenetic role of miRNA within these cells .
Despite these advantages, there is an absence in the availability of miRNA sequence data for Daphnia. According to the miRBase, there are only 50-conserved miRNA molecules belonging to D. pulex that have been identified by computational genomics . Inevitably, additional miRNA sequence data will be required to understand and elucidate miRNA mediated molecular regulatory mechanism in this organism. Bioinformatics is a useful approach to identify miRNA sequences from diverse groups of organisms. The identification of both conserved and species-specific miRNAs has primarily relied on computational tools that are homolog-based and utilize sequence conservation or are based on structural and thermodynamic prediction of pre-miRNA hairpin formation. Recently, additional tools have been developed for use in miRNA identification in cancer , plant , human , and mouse  systems. However, in all of these cases, sequence information of the target organism is required . Unfortunately, genomic information for D. magna is limited, making most of these available tools not applicable. Thus in this study we aimed to use modified Perl codes to identify conserved miRNA sequences within a D. magna small RNA library using D. pulex genome information.
Materials and Methods
D. magna culturing and total RNA isolation
D. magna strains were purchased from MBL Aquaculture and maintained in 2 L glass beakers filled with oxygenated, deionized water kept at ambient temperature (20°C) in a light controlled room (16h light and 8 h dark) for two weeks before RNA isolation. D. magna were fed daily with freshwater algae (Pseudokirchneriella subcapitata; 3.5 x 106 cells/mL) and YCT (yeast, cereal leaves, tetramin) mixtures at 7mL and 10mL per L of water, respectively.
Total RNA <200 nucleotides in length was extracted from 5 whole-body D. magna adults randomly selected from culture. Total RNA isolation was performed using mirVana miRNA Isolation Kit according to the manufacturer’s instructions (Life Technologies). Isolated nucleic acid was quantitated spectrophotometrically, and the integrity of the RNA checked by electrophoresis through a 15% acrylamide (19:1 acrylamide:bis acrylamide), 8M Urea, 1X TBE 0.75mm vertical gel followed by ethidium bromide staining.
Small RNA sequencing
Total small RNA samples were submitted to HudsonAlpha Institute for Biotechnology (Huntsville, Alabama) for quantitation, small RNA library preparation, and sequencing using the Illumina HiSeq 2000 next generation sequencing platform. Data for four million 50 base pair, single-end reads were analyzed for miRNA signatures by computational sequence analysis.
Computational sequence analysis
Small RNA library sequence data was processed using designed Perl codes (available upon request) to remove redundant sequences. A simple Perl code was also generated to blast small RNA sequences against a database that is compatible with Bioperl tool kit . A total of 30,424 known mature miRNA sequences belonging to 203 different species was downloaded in FASTA format via FTP from the miRBase miRNA database (ftp://mirbase.org/pub/mirbase/CURRENT/mature.fa.gz) . Given the short sequences, the code selected sequences having more than 90% identity to reduce the risk of false positives.
For secondary structure prediction studies, a Perl code was written to identify miRNA precursor sequences. The code searched for putative mature miRNA sequences in D. pulex chromosomal sequences (downloaded from NCBI) and extracted the matching sequence in addition to 80 nucleotides upstream and downstream of the reference miRNA. miRNA precursor sequences obtained in this manner were analyzed using the RNA Folding Form application at http://mfold.rna.albany.edu  to predict secondary structure using the default software settings. Structural output files in Vienna format were uploaded to the Mfold server using Structure Display and Free Energy Determination application for structure display .
Results and Discussion
Small RNA sequencing
D. magna small RNA (less than 50 nucleotide) library were sequenced using the Illumina sequencing platform. Approximately 4 million raw reads obtained from sequencing were processed by Perl codes to remove multiple reads, contaminants, and adaptor sequences. The resulting data file contained 815,699 sequences which were run in encoded Perl blast code against a database file containing all known miRNA sequences obtained from the NCBI database and formatted using Blastdb tool. Considering the short length of miRNA sequences, we only parsed the data having 90% or higher identity and higher than 0.1 E value. The analysis revealed 205 putative mature miRNA sequences belonging to 188 miRNA families (detailed data analysis is given in S1 and S2 Tables). As expected, the read length of 22 nucleotides was the most abundant at 30.2% of the total miRNAs identified. This was followed by reads with lengths of 20 nucleotids (27.8%) and 21 nucleotides (22.9%), respectively (Fig 1A). Base distribution analysis identified a strong preference for a purine at the first position of the mature miRNA, with 107 of the transcripts having a uracil at the first position and 43 starting with a adenine (Fig 1B).
(A) Nucleotide lengths of each identified miRNA sequences were analyzed. Lengths from 20 nucleotides to 25 nucleotides were identified and the total number of predicted miRNA sequences for each corresponding sequence length is indicated. (B) Base distribution at each position of the 205 identified miRNA sequences are indicated.
Conservation of putative D. magna miRNAs
Conservation of proposed D. magna miRNA sequences among other species is an important criteria for the initial validation of miRNA transcripts. Based on a cutoff of 90% sequence identify, the data set had the highest number of conserved miRNA transcripts when compared to human miRNAs (105 hits), which was followed by M. musculus (90 hits), D. melanogaster (45 hits), D. pulex (41 hits), and C. elegans (14 hits) miRNAs, respectively (Table 1 and S3 Table). Given that these numbers are dependent on the number of miRNA identified for each species, the data presented in Table 1 does not represent a homological comparison among species. However, the data does suggest that D. magna is a suitable model organism for the study of miRNA functions.
Validation of D. magna miRNAs by secondary structure prediction
Since D. magna chromosome sequence data is not currently available, D. pulex chromosomal scaffold sequence data obtained from the NCBI database was used as a chromosomal reference for the identification of miRNA precursor information. A simple Perl code was developed to extract precursor miRNA data from D. pulex chromosomal sequence for putative miRNAs that showed 100 percent homology with D. pulex. Given that 4 of the 41 D. magna miRNAs showed less than 100% homology with D. pulex, only 37 of the miRNAs were analyzed for secondary structure predictions. 5185 scaffold sequences, with sequence lengths ranging from 1,000 to 3,777,634 nucleotides, were searched for positive matches to the 37 D. magna putative mature miRNAs. Setting the matching region as the core, sequence from the region ~80 nucleotides upstream of the start of the mature miRNA and ~80 nucleotides downstream of the miRNA was selected (S4 Table). Extracted miRNA precursor sequences identified in this manner were analyzed using Mfold software to predict sequence secondary structures. In total, upstream and downstream nucleotide sequence information was obtained for 18 of the 37 D. pulex homologs and is available in S3 Table. All 18 fit parameters characteristic of microRNAs in that they: a) were predicted to form irregular stem-loop structures with the mature miRNA located on one of the two arms of the hairpin (Fig 2) and b) had a limited number of mismatches (including insertions and deletions) between the mature miRNA and its complementary passenger strand, with 4 and 5 being the most common number of mismatches (found in 11 of the 18 predicted miRNAs). The inability to determine secondary structure for the remaining 19 D. magna miRNAs was likely due to the incomplete and fragmented nature of the D. pulex scaffold sequences used for analysis.
The RNA Folding Form application in Mfold server was used to determine the secondary structure of precursor microRNA sequences obtained by parsing the nucleotide sequences selecting an 80 nucleotides upstream and downstream window. Default software settings were used for structural analysis and data downloaded in Vienna file format. To illustrate secondary structures, Vienna formatted data files were used in Structure Display and Free Energy Determination applications in Mfold server. For the ease of identification, the corresponding predicted miRNA sequences are underlined.
Since Daphnia is a promising model for development-based studies, we determined the number of predicted D. magna miRNAs that are likely to be involved in regulating various developmentally linked signaling pathways by comparing them to miRNAs known to be implicated in human (total of 145 miRNAs) and Mus musculus (total of 52 miRNAs) development (Table 2). We specifically focused on calcium, Hedgehog, WNT, Notch and chemokine signaling pathways in Homo sapiens and Mus musculus. Using data downloaded from TarBase , these miRNAs were compared against the 105 predicted miRNAs conserved with Homo sapiens and 90 predicted miRNAs conserved with Mus musculus (S5 Table). In total, 52 of the 205 miRNAs identified in D. magna are predicted to have a role in regulating cellular signaling cascades implicated in controlling developmental pathways.
Comparing putative D. magna miRNAs with homologs reported for Homo sapiens, Mus musculus, D. melanogaster, and C. elegans found that 55% of these miRNAs (105 of 188) were present in one or more of the four model organisms (S6 Table). Homologs for 13 of these 105 have only been reported for humans, 36 of the D. magna miRNAs have family members in humans and mouse but not D. melanogaster or C. elegans, and 2 miRNA family members were found in all 5 organisms (miR-124 and let-7).
We have identified 205 putative miRNA sequences expressed in D. manga. These are the first and the largest number of miRNA sequences directly isolated, and identified, by a comparative genomics approach in D. manga. These sequences included 41of the 45 miRNA sequences previously predicted for D. pulex and available from the MirBase database, confirming the strength of this approach. The identification of these sequences has greatly improved the miRNA database for Daphnia and research efforts focusing on epigenetic regulation in this model organism.
Identification of putative D. magna miRNAs were based on sequence conservation with known miRNAs, therefore it is likely that species-specific miRNAs were overlooked in our analysis and the 205 miRNAs identified in this study is an underestimate of the miRNAs present in D. magna. Given that many of the bioinformatics tools currently used to identify miRNAs rely of the availability of genome sequence information, the identification of additional miRNAs for this model organism will likely have to await the availability of a more complete genome data set.
When compared to human and model organisms, microRNA families identified in this study validate potential for using D. magna as a model for miRNAs based studies. As presented in S6 Table, there are several miRNA families conserved between human and D. magna for which no members have been identified in other model organisms such as mouse, C. elegans or D. melanogaster. Many of the miRNAs were identified by RNA sequencing of normal versus cancerous cells and differentiated versus undifferentiated stem cells [23, 24]. Results from additional studies have suggested a role for these miRNAs in regulating cell proliferation and differentiation [25, 26]. Given the potential importance of these miRNAs in normal cellular activities, the identification of miRNA homologs in D. magna may provide a unique opportunity to study the function of these miRNA family members in a more tractable system.
Using a bioinformatics approach and the D. pulex genome as the template, we were able to confirm that 18 of the 205 putative miRNAs were likely to have pre-miRNA transcripts with secondary structure predictions consistent with the formation of a hairpin, a defining characteristic of miRNAs. Support for the validity of the remaining miRNAs will likely require completion of the D. magna genome. In addition to microRNAs, small non-coding RNAs also include those of the piRNA family. These transcripts range in size from 26–30 nucleotides in length in D. melanogaster, but can be as short as 21 nucleotides as in the case of C. elegans . Unlike miRNAs, these transcripts do not form a double-stranded intermediate, utilize a distinct cellular pathway for maturation, and share little sequence conservation outside of a preference for a uracil at the 5’ position. Although many of the putative miRNAs identified in this study fit several of the descriptors for a piRNA (i.e. 21 nucleotides in length, uracil at position 1 of the mature RNA transcript), the identification of homologs with 90% or higher identity in other species argues against them falling into this category.
Based on sequence conservation with human and mouse, a number of the identified miRNAs are likely to be involved in regulating the expression pattern of genes involved in influencing various developmental pathways (Table 2), information that will provide a useful opportunity to understand and elucidate similar molecular mechanisms in D. magna. Recently developed gene manipulation techniques during the embryonic development of Daphnia , and the availability of the related set of common miRNA sequences are likely to provide a significant opportunity to reveal epigenetic regulation of embryonic stages in this model organism.
S1 Table. Raw data after miRNA sequence analysis.
Table represents the raw data that shows the parameter values for each sequence having 90% or higher identity and 0.1 E-value. The raw data was obtained after running the blast code using 815,699 reads that remained after the clean-up process. miRNAs that show homologues in humans and model organisms (Mus musculus, Drosophila melanogaster,Caenorhabditis elegans and D. pulex) are presented.
S2 Table. Predicted miRNA sequences and corresponding mature miRNA sequences.
Predicted miRNA sequences were named by adding a dma- suffix to homologous sequence names identified in other species. Trimmed versions of raw sequence read data are presented as predicted mature miRNA sequence. A simple Perl code was used to trim adaptor sequences from the raw data and calculate the sequence lengths.
S3 Table. Comparison of putative miRNA sequences from D. magna with available miRNAs.
Data is an expansion of that presented in Table 1. The total number of predicted sequence hits for each species was determined. The percent of conserved miRNA transcripts with D. magna was calculated by dividing the number of homologous miRNAs by the total number of miRNAs identified for each species.
S4 Table. Extracted miRNA precursor sequences from D. pulex genome.
Table represents the raw data for pairwise analysis of miRNA sequences that matched in Daphnia pulex genome. Extracted sequence and corresponding scaffold information are represented for the 18 predicted D. magna miRNAs.
S5 Table. Pairwise comparison of predicted D. magna miRNAs with those involved in development-related pathways in Homo sapiens and Mus musculus.
Table represents the list of predicted miRNAs involved in calcium, Hedgehog, WNT, Notch and chemokine signaling pathways in Homo sapiens and Mus musculus. For each signaling pathway, homologous miRNAs matching those downloaded from the TarBase database are presented. ‘Total’ represents the unique number of development specific miRNAs based on their involvement in one of the four signaling cascades: calcium, Hedgehog, WNT, Notch and/or chemokine pathways.
S6 Table. Comparison of miRNA family members between D. magna and other model species.
miRNA families with homologous predicted sequence hits were compared for their presence in Homo sapiens, Mus musculus, Drosophila melanogaster, and Caenorhabditis elegans. A predicted miRNA family member found in D. magna and one of the four model species is indicated with a green “√” sign; a red “X” indicates that a homolog for the putative D. magna miRNA was not found in that species.
We would like to thank the Abant İzzet Baysal University and Mississippi State University Department of Biological Sciences for their support.
Conceived and designed the experiments: MT DMG ESÜ. Performed the experiments: MT DMG. Analyzed the data: ESÜ. Contributed reagents/materials/analysis tools: DMG ESÜ. Wrote the paper: DMG MT ESÜ. Perl programming: ESÜ.
- 1. Carthew RW, Sontheimer EJ. Origins and mechanisms of miRNAs and siRNAs. Cell. 2009; 136: 642–55. pmid:19239886
- 2. Brenneck J, Hipfner DR, Stark A, Russell RB, Cohen SM. bantam encodes a developmentally regulated microRNA that controls cell proliferation and regulates the proapoptotic gene hid in Drosophila. Cell. 2003; 113(1): 25–36. pmid:12679032
- 3. Price C, Chen J. MicroRNAs in cancer biology and therapy: Current status and perspecives. Genes and Diseases. 2014; 1: 53–63. pmid:25473652
- 4. Orellan EA, Kasinski AL. MicroRNAs in Cancer: A historical perspective on the path from discovery to therapy. Cancers. 2015; 1388–1405. pmid:26226002
- 5. Wang Y, Mao Z, Cheng X, Liu F, Xiao L, Dai L, et al. Identification of microRNAs in Meloidogyne incognita using deep sequencing. PLoS ONE. 2015; 10(8): e0133491. pmid:26241472
- 6. Wang H, Liu S, Cui J, Li C, Hu Y, Zhou W, et al. Identification and characterization of microRNAs from longitudinal muscle and respiratory tree in sea cucumber (Apostichopus japonicus) using high-throughput sequencing. PLoS ONE. 2015; 10(8): e0134899. pmid:26244987
- 7. Niu S, Fan G, Deng M, Zhao Z, Xu E, Cao L. Discovery of microRNAs and transcript targets related to witches’ broom disease in Paulownia fortune by high-throughput sequencing and degradome approach. Mol Genet Genomics. 2015;
- 8. Reynolds CS. Daphnia: Development of model organism in ecology and evolution. Freshwater Reviews. 2011; 4: 85–87.
- 9. Lampert W. Daphnia: model herbivore, predator and prey. Polish Journal of Ecology. 2006; 54: 607–620.
- 10. Meester L, De Dawidowicz P, Loose C, van Gool E. Ecology and evolution of predator-induced behavior of zooplankton: depth selection behavior and diel vertical migration. The Ecoloy and Evolution of Inducible Defenses (Tollrian R, Harvell D, Eds) Princeton Univ Press; 1999.
- 11. Colbourne JK, Pfrender ME, Gilbert D, Thomas WK, Tucker A, Oakley TH. The ecoresponsive genome of Daphnia pulex. Science. 2011; 331(6017): 555–561. pmid:21292972
- 12. Kato Y, Shiga Y, Kobayashi K, Tokishita S, Yamagata H, Iguchi T. Development of an RNA interference method in the cladoceran crustacean Daphnia magna. Dev Genes Evol. 2011; 220: 337–345. pmid:21327957
- 13. Chen S, McKinney GJ, Nichols KM, Sepúlveda MS. In silico prediction and in vivo validation of Daphnia pulex microRNAs. PloS One. 2014; 9(1): e83708. pmid:24400076
- 14. Gomes CP, Cho JH, Hood L, Franco OL, Pereira RW, Wang K. A review of computational tools in microRNA discovery Front. Genet. 2013; 4: 81
- 15. Kou Y, Qiao L, Wang Q. Identification of core miRNA based on small RNA-seq and RNA-seq for colorectal cancer by bioinformatics. Tumor Biology. 2015; 36(4): 2249–2255. pmid:25412953
- 16. Bai J, Dan C, Zhang Y, Zhao G, Ding X. NGSmirPlant: comprehensive characterization of the small RNA transcriptomes of plants. Protein and Cell. 2015; 6(6): 397–402. pmid:25953031
- 17. Alon S, Erew M, Eisenberg E. DREAM: a webserver for the identification of editing sites in mature miRNAs using deep sequencing data. Bioinformatics. 2015; 31(15): 2568–2570. pmid:25840043
- 18. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, et. al. The Bioperl toolkit: Perl modules for the life sciences. Genome Research. 2002; 12(10): 1611–1618. pmid:12368254
- 19. Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA genomics. Nucleic Acids Research. 36 (Database issue) D154–8. pmid:17991681
- 20. Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research. 2003; 31(13): 3406–3415. pmid:12824337
- 21. Mathews DH, Sabina J, Zuker M, Turner DH. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. Journal of Molecular Biology. 1999; 288(5): 911–940. pmid:10329189
- 22. Vergoulis T, Vlachos IS, Alexiou P, Georgakilas G, Maragkakis M, Reczko M, et al. TarBase 6.0: capturing the exponential growth of miRNA targets with experimental support. Nucleic Acids Research. 2012; 40: D222–229. pmid:22135297
- 23. Jima DD, Zhang J, Jacobs C, Richards KL, Dunphy CH, Choi W, et al. Deep sequencing of the small RNA transcriptome of normal and malignant human B cells identifies hundreds of novel microRNAs. Blood. 2010; 116 (23): 118–127.
- 24. Yoo JK, Kim J, Choi S, Noh HM, Kwon YD, Yoo H, et al. Discovery and Characterization of Novel MicroRNAs During Endothelial Differentiation of Human Embryonic Stem Cells. Stem Cells and Development. 2012; 21(11): 2049–2057. pmid:22142236
- 25. Chen L, Lü MH, Zhang D, Hao NB, Fan YH, Wu YY, et al. miR-1207-5p and miR-1266 suppress gastric cancer growth and invasion by targeting telomerase reverse transcriptase. Cell Death and Disease. 2014; 5(1): e1034.
- 26. Chang L, Guo F, Huo B, Lv Y, Wang Y, Liu W. Expression and clinical significance of the microRNA-200 family in gastric cancer. Oncology Letters. 2015; 9(5): 2317–2324. pmid:26137064
- 27. Weick EM, Miska EA. piRNAs: from biogenesis to function. Development. 2014; 141: 3458–3471. pmid:25183868