Figures
Abstract
MSM/Ms mouse derived from the Japanese wild mouse has unique characteristics compared to the widely used C57BL/6 mouse. To examine the usefulness of the MSM/Ms mouse for the comparative genomic analysis, expression of small RNAs were analyzed by the large-scale sequence analysis for two strains of mouse, C57BL/6 and MSM/Ms. As a trial, expression of box C/D snoRNAs, which are the most abundant small RNAs in the cell, were analyzed. By the comparison of the read number for each fragment, 11 snoRNAs with single nucleotide polymorphisms (SNPs) were detected. One of the snoRNAs, SNORD53, shows the expression only for MSM/Ms and this snoRNA has a mutation in the box sequence in C57BL/6. Thus, it was demonstrated that the proposed experimental system using SNPs can give new insight for the gene expression regulation.
Citation: Saito R, Otsu M, Kiyosawa H, Kawai G (2023) Expression analysis of box C/D snoRNAs with SNPs between C57BL/6 and MSM/Ms strains in male mouse. PLoS ONE 18(7): e0288362. https://doi.org/10.1371/journal.pone.0288362
Editor: Charalampos G. Spilianakis, University of Crete & IMBB-FORTH, GREECE
Received: March 1, 2023; Accepted: June 24, 2023; Published: July 10, 2023
Copyright: © 2023 Saito et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Among the experimental mouse strains (Mus musculus), C57BL/6 mouse (B6) has been one of the most popular strains used in a variety of research fields [1–3]. Although more than 20 substrains have been established for B6, it has been widely used as a background strain [1–3]. In contrast, MSM/Ms or Mishima mouse (MSM) shows unique characteristics compared to B6 [4–6]. For example, MSM is less likely to develop tumors than B6 [7], and MSM has tolerance to deafness [8], or behavioral traits such as active athletic activity and aggression [9, 10]. The genome sequence of MSM differs by about 1% compared to B6 [11, 12]. Thus, comparative genomics with B6 and MSM should contribute to deeper understanding of the gene regulation in mouse.
It is well known that many small-sized RNAs were expressed with important biological roles in the cell. Recently, small RNAs (about 60 to 80 nt) were found during an analysis of transcription from the sense–antisense transcripts (SAT) loci [13, 14]. The small RNAs were detected from the exon-overlapping regions of SAT loci and these RNAs may have regulatory roles in the mammalian genome. Thus, the search for small RNAs is expected to yield novel biological function.
In the present study, we focused on the expression of small nucleolar RNAs (snoRNAs) as a trial expression analysis because these are the most abundant small RNAs in cells. There are two types of snoRNA, C/D box snoRNA (SNORD) and H/ACA box snoRNA (SNORA), where SNORD forms a loose stem-loop and SNORA forms a tight two-hairpin structure [15]. Many snoRNAs guide the site-specific chemical modification of ribosomal RNA [16] and some snoRNAs are located at the imprinted loci [17]. The comparison of snoRNA expressions is anticipated to give novel insights into the mechanism of genomic imprinting.
To analyze small RNAs, we performed the next generation sequencing (NGS) analysis of small RNA fractions to find several novel structured small RNAs from the brain of B6 [18]. In the present study, we applied similar technique to B6 and MSM toward the comparative genomics of small non-coding RNAs. As the first trial, we focused on snoRNAs which are abundant and have several important functions in the cell as described above. This method can be used for analysis of differential expression of several snoRNAs.
Materials and methods
Preparation of RNA samples
RNAs were extracted from male mouse (C57BL/6J, male, 8–10 weeks, CLEA, Tokyo, Japan) and MSM/Ms (male, 8–10 weeks, provided by RIKEN BRC) brain by using TRIzol reagent (Invitrogen) according to the manufacturer’s protocol. Brains from typically two mice were homogenized for each subspecies before RNA extractions. Total RNA (60 μg) was then separated by 12% polyacrylamide/7M urea denaturing gel electrophoresis. The gel was stained with ethidium bromide and irradiated with UV light to detect RNA bands. Small RNAs of about 40 to 140 nt in length were recovered from the gel to obtain 2.2 and 3.2 μg/μl samples for B6 and MSM, respectively. Mouse experiments were performed in accordance with the institutional guidelines of Chiba Institute of Technology and Kochi University, and approved by the Animal Care and Use committees of both institutions under the approval numbers of 18002 and J-00022, respectively. The mice were properly sacrificed by cervical dislocation.
Sequencing
DNA libraries for sequencing were prepared with three kinds of preparation kits; TruSeq Small RNA Library Prep Kit (TruSeq, Illumina, Inc.) [19], NEBNext Multiplex Small RNA Library Prep Set for Illumina (NEBNext, New England BioLabs Inc.) and SMARTer® microRNA-Seq Kit (SMARTer, Takara Bio USA, Inc.) [20–22]. In general, manufacturer’s protocols were used except for the final size selection step. In the final step, DNAs corresponding to small RNAs were selected by the SPRIselect size selection kit (Beckman Coulter, Inc.) with the double side selection protocol. The qualities of the DNA libraries were confirmed by the Bioanalyzer 2100 (Agilent Technologies) with the High Sensitivity DNA kit. The concentrations of the DNA libraries were around 5 nM. The sequence analyses for 151 nt in paired end were done by the MiniSeq system (Illumina) with the MiniSeq High Output Kit (300 Cycles) and the RNA sequences were extracted by home-made programs as described in Kiyosawa et al. 2015; for each set of 150 nt pair-ended read, sequences with overlapping at least 10 nt are extracted as a small RNA.
Data analysis
The obtained RNA sequences in length of 10–292 were processed by mainly Microsoft Excel and home-made programs in part. Sequences for mouse brains of B6 and MSM were compared to each other and the sequences with more than 100-fold change between MSM and B6 or more than 100 for one strain and 0 for the other strain in read number per million total read number (rpm) were selected. Each sequence was mapped on the mouse genome (GRCm39/mm39) using BLAT in UCSC Genome Browser (https://genome.ucsc.edu) or subjected to the BLAST search in the snoRNA orthological gene database snOPY [23].
Results
Detection of differential expressed sequences
The overview of the analysis in this work was summarized in Fig 1. RNAs were extracted from brain of B6 and MSM, and the DNA libraries were prepared with three kinds of DNA library preparation kits, TruSeq, NEBNext and SMARTer. These kits are optimized for small RNAs such as miRNAs. Then, the sequences for each DNA libraries were analyzed and RNA sequences were extracted from the obtained sequence reads. Hereafter, the three data obtained from the three libraries were called as analyses 1–3, respectively. By an NGS analysis, 30,063,505 reads were obtained in total and 27,261,887 RNA sequences with length from 10 to 292 which cover the most of SNORD and SNORA. Hereafter, the number of RNA sequences are called as read number. As shown in Table 1, RNA sequences of 3,767,220–5,498,823 were obtained for each DNA library which contained RNA fragments of 180,679–406,614. To find candidates for the differentially expressed RNAs, correlation of the read numbers for each sequence were analyzed as shown in Fig 2. As shown in Fig 1, RNAs with 100-fold differences in rpm were selected. In each analysis, most of RNAs shows similar read numbers between B6 and MSM (Fig 2, left for each panel) and small number of RNAs showed 100-fold differences (outside of the lines in Fig 2). In addition, RNA sequences found only in one strain for each analysis with more than 100 rpm are also selected (Fig 2, categories 3 and 4 in right for each panel). Annotations of selected RNAs were analyzed by the mouse genome (GRCm39/mm39) and snoRNAs were picked up. Number of selected RNAs were summarized in Table 2. Notably, other RNAs included fragments of ribosomal RNAs (rRNAs), mitochondrial rRNAs, tRNAGlu, tRNAGly and long non-coding RNAs were also selected.
rpm: read number per million total read number.
(A-C) are corresponding to the analysis 1, 2 and 3, respectively. Left: correlation in number of reads for each fragment (rpm) between B6 and MSM. Lines indicate the 100-fold difference. Right: distribution of read numbers for fragments which were detected only one strain. The categories 1–4 indicate x < 10, 10 ≤ x < 100, 100 ≤ x < 1,000 and 1,000 ≤ x (rpm), respectively.
It is noted that several fragments with different 5′ and/or 3′ ends were found for each RNA. Sometimes, variations of the 3′ terminal residue were observed. These variations may have happened in the processing of snoRNA and/or degradation during the sample preparation. Thus, number of RNAs in Table 2 is smaller than number of selected sequences in Fig 2.
In the present study, we focused on SNORD which are most abundant small RNAs in cells. As shown in Table 2, 14 to 19 SNORDs were selected for each library. Twenty-three SNORDs were detected in total (Fig 3). Expression of SNORD28 was observed only in the analysis 1 and expression of SNORD104 was observed only in the analysis 2. In contrast, eleven SNORDs, SNORD31, SNORD33, SNORD38, SNORD45c, SNORD49b, SNORD52, SNORD53, SNORD58b, SNORD100, SNORD115 and SNORD116, were commonly selected among the three analyses and, thus, we further analyzed these 11 SNORDs. The sum of read numbers in rpm for each SNORD were shown in S1 Table and the target RNAs and host genes for these SNORDs are shown in S2 Table. Most of selected SNORDs target rRNAs except for SNORD115 and SNORD116 those targets are unknown. In contrast, host genes are varied.
Analysis of the detected RNAs
Most of SNORDs could be paired between sequences specific for B6 and MSM (Table 3), indicating that the differences in read number are caused by SNPs rather than expression differences. SNORDs other than SNORD58b, SNORD115 and SNORD116 have single gene in the genome and one or two SNPs between B6 and MSM were found (Table 3, S3 Table). Small read numbers for the counter part of the strain pair are probably due to the contamination during the RNA and/or DNA library preparations. It is not the error in sequence analysis because any other sequences were merely detected. In the case of SNORD49b, two kinds of sequences are found for MSM with similar read numbers to each other. The reason of this phenomenon is unknown. Three genes of SNORD58b are coded in introns of the ribosomal protein L17 gene, and two of them have SNPs as shown in Fig 4. Among the three genes, products of Gm23301 and Gm26202 were identified in this study. Despite the expression of about 30,000 read number in both Gm23301 and Gm26202, SNORD58b has hardly been detected. SNORD58b may not be expressed in the brain of B6. Both Gm23301 and Gm26202 have SNPs as shown in Fig 4. Moreover, they target the same RNA.
(A) Locus of Gm23301 and Gm26202 on the ribosomal protein L17 gene in Chromosome 18. Three snoRNAs are coded on the introns of this gene and two of them have SNPs between B6 and MSM. (B) SNPs in Gm23301 and Gm26202. Red: possible SNP residues. Orange: Box sequences. Green: Complementary sequence to target RNA.
SNORD115 and SNORD116 are coded in introns of the small nucleolar RNA host gene 14 (Snhg14) as gene clusters [24]. Although these snoRNAs have sequence variations in each strain, differential expression is still observed by single SNPs (Table 3 and S3 Table).
Analysis of SNPs
In the present study, most of SNPs were located neither on the box sequences nor complementary sequences for the target RNAs (S3 Table), suggesting that these SNPs are functionally silent. Only the exception is SNORD53, for which the SNP is in the first position of the box sequence CUGA (Fig 5). SNORD53 of MSM possesses the two consensus CUGA box sequences where that of B6 has a mutation in the second box according to the mouse genome (GRCm39/mm39). The target of SNORD53 is 28S rRNA and the second CUGA box sequence is conserved among mammalian. It is possible that SNORD53 lost its function in B6 and this may be a reason for loss of SNORD53 sequences in B6. The expression of the mutated SNORD53 may be suppressed and/or dysfunctional SNORD53 may be degraded in B6.
(A) Locus of SNORD53 on the WD repeat domain 43 gene in chromosome 17. The snoRNA is coded on the introns of this gene. (B) SNPs in SNORD53. Red: possible SNP residues. Orange: Box sequences. Green: Complementary sequence to target RNA.
Discussion
In this study, differential expressions between two mouse strains, B6 and MSM, of box C/D snoRNA were analyzed by the large-scale sequence analysis. In most cases, differential expressions were found to be due to SNPs rather than different regulation of gene expression. In most cases, no SNPs were found in functional sequences such as the box sequences or the complementary sequences to target RNAs. As described below, SNORD53 was identified only for MSM, indicating that SNP-based analysis is effective for the analysis of differential expressions between strains. Thus, it was demonstrated that the comparison between the two strains give new insight for the gene expression regulation. It should be noted that SNORD115 and SNORD116, which form large gene clusters, are also basically distinguishable by SNPs despite the polymorphisms in these loci. For SNORD115 as example, B6-1 to 5 are B6 specific and MSM-1 and 2 are MSM specific as shown in Table 3. It is noted that sequences shared between the two strains were not detected in this analysis. Functionally silent SNPs can be used for the discrimination of expressed alleles. Similar results were also obtained for SNORD116. There is a possibility that it can be used for functional analysis of SNORD115 and SNORD116.
Expression of SNORD53 may be different between the two strains; this snoRNA was found only for MSM. In fact, the sequence of SNORD53 was extracted from the three data for B6 to find negligible read numbers (3, 20 and 14 rpm for each trial). Because SNORD53 in B6 has a mutation in the box sequences and this may cause the gene silencing or degradation of the product. Thus, it was demonstrated that the comparison between the two strains give new insight for the gene expression regulation. Because the genome sequence of MSM has been determined, it is possible to identify snoRNAs with SNPs. However, the method in this study can show the existence of SNPs and expression levels simultaneously. In addition, this method can be applied to strains for which the genome sequence is not determined. Further analysis will give more information on the differential expression of small RNAs.
As shown in Fig 3, more than half snoRNAs were shared among the three results despite of the different DNA library kits. In the analysis 1 (TruSeq), 5′ and 3′ adaptors were simply ligated and amplified by PCR. In the analysis 2 (NEBNext), the primers for reverse transcription were added after the 3′ adaptor ligation to prevent the formation of adaptor dimer. Then the 5′ adaptor was ligated and amplified. In contrast, in the analysis 3 (SMARTer), RNAs were circularized after 3′ adaptor ligation and, then, reverse transcribed and amplified. SNORDs seem to be effectively incorporated to the DNA libraries by the three methods. It is noted that the sequences of snoRNAs obtained for C57BL/6J were completely agree with those of GRCm39/mm39, suggesting the accuracy of the sequencing as well as the suitability of C57BL/6J as a control subspecies for this work.
Some snoRNAs are located at the imprinted loci and are affected by the expression mechanism of the imprinted gene [18]. It is known that snoRNAs are incorporated into the gene expression regulation because the expression disruption of the imprinted locus affects the expression of snoRNA groups located there and causes abnormal development and diseases [23].
All mammals, including human, inherit a set of chromosomes from father and mother, and both parents-derived genes are expressed [15]. However, for some genes, the gene expression mechanism called genomic imprinting is known, where only one allele of the two is expressed in a parental-of-origin manner. [25, 26]. Most genes that is subject to the genomic imprinting control have important roles in regulating the development of individuals and the activities of living organisms [27]. It has been demonstrated that monoallelic expression of RNAs can be analyzed by using the F1 hybrid mice between B6 and MSM [28]. In the hybrid F1, there are several genes that show different expression patterns compared to fathers and mothers. For example, incompatibility due to strain difference occurs due to factors such as mismatching combinations of transcription regulatory factors [28]. To analyze the incompatibility, it is important to know the normal expression pattern and the proposed method based on the SNPs are useful for such analyses. In addition, the proposed experimental system using SNPs can be used as a tool for analyzing imprinting by providing the reference data for the expressions.
Supporting information
S1 Table. Read numbers of selected box C/D snoRNAs (SNORDs) in rpm.
https://doi.org/10.1371/journal.pone.0288362.s001
(PDF)
S2 Table. Properties of the selected box C/D snoRNAs (SNORDs).
https://doi.org/10.1371/journal.pone.0288362.s002
(PDF)
Acknowledgments
The authors thank to Prof. Naoya Kenmochi and Dr. Tamayo Uechi for their helpful comments.
References
- 1. Mekada K, Yoshiki A. Substrains matter in phenotyping of C57BL/6 mice. Exp Anim. 2021;70(2):145–60. Epub 20210114. pmid:33441510.
- 2. Karp NA, Mason J, Beaudet AL, Benjamini Y, Bower L, Braun RE, et al. Prevalence of sexual dimorphism in mammalian phenotypic traits. Nat Commun. 2017;8:15475. Epub 20170626. pmid:28650954.
- 3.
The International Mouse Phenotyping Consortium (IMPC) website. [Accessed on 17 June 2023.] Available from: https://www.mousephenotype.org/
- 4. Takada T, Mita A, Maeno A, Sakai T, Shitara H, Kikkawa Y, et al. Mouse inter-subspecific consomic strains for genetic dissection of quantitative complex traits. Genome Res. 2008;18(3):500–8. Epub 20080206. pmid:18256237.
- 5. Moriwaki K, Miyashita N, Mita A, Gotoh H, Tsuchiya K, Kato H, et al. Unique inbred strain MSM/Ms established from the Japanese wild mouse. Exp Anim. 2009;58(2):123–34. pmid:19448335.
- 6.
RIKEN BioResource Research Center, MSM/Ms Strain Data Sheet. [Accessed on 17 June 2023.] Available from: https://knowledge.brc.riken.jp/resource/animal/card?brc_no=RBRC00209&__lang__=en
- 7. Nakanishi M, Tazawa H, Tsuchiya N, Sugimura T, Tanaka T, Nakagama H. Mouse strain differences in inflammatory responses of colonic mucosa induced by dextran sulfate sodium cause differential susceptibility to PhIP-induced large bowel carcinogenesis. Cancer Sci. 2007;98(8):1157–63. Epub 20070615. pmid:17573895.
- 8. Nemoto M, Morita Y, Mishima Y, Takahashi S, Nomura T, Ushiki T, et al. Ahl3, a third locus on mouse chromosome 17 affecting age-related hearing loss. Biochem Biophys Res Commun. 2004;324(4):1283–8. pmid:15504353.
- 9. Koide T, Moriwaki K, Ikeda K, Niki H, Shiroishi T. Multi-phenotype behavioral characterization of inbred strains derived from wild stocks of Mus musculus. Mamm Genome. 2000;11(8):664–70. pmid:10920237.
- 10. Takahashi A, Sugimoto H, Kato S, Shiroishi T, Koide T. Mapping of Genetic Factors That Elicit Intermale Aggressive Behavior on Mouse Chromosome 15: Intruder Effects and the Complex Genetic Basis. PLoS One. 2015;10(9):e0137764. Epub 20150921. pmid:26389588.
- 11. Takada T, Ebata T, Noguchi H, Keane TM, Adams DJ, Narita T, et al. The ancestor of extant Japanese fancy mice contributed to the mosaic genomes of classical inbred strains. Genome Res. 2013;23(8):1329–38. Epub 20130419. pmid:23604024.
- 12. Okumura K, Saito M, Isogai E, Wakabayashi Y. The Japanese Wild-Derived Inbred Mouse Strain, MSM/Ms in Cancer Research. Cancers (Basel). 2021;13(5). Epub 20210301. pmid:33804471.
- 13. Kiyosawa H, Mise N, Iwase S, Hayashizaki Y, Abe K. Disclosing hidden transcripts: mouse natural sense-antisense transcripts tend to be poly(A) negative and nuclear localized. Genome Res. 2005;15(4):463–74. Epub 20050321. pmid:15781571.
- 14. Okada Y, Tashiro C, Numata K, Watanabe K, Nakaoka H, Yamamoto N, et al. Comparative expression analysis uncovers novel features of endogenous antisense transcription. Hum Mol Genet. 2008;17(11):1631–40. Epub 20080218. pmid:18283053.
- 15. Bergeron D, Fafard-Couture É, Scott MS. Small nucleolar RNAs: continuing identification of novel members and increasing diversity of their molecular mechanisms of action. Biochem Soc Trans. 2020;48(2):645–56. pmid:32267490.
- 16. Bachellerie JP, Cavaillé J, Hüttenhofer A. The expanding snoRNA world. Biochimie. 2002;84(8):775–90. pmid:12457565.
- 17. Baldini L, Robert A, Charpentier B, Labialle S. Phylogenetic and Molecular Analyses Identify SNORD116 Targets Involved in the Prader-Willi Syndrome. Mol Biol Evol. 2022;39(1). pmid:34893870.
- 18. Kiyosawa H, Okumura A, Okui S, Ushida C, Kawai G. Secondary structure-based analysis of mouse brain small RNA sequences obtained by using next-generation sequencing. Genomics. 2015;106(2):122–8. Epub 20150520. pmid:26003051.
- 19.
TruSeq® Small RNA Library Prep Reference Guide. [Accessed on 17 June 2023.] Available from: https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/samplepreps_truseq/truseqsmallrna/truseq-small-rna-library-prep-kit-reference-guide-15004197-02.pdf
- 20.
NEBNext® Multiplex Small RNA Library Prep Set for Illumina® Set 1, Set 2, Index Primers 1–48 and Multiplex Compatible. [Accessed on 17 June 2023.] Available from: https://international.neb.com/-/media/nebus/files/manuals/manuale7300_e7330_e7560_e7580.pdf?rev=d0964a2e637843b1afcb9f7d666d07b2&hash=5B733FC9B41103A865143C75D0F3FC5D
- 21.
SMARTer® microRNA-Seq Kit User Manual. [Accessed on 17 June 2023.] Available from: https://www.takarabio.com/documents/User%20Manual/SMARTer%20microRNA/SMARTer%20microRNA-Seq%20Kit%20User%20Manual_080618.pdf
- 22. Zhu YY, Machleder EM, Chenchik A, Li R, Siebert PD. Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. Biotechniques. 2001;30(4):892–7. pmid:11314272.
- 23. Yoshihama M, Nakao A, Kenmochi N. snOPY: a small nucleolar RNA orthological gene database. BMC Res Notes. 2013;6:426. Epub 20131023. pmid:24148649.
- 24. Cavaillé J, Buiting K, Kiefmann M, Lalande M, Brannan CI, Horsthemke B, et al. Identification of brain-specific and imprinted small nucleolar RNA genes exhibiting an unusual genomic organization. Proc Natl Acad Sci U S A. 2000;97(26):14311–6. pmid:11106375.
- 25. Wang T, Li J, Yang L, Wu M, Ma Q. The Role of Long Non-coding RNAs in Human Imprinting Disorders: Prospective Therapeutic Targets. Front Cell Dev Biol. 2021;9:730014. Epub 20211025. pmid:34760887.
- 26. Kaneko-Ishino T, Ishino F. The Evolutionary Advantage in Mammals of the Complementary Monoallelic Expression Mechanism of Genomic Imprinting and Its Emergence From a Defense Against the Insertion Into the Host Genome. Front Genet. 2022;13:832983. Epub 20220303. pmid:35309133.
- 27. Macdonald WA. Epigenetic mechanisms of genomic imprinting: common themes in the regulation of imprinted regions in mammals, plants, and insects. Genet Res Int. 2012;2012:585024. Epub 20120215. pmid:22567394.
- 28. Kondo S, Kato H, Suzuki Y, Takada T, Eitoku M, Shiroishi T, et al. Monoallelic, antisense and total RNA transcription in an in vitro neural differentiation system based on F1 hybrid mice. J Cell Sci. 2019;132(17). Epub 20190909. pmid:31409693.