Identification of a Putative Quantitative Trait Gene for Resistance to Obesity in Mice Using Transcriptome Analysis and Causal Inference Tests

It is still challenging to identify causal genes governing obesity. Pbwg1.5, a quantitative trait locus (QTL) for resistance to obesity, was previously discovered from wild Mus musculus castaneus mice and was fine-mapped to a 2.1-Mb genomic region of mouse chromosome 2, where no known gene with an effect on white adipose tissue (WAT) has been reported. The aim of this study was to identify a strong candidate gene for Pbwg1.5 by an integration approach of transcriptome analysis (RNA-sequencing followed by real-time PCR analysis) and the causal inference test (CIT), a statistical method to infer causal relationships between diplotypes, gene expression and trait values. Body weight, body composition and biochemical traits were measured in F2 mice obtained from an intercross between the C57BL/6JJcl strain and a congenic strain carrying Pbwg1.5 on the C57BL/6JJcl background. The F2 mice showed significant diplotype differences in 12 traits including body weight, WAT weight and serum cholesterol/triglyceride levels. The transcriptome analysis revealed that Ly75, Pla2r1, Fap and Gca genes were differentially expressed in the liver and that Fap, Ifih1 and Grb14 were differentially expressed in WAT. However, CITs indicated statistical evidence that only the liver Ly75 gene mediated between genotype and WAT. Ly75 expression was negatively associated with WAT weight. The results suggested that Ly75 is a putative quantitative trait gene for the obesity-resistant Pbwg1.5 QTL discovered from the wild M. m. castaneus mouse. The finding provides a novel insight into a better understanding of the genetic basis for prevention of obesity.


Introduction
Genome-wide quantitative trait locus (QTL) mapping is performed on the basis of either association analysis in outbred populations (called a genome-wide association study (GWAS)) or linkage analysis in three-generation pedigrees or designed crosses (a genome-wide QTL analysis). QTL mapping is an unbiased phenotype-driven, discovery-based approach without any a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 males was selected from a litter. In the set, three diplotypes for the subcongenic region were segregating. That is, two of the diplotypes were homozygous for either haplotype derived from the wild mouse or from the B6 mouse, and the other was heterozygous for both haplotypes. Finally, five sets of such F 2 segregating males were selected from four litters. All of the mice were weaned at 3 weeks of age. The mice were given standard chow (CA-1, Clea Japan), containing 5% crude fat and 3.5 kcal/g energy, and tap water ad libitum. Detailed husbandry conditions of the mice were described previously [16].

Growth and Body Composition Traits
All growth and body composition traits for the F 2 male mice used in this study were previously determined [16]. Briefly, body weights at 1, 3, 6, 10 and 14 weeks of age were recorded. Body weight gains at 1-3 weeks, 3-6 weeks, 6-10 weeks and 10-14 weeks of age were calculated. After overnight fasting, mice were killed on the day after 14 weeks under anesthesia. Total body length (from the tip of the nose to the end of the tail) and tail length (from the anus to the end of the tail) were measured, and head-body length was calculated by subtracting tail length from total body length. Relative head-body and tail lengths were obtained as a percentage of total body length. Blood was obtained by cardiopuncture. The lungs, spleen, liver, kidneys, testes and two sides of inguinal and epididymal WAT depots were weighed. The sum of these two parts of WAT weights was considered as total WAT weight in this study. In mice, the weight of WAT depots has been long and widely used as an indicator of fatness because of a high correlation of the WAT depot weight with total body fat weight [19]. The relative weight of each organ and WAT was obtained as a percentage of body weight at 14 weeks. Parts of the liver and epididymal WAT were stored in RNAlater (Life Technologies, Tokyo) for a few days and subsequently stored at -80˚C until total RNA extraction.

Food Intake
Three male mice of the SR1 strain and four males of the B6 strain at 9-10 weeks of age were housed individually in cages in which the mice had free access to tap water and the powder diet of CA-1. Food intake and body weight were measured every two or three days for 7 days. Relative introgressed genomic interval of the B6.Cg-Pbwg1/1Nga subcongenic strain (abbreviation SR1). B6.Cg-Pbwg1/ 1Nga was developed from the original B6.Cg-Pbwg1 congenic strain carrying the Pbwg1 growth QTL on mouse chromosome 2. The black bar shows the minimum introgressed interval derived from the wild Mus musculus castaneus mouse. The genomic region of the black bar in B6.Cg-Pbwg1 was previously exome-sequenced [16]. The black-bar region of SR1 was RNA-sequenced in this study. The gray bar shows the interval derived from the C57BL/6JJcl (B6) background strain. The hatched bar indicates the interval where recombination occurred. The map positions (Mb) of microsatellite markers (D2Mit#) and PCR-RFLP markers (rs#) previously developed [16] are shown on the horizontal line. The horizontal double-headed arrows indicate the maximum intervals of body weight and body composition QTLs (Pbwg1.#) previously defined [16]. The effect of the QTL allele derived from the wild castaneus mouse is shown together with the QTL name. The filled triangle indicates the map position of Ly75, a strong candidate gene for Pbwg1.5. WAT, white adipose tissue.

Serum Lipoprotein Analysis
The blood obtained by cardiopuncture was stored at room temperature for approximately 1 hour and overnight at 4˚C. Serum was separated by centrifugation at 3,500 rpm for 15 min and stored at -80˚C until analysis. Serum lipoproteins were analyzed using gel permeation high-performance liquid chromatography (GP-HPLC) according to the LipoSEARCH 1 system [20,21] established by Skylight Biotech Inc. (Akita, Japan). Four lipoprotein subclasses, chylomicron (CM), very-low-density lipoprotein (VLDL), low-density lipoprotein (LDL) and high-density lipoprotein (HDL), were calculated for cholesterol (C) and triglycerides (TG). The percentage of each level of C and TG subclasses in total C and total TG levels, respectively, was calculated.

RNA-sequencing (RNA-seq) Analysis
Total RNA was extracted from the livers of a set of F 2 segregating males (one individual per diplotype) and from epididymal WATs of three sets of F 2 segregating males (three individuals per diplotype) using Trizol reagent (Life Technologies, Tokyo) according to the manufacturer's instructions. The concentration and quality of the total RNA were examined by Nanodrop 1000 (Thermo Fisher Scientific, Yokohama, Japan) and Agilent 2100 Bioanalyzer (Agilent Technologies, Tokyo), respectively. For epididymal WAT, the total RNAs of three mice with the same diplotype were pooled because of their low total RNA concentrations. RNA-sequencing (RNA-seq) analysis and subsequent sequence data analyses were outsourced to Hokkaido System Science Co., Ltd (Sapporo, Japan). Briefly, sequence libraries were constructed from the total RNA using a TruSeq RNA sample prep kit (Illumina K.K., Tokyo) according to the manufacturer's instructions. RNA-seq analysis was performed with the next-generation sequencer Illumina Hiseq2000 using a single read of a 101-bp length. The raw sequence data were aligned to the UCSC Mouse Genome Browser GRCm38/mm10 assembly (RefSeq mm10) using TopHat software version 2.0.9 [22]. Values for fragments per kilobase of exon per million mapped fragments (FPKM) were calculated by Cufflinks software version 2.1.1 [22] using the RefSeq mm10 transcriptome. Genes with both FPKM > 0.1 [23] and log 2 (fold change) ! 0.58 (or -0.58) were considered to be differentially expressed in this study. The RNA-seq data were deposited in DDBJ Sequence Read Archive under the accession number of DRA003057.

Quantitative Real-Time PCR Analysis
Total RNA was extracted from the livers and epididymal WATs of five sets of F 2 segregating males using Trizol, as described above. Using a PrimeScript 1 RT reagent Kit with gDNA Eraser (Takara Bio, Otsu, Japan), genomic DNA contamination was removed by the gDNA Eraser enzyme and then cDNA was synthesized from 1 μg of total RNA according to the manufacturer's instructions. Quantitative PCR was performed in a 10.0-μl reaction volume on the StepOnePlus™ Real-Time PCR system (Life Technologies, Tokyo) with SYBR 1 Premix Ex Taq™ II (Tli RNaseH Plus) (Takara Bio, Otsu, Japan). Primer sequences for eight candidate genes, Ly75, Pla2r1 (phospholipase A2 receptor 1), Tank (TRAF family member-associated Nf-kappa B activator), Fap (fibroblast activation protein), Ifih1 (interferon induced with helicase C domain 1), Gca (grancalcin), Grb14 and Cobll1 (Cobl-like 1), and two endogenous control genes, Actb (actin, beta) and B2m (beta-2 microglobulin), were retrieved from the PrimerBank database [24] or designed with Primer Express 1 version 3.0 (Life Technologies, Tokyo). They were custom-synthesized and are listed in S1 Table. Cycle conditions of quantitative PCR were 95˚C for 30 sec and 40 cycles of 95˚C for 5 sec and 60˚C for 30 sec. All samples were analyzed in triplicate. In the liver, quantitative relative standard curves for the candidate and endogenous genes, with four serial dilution points of the B6 control cDNA (20 ng, 4 ng, 0.8 ng and 0.16 ng), were used to determine mRNA levels. Expression levels of the candidate genes were normalized to a composite level of Actb and B2m control genes. Dissociation curves, PCR amplification efficiencies and R 2 values were examined to determine the precision of qPCR. In epididymal WAT, the expression levels of the candidate genes were normalized to that of Actb and measured using the 2 −ΔΔCT method because of the low concentrations of fat RNA.

Causal Inference Test (CIT)
The relationships between diplotype (D), mRNA expression (R) and quantitative trait (T) were assessed by causal inference tests (CITs) [18] to determine whether R was the mediator between D, considered as a cause, and T, considered as the phenotypic outcome. Three possible relationship models among these three factors, i.e. causal, reactive and independent relationships, were obtained by the results of CITs (Fig 2). The CIT is composed of four  conditional tests that are carried out using the following four linear regression models.
Here β 0 is the overall mean, β 1 is the effect of the dependent variable of D or R, β 2 is the effect of the conditioning variable of T, D or R, and ε is the random error. For the four CITs, Test 1 determines whether D is associated with T using Model 1, Test 2 determines whether D is associated with R after conditioning on T using Model 2, Test 3 determines whether R is associated with T after conditioning on D using Model 3, and Test 4 determines whether D is independent of T after conditioning on R using Model 4. To declare a causal relationship, all of the conditional tests must be satisfied.

Statistical Analysis
The effect of litter on phenotypic values for body weight, body compositions, lipoproteins and gene expression levels obtained in F 2 males was tested using a linear model of the statistical discovery software JMP version 11.1.1 (SAS Institute, Cary, NC). The litter effect was significant for most of the traits at the nominal 5% level (see S1 Fig for example). Thus, residuals after removing the litter effect were used to perform CITs and to compare phenotypic differences among diplotypes by one-way analysis of variance (ANOVA) followed by Tukey's honestly significant difference (HSD) post hoc test. Food intake was compared between SR1 and B6 strains by Student's t-test at the nominal 5% level.

Trait Analyses
A total of 48 quantitative traits for body weight, growth, body length, organ weight and serum lipoprotein levels were measured in F 2 male mice with three diplotypes (B/B, B/C and C/C) for the subcongenic region (Fig 1), and all trait values are listed in Table 1. Among the 48 traits, 12 were significantly different among mice with the three diplotypes. Body weights at 10 and 14 weeks of age and body weight gain at 6-10 weeks of age were significantly higher in C/C mice than in B/B mice, despite the fact that the wild mice had only 60% of the body weight of B6 [9]. Percent CM-TG in C/C mice was also significantly higher than in B/B mice. In contrast, absolute and relative weights of epididymal and/or inguinal WATs and testes were significantly lower in C/C mice than in B/B mice. The LDL-C level was also significantly lower in C/C mice than in B/B mice.
Food intake was examined for 7 days in males of SR1 subcongenic and B6 strains. Average food intake (± standard error) per day per body weight was 0.156 ± 0.002 g in the SR1 strain and 0.150 ± 0.002 g in the B6 strain. There was no significant strain difference in average food intake.

RNA-seq Analysis
To screen for genes differentially expressed in the liver and epididymal WAT, RNA-seq was performed on F 2 male mice that were segregating for the SR1 subcongenic region (Fig 1). Total reads (input reads after trimming) ranging from 52,629,776 bp to 83,051,167 bp were obtained. Approximately 95% of the reads were correctly mapped to RefSeq mm10 in each diplotype by organ (S2 Table). S3 Table shows expression levels of all genes located in the SR1 subcongenic region between the two microsatellite markers D2Mit433 and D2Mit93 on mouse chromosome 2 (see Fig 1). A 5.8-Mb target region harboring Pbwg1.12 and Pbwg1.5 QTLs was physically defined in a previous study [16]. In the target region, only five genes (Ly75, Pla2r1, Tank, Fap and Gca) were differentially expressed in the liver, and only four genes (Fap, Ifih1, Grb14 and Cobll1) were differentially expressed in epididymal WAT ( Table 2). The expression of Ly75, Tank and Fap was elevated in the liver of mice with the C/C diplotype. Fap expression was also increased in The analysis was performed using the liver and epididymal WAT of F 2 mice with three diplotypes (B/B, B/C and C/C). The target region was physically defined by a previous study [16] (see Fig 1). WAT. In contrast, Gca expression in the liver and Ifih1 and Grb14 expression in WAT were decreased in the C/C mice. Both Pla2r1 expression in the liver and Cobll1 expression in WAT were different in direction of log 2 FC between C/C and B/C diplotypes for unknown reasons.

Quantitative Real-Time PCR Analysis
Although the differential expression of Tank in the liver and Cobll1 in WAT detected by RNAseq was not validated by quantitative real-time PCR analysis, that of Ly75, Pla2r1, Fap, Gca, Ifih1 and Grb14 in the liver and/or WAT was validated ( Table 3). The C/C mice had significantly higher expression levels of Ly75 and Fap than did the B/B mice. In contrast, the expression levels of Gca, Ifih1 and Grb14 in the B/B mice were significantly higher than those in the C/C mice. Uniquely, Pla2r1 expression level was lower in the B/C mice than in the B/B and C/C mice. Furthermore, both the amount and direction of gene expression were very similar in the results of RNA-seq and results of real-time PCR (Tables 2 and 3).

CIT Analysis
To identify genes mediating between diplotype and trait, the four component tests of the CIT were carried out as shown in Fig 2. As described earlier, among 48 quantitative traits measured, 12 were significantly associated with diplotype ( Table 1), meaning that the 12 traits passed Test 1 of CITs. For Ly75, Pla2r1, Fap and Gca genes differentially expressed in the liver, the subsequent three CITs were carried out and the results are summarized in Table 4. Among the four genes, Ly75, Fap and Gca passed Test 2. That is, diplotype was significantly associated with mRNA levels of these three genes after conditioning on each of the 12 traits. However, diplotype was significantly associated with Pla2r1 expression for only seven traits shown in Table 4. For Test 3 that conditioned on diplotype, Ly75 expression was significantly associated with percent WAT weight of inguinal and total depots. It was also marginally associated with epididymal WAT weight and percent epididymal WAT weight. Fap expression was significantly associated  with percent testes weight. Gca expression was significantly associated with LDL-C. However, Pla2r1 expression was not significantly associated with any traits. For Test 4, when conditioning on Ly75 expression, diplotype was independent of four WAT traits, i.e., epididymal, percent inguinal, percent epididymal and percent total depot weights. When conditioning on Pla2r1, Fap or Gca expression, diplotype was not independent of the corresponding traits. Taken together, only the Ly75 gene showed a causal relationship with WAT traits. As a typical example of the causal relationship, the results of all four tests for percent total WAT weight are shown in Fig 3. In addition, both Ly75 and Gca genes showed independent relationships with testes weight traits. The Pla21 gene also showed independent relationships with body weight gain at 6-10 weeks, total body length, WAT and lipoprotein traits, though their P values for Tests 2 and 4 were not so low ( Table 4). None of the genes differentially expressed in the liver had causal relationships with body weight and growth traits.
Although Fap, Ifih1 and Grb14 gene expression in epididymal WAT passed Test 2, only Ifih1 passed the subsequent two tests (Table 5). Ifih1 finally showed a reactive relationship with body weight gain at 6-10 weeks and an independent relationship with percent CM-TG. However, none of the genes differentially expressed in epididymal WAT showed causal relationships with body weight, growth, WAT and lipoptotein traits.

Discussion
The integration approach of gene expression and the CIT clearly demonstrated that the Ly75 gene is the mediator between diplotype and WAT weight, indicating that Ly75 is a strong candidate gene for the Pbwg1.5 QTL discovered from an untapped natural resource of wild M. m. castaneus mice in the Philippines. In contrast, the integration approach failed to find candidate genes for the Pbwg1.12 QTL for body weight because none of the genes differentially expressed in the liver and epididymal WAT was causally associated with body weight and growth traits.
Ly75 encodes DEC-205 (dendritic and epithelial cells, 205 kDa), which is also known as CD205 (cluster of differentiation 205). DEC-205 is an integral membrane protein homologous to the macrophage mannose receptor, and it is a novel endocytic receptor used by dendritic cells and thymic epithelial cells to direct captured antigens from the extracellular space to a specialized antigen-processing compartment [25]. Knockout mice for Ly75 have abnormalities in CD8-positive T cell morphology and cytotoxic T cell physiology [26], though the direct effect of Ly75 on WAT and related traits has not yet been investigated. Hence, it is not precisely known how Ly75 expression in the liver regulates WAT weight.
A previous microarray study [27] revealed 259 genes differentially expressed in the liver between SM/J and LG/J strains of mice fed a high-fat diet. The SM/J strain is more responsive Candidate Gene for Obesity-Resistant QT  than the LG/J strain for many obesity and diabetes traits. Most of the differentially expressed genes are associated with immune function, and 62 genes are located within intervals of QTLs previously mapped for obesity, diabetes and related traits [27]. Since high-fat diets are known to trigger an immune response through inflammation in many organs and tissues such as the liver and adipose tissue [28,29], genes associated with immune function can become candidate genes for obesity and diabetes QTLs. In fact, in humans, the SNP associated with LY75 gene expression has been reported to be relevant to type 2 diabetes mellitus [30]. In chickens, LY75 has been defined as a candidate gene for an adiposity QTL [31]. These facts thus reinforce the mouse Ly75 gene as a putative QTG for the Pbwg1.5 QTL with a preventive effect on obesity when mice are fed both low-fat standard and high-fat diets [17]. Previous exome sequencing analysis of the wild-derived congenic region on mouse chromosome 2 revealed nine nsSNPs, leading to amino-acid changes, in Ly75 exons, but none of the nsSNPs was predicted to be harmful to protein function [16]. We thus consider that some kinds of DNA changes in the promoter, enhancer or intronic region of Ly75 or epigenetic changes cause the expression difference in the liver Ly75 gene between wild and B6 mice. As shown in S4 Table, many genes associated with immune functions were differentially expressed in the liver of the F 2 mice used in this study. The differentially expressed genes were located on all chromosomes excluding the Pbwg1.5 QTL region of chromosome 2 and Y chromosome. Some of the genes associated with immune functions might be downstream genes for which expression is affected by Ly75 expression.
A quantitative complementation test, or a QTL-knockout interaction test, is often used to determine whether a candidate gene is a true QTG in rodents [32][33][34]. In the test, two experimental strains with different QTL alleles on different genetic backgrounds are each crossed to two tester strains, a knockout strain and its background strain with a wild-type allele. Trait values for F 1 hybrids produced from the four crosses are measured. An interaction effect between the knockout allele and the QTL allele is examined on the trait. If no interaction effect is obtained, then it is interpreted as evidence that the candidate gene is not a QTG. If the interaction effect is statistically significant, it is concluded that the candidate gene is a QTG, but the possibility that the interaction effect is caused by epistatic interaction between the candidate gene and other QTLs on different chromosomal regions cannot be entirely ruled out because the genetic backgrounds of the two experimental strains are different. To overcome this problem, a congenic strain carrying the QTL and its background strain with a wild-type QTL allele are used as experimental strains to perform the quantitative complementation test in a common genetic background [32,35]. Quantitative complementation tests using SM/J and LG/J mice revealed that the Capn10 (calpain-10) gene is a QTG for Adip1, an obesity QTL identified in a set of recombinant inbred strains between SM/J and LG/J, because highly significant interaction effects between Capn10 knockout and Adip1 genotypes were obtained for body weight, weights of fat pads, liver and heart, and serum leptin level [34]. Quantitative complementation tests in a common mouse genetic background suggested that the Pappa2 (pregnancy-associated plasma protein A2) gene is a QTG for a QTL with a small general effect on body size (tail length, bone length and body weight) found in a cross between DBA/2J and C57BL/6J mice; however, an interaction between Pappa2 and QTL genotypes was only significant for tail length and body weight at 3 weeks of age, whereas it was not significant for lengths of the skull and long bones and body weight at 6 weeks and 10 weeks [36]. Hence, even if a gene that has been knockouted is a QTG for a QTL with small phenotypic effects, the results of quantitative complementation tests would not always reach statistical significance. Another approach is to use a simple transgenic overexpression of a candidate gene, which has recently been proved efficient for positional cloning of a tail suspension QTL [5] and an adiposity QTL [7] in mice. We are now planning to perform a quantitative complementation test and/or a simple overexpression experiment using the Ly75 cDNA of M. m. castaneus in order to finally confirm the causality between Ly75 and Pbwg1.5.

Conclusions
The results of this study provided the first statistical evidence that Ly75 expression mediated between diplotype and WAT in mice, suggesting that Ly75 is a putative QTG for the obesityresistant Pbwg1.5 QTL discovered from the wild M. m. castaneus mouse. This finding provides a novel insight into a better understanding of the genetic basis for prevention of obesity.  Table. Expression of all genes in the SR1 subcongenic region detected by RNA-seq analysis using the liver and epididymal WAT of F 2 mice with three diplotypes (B/B, B/C and C/C). Genes with an expression level with FPKM > 0.1 [23] are focused on. a UP in parenthesis denotes an up-regulated gene with log 2 FC ! 1.0, DOWN indicates a down-regulated gene with log 2 FC -1.0, and DIS indicates a gene discrepant in direction of differentially expressed levels between the liver and WAT. b Based on the mouse RefSeq mm10. c R denotes regions where recombination occurred, and T denotes a target region physically defined by a previous study [16]. B, haplotype derived from C57BL/6JJcl; C, haplotype derived from the wild M. m. castaneus mouse; FPKM, fragments per kilobase of exon per million mapped fragments; FC, fold change; NA, not applicable. (XLSX) S4 Table. Differentially expressed genes on all chromosomes detected by RNA-seq analysis using the livers of F 2 mice with three diplotypes (B/B, B/C and C/C). The differentially expressed genes with FPKM > 0.1 [23] are listed at the threshold level of log 2 FC ! 0.58 (! 1.5 fold) or -0.58 ( 0.67 fold). The positions of the genes are based on the mouse RefSeq mm10. The genes associated with immune functions (see MGD [2]) are shown in bold font. a Q denotes the genomic region harboring two QTLs for body weight (Pbwg1.12) and resistance to obesity (Pbwg1.5). B, haplotype derived from C57BL/6JJcl; C, haplotype derived from the wild M. m. castaneus mouse; FPKM, fragments per kilobase of exon per million mapped fragments; FC, fold change.