Genome-wide analysis of methylation in giant pandas with cataract by methylation-dependent restriction-site associated DNA sequencing (MethylRAD)

The giant panda (Ailuropoda melanoleuca) is a native species to China. They are rare and endangered and are regarded as the ‘national treasure’ and ‘living fossil’ in China. For the time being, there are only about 2500 giant pandas in the world. Therefore, we still have to do much more efforts to protect the giant pandas. In captive wildlife, the cataract incidence of mammalian always increases with age. Currently, in China, the proportion of elderly giant pandas who suffering from cataract has reached 20%. The eye disorder thus has a strong influence on the physical health and life quality of the elderly giant pandas. To discover the genes associated with the pathogenesis of cataract in the elderly giant panda and achieve the goal of early assessment and diagnosis of cataract in giant pandas during aging, we performed whole genome methylation sequencing in 3 giant pandas with cataract and 3 healthy giant pandas using methylation-dependent restriction-site associated DNA sequencing (MethylRAD). In the present study, we obtained 3.62M reads, on average, for each sample, and identified 116 and 242 differentially methylated genes (DMGs) between the two groups under the context of CCGG and CCWGG on genome, respectively. Further KEGG and GO enrichment analyses determined a total of 110 DMGs that are involved in the biological functions associated with pathogenesis of cataract. Among them, 6 DMGs including EEA1, GARS, SLITRK4, GSTM3, CASP3, and EGLN3 have been linked with cataract in old age.


Introduction
Nowadays, a growing number of wild animals have been successfully placed in Zoo. Although the captive animals in Zoo live longer than those in the wild, the aged captive animals (e.g., Malayan Tapir) always encounter various age-related diseases including cataract. Cataract, characterized by the opacification of eye lens, is the most common cause for the blindness of almost all mammals, such as dogs, rhesus monkeys, and humans [1][2][3]. In addition, increasing age is considered to be the most important risk factor for cataract and a considerable number of cataract are classified as age-related cataract [4]. The loss of vision caused by age-related cataract has great influences on the health status of aged animals. As showed by one previous investigation on the captive rhesus monkeys, cataract attacked 20% of the rhesus monkeys at age of 20-22 years and the rate was still increasing after 26 years of age [1]. In addition, the giant panda (Ailuropoda melanoleuca), a world's most protected rare animals, is also attacked by cataract with age. The studies have shown that the average life span of wild giant pandas is about 15-20 years old, while those in captivity usually live longer and can reach to the age of about 25-30 years [5][6]. Generally, the lifespan of human is 4-4.5-fold longer than the giant panda. The giant panda at the age of 20 years approximately equals human at age of 80-90 years and those aged after 18 years are always served to be aged giant pandas. According to the national survey of eye diseases in aged giant pandas, conducted by Beijing Zoo in 2013, approximately 20% of the aged giant pandas suffered from cataract. Since the giant panda is still an endangered species, the protection of aged giant pandas from cataract has great significances. Hitherto, a growing number of evidence has shown that genetic factors have large influences on the severity of cataract and play important roles in the development of cataract [7]. For instance, oxidative stress and DNA damage are two common contributors to the many changes in development of age-related cataract [8][9][10]. Abundant evidence has revealed that genes related to these activities (i.e., oxidative stress and DNA damage) play an important role in the pathogenesis of age-related cataract, such as SOD1, PRDX6, and CRYBA4 [11][12][13]. Epigenetic modifications (e.g., DNA methylation, histone modifications, and non-coding RNA) refer to the alteration of gene activity without any changes in genomic sequence [14][15]. Currently, alteration in epigenetic patterns, in especial DNA methylation, has been closely linked with the cataractogenesis [16]. For example, a reduction of OGG1 and CRYAA expression caused by hypermethylation was observed in lens of eyes with age-related cataract [17][18]. All the existing evidence indicates that the abnormal DNA methylation changes have great contributions to the development of age-related cataract in giant pandas.
In this present study, we performed genome-wide DNA methylation analysis on 3 giant pandas with cataract and 3 healthy giant pandas by methylation-dependent restriction-site associated DNA sequencing (MethylRAD) [19]. Comparison of methylation patterns between the two groups led to the identification of a number of the differentially methylated genes (DMGs) according to the methylation level of CCGG/CCWGG sites. Further analyses showed that the DMGs are preferential located on KEGG pathways and GO terms that have close associations with cataract development. Among these DMGs, some genes (e.g., CASP3, HMGB1, EEA1, and GARS) indeed have been proved to be functioning in the pathogenesis of agerelated cataract. Taken together, our study illustrates the epigenetic basis of cataract development in giant panda and identifies potential targets for drug intervention in the therapy of age-related cataract. This research work will facilitate the development of precision medical measures for cataract specific to giant pandas.

Sampling and MethylRAD sequencing
The peripheral blood samples were collected from 6 female giant pandas, consisting of 3 cases with cataract and 3 healthy controls. A total of 2 ml blood was draw for each sample during the daily physical examination (without anesthetic). The genomic DNA of blood samples was extracted using phenol-chloroform method (EMD Millipore-516726, Sigma-Aldrich). Blood samples were initially stored at -80˚C. Construction of the MethylRAD library has been described by Wang et al. [19]. 3 μg genomic DNA for each sample was mixed with FspEI (5U/ μl) by a volume ratio of 1:0.8. Then, 30 × Enzyme activator was added to the mixture for digestion reaction, with a volume ratio on 0.5:1. After ligation of adaptor, the product was enriched and purified, and then amplified with PCR reactions. PCR product was further purified using QIAquick PCR Purification Kit. Finally, the short DNA fragments in each library were sequenced on Illumina HiSeq platform by the mode of single-end, 50-bp (Illumina Inc., USA).

Quality control and reads alignment
To get the clean reads with high quality, we filtered out the poor-quality reads using the threshold of over 15% of bases in a read with quality value of less than 30. In addition, we also removed those reads with a percentage of N greater than 8% [18]. Then, the reads with enzyme sites (enzyme reads) were extracted for subsequent analyses.

Quantitation and compare of methylation level between two groups
Since the consistency of amplification efficiency for the sequences with equal length, the methylation level of the sites (CCGG/CCWGG) can be quantified by the sequencing depth of the methylation tag. For the MethylRAD-sequencing, the methylation level of each site (CCGG/ CCWGG) was represented by RPM (reads per million) as the following formula [21]:

RPM ¼
site coverage reads number � 1; 000; 000 library high quality reads number ð1Þ In addition, the methylation level of one certain genic region including upstream/downstream 2000 bp of TSS (transcription start site), gene body, and upstream/downstream 2000 bp of TTS (transcription termination site) was calculated by the sum value of all the methylated sites that are located in the corresponding the region. The methylation data were then analyzed to identify the differentially methylated sites/genes (DMS/Gs) between the case and control groups using the edgeR Bioconductor package that is relied on the number of coverage reads on the sites or genes [22]. Here, the sites/genes with at least 3 reads coverage across the samples in at least one group were retained and those meeting the threshold of p value � 0.05 and | log2FC| > 1 were determined as DMS/Gs, with hypermethylated and hypomethylated.

Gene annotation and enrichment analysis
The gene annotation information of giant panda was also downloaded from the National Center for Biotechnology Information (NCBI) with the website: ftp://ftp.ncbi.nlm.nih.gov/ genomes/all/GCF/000/004/335/GCF_000004335.2_AilMel_1.0/GCF_000004335.2_AilMel_1. 0_genomic.gff.gz. The UTR regions of the genes were calculated by using SnpEff tool based on the annotation information [23]. The distributions of the methylated sites on various genomic sequences were calculated by BEDTools [24]. To perform gene set enrichment analysis, we obtained the available gene information of pathways and biological functions from databases of Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO) and The Comparative Toxicogenomics Database (CDT) [25][26][27]. We utilized the hypergeometric test to calculate the statistical significance of genes enriched on each biological function.

Source of the study samples
In 2013, we looked into 55 old giant pandas in Chinese Zoo, among which 11 (8 females and 3 males) were suffering from cataract. Most of the sufferers were female and over 20 years old. Here, we obtained the genomic DNA samples of peripheral blood cells from 3 female giant pandas with cataract, 2 healthy female giant pandas and 1 male giant panda to perform subsequent genome-wide methylation study. As presented in Table 1, we numbered the 6 giant panda samples as YY-Y, BD-Y, YY-XK, LL-D, JN-D, and XX-XK, respectively. Among these samples, the YY-Y and BD-Y, were healthy ones with an age of approximately 20 years old, YY-XK was healthy with an age of 29 years old. LL-D, JN-D, and XX-XK were ill ones with an age of 36, 32, and 25 years old, respectively. LL-D was died in 2018.

MethylRAD of the samples
The methylomes of the peripheral blood cells from the 6 giant pandas were generated on HiSeq platform using MethylRAD sequencing (see Materials and Methods) [19]. Here, we obtained 3.62 ± 0.17 million raw sequencing reads, on average, for each sample. After removing the reads with low quality and the reads without enzyme sites, we retained about 1.72 million clean enzyme (FspEi) reads that covering CCGG/CCWGG sites in each sample for the subsequent analyses. We mapped the enzyme reads to the reference genome (AilMel 1.0) of giant panda by using SOAP software version 2.21 [20]. On average, about 1.31 million clean reads were mapped to the genome for each sample, the mapping ratio is about 76.66% (Table 2). In this study, we determined the reliable methylated sites by a cutoff of read-coverage no less than 3, and obtained 1 million CCGG sites and 0.32 million CCWGG sites, on average, for each sample. The average coverage depth of CCGG and CCWGG sites is 10.3 and 9.01, respectively (Table 3). Therefore, the sequencing reads satisfy the condition of following analyses.

Signatures of DNA methylation in the giant pandas
Then, we analyzed the distribution of the methylated CCGG/CCWGG sites on distinct genomic sequences, including Utr3prime, Utr5prime, Upstream, Exon, Intron, and intergenic regions (Fig 1). Among them, the sites were most located in intergenic and intron regions, the next in exon and upstream, and the least in Utr3prime and Utr5prime regions. In addition, results revealed that the distribution of methylated CCGG and CCWGG sites on various genomic regions was very similar in the 3 healthy giant pandas, while a marked difference was observed in the 3 giant panda that suffering from cataract. For instance, the number of methylated CCGG and CCWGG sites located in the various genomic regions was fewest in LL-D, moderate in XX-XK, and largest in JN-D. In addition, we found that the methylated CCGG and CCWGG sites in LL-D and XX-XK were smaller than the healthy giant pandas. However, for the methylated sites in JN-D, the number of CCGG sites was larger than healthy giant pandas, while the number of CCWGG sites was similar with healthy giant pandas.
In addition, we analyzed the overall methylation pattern of different positions on the genic regions. As presented in Fig 2 and S1 Table, the methylation level based on CCGG and CCWGG sites was gradually rising from the initial position of genic region, occurred a turning point of rising tread, then continued to rise and reach the highest value at the end position. Moreover, no obvious differences were observed in the 6 giant pandas.

Genes differentially methylated in the case and control groups
In the present study, we calculated the methylation level of genes based on the CCGG and CCWGG sites, respectively, and identified the DMGs between the case and control groups using a threshold of P value � 0.05 and absolute log2FC value > 1 (see Materials and Methods). Here, we identified a total of 116 DMGs by the CCGG sites, including 75 hypermethylated genes and 41 hypomethylated genes. Moreover, we determined 242 DMGs by the CCWGG sites, including 164 hypermethylated genes and 78 hypomethylated genes. The heatmap plot showed a different methylation pattern between the two groups, with genes highly methylated in patients presenting low methylation level in the healthy samples and vice versa. For both the CCWGG and CCGG sites, there were a large number of genes with hypermethylation in healthy group (Fig 3). Among them, there were 20 DMGs that were identified by both the CCGG and CCWGG sites, containing 12 hypermethylated genes and 8 hypomethylated genes (Fig 4 and S2 Table).

Pathway-level functions of the DMGs in the development of cataract
To determine the signaling pathways potentially associated with age-related cataract, we first performed KEGG enrichment analysis and searched the available annotation information from CTD database [25,27]. By using a threshold of P value < 0.05, we identified 15 and 55 enriched KEGG signaling pathways for the CCGG-and CCWGG-based DMGs, respectively. Among them, the CCGG-based signaling pathways contained 27 DMGs, including 22 hypermethylated genes and 5 hypomethylated genes; while the CCWGG-based signaling pathways contained 96 DMGs, including 80 hypermethylated genes and 16 hypomethylated genes. In addition, there were 3 enriched pathways according to both CCGG-and CCWGG-based DMGs. A total of 10 DMGs were located in these 3 signaling pathways, including 8 hypermethylated genes and 2 hypomethylated genes. We then summarized the findings of KEGG signaling pathways to assess their associations with cataract pathogenesis. The pathways associated with genetic information processing included base excision repair (cataract-related genes: HMGB1, hypomethylated in aged giant pandas with cataract), SNARE interactions in vesicular transport (STX19, hypermethylated), and RNA degradation (MPHOSPH6 and TTC37, both were hypermethylated). The pathways related with environmental information processing contained NF-kappa B signaling pathway (CCL19, hypomethylated), cAMP signaling pathway, HIF-1 signaling pathway (LOC100484901, PDK1, and EGLN3, all were hypermethylated). On the level of cellular process, there were 5 pathways, 3 of which had some associations with cataract, including cell cycle (ORC6 and CDC7, both were hypermethylated), apoptosis (CASP3, hypermethylated), p53 signaling pathway (TP53I3 and CASP3, both were hypermethylated). On the level of metabolism, there were 17 pathways, 7 of which were related with cataract, including drug metabolism-cytochrome P450 (FMO5 and GSTM3, both were hypermethylated), glycerolipid metabolism (LPL, hypermethylated), beta-Alanine metabolism (LOC100474209, hypermethylated), tyrosine metabolism (LOC100474209, hypermethylated),  Table 4 and S3 Table).

GO-level functions of the DMGs in cataract
Similarly, we also conducted GO enrichment analysis for the DMGs and then found the cataract-related GO term by CTD database [26][27].       Table 5 and S4 Table).

Roles of the DMGs in cataract
In this study, we identified a total of 338 DMGs between the groups. Among the DMGs, 116 have been previously supposed to be potentially linked with the development of cataract (S5 Table). Base on the results of enrichment analysis, we selected 16 DMGs associated with the cataract-related KEGG pathways and 108 DMGs involved in the cataract-related GO terms. By combing the results of KEGG and GO enrichment analyses, we obtained a total of 110 DMGs (Table 6), among which 6 have been linked with the cataract in old age in previous reports that were EEA1, GARS, SLITRK4, GSTM3, CASP3, and EGLN3.

Discussion
In the present study, we analyzed the methylation profile differences between the aged giant pandas suffering from cataract and healthy giant pandas and identified hundreds of DMGs in giant pandas with age-related cataract. Notably, we found no methylation differences on the genes (i.e., GLB1, CDKN2A and CDKN2B) highly correlated with aging [28][29], implying negligible effects of age on the DMGs. Further analysis showed that the genes with significant methylation differences between the case and control groups are indeed located on many biological processes related with cataract formation, such as base excision repair, p53 signaling pathway, and apoptosis [18,30,31]. Among them, p53 signaling pathway plays an important roles in the prevention of apoptosis of lens epithelial cells and cataractogenesis [31], the hypermethylation of genes (i.e., TP53I3 and CASP3) on this pathway would like to downregulate the     functions of p53-mediate signaling pathway and promote the development of cataract. In addition, other direct evidence comes from the certain genes that have been previously reported to be associated with cataract pathogenesis. For example, Glutathione S-Transferase Mu 3 (GSTM3, hypermethylated in giant pandas with cataract) was considered to prevent the agerelated cataract by protecting the lens from oxidative stress and a decreased expression level of GSTM3 was observed in the lens tissue of patients with age-related cataract, which correlated with the hypermethylation of GSTM3 promoters [32]. Since DNA methylation is reversible and can be influenced by the external factors [33], the research on the appropriate epigenetic drugs based on the specific cataract-associated genes would be wildly used in prevention of age-related cataract development in giant pandas. In this study, we shed light on the methylation characteristics of giant panda suffering from cataract and provide a number of candidate epigenetic therapeutic targets for the prevention and treatment of cataract in the aged giant panda. Nevertheless, the small sample size and the lack of functional experiments limit the practical utility of the findings in this study. Therefore, further efforts are needed to address the issues as follows: 1) the validation of DMGs in a large giant panda population; 2) the influences of certain aberrant DNA methylation events on gene activity; 3) the key genes with major contributions to the cataract development in age giant pandas; 4) the molecular mechanisms of key genes in the pathogenesis of age-related cataract. In addition, we also observed that some giant pandas with cataract can be self-healing after their living environments were changed. The contribution of reversible epigenetic modifications (e.g., DNA methylation) caused by environmental stimulus to this phenomenon will be explored in our future studies.

Conclusion
In short, we determined a number of DMGs that had potential roles in regulating the activity of cataract-related pathways, such as base excision repair, apoptosis, and p53 signaling pathway. Moreover, these findings were further supported by detailed genes with abnormal methylation pattern in giant pandas with cataract. For example, the CASP3 gene encodes a cysteineaspartic acid protease that served to as an apoptosis executor, and has been linked with cataract in rat [34][35]; HMGB1 plays an important role in protecting the keratinocytes from ultraviolet radiation-induced cell death and is thus involved in cataract formation [36]. Overall, all the results argue for an important role of aberrant methylation changes in the development of cataract in aged giant pandas.
Supporting information S1