Exploring Differentially Expressed Genes and Natural Antisense Transcripts in Sheep (Ovis aries) Skin with Different Wool Fiber Diameters by Digital Gene Expression Profiling

Wool fiber diameter (WFD) is the most important economic trait of wool. However, the genes specifically controlling WFD remain elusive. In this study, the expression profiles of skin from two groups of Gansu Alpine merino sheep with different WFD (a super-fine wool group [FD = 18.0 ± 0.5 μm, n= 3] and a fine wool group [FD=23.0±0.5μm, n=3]) were analyzed using next-generation sequencing–based digital gene expression profiling. A total of 40 significant differentially expressed genes (DEGs) were detected, including 9 up-regulated genes and 31 down-regulated genes. Further expression profile analysis of natural antisense transcripts (NATs) showed that more than 30% of the genes presented in sheep skin expression profiles had NATs. A total of 7 NATs with significant differential expression were detected, and all were down-regulated. Among of 40 DEGs, 3 DEGs (AQP8, Bos d2, and SPRR) had significant NATs which were all significantly down-regulated in the super-fine wool group. In total of DEGs and NATs were summarized as 3 main GO categories and 38 subcategories. Among the molecular functions, cellular components and biological processes categories, binding, cell part and metabolic process were the most dominant subcategories, respectively. However, no significant enrichment of GO terms was found (corrected P-value >0.05). The pathways that were significantly enriched with significant DEGs and NATs were mainly the lipoic acid metabolism, bile secretion, salivary secretion and ribosome and phenylalanine metabolism pathways (P < 0.05). The results indicated that expression of NATs and gene transcripts were correlated, suggesting a role in gene regulation. The discovery of these DEGs and NATs could facilitate enhanced selection for super-fine wool sheep through gene-assisted selection or targeted gene manipulation in the future.


Introduction
Fine wool sheep, also called Merino, is a world famous sheep breed that is known to produce high-quality fine wool. Fine wool sheep are distributed primarily in Australia, China, New Zealand, South Africa, Uruguay, Argentina and other countries [1]. Wool fiber diameter(WFD) is the most important economic trait of merino sheep and determines 75% of the value of wool fibers. The WFD variation-induced profit accounted for 61% of the total profits of wool [2,3]. Wool is formed from keratinocytes derived from a progenitor population at the base of the hair follicle(HF) [4]. The morphogenesis and growth of HF in sheep has been extensively studied since the 1950's and the developmental processes at the cellular level are reasonably well understood [5][6][7]. Dermal papilla (DP) cells are a population of mesenchymal cells at the base of the HF [8], and provide signals that contribute to specifying the size, shape and pigmentation of the wool [9]. It is well known that WFD is significantly associated with size of DP and matrix in mammals [4,6,[10][11][12][13][14][15][16][17], and is largely specified post-initiation, during the period of HFs growth and morphogenesis [18]. To illustrate the molecular mechanisms of controlling WFD, the expression profiles of different stage of fetal and adult sheep skin have also been generated by sequencing of expressed sequence tags (ESTs) and cDNA microarray [19,20]. However, the size of DP and matrix in mammals is also markedly influenced by genetic [10,11,21], physiological [13], nutrition [22], hormones [12] during the anagen phase of the hair cycle. Up to now, there are no studies on the molecular mechanisms of controlling WFD during the anagen phase, and the genes specifically controlling WFD remain elusive [23].
Knowledge of the genes controlling development of DP and matrix come from studies of the morphogenesis and cycle of HF of mice and human [24][25][26][27][28], It involves a series of signaling between the matrix and the dermal papill, such as Wnt/beta-catenin, EDA/EDAR/NF-κB, Noggin/Lef-1, Ctgf/Ccn2, Shh, BMP-2/4/7, Dkk1/Dkk4 and EGF [4,29]. The mutation, epigenetic modification and post-translational modification of any ligand or receptors in these pathways maybe affect WFD [30,31]. Therefore, in addition to the current efforts on proteinencoding genes, the attention should also be paid to a novel regulatory factor, non-coding RNA (ncRNA), such as micro RNA, long non-coding RNA and natural antisense transcripts (NATs) [32]. Among these ncRNAs, NATs are not only large in quantity but also play important roles in gene expression regulation in organisms [33]. NATs refer to a class of non-coding RNAs that are produced inside organisms under natural conditions and are expressed in many species [34][35][36]. NATs play an important role in transcript regulation at the mRNA and/or protein levels and in regulating various physiological and pathological processes, such as organ formation, cell differentiation and disease [37,38]. However, the roles of NATs in controlling WFD have not been described.
The next-generation sequencing (NGS)-based digital gene expression (DGE) profiling technologies developed in recent years constitute a revolutionary change in traditional transcriptome technology. Compared with ESTs and cDNA microarray, the strength of DGE profiling is that it is an "open system" and has better capability to discover and search for new information, providing a new way to identify novel genes and NATs that specifically control WFD [39]. However, the success or effectiveness of the search is largely dependent on the completeness and stitching quality of the genome sequences of the species studied [40]. It is exciting that the International Sheep Genomics Consortium (ISGC) has achieved initial success in assembling the reference sheep genome. The total length of the assembled genome, Sheep Genome v3.1, has reached 2.64 Gb, with only 6.9% gaps [41]. Needless to say, the release of a high-quality sheep genome reference sequence provides important resources for using NGS to study the skin transcriptome of sheep with different WFD and find genes and NATs that specifically control WFD [42,43].
In this study, NGS-based DGE profiling was conducted to analyze the differentially expressed genes (DEGs) and NATs in the skin of Gansu Alpine fine wool sheep with different WFD. A total of 47 significant DEGs and NATs were detected, including 9 up-regulated genes, 31 down-regulated genes and 7 down-regulated NATs. These DEGs and NATs may be useful in further study on molecular markers of controlling WFD in fine wool sheep.

Sequencing and assembly
NGS was performed on 6 individuals of the super fine wool group (sample nos.65505,65530,65540;n = 3) and the fine wool group (sample nos.5Y127,5Y212,5Y339; n = 3), and raw data greater than 5 Mb were obtained for all individuals (Table 1), submitted to the NCBI BioProject database, with the accession number PRJNA274817. There was a 3' adaptor sequence in the raw data, along with small amounts of low-quality sequences and various impurities. Impurity data were removed from the raw data, and clean tags greater than 4.7 Mb were obtained. The distinct tag number of each individual was greater than 0.11 Mb (Table 1).
The sheep genome reference sequence database includes 19,346 gene sequences, including 18,940 genes with "CATG" loci and accounting for 97.90% of the total number of genes. The total number of reference tags in the reference tag database is 173,207, including 169,843 unambiguous reference tags, accounting for 98.06% of the total number of tags. All clean tags were compared with reference genes and reference genomes, and the results indicated that 87.45%, 87.32%, 88.63%, 88.97%, 88.72% and 87.38% of the total numbers of clean tags of the 6 individuals from two groups (sample nos. 5Y127, 5Y212, 5Y339, 65505, 65530,and 65540, the same hereinafter), respectively, could be matched to the reference tags; of these, 49.05%, 50.26%, 51.86%, 52.41%, 51.48% and 50.37% of the clean tags could be uniquely located in the reference sequences (sense and antisense), and the ratios of distinct tag were 36.00%, 36.63%, 37.64%, 36.61%, 37.33% and 33.22% (Table 2), respectively; the numbers of unambiguous tagmapped genes were 10,391, 10,209, 10,666, 10,434, 10,137, 10,298,10763 and 9,965, respectively. (Table 2) These uniquely located sequences indicated that the key genes that regulate the WFD may exist in genes expressed in the individuals with different WFD.
There were a large number of possible unknown tags that are in the gene expression profile libraries of these six individuals. Accounting for 12.55% (617,198), 12.68% (638,292), 11.37% (552,279), 11.03% (543,900), 11.28% (585,017) and 12.62% (623,705) of the total number of clean tags were unannotated, respectively; the proportions of distinct tag number of unknown tag were 18.77% (22, (20,292) and 20.43% (29,382), respectively. These results suggested that there were many unknown genes in sheep skin tissue that may also play important roles in the regulation of HF development and wool growth.

Analysis of the expression profiling of NATs
Sense-antisense regulation is an important method for controlling gene expression. If the clean tags can be matched to the antisense strand of the gene, then it suggests that there are also transcripts for the antisense strand of this gene and that this gene may be subjected to sense-antisense regulation [44]. In the present study, we found that in the gene expression profiling libraries of the 6 individuals of the two groups, the percentages of genes with antisense transcripts were, respectively, 5.83%, 6.01%, 5.91%, 5.84%, 6.00% and 6.20% of the total numbers of the genes in the library, including 247,816 (5.04%), 262,353 (5.21%), 249,595 (5.14%), 247,940 (5.03%), 266,926 (5.15%) and 265,599 (5.37%) tags that could be exactly matched, respectively (Table 3); the number of tag species accounted for, respectively, 13.31%, 13.31%, 13.46%, 13.20%, 13.65% and 11.94% of the total numbers of tag species in the expression profile libraries of the 6 individuals from the 2 groups, including 15,192 (Table 3). The sense-antisense transcript ratios of all 6 expression profile libraries exhibited similar trends. The ratios between the tag number, distinct tag number and tag-mapped genes of sense and antisense transcripts were 10:1, 2:1 and 5:3, respectively; the spearman r between the sense and antisense transcripts expression were 0.533~0.557.

Detection of DEGs, NATs and validation
The clean tags of all six individuals were aligned with the reference tag database, allowing a maximum of one base mismatch. Unambiguous tags were annotated, and the number of raw clean tags that corresponded to the same gene was counted and then standardized to obtain the standardized expression level of each gene in the skin transcriptome of the 6 individuals. Noiseq software was used to select DEGs, NATs that exhibited different expression levels between S and F group. However, there were only 47 significant DEGs and NATs (log2Ratio (S/ F)!1, q-value!0.8).9 genes were up-regulated, 31 genes and all 7 NATs were down-regulated ( Table 4, Fig 1). Among the down-regulated genes, NT5C3L (Gene ID 101109197) showed the greatest expression difference (log2Ratio(S/F) = -10.59,q value = 0.89); among the up-regulated genes, CCNA2 (Gene ID 100144758) showed the greatest expression difference (log2Ratio (S/F) = 6.04, q value = 0.83). 38 significant expression genes had NATs, and only 2 had no found antisense transcripts: CCNA2 (Gene ID: 100144758) and prolactin-inducible protein homolog (Gene ID: LOC101114011).3 significant DEGs (AQP8, Gene ID: 101108013;Bos d2, Gene ID:101116281, and SPRR,Gene ID: 443313) had significant NATs. All 3 of these genes and their NATs were significantly down-regulated in the super-fine wool group (Table 4). 10 DEGs, NATs were used to validate selected differentially expressed transcripts identified from DGE profiling by Real-time PCR. The results from the real-time PCR confirmed the expression pattern of DGEs and NATs at two different groups in Gansu Alpine fine wool sheep. Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and other databases were used for the functional analysis and signaling pathway annotations of these DGEs and NATs. Among all of the significant DEGs and NATs, 44 genes were annotated, and 3 genes located in the genome could not be annotated effectively. In total of 47 DEGs and NATs were summarized as three main GO categories and 38 subcategories (Fig 2). Among the molecular functions category, the top three were involved in binding and catalytic activity. Regarding cellular components, cell part, cell, organelle, membrane, organelle parts were the dominant groups. Within biological processes category, metabolic process and cellular process were the most dominant group. However, no significant enrichment of GO terms was found (corrected P-value >0.05). The pathways that were significantly enriched with significant DEGs and NATs were mainly the lipoic acid metabolism, bile secretion, salivary secretion and ribosome and phenylalanine metabolism pathways (P < 0.05).

Discussion
Wool is produced via synthetic processes by HFs, which are embedded in the skin of sheep [45]. There are two types of HFs, named primary HF and secondary HF, which are different in appearance and function. Secondary HFs of the fine wool sheep are the main hair follicle and are critical determinants of mean fiber diameter and other wool characteristics. WFD is highly correlated with the size of DP of HF [17], whose origin can be traced to the dermal condensate, one of the earliest features of the developing HF [19]. The charactering of the molecular controls of HF initiation, morphogenesis, branching and growth can facilitate enhanced selection for new sheep breeds with lower WFD [20]. In order to understand the molecular mechanisms of controlling WFD, Adelson et al. (2004) constructed three cDNA libraries from fetal and adult sheep skin, obtained 2,345 noredundant EST sequences and identified 61 ESTs expressed in the adult HF, which constituted a high priority candidate gene subset for further work aimed at identifying genes useful as selection markers or as targets for genetic engineering [19]. Norris et al. (2005) found 50 up-and 82 down-regulated genes with increased fetal and inguinal expression relative to adult by compared skin gene expression profiles between fetal day 82, day 105, day 120 and adult HF anagen stage using a combined ovine-bovine skin cDNA  [20]. However, comparing to sequencing of expressed sequence tags (ESTs) and cDNA microarray, DGE profiling has many unique advantages [46,47]. It is highly accurate and has a very low detection limit, giving it a very wide range of applications [48],such as detection of new transcripts [49],functional research of non-coding RNA [50,51]. In this study, six DGE profiling were conducted to analyze the DEGs and NATs in the skin of Gansu Alpine fine wool sheep with different WFD at the same age, gender, and nutrition level during the anagen. More than 4.7 Mb of clean tags were obtained for each sheep, with more than 0.11 Mb of distinct tags. A total of 47 significant DEGs and NATs were detected, including 9 up-regulated genes, 31 down-regulated genes and 7 down-regulated NATs. The HF comprises several concentric epithelial structures [29]. Wool fiber is enveloped by two epithelial sheaths, known as the inner root sheath (IRS) and the outer root sheath (ORS).

Genes and NATs in Sheep Skin with Various Fiber Diameters
At the distal end of the HF, IRS cells undergo apoptosis and liberate the wool fiber [29,52]. ORS cells contain proliferating cells derived from stem cells in the bulge that feed into the matrix compartment of the bulb [53]. The matrix is a proliferative zone located at the proximal end of the HF surrounding the DP [52]. During the anagen phase, the DP cells are thought to direct the matrix cells to proliferate and differentiate into the multiple cell types that form the wool fiber and its channel, the IRS [9]. During postnatal life, hair follicles show patterns of cyclic activity with periods of active growth and hair production (anagen), apoptosis-driven involution (catagen), and relative resting (telogen) [53][54][55]. However, Merino HFs are predominantly in anagen throughout growth, different to many animals such as the mouse, rabbit, and guinea-pig [30]. It is also well known that WFD is significantly associated with sizes of DP and matrix in mammals [4,6,[10][11][12][13][14][15][16][17],which are markedly influenced by genetic [10,11,21], physiological [13],nutrition [22], hormones [12] during the anagen phase of the hair cycle. However, the genes specifically controlling the size of DP and matrix remain elusive. In the present study, the up-regulated genes in the super fine wool group included 5 genes associated with the cell cycle and apoptosis: GSDMA3 [56], HSPA2 [57], RPS27A [58], PDCD6IP [59], DAP [60], CCNA2 [61] and FOS [62]; the down-regulated genes included 7 genes associated with promoting follicle cell proliferation and differentiation: KRT1 [63,64], KRT7 [65,66], HSD11B1 [67,68], S100A8 [41,69,70], NT5C3 L [71] and DNAJC12 [72]. These genes may through promoting HF cell apoptosis, inhibiting follicle cell proliferation and differentiation, thereby reduce the WFD.
White adiposities, dermal fibroblasts, smooth muscle and endothelial cells of the vasculature, neurons and resident immune cells also surround the HFs in the skin. Recent studies show that the growth of wool fiber is not only affected by the proliferation and differentiation of follicular cells but is also affected by other cells around the follicle [73,74]. The layer of intradermal adipocytes expands as the HF enters its anagen stage and then thins during telogen [75,76]. FATP4 -/-, Dgat1 -/-, or Dgat2 -/-Early B cell factor 1 (Ebf1 -/-) mice have decreased intradermal adipose tissue due to defects in lipid accumulation in mature adipocytes [77][78][79]. Interestingly, these mice also display abnormalities in skin structure and function, such as hair loss and epidermal hyperplasia. In the present study, in the super fine wool group, it was also determined that the genes related to fat synthesis (ABCC11 [80], GLYATL2 [81], and ACSM3 [82]) were significantly down-regulated and that LIPK, which is involved in lipolysis, was significantly up-regulated [83].
Genes with NATs are very common in animal and plant genomes, accounting for 7% to 9% of transcripts in plants and 5% to 30% of transcripts in animals, except for nematodes [84]. The present study indicated that more than 30% of the genes had NATs, which is consistent with the findings of the above mentioned studies. In the present study, correlation analysis was conducted on the sense and antisense transcripts in the DGEs of skin, and the results showed that the correlation coefficients between the sense and antisense transcripts were between 0.533 and 0.557, indicating that the WFD maybe were regulated by both sense and antisense transcripts. NATs regulate sense transcripts through positive or negative feedback by forming a sense-antisense bidirectional structure [85]. In this study, 3 out of 40 significantly differentially expressed genes had significantly different NATs: AQP8, Bos d2, and SPRR (type II small proline-rich protein). AQP8 belongs to the aquaporin subfamily of the AQP family [86]; it is located mainly in a variety of tissues and on the surfaces of acinar cells in various glands and plays important roles in gland secretion and the transportation of water, urea, ammonia and hydrogen peroxide [87][88][89][90]. Bos d 2 belongs to the lipocalin family of proteins [91]. In the skin sections, Bos d2 was found in the secretory cells of apocrine sweat glands and the basement membranes of the epithelium and HF [92]. It is assumed that Bos d2 is a pheromone carrier. [92]. The SPRR proteins comprise a subclass of specific cornified envelope precursors encoded by a multigene family clustered within the epidermal differentiation complex region [93]. Two SPRR1, seven SPRR2, one SPRR3 and one SPRR4 genes are located within a 300-kb area of the EDC [94,95] and are expressed in the epidermis, HFs and capillaries [96,97]. Several studies have suggested that the SPRRs are related to increased epithelial proliferation and to malignant processes [98]. Other than functioning as structural proteins, SPRRs also regulate gene expression levels, inhibit cell proliferation and promote differentiation [99]. The present study also found that AQP8, Bos d2, SPRR and their antisense transcripts were significantly down-regulated in the super fine wool group, suggesting that AQP8, Bos d2 and SPRR and their antisense transcripts were regulated by positive feedback. However, the mechanism by which AQP8, Bos d2 and SPRR regulate the WFD needs to be further studied.

Sheep skin sampling
Gansu Alpine fine wool sheep were bred in the Huang Cheng District of Gansu Province, China, by cross breeding Mongolian or Tibetan sheep with Xinjiang Fine Wool sheep and then with some fine wool sheep breeds from the Union of Soviet Socialist Republics, such as Caucasian sheep and Salsk sheep. The breed was approved by the Gansu provincial government in 1980. Gansu Alpine fine wool sheep were obtained from a sheep stud farm located in Zhangye city, Gansu Province. All experimental and surgical procedures were approved by the Institutional Animal Care and Use Committee, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Peoples Republic of China. Six unrelated 3 years old ewes at different WFD, and also as different DP size were selected and divided into super fine wool group (S) (WFD = 18.0±0.5μm; Diameter of secondary DP size = 3.2±0.2μm) and fine wool group (F) (WFD = 23.0±0.5μm; Diameter of secondary DP size = 4.1±0.2μm) (Fig 3). A piece of midside skin (2 mm in diameter) was collected via punch skin biopsy under local anesthesia using 1% procaine hydrochloride immediately placed in liquid nitrogen and stored at -80°C for subsequent analysis.

Total RNA extraction, library construction and deep sequencing
Total RNA was isolated from the tissues using the RNeasy Maxi Kit (Qiagen, Hilden, GER) according to the manufacturer instructions. RNA quality was verified using a 2100 Bioanalyzer RNA Nanochip (Agilent, Santa Clara, CA, USA), and the RNA Integrity Number (RIN) value was >8.5. Then, the RNA was quantified using the Nano Drop ND-2000 Spectrophotometer (Nano-Drop, Wilmington, DE, USA). Sequence tags were prepared using the Illumina Digital Gene Expression Tag Profiling Kit, according to the manufacturer's protocol. Sequencing libraries were prepared from 1μg of total RNA using reagents from the NlaIII Digital Gene Expression Tag Profiling kit (Illumina Inc., San Diego, CA, USA). mRNA was captured on magnetic oligo(dT) beads and reverse transcribed into double-stranded cDNA (SuperScript II, Invitrogen, Carlsbad, CA, USA). The cDNA was cleaved using the restriction enzyme NlaIII. An adapter sequence containing the recognition sequence for the restriction enzyme MmeI was ligated to the NlaIII cleavage sites. The adapter-ligated cDNA was digested with MmeI to release the cDNA from the magnetic bead, while leaving 17 bp of sequence in the fragment. The fragments were dephosphorylated and purified by phenol-chloroform. A second adapter was ligated at the MmeI cleavage sites. Adapter-ligated cDNA fragments were amplified by PCR, and after 15 cycles of linear PCR amplification, 95 bp fragments were purified by 6% TBE PAGE gel electrophoresis, denatured and the single-chain molecules were fixed onto the Illumina Sequencing Chip. A single-molecule cluster sequencing template was created through in situ amplification. Nucleotides labeled with different colors were used to perform sequencing by the sequencing by synthesis method; each tunnel can generate millions of raw reads with a sequencing length of 35 bp.

Determination of gene expression levels and detection of DEGs and NATs
Sequencing-received raw image data were transformed by base culling into sequence data, which was called raw data. Raw sequences were transformed into clean tags after removed all low quality tags such as short tags (< 21 nt), empty reads, and singletons (tags that occurred only once). Clean tags were classified according to their copy number (as a percentage of the total number of clean tags) and the saturation of the library was analyzed. All clean tags were mapped to sheep reference sequences(version:Oarv3.1) by SOAP2(version:2.21) with default parameters, and allowing one mismatches. To monitor mapping events on both strands, both sense and complementary antisense sequences were included [100].
A preprocessed database of all possible 17 base-long sequences of Oarv3.1 located next to the NlaIII restriction site was created as the reference tags, and only one mismatch was allowed. Following the common practice, only clean reads that can be uniquely mapped back to the reference tags were considered ("-r 0" option), and those mapped more than one location were discarded. Remainder clean tags were designed as unambiguous clean tags. When multiple types of remainder clean tags were aligned to the different positions of the same gene, the gene expression levels were represented by the summation of all. The number of unambiguous clean tags for each gene was calculated and then normalized to TPM (number of transcripts per million clean tags) = clean tag number corresponding to each gene number of total clean labels in the sample × 1,000,000 [101,102].
All TPM of 3 samples each group were integrated, and NOISeq (version: 2.8.0) [103]with default parameters, which is a novel nonparametric approach for the identification of DEGs and NATs, was used to detect the differentially expressed transcripts between super fine wool group (S) and fine wool group (F). To measure expression level changes between two groups, NOISeq takes into consideration two statistics: M(the log 2 -ratio of the two conditions) and D (the absolute value of the difference between conditions). The probability thresholds were P ! 0.8 and the TMM value in the lower expressed sample was !1. The higher the probability, the greater the change in expression between two groups. Using a probability threshold of 0.8 means that the gene is 4 times more likely to be differentially expressed than non-differentially expressed [104].

Strand-specific real-time quantitative RT-PCR
To confirm the differentially expressed sense and antisense transcripts between super fine wool group and fine wool group, ten genes were randomly selected to verify their expression levels of gene and NATs transcripts in skin by strand-specific qRT-PCR according to the protocol described in Haddad et al. (2007) [105]. Primers for real-time PCR were designed with Primer Express 3.0 (Applied Biosystems) ( Table 5). GAPDH was used as a reference control. Real time PCR was performed using SYBR Green master mix (TianGen) on the CFX96 Real-Time System (BioRAD, USA). The reaction was performed using the following conditions: denaturation at 95°C for 3 min, followed by 40 cycles of amplification (95°C for 30s, 60°C for 30s, and 72°C for 30s). Relative expression was calculated using the delta-delta-Ct method.

GO and KEGG enrichment analysis of differentially expressed transcripts
Gene ontology (GO), an international standardized gene functional classification system, offers a dynamic-updated controlled vocabulary and strictly defined concept to comprehensively describe the properties of genes and their products in any organism [106]. Kyoto Encyclopedia of Genes and Genomes (KEGG) database is a database resource for understanding functions and utilities of the biological system, such as the cell, the organism and the ecosystem from molecular information, especially for large-scale molecular datasets generated by genome sequencing and other high throughput experimental technologies (http://www.genome.jp/kegg/) [107]. all DEGs and NATs were mapped to GO-terms in GO database, looking for significantly enriched GO terms in DEGs comparing to the genome background GOseq R package(version:1.18.0) [108],while significantly enriched metabolic pathways or signal transduction pathways in DEGs and NATs were identified via pathway enrichment analysis using KEGG, public pathway-related database, and comparing with the whole genome background by KOBAS(version: 2.0) [109].
In all tests, P-values were calculated using Benjamini-corrected modified Fisher's exact test and 0.05 was taken as a threshold of significance. The calculating formula is: Where N is the number of all genes with GO or KEGG annotation; n is the number of DEGs or NATs in N; M is the number of all genes that are annotated to the certain GO terms or specific pathways; m is the number of DEGs or NATs in M.
Supporting Information S1