Transcriptomic Analysis of Ovaries from Pigs with High And Low Litter Size

Litter size is one of the most important economic traits for pig production as it is directly related to the production efficiency. Litter size is affected by interactions between multiple genes and the environment. While recent studies have identified some genes associated with prolificacy in pigs, transcriptomic studies of specific genes affecting litter size in porcine ovaries are rare. In order to identify candidate genes associated with litter size in swine, we assessed gene expression differences between the ovaries of Yorkshire pigs with extremely high and low litter sizes using the RNA-Seq method. A total of 1 243 differentially expressed genes were identified: 897 genes were upregulated and 346 genes were downregulated in high litter size ovary samples compared with low litter size ovary samples. A large number of these genes related to steroid hormone regulation in animal ovaries, including 59 Gene Ontology terms and 27 Kyoto Encyclopedia of Genes and Genomes pathways involved in steroid biosynthesis and ovarian steroidogenesis. From these differentially expressed genes, we identified a total of 11 genes using a bioinformatics screen that may be associated with high litter size in Yorkshire pigs. These results provide a list of new candidate genes for porcine litter size and prolificacy to be further investigated.


Introduction
Litter size is one of the most important economic factors in pig production and is affected by interactions between multiple genes and the environment [1,2].The size of litters varies between pigs in different breeding farms.Altering litter size by conventional breeding methods can be slow, and has shown low heritability.However, using marker assisted selection (MAS) can speed up genetic improvements in litter size traits [3].As the ovary directly mediates ovulation, it has a significant impact on the fecundity of mammals, and therefore, genetic differences in the ovaries may contribute to the observed differences in litter size [4,5].
RNA sequencing (RNA-Seq) can measure genes, both quantitatively and functionally, at the transcriptome level [17].To date, RNA-Seq has been used to study specific ovarian genes of important livestock animals (e.g., yak [18], and goat [19,20]), but transcriptomic studies of specific genes in pigs are rare, and have mainly focused on the molecular regulation mechanism of fat deposition and muscle development [21][22][23].Therefore, transcriptomic analysis of pig ovaries using RNA-Seq may explain heredity of porcine fecundity, and be used to identify key genes relating to litter size.
Yorkshire is one of the world's most famous lean pig breeds, and is known for its higher fertility and high litter size [24,25].However, in pig production, some Yorkshire pigs always have low litter sizes, which seriously restricts productivity.In the present study, we sampled ovaries of Yorkshire pigs with extremely high and low litter sizes, and analyzed the differential expression of genes (DEGs) using RNA-Seq.The aim of this analysis was to identify major genes that control prolificacy, and thus provide a molecular basis for genetic improvement of reproductive traits in swine.

Ethics statement
Yorkshire (a Danish lean-type breed) pigs were obtained from the Anhui Daziran Primary Pig Breeding Farm, Huaibei, Anhui, China.All experimental procedures and sample collection were performed according to the Regulations for the Administration of Affairs Concerning Experimental Animals (Ministry of Science and Technology, China; revised in June 2004) and approved by the Institutional Animal Care and Use Committee of Anhui Agricultural University, Hefei, China, under permit No. ZXD-P20140809.This report fully adhered to the ARRIVE Guidelines for the reporting of animal research [26].A completed ARRIVE guideline checklist is included (S1 Checklist).The animals were reared in the same environment and fed the same diet ad libitum during the experimental period.Food was withheld from the animals on the night before they were slaughtered.

Animals and ovary collection
A total of twelve healthy female pigs were used in this study from two groups: the extreme high litter size group (YH: n = 6), and the extreme low litter size group (YL: n = 6) (Table 1), representing pigs with high and low fecundity, respectively.In order to reduce, as far as is possible, the effects of age and parity on litter size, three pigs of similar age and parity from each group were selected as biological replicates for RNA-Seq.Their intact ovaries were rapidly harvested from their carcasses and immediately frozen in liquid nitrogen.All tissue samples were stored at −80°C until the total RNA extraction procedure was performed.

mRNA library preparation and sequencing
The ovaries were completely ground and total RNA was extracted using TRIzol (Invitrogen, Carlsbad, CA, USA).The quality of the total RNA was checked using the Agilent 2100 Bioanalyzer system (Santa Clara, CA, USA).A total amount of 3 μg RNA per sample was used as input material for the RNA sample preparations.Sequencing libraries were generated using NEBNext 1 Ultra TM RNA Library Prep Kit for Illumina 1 (NEB, USA) following manufacturer's recommendations.Briefly, mRNA was extracted from total RNA using oligo (dT) magnetic beads and sheared into short fragments of about 200 bases.These fragmented mRNAs were then used as templates for cDNA synthesis.The cDNAs were then PCR amplified to complete the library.The cDNA library was sequenced using an Illumina HiSeq TM 2000 platform.

Analysis of RNA-Seq data
Raw RNA-Seq reads were processed through in-house perl scripts.Clean reads were obtained by removing reads containing low quality reads and/or adaptor sequences from raw reads [27], and mapped to the pig genome (Sus scrofa 10.2) using TopHat software [28], allowing up to two base mismatches.The gene expression level was then calculated using the reads per kilo bases per million reads (RPKM) method [29].We considered the gene was expressed exclusively in one of the two groups if its RPKM 0.3 in one group and < 0.3 in the other group.Differential expression analysis was performed using the Benjamini and Hochberg's approach for controlling the false discovery rate [30].Genes with an adjusted P value < 0.05 were assigned as differentially expressed (DEG).
DEG lists were submitted to the databases of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) for enrichment analysis of the significant overrepresentation of GO terms and KEGG-pathway categories [31,32].In all tests, P values were calculated using the Benjamini-corrected modified Fisher's exact test and 0.05 was taken as a threshold of significance.

Quantitative PCR analysis
The RNA-Seq results were validated using an RNA samples from the YH group (n = 6) and the YL group (n = 6), respectively.Six DEGs enriched in the ovarian steroidogenesis pathway, two DEGs (one upregulated, and the other downregulated in the YH group) were enriched for metabolic pathways, two DEGs with the highest numbers of reads in both YH and YL, and two genes (one expressed only in the YH group, and the other expressed only in the YL group) were analyzed by quantitative PCR (qPCR) (S1 Table ).Total RNA (1.0 mg) was used to synthesize first-strand cDNA using PrimeScript RT Master Mix (TaKaRa, Osaka, Japan).Q-PCR was performed using SYBR Premix Ex Taq (TaKaRa, Osaka, Japan) in the CFX96 Real-Time PCR Detection System (Bio-Rad, Hercules, CA, USA).The primers were designed using the Primer Express (Applied Biosystems) software.All the mRNA nucleotide sequences were obtained from the NCBI Entrez Nucleotide database (http://www.ncbi.nlm.nih.gov/sites/entrez?db = nuccore&itool = toolbar), or from the EMBL-EBI database (http://www.ebi.ac.uk).The primers used for qPCR are listed in S1 Table .The thermal cycling conditions were 95°C for 10 min, followed by 40 cycles of 95°C for 15 s and 60°C for 1 min.There were three replicates for each amplification.Cycle threshold (Ct) values were analysed using a linear mixed model as described below.
in which L is the effect of fecundity, A is the fixed effect of age, and P is the fixed effect of parity of animal [33].Then the Ct values were transformed to quantities using the comparative Ct method.Data was normalized using the porcine β-actin reference gene.Comparison of gene expression levels conditional on prolificacy levels was performed using the t-test, and correlations between qPCR and RNA-Seq measures were calculated.

Overview of sequencing data
After removing the low quality and adaptor sequences, we obtained approximately 52 to 66 million (M) clean reads for six RNA-Seq libraries, and high percentages of mapped reads ranging from 79.15 to 80.68%.Most mapped reads were located within an exon (71.6 to 85.6%) while a smaller percentage of mapped reads (less than 19.0%) were located within the introns and intergenic regions (Table 2).These results indicated that our six libraries were of high quality, and had high coverage of the pig genome.This allowed us to compare the ovary transcriptomes from pigs with high and low litter size.The data is available from the Sequence Read Archive (SRA) (Accession no.SRP058401, Bioproject: PRJNA283575).

Functional enrichment analysis of differentially expressed genes
To define the biological functions of the 1 243 DEGs, GO and KEGG analysis were carried out.Fifty-nine significantly enriched GO terms (corrected P < 0.05) were identified, including single-organism metabolic process, lipid metabolic process, oxidation-reduction process, lipid biosynthetic process, organic acid metabolic process, carboxylic acid metabolic process, oxoacid metabolic process, small molecule metabolic process, fatty acid metabolic process, cellular lipid metabolic process, catalytic activity, steroid metabolic process, and glucose metabolic process (Table 4 and S3 Table ).Meanwhile, 27 significantly enriched KEGG pathways were identified, including metabolic pathways, valine/leucine and isoleucine degradation, fatty acid metabolism, carbon metabolism, steroid biosynthesis, butanoate metabolism, ovarian steroidogenesis, biosynthesis of unsaturated fatty acids, PPAR signaling pathway, synaptic vesicle cycle, and biosynthesis of amino acids (Table 5 and S4 Table ).Among these GO terms and KEGG pathways, the steroid biosynthesis and ovarian steroidogenesis pathways are the ones related to steroid hormone regulation in animal ovaries and therefore likely to be contributing to litter size.However, as most GO and KEGG assignments and distributions are related to reproduction, growth and development, and metabolism, our results indicate that the DEGs are involved in a wide range of regulatory functions in porcine ovaries.

Validation of DEGs by qPCR
Twelve genes were used for qPCR analysis from the ovaries of the YH and YL groups to validate the expression profiles obtained by RNA-Seq.Consistent with the RNA-Seq findings, we verified that hydroxysteroid (17-beta) dehydrogenase 2 (HSD17B2), cytochrome P450 E class group 1 (CYP11A1), steroidogenic acute regulatory protein (STAR), low density lipoprotein receptor (LDLR), scavenger receptor class B member 1 (SCARB1), cytochrome c oxidase, subunit I domain (CO1), and cytochrome c oxidase subunit III domain (COX3) genes were all upregulated, while luteinizing hormone/choriogonadotopin receptor (LHCGR), and insulinlike growth factor 1 (IGF-1) genes were downregulated in high litter size samples.Matrix metallopeptidase 13 (MMP13) gene was not significant difference between the two groups.However, branched-chain amino acid aminotransferase II (BCAT2) and dopachrome tautomerase (DCT) genes were validated as different expression with P > 0.05 in YH versus YL groups (Table 6 and Fig 2).Therefore, results obtained from RNA-Seq were statistically confirmed for 83.3% of the tested genes by qPCR.In all cases, the relative fold change of gene expression was in the same direction between the RNA-Seq and qPCR data.

Identification of candidate genes for pig litter size
To accurately identify the key genes that may influence porcine litter size, we created a Venn diagram of the 10 most DEGs between high and low litter size samples (Table 3), the 10 most expressed genes in the ovaries of pigs with high litter size (Table 7), the 10 most expressed genes in the ovaries of pigs with low litter size (Table 7), and the 21 genes enriched in the steroid metabolic process and ovarian steroidogenesis of the above-mentioned (S5 Table ).The results showed that 11 genes appeared in the two or more sets, including the CO1, glutathione peroxidase 3 (GPX3), beta-microseminoprotein (MSMB), COX3, tissue inhibitor of metalloproteinase 1 (TIMP1), cytochrome b (CYTB), STAR, 3-beta hydroxysteroid dehydrogenase (HSD3B), CYP11A1, SCARB1, and hydroxysteroid (17-beta) dehydrogenase 2 (HSD17B2) (Fig 3).These 11 genes may be candidates for porcine fecundity and litter size.

Discussion
Ovaries are one of the most important animal reproductive organs.They directly mediate ovulation and female hormone secretion, which has a significant impact on the fecundity of mammals [34][35][36].Some specific genes relating to animal fecundity have been reported using transcriptome analysis of certain characteristics in ovaries [18,37], however similar studies on litter size in pigs had not been conducted to date.It is well-known that there is gene expression specificity in different tissues and cells.The ovaries contain a mixture of different tissue, and the expression of candidate genes may differ between them.We strove to ensure that we obtained intact ovaries and ground them completely for the purpose of RNA extraction, to ensure that the RNA-Seq results were representative of the complete porcine ovarian transcriptome.Additionally, the age of the sow is known to greatly affect litter size.In order to minimize this and the effect of parity, we selected six pigs of similar age and parity for RNA-Seq.Two sequencing libraries were constructed from extremely high and low litter size samples.High quality transcriptome data was generated (60 million clean reads; 80% genome mapping rate), which was sufficient for the quantitative analysis of gene expression.Previous studies of goat ovaries suggested that the most differentially expressed genes identified by RNA-Seq were likely to be important for improving litter size [19].Some of the top 10 most differentially expressed genes found in our study have been reported to be candidate genes involved in important metabolic processes.For example: the PCK1 gene has previously been shown to be associated with lipid metabolism in animals [38][39][40][41]; the HSD17B2 gene has been identified as a key regulator of steroid hormone metabolism [42][43][44][45][46]; the EGR4 gene is a candidate gene for reproductivity in animal ovary and testis [47][48][49][50]; and the RAB33A gene may affect the risk of developing certain disease, such as tuberculosis [51,52].However, while there was an association between HGD and meat quality traits reported in Chinese red cattle [53,54], these gene association studies have mainly concentrated on humans or animal models.Therefore, the role of these genes in pig fecundity requires further investigation.From our GO and KEGG analysis, we found that the functions of the 10 most DEGs between high and low litter size samples were mainly in the metabolic pathways, ovarian steroidogenesis, citrate cycle (TCA cycle), pyruvate metabolism, and the PPAR signaling pathway.It is accepted that the molecular regulation of animal traits is very complex and the relationship between genes and traits is often that of "one-to-many" or "many-to-one".The DEGs were not only enriched in reproduction-related pathways but also in those involved with lipid and fatty acid metabolism.This suggests that the genes may be associated with both reproduction and fat metabolism.It is easily appreciated that, in the life of organisms, all physiological processes interact with each other.The pathways mentioned above are, to a greater or lesser degree, involved in the development of follicular cells and oocytes.Consequently, functional studies should be performed with these DGEs in order to identify key candidate genes influencing reproductive traits in swine.
Previous research with gene expression microarrays and QTL mapping analysis, identified 27 candidate genes, co-localizing with QTL for porcine litter size traits, which fulfill biological, positional, and functional criteria [55,56].These genes, which include CYP19A1, CYP2E1, RBP4, MSRB2, and SLC16A3, mainly encode specific proteins, hormones and cytokines.CYP11A1 (which belongs to cytochrome P450 E class group 1), RBP4 (retinol-binding protein 4), and SLC5A10, SLC7A11 & SLC6A20B (which belong to the solute carrier family) were also identified in our study as DEGs in pigs, suggesting that they may also be relevant to porcine prolificacy (S2 Table ).However, apart from these DEGs, the other candidate genes found previously to co-localize with QTL for porcine litter size were not identified in our study.This may be because of inter-species differences, or it may be a consequence of an insufficiently large sample size.Genes that are highly expressed in reproductive tissues or cells may also indicate that the gene itself is activated, as shown in previous studies [57][58][59].Therefore, we identified the 10 most highly expressed genes in the ovaries of high and low litter size pigs, respectively (Table 7).Eight of these genes (CO1, GPX3, MSMB, COX3, TIMP1, STAR, HSD3B, and CYTB) were upregulated in the high litter size samples.Some of them may be involved in specific reproductive processes.For example, previous research has shown that MSMB participates in spermatogenesis [60], and that HSD3B and STAR regulate specific gonadal development and hormone metabolism pathways [61][62][63][64].The other most highly expressed genes may play other important roles in biological processes.CO1 and COX3 genes have been extensively utilized in polymorphic and phylogenetic analysis in some species [65,66] and GPX3 is associated with certain cancers [67,68].Furthermore, in order to identify more candidate genes relate to porcine litter size, we emphatically analyzed the 21 candidate genes (S5 Table ) that may be associated with ovarian steroid hormone secretion.We showed that 19 of these 21 genes were all upregulated in the ovaries of pigs with high litter size.This suggests that the ovarian steroidogenesis pathway may be activated.These 21 genes mainly contribute to ovarian steroidogenesis and development of follicular cells.
Finally, by comparing the most highly expressed genes in the high and low litter size samples with the 10 most DEGs, and the 21 genes enriched in the steroid metabolic process and ovarian steroidogenesis, we identified a total of 11 candidate genes (CO1, GPX3, MSMB, COX3, TIMP1, CYTB, STAR, HSD3B, CYP11A1, SCARB1, and HSD17B2) relating to porcine fecundity and litter size.Therefore, the reproductive roles of these 11 genes in pigs should be further investigated in specific ovarian cells (such as theca and granulosa cells) in order to determine their functions both in vitro and in vivo.

Conclusion
This study screened for DEGs in the ovarian tissues of extremely high and low litter size Yorkshire pigs using RNA-Seq.We identified 897 genes that were upregulated and 346 genes that were downregulated in the high litter size samples.After analyzing the function of these genes, we found 11 DEGs that may be relevant to the prolificacy of pigs.This new information provides a solid foundation for further studies of the molecular mechanisms underlying porcine prolificacy.In the future, biochemical and physiological analyses of these candidate genes will be conducted.

3 .
Uses the Sus scrofa 10.2 as the reference genome annotation to classify the mapping tags into the different regions.Ratio was calculated by the number of tags on each region divided by the total tags on the whole genome.doi:10.1371/journal.pone.0139514.t002059 genes were expressed only in YL, and 17 083 genes were co-expressed in both libraries (Fig 1A).A total of 1 243 genes were differentially expressed between the two groups, in which 897 genes were upregulated and 346 genes were downregulated in the YH group (Fig 1B and S2 Table

Fig 1 .
Fig 1. Comparative results of gene expression levels and differentially expressed gene distributions between the ovaries of Yorkshire pigs with extremely high (YH) and low (YL) litter size.(A) Venn diagram showing genes only expressed in the YH group (yellow circle), only expressed in the YL group (light red circle), and common to both groups (intersection).(B) Scatter plot of differentially expressed genes (YH vs. YL).Red points represent upregulated genes with log 2 (fold change) > 1 and padj < 0.05 (-log10 (padj) 1.3); Blue points represent downregulated genes with log 2 (fold change) < -1 and padj < 0.05 (-log10 (padj) 1.3).Green points represent genes with no significant difference.Fold change = gene normalized expression of the YH group / gene normalized expression of the YL group.doi:10.1371/journal.pone.0139514.g001

Fig 2 .
Fig 2. Verification of the RNA-Seq results using the qPCR method.Fold change values greater than 2 and P < 0.05 indicate overexpression in the YH group, and fold change values less than 0.5 and P < 0.05 indicate overexpression in the YL group.Genes with asterisk differ significantly (P < 0.05).doi:10.1371/journal.pone.0139514.g002

Fig 3 .
Fig 3. Venn diagram for identification of key genes influencing porcine litter size.T10DEGs represents the top 10 most DEGs between the high and low litter size samples; T10YH represents the top 10 most expressed genes in the high litter size samples; T10YL represents the top 10 most expressed genes in the low litter size samples; E21OS represents the 21 genes enriched in the steroid metabolic process and ovarian steroidogenesis.Six, in the light green overlapping set surrounded by the red triangle, represents the CO1, GPX3, MSMB, COX3, TIMP1, and CYTB genes; two, in the brown overlapping set surrounded by the red triangle, represents the STAR and HSD3B genes; two, in the light brown overlapping set surrounded by the red triangle, represents the CYP11A1 and SCARB1 genes; and one, in the violet overlapping set surrounded by the red triangle, represents the HSD17B2 gene.Therefore, 11 genes (CO1, GPX3, MSMB, COX3, TIMP1, CYTB, STAR, HSD3B, CYP11A1, SCARB1, and HSD17B2) were present in two or more assemblies.doi:10.1371/journal.pone.0139514.g003

Table 1 .
Prolificacy characteristics of Yorkshire pigs with high and low litter sizes.

Table 2 .
RNA sequencing results of mRNA from the ovaries of Yorkshire pigs with high and low litter sizes.

Table 3 .
Detailed information on the top 10 most differentially expressed genes.

Table 5 .
The top ten significantly enriched Kyoto Encyclopedia of Genes and Genomes pathways from differentially expressed genes (DEGs).

Table 7 .
Detailed information of the top 10 most expressed genes in the YH and YL groups.