Single Nucleotide Polymorphism Discovery in Bovine Pituitary Gland Using RNA-Seq Technology

Examination of bovine pituitary gland transcriptome by strand-specific RNA-seq allows detection of putative single nucleotide polymorphisms (SNPs) within potential candidate genes (CGs) or QTLs regions as well as to understand the genomics variations that contribute to economic trait. Here we report a breed-specific model to successfully perform the detection of SNPs in the pituitary gland of young growing bulls representing Polish Holstein-Friesian (HF), Polish Red, and Hereford breeds at three developmental ages viz., six months, nine months, and twelve months. A total of 18 bovine pituitary gland polyA transcriptome libraries were prepared and sequenced using the Illumina NextSeq 500 platform. Sequenced FastQ databases of all 18 young bulls were submitted to NCBI-SRA database with NCBI-SRA accession numbers SRS1296732. For the investigated young bulls, a total of 113,882,3098 raw paired-end reads with a length of 156 bases were obtained, resulting in an approximately 63 million paired-end reads per library. Breed-wise, a total of 515.38, 215.39, and 408.04 million paired-end reads were obtained for Polish HF, Polish Red, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed 93.04%, 94.39%, and 83.46% of the mapped sequencing reads were properly paired to the Polish HF, Polish Red, and Hereford breeds, respectively. Constructed breed-specific SNP-db of three cattle breeds yielded at 13,775,885 SNPs. On an average 765,326 breed-specific SNPs per young bull were identified. Using two stringent filtering parameters, i.e., a minimum 10 SNP reads per base with an accuracy ≥ 90% and a minimum 10 SNP reads per base with an accuracy = 100%, SNP-db records were trimmed to construct a highly reliable SNP-db. This resulted in a reduction of 95,7% and 96,4% cut-off mark of constructed raw SNP-db. Finally, SNP discoveries using RNA-Seq data were validated by KASP™ SNP genotyping assay. The comprehensive QTLs/CGs analysis of 76 QTLs/CGs with RNA-seq data identified KCNIP4, CCSER1, DPP6, MAP3K5 and GHR CGs with highest SNPs hit loci in all three breeds and developmental ages. However, CAST CG with more than 100 SNPs hits were observed only in Polish HF and Hereford breeds.These findings are important for identification and construction of novel tissue specific SNP-db and breed specific SNP-db dataset by screening of putative SNPs according to QTL db and candidate genes for bovine growth and reproduction traits, one can develop genomic selection strategies for growth and reproductive traits.


Introduction
In recent years, the development of RNA-seq technology has completely revolutionized the way of thinking in molecular biology, and allowed a more detailed insight into gene expression by examining expression level, sequence variation, gene structure, and strand-specificity simultaneously [1][2][3][4]. The bovine genome is about 3 GB (3 billion base pairs), and contains approximately 22,000 genes, of which 14,000 are common to all mammalian species. The genome assembly of a Hereford cow produced by the Baylor College of Medicine Human Genome Sequencing Centre (BCM-HGSC) has now allowed years of work in cattle QTL mapping to be associated with genes and other genomic features [5][6]. This high throughput genome sequencing and identification of abundant amount of DNA variants for the bovine genome have significantly facilitated the analysis of genetic variation among cattle breeds. However, very little is known regarding the transcriptomic expression of genes in cattle and the variation between breeds. However, very little is known regarding the SNP-driven transcriptomic expression of genes in cattle and the tissue level database of SNPs (SNP-db) that associate with expressed genes in the tissue. Therefore, importance of understanding single nucleotide variation in the mapped transcriptome is important to identify the trait-associated genomic mutations in shaping the phenotypes [7]. In this study, the pituitary gland transcriptomes of three cattle breeds: Polish HF, Polish Red, and Hereford were investigated. Polish HF and Polish Red are the native cattle breeds utilized both for dairy and beef productions, whereas the Hereford breed is used solely for beef production. It is well known that the pituitary gland transcriptome, due to its potential endocrinal role, has the major influence towards genetic improvement of economic traits, particularly the growth and reproduction trait. In general, the bovine pituitary gland is responsible for growth control at different stages of development. This involves complex interactions of several hormones and growth factors, acting in both an endocrine, and a paracrine or autocrine manner. In this context bovine growth and development in young animal ages, candidate genes such as the pituitary growth hormone (GH or somatotropin) and its receptor (GHR), insulin-like growth factors (IGF), IGF binding proteins (IGFBP) and the IGF-I receptor (IGF-IR), myogenic factors (MYFs), and somatostatin, plays an essential role in postnatal growth regulation, especially in nutrient utilization [8][9][10][11]. Generally, in cattle breeding practice and selection programs, the maximum genetic response is favorable for traits of economic importance, which includes growth traits.
In animal genomics, RNA-Seq could also provide a powerful tool for high-resolution genomic analysis of tissues and cell populations of domestic animals to identify novel mutations and transcripts affecting the major economic traits [12]. Gene expression thus is a useful marker for economic traits, suggesting that RNA-Seq could play a role in domestic animal breeding and selection programs viz., marker assisted selection (MAS), gene assisted selection (GAS) and genomics selection (GS) [13][14][15]. In animal genomics, detection of potential putative SNPs for economic traits has great potential in genetic improvement of all domestic livestock animals, including cattle. In the past decade, implementations of MAS, GAS and GS were highly recommended to the global cattle breeding program worldwide. In last 5-6 years, several studies have reported on the identification of novel SNP discoveries using RNA-seq technologies in cattle [16][17][18][19][20][21][22][23][24][25] (Table 1). To best of our knowledge, no studies have reported about the detections of SNPs in bovine pituitary gland except, some of our preliminary results presented at an international conference [26]. A brief summary of the global RNA-seq studies on the detection of SNPs in cattle is illustrated in Table 1. In this study, we used mRNA-Seq to characterize and compare the bovine pituitary transcriptomes of the Hereford, Polish Red, and Polish HF breeds, with respect to single nucleotide variations, CGs, QTLs regions and to identify the novel breed-specific SNPs.

mRNA sequencing and read alignment
To obtain the bovine pituitary gland transcriptome at a single-nucleotide resolution, two biological replicates of poly(A)-enriched mRNA of young bulls aged 6, 9, and 12 months from three cattle breeds were converted to barcoded strand-specific dUTP RNA-seq libraries, and sequenced on the Illumina NextSeq 500 sequencer. Sequencing generated a total of 113,882,3098 raw paired-end reads with a length of 156 bases. The reads were de-multiplexed to assign reads to each sequenced sample according to its index. An average of 63 million paired-end reads was obtained per library (n = 18). The FastQ sequence dataset of each library was submitted to NCBI-SRA experiment number SRS1296732 (http://www.ncbi.nlm.nih.gov/ sra?linkname=bioproject_sra_all&from_uid=312148). Using the BWA programme, breed-specific pituitary gland transcripts were mapped to the bovine reference genome (UMD3.1 assembly plus Y chromosome) and the results were summarized in Table 2 (Polish HF), Table 3 (Polish Red) and

SNP discoveries in cattle breeds
Using a stringent filtering parameter of read count (SAMtools package), with a minimum depth of 5, detection of putative SNPs were performed in three breeds. The Venn diagram is displayed to report the number of SNPs shared among three cattle breeds (Fig 1). For the detection of breed-specific putative SNPs, we utilized two stringent parameters, viz., a minimum 10 SNP reads per base with an accuracy 90%, and a minimum 10 SNP reads per base with an accuracy = 100%. The SAMtools identified single base substitutions (SNPs) as well as small insertions and deletions (indel); however, only SNPs results were considered to publish in this paper. A total of 13.7 million breed-specific SNPs positions were detected with the RNA-Seq reads, with an average of 765,326 (~0.76 million) SNPs per young bull (S1-S18 Tables). This high percentage of novel SNPs suggests that a large fraction of the genetic variability present in cattle breeds still remains to be discovered. Furthermore, based on the S1-S18 Tables, the raw SNP database (SNP-db) of pituitary gland transcriptome representing the investigated young bulls of the Polish HF, Polish Red, and Hereford breeds was constructed (S19 Table). Using two stringent filtering parameters, SNP-db records were trimmed to construct a highly reliable SNP-db and resulting in reduction of 95,7% and 96,4% cut-off mark of constructed raw SNPdb (S19 Table). SNPs (10reads 90%), 20,573 SNPs (10reads = 100%) for Polish Red breed, respectively. Furthermore, the SNP-db showed the highest percentage of 10reads 90% and 10reads = 100% filtered SNPs in Hereford breed (4.6% and 4.0%), followed by Polish HF breed (4.5% and 3.6%), and Polish Red breed (3.8% and 3.1%), respectively. Based on their types and occurrences, filtered SNP-dbs [S20 and S21 Tables] were further categorized into the following four sub-categories, as shown in S22 Table. 1. SNPs represented in any of six samples per given breed (SNP-type designated as ANY) 2. SNP detected in all six samples per breed (SNP-type designated as Breed-specific). The S22 Table showed that SNP categorization substantially reduced the SNP-db, resulting in highly reliable breed-specific, and breed-specific unique SNP-dbs. Moreover, the breed-specific UNIQUE SNP-db were further explored and illustrated into three S Tables for the Hereford (S23 Table), Polish Red (S24 Table), and Polish HF breeds (S25 Table), respectively. Finally, Eight putative SNPs from Polish HF were selected from one of the unique SNP-db (S25 Table), for SNPs validation experiment.
In the following sub-section of SNPs analysis, it was assumed that certain regions of Bos taurus genome are still unknown (base = N, according to recent mapping UMD3.1). After applying the both SNP filtering criteria (10 SNP reads with an accuracy 90% and = 100%), a total of 1039 best candidates of de novo SNPs reads were identified. (S26 and S27 Tables). Lastly, it was also assumed that certain regions of the reference Bos taurus genome still contained mutations or errors which were not correctly annotated. Such errors were corrected by analysing the SNPs database through successive annotations. After applying the filtering criteria of 10 SNP reads with an accuracy 90%, a total of 5.041 potential putative SNPs were identified, whereas, filtering criteria of 10 SNP reads with an accuracy of = 100%, a total of 567 potential putative SNPs was identified (S28 Table).
Breed-specific SNPs discoveries and QTLs/CGs analysis. For each breed, a set of 76 QTLs/CGs loci (http://www.animalgenome.org/cgi-bin/QTLdb/index) was comprehensively investigated (S29 Table). The chromosomal locations and SNP locations of identified putative SNPs loci for each breed representing three developmental ages were summarized in detail in supplement S30-S32 Tables respectively. This comprehensive QTLs/CGs analysis, allowed us to identify within breed and between breeds putative SNPs for each breed and developmental ages of young bulls (S33-S35 Tables). The results showed that most of the identified top 20 CGs were represented in all three breeds and developmental ages, however, numbers of SNPs hits loci varied across developmental ages. Furthermore, KCNIP4, CCSER1, DPP6, MAP3K5 and GHR genes loci were identified as top 5 potential CGs with highest SNPs hit loci in all three breeds and developmental ages, while CAST gene locus was identified as potential CGs expressed with more than 100 SNPs hit in Polish HF and Hereford breeds (S30-S35 Tables). Additionally, a phylogenetic tree [27] was constructed to see how similar the cattle within the same breed. Results revealed that all three breeds were clustered together for each breed and were separated from for each other (Fig 2). Thus, the study proved to indicate that SNP calling form RNA-seq data has been effective at identified genetic variation.

Breed-specific SNP Validation
In order to validate the breed-specific SNP discovery process, a subset of eight putative SNPs, identified as uniquely specific to the Polish HF breed, were evaluated using single-plex KASP™ assays (LGC Genomics), which are based on fluorescently labelled allele-specific PCR primers. The SNP identity with normal and mutant alleles, genome locations, SNP gene descriptions of selected SNPs are presented in S36 Table. An additional analysis for the estimation of direct relationship between breed and detected SNP polymorphisms was performed using the PROC MIXED SAS 9.2 package with the age as a random effect. Results revealed no significant association between developmental ages and SNPs selected for the validation. However, significant and highly association between breed and SNPs selected for validation were observed (S37 Table).
Based on this initial analysis, KASP™ SNP assay results were organized according to distributions of genotypes for each validated SNPs locus in all the investigated young bulls from three breeds (S38 Table). Furthermore, the distribution of and frequencies of SNPs alleles (S39 Table) results revealed the highest numbers and frequencies of mutant alleles in the SNP loci of all young bulls from the three breeds (n = 44), except, BTA4_77555734 and BTA19_27086284 loci. Moreover, the genotypes distributions of each validated SNP were further illustrated specifically for each breed, and are presented in S40 Table. Based on the observed genotypes and allele frequencies (S38-S40 Tables), the results revealed that the selected SNPs mutations were polymorphic to tested cattle breed population, and were in accordance with RNA-seq observations. Furthermore, a subset of eight breed-specific putative SNP makers from the Polish HF breed worked well to KASP™ SNP genotyping assay (LGC Genomics), and did not reveal either non-amplification or ambiguous clustering, except in few samples which did not respond to KASP™ assay because of poor DNA quality (Figs 3-10).
Based on the initial KASP™ SNP assays results (S38-S40 Tables), the statistical analysis of SNP validation was carried out by the Genepop software [28,29]. Using the Fisher's Exact Probability test, the genetic differentiation of SNP alleles and SNP genotypes results revealed significant differences in the SNP alleles frequencies (S41 Table), and SNP genotypes frequencies (S42 Table) among the investigated cattle breeds and SNP loci. Moreover, the comparison of allele and genotypic frequencies of selected SNPs among the investigated cattle breeds showed that highly significant differences between the Polish HF breed vs Polish Red breed, and the Polish HF breed vs Hereford breed, thus confirming the validation of selected Polish HF breed-specific SNP markers. Moreover, using a Markov chain method the SNP validations results were further confirmed by testing the deviation from Hardy Weinberg equilibrium (HWE) for each SNP locus (S43 Table), and each breed population (S44 Table). The results revealed highly significant differences in SNP genotypes and alleles frequencies among the cattle breeds across all eight SNP loci and all three breeds.

Discussion
In cattle genomics, the key challenge in SNP discovery is to distinguish true individual variants (occurring at a low to very low frequencies) from sequencing errors (which often occurs at higher frequencies). Breed-specific SNP discovery is critical for successful implementation of GAS and GS programs for genetics improvement in economically important traits [10][11][12]. With increasing attention being paid to genomic selection based on the next generation sequencing (NGS) by animal breeders, the primary goal of genome-wide SNP discovery using NGS is to identify large number of markers that provide complete set of genetic variants for economic traits. Herein this paper by using the NGS based RNA-seq strategy, the pituitary gland transcriptome analysis of the three cattle breeds was accomplished by performing the alignment of mapping read, followed by detections of breed-specific SNPs, and finally the validation of SNPs.

Alignment of mapping reads
The mapping of sequence reads was performed with BWA (STAR alignment tool) using their default settings on the same Linux based servers. The study of Wang and colleagues suggested both Bowtie and BWA tools could be utilized for fast and efficient alignment of Illumina short reads [30]. However, other study [31] found that BWA was superior to Bowtie at mapping short reads. For example, a higher percentage of read mappings were achieved using BWA (84.8%) compared to Bowtie (62.3% including the suppressed reads) when Illumina reads of eight genotypes were mapped onto the reference sequence. Therefore, in this paper, the BWA based assembly was chosen for downstream analysis because it produced higher percentages of mapped reads.
In cattle genomics, very few tissue-specific RNA-seq experiments on SNP discoveries were reported. According to the PubMed literatures, no RNA-seq experiment on SNP discoveries in bovine pituitary gland transcriptome has been conducted so far, besides our preliminary results [26]. A previous study on the longissimus thoraci transcriptome from three Limousine bull calves revealed approximately 36-46 million paired-end reads per library, and 63% to 67% of the mapped reads were aligned properly to paired end with almost negligible transcriptome contamination, using the BWA program [22].
By sequencing the milk transcriptome mapped uniquely onto the bovine genome, approximately 65% of the RNA-Seq reads were successfully mapped and aligned to the reference genome, and approximately 17,000-19,000 genes were expressed in milk [32]. In an another bovine RNA-seq experiment on rumen epithelium tissue, the study of Baldwin and co-workers found that~71% of the reads were mapped onto~17,000 different genes of Bos taurus reference genome [33].
A previous study of RNA-seq study on bovine milk transcriptome identified 118 million reads of 36 to 40 bp size, averaging 17 million short-sequence reads for each milk sample. Study further revealed that approximately 70% (82.7 million) reads were properly mapped and remaining 35 million were unmapped. Approximately 87.5% of uniquely mapped reads corresponded to total exon reads, and a small remaining fraction corresponded to total intron reads [16].
While analysing the bovine embryo, the RNA-seq experiment on single bovine blastocysts reported approximately 38 million sequencing reads per embryo, and 9,489 known bovine genes expressed with high correlations of expression levels between the samples. [20]. In another study using multiple tissues (i.e., hypothalamus, pituitary gland, ovary, uterus, and endometrium, longissimus dorsi muscle, adipose, and liver) identified an average of 30 million sequences reads from each sample tissue, and mapping of approximately 70 to 80% of the sequence reads to the 27,368 annotated genes in the bovine genome assembly UMD3.1.74 [23].
While analysing the RNA-seq experiment of bovine leukocyte transcriptomes from two taurine breeds (Holstein and Jersey), and one indicine breed (Cholistani), a total of approximately 21 million paired-end fragments were sequenced for each of the three breeds. The Tophat results revealed that an approximately 70% of the sequenced fragments were successfully aligned. The study further concluded that more than 90% aligned fragments were mapped to unique genomic regions [18].
Recently, the Qinchuan beef cattle transcriptome analysis in context to longissimus muscle growth and development identified approximately 2 billion pair-end reads of 100 bp in length for bovine longissimus muscle from three embryos and three adult bovines. For Emb135 and 30M samples, approximately 79% and 77% pair-end reads were mapped to the UMD 3.1 reference genome [19].

Breed-specific SNP discoveries
The alignment file (.gz and.txt format) generated by BWA was used as input files for bovine pituitary gland SNP discovery using SAMtools [34]. The SNP detections were performed using two different stringent parameters of SNP filtering process approaches. The overall constructed SNP database for three breeds yielded approx. 13.7 million of SNPs. While, after filtering to different stringent parameters, 0.59 million and 0.49 million SNPs were detected for the stringent parameters of with minimum 10 SNP reads per base and with an accuracy of 90% and = 100%, respectively.
In an analysis of the transcriptome, utilizing RNA-seq of bovine Longissimus thoraci from three Limousine bull calves a total of 34,376 different SNP were detected [22]. Amongst these SNPs, 8,974 (26%) were homozygous in all three sequenced samples, corresponding presumably to differences between Limousine and Hereford. This study also found that there were 30,998 bi-allelic SNPs mapping to coding regions, 38.6% of which were previously found and recorded in dBs [22].
In a study on milk transcriptome, approximately 100,000 SNPs located in genes expressed in milk samples from the Holstein cows were identified [16]. However, only 32% of SNPs (33,045) were polymorphic within their seven Holstein cows. In another study, [23], putative SNPs within three investigated cattle breeds i.e., the Holstein, Jersey, and Cholistani were reported. Results showed that among 22,100,344 surveyed bases in Holstein, 32,547 (0.147%) were reliably called SNPs, and 39,370 out of 19,967,581 (0.197%) bases were polymorphic within Cholistani. Using different SNPs variant calling tools, a total of 31,334 and 30,618 putative SNPs were identified for the longissimus pooled samples of adults and pooled samples of embryo. For both samples, the SNP detection stringency conditions of i) at least two unique mapping reads that support the polymorphic nucleotide, and ii) a quality score of 20 were utilized [19]. The potential applications of RNA-Seq present unique benefits in terms of SNP analysis because of its wide dynamic range and ability to identify functional sequence variants. The detection and categorization of SNPs within animal production systems have been performed extensively [15]. The use of these genetic variants as markers and predictors of performance in a large variety of economic traits (i.e. meat production, milk production, fertility, calving ease, etc.) is quite common in animal breeding practices [35][36][37]. In the context of early body growth and development trait, classifying SNPs between samples of varying viability, ages, or breed allows for discovery of novel markers of growth traits and characterization of critical regulatory mechanisms of body growth and development. Traditionally, the use of transcript sequence for SNP variant detection has been performed with various assays [38], and more recently in cattle using RNA sequencing data [16].

Conclusions
This is the first study to report breed-specific SNPs discovery using NGS based RNA-seq in bovine pituitary gland. The study generated nearly fourteen million breed-specific SNP-db based on the expressed genes of the bovine pituitary gland, with mapping accuracy comparable or better to previous works in Bos taurus. These databases of breed-specific SNP as well as identified putative SNPs within the QTLs/CGs for bovine body growth and development trait will improve the genomic resources available for cattle, especially for beef breeds, and may also prove useful to study the mechanisms underlying the genetic variability in meat quality and reproduction traits. KASP™ SNP assay was proven to be an efficient cost effective method to validate the breed specific putative SNPs originating from RNA-seq experiments.

Materials and Methods Animals
The bovine pituitary gland samples from young growing bulls were collected from Institute of Genetics and Animal Breeding, Polish Academy of Science (PAS), Jastrzębiec, Poland. After slaughtering of the animals, the collected tissues were immediately kept in liquid nitrogen, and finally stored in deep freezer at -80°C. All procedures involving animals were performed in accordance with the guiding principles for the care and use of research animals, and were approved by the local ethics commission (permission No. 3/2005) of Institute of Genetics and Animal Breeding, Polish Academy of Science (PAS), Jastrzębiec, Poland.
The experimental design comprised of 18 young bulls for RNA-seq experiment, and 44 young growing bulls for SNP validation experiment (Table 5), in a panel of three selected cattle breeds: Polish HF, Polish Red, and Hereford, respectively. All experimental animals were reared at Institute of Genetics and Animal Breeding, Jastrzębiec, Poland in a closed herd, and providing uniform feeding and environmental conditions.

Laboratory procedures
Total RNA was extracted and prepared from 50-60 mg of frozen bovine pituitary gland tissues (n = 18) using the guanidinium thiocyanate method [39] (TRIzol reagent: Invitrogen, Carlsbad, CA, USA). Preliminary RNA samples were evaluated with the Agilent BioAnalyzer using the Nano RNA Kit. Only samples with RNA Integrity Number (RIN) > 7.0, and 5 total μg were used for library preparation, and two biological replicates were used for each age/breed group. The mRNA isolation was carried out by using the Dynabeads 1 mRNA Direct™ kit (Thermo Fisher), and following recovery, polyA RNA was evaluated using the BioAnalyzer Nano mRNA assay, verify that samples contain less than 2.0% rRNA. For dUTP directional mRNA libraries preparation, 25-100ng mRNA was fragmented by chemical hydrolysis, converted to first strand cDNA with random hexamers, and second strand synthesized with dUTP according to the NEBNext Ultra Directional RNA library preparation Kit for Illumina (New England Bio Labs). The cDNA fragments were end-repaired, A-tailed, and ligated to the TruSeq y-tail single indexes from Illumina TruSeq DNA kit. Indexed libraries were cut with USER enzyme, and PCR amplified for 12 cycles. Finally, to achieve the highest quality data on Illumina sequencing platforms, optimum cluster deposition was made by quantitation of libraries using qPCR according to the Illumina Sequencing Library qPCR Quantification Guide (Kapa Biosciences). 156x156 bp paired-end sequence reads were generated using the Illumina NextSeq 500 platform High Output/300 cycle kits from Illumina.

Bioinformatics analysis
Pre-processing analysis of RNA-seq data and sequence quality control (QC) was performed prior to read alignment to reference genome. For all libraries, the adaptor sequences were removed using the cutadapt software with minimum overlap length was set to 10 and error rate was set to 0.05 [40]. After adaptor trimming, the low quality bases were trimmed from 3'end followed by sequence quality control (FastQC). The post processing analysis of RNA-seq data was performed i) to map the sequencing read (read alignment), as well as ii) to discover the breed-specific putative SNPs.
2. Breed-specific discoveries of novel putative SNPs in Polish HF, Polish Red, and Hereford cattle. For the analysis of SNPs discoveries (variant calling), the SAMtools mPileUp package to call SNPs and indels [34] was utilized to detect the putative breed-specific SNPs in bovine pituitary gland transcriptome. Using Microsoft Office Excel, the following two stringent parameters of SNP filtering were performed.
• Stringent parameters of SNP filtering with minimum 10 SNP reads per base and with an accuracy of 90%.
• Stringent parameters of SNP filtering with minimum 10 SNP reads per base and with an accuracy = 100%. Similarly, using Microsoft Office Excel the breed-specific SNP database of bovine pituitary gland were further trimmed to a highly reliable breed-specific SNP-db according to SNP type, de novo SNP and annotated SNPs, respectively.
3. Breed-specific SNPs discoveries and QTLs/CGs analysis. The publicly available animal quantitative trait loci (QTL) database (Animal QTLdb: http://www.animalgenome.org/cgibin/QTLdb/index) representing trait associated QTLs as well as candidate gene and association data (GWAS) mapped to bovine genomes was utilized to find putative SNPs within the candidate genes for bovine body growth and development trait [43]. A total of 76 potential QTLs/CGs (S29 Table) for bovine body growth and developmental trait was selected to investigate the RNA-seq SNP-db. To identify the putative SNPs within the CGs loci, stringent SNP filtering was carried out on both RNA-seq and bovine QTL databases using Microsoft Office Excel. KASP™ assays design and SNP validation. Eight breed-specific SNPs (Polish HF) were selected for KASP™ SNP assays validation experiment. The primer sequences are presented in Table 6. The genomic DNA of 44 young bulls from Polish HF, Polish Red, and Hereford were isolated from pituitary gland tissue using the MasterPure™ DNA purification kit with some modifications (Epicentre). All 44 DNA samples were then shipped to LGC genomics Teddington, Middlesex, TW11-0LY, UK (http://www.lgcgroup.com) to perform the KASP™ SNP genotyping assay including primer design. For each of selected breed-specific SNP marker, two allele specific forward primers and a common reverse primer were designed for use in fluorescence based competitive allele-specific PCR assays. Primers designed were done by LGC genomics, UK, by blasting the region of +200bp and -200pb around the nucleotide variation using the PrimerPicker [44]. The KASP™ genotyping reaction and PCR thermocycling conditions were carried out according to manufacturer's recommendations (LGC Genomics, UK). PCR reaction mixture was carried out in a 10 μl-volume consisting of 2.5 μl DNA (10 ng/μl), 2.5 μl 3. Genic and genotypic differentiation was tested for each locus for all populations, and for each population pair across all loci.
4. Significance of differences in allele frequencies between samples were determined by the Exact Fisher's method and the Markov Chain procedure [28], and the Fisher combined test was computed as a global test over loci to determine the overall significance.
5. Deviation from Hardy-Weinberg equilibrium for each population was calculated by a probability exact test using a Markov Chain method [29] as implemented in Genepop.
Supporting Information S1  Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-9 of Polish Red cattle aged 9 months. (XLSX) S10 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-10 of Polish Red cattle aged 9 months. (XLSX) S11 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-11 of Polish Red cattle aged 12 months.
(XLSX) S12 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-12 of Polish Red cattle aged 12 months.
(XLSX) S13 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-13 of Polish HF cattle aged 6 months.
(XLSX) S14 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-14 of Polish HF cattle aged 6 months.
(XLSX) S15 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-15 of Polish HF cattle aged 9 months. (XLSX) S16 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-16 of Polish HF cattle aged 9 months. (XLSX) S17 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-17 of Polish HF cattle aged 12 months.
(XLSX) S18 Table. RNA-seq SNP-db of bovine pituitary gland tissue of young bull-18 of Polish HF cattle aged 12 months.