Fig 1.
Association mapping and fine-mapping of the major clinical mastitis resistance QTL on BTA6.
(A) Association mapping performed with imputed BovineHD variants on BTA6. The association signal near BTA 6:88.6 Mb, shown in other dairy cattle populations was replicated in the current Dutch HF population. (B) Fine-mapping performed with imputed WGS variants in BTA 6:84–93 Mb region. A strong association signal was shown in a 200-Kb region (BTA 6:88.5–88.7 Mb), spanning over GC gene. Our lead SNP (rs110813063, marked with green and vertical dotted line) has not been reported as a candidate SNP in other CM fine-mapping studies. CM candidate SNPs from other fine-mapping studies are marked as yellow. (C) Conditional analyses including GC CNV as a covariate nullify the association signal.
Fig 2.
Discovery of multiallelic GC CNV using deeply sequenced genomes and familial structure.
(A) Schematic overview showing the lead associated SNP and ~12 kb CNV overlapping with GC. GC is a reverse oriented gene, consisting of 14 exons, of which two last exons are non-coding. Five CNV tagging SNPs were present within the GC CNV and marked with black asterisks (the middle asterisk covers three tagging SNPs). Among them, the first SNP, which was also the lead SNP, was in perfect LD with the GC CNV (r2 = 1), whereas the rest were in high LD (r2 > 0.98). The hash marks at the upstream and the intronic region of GC indicate CM resistance candidate SNPs reported by others [13,14]. (B) Sequencing depth difference between the CNV region and normal region was used to infer copy numbers. (C) A histogram of fold change in read depth values shows that majority of animals fall into diploid copy number of 2, 5 and 8, and some minor peaks occur at diploid copy number of 6, 7, 9 and 10. Based on this diploid CNs, we inferred haploid CNs of 1, 4, 5, and 6. We showed possible allelic combination(s) above each diploid CN. The diploid CN10 could be comprised of either CN5/CN5 or CN4/CN6; however, our results showed that it was always CN4/CN6. (D) Familial information and background haplotypes were used to phase the copy number and thus revealed how the CNV segregates in trios. The upper family tree shown with animal signs stands for diploid copy numbers, and the lower tree shows haploid copy numbers (the phase results of the diploid CNs).
Fig 3.
Characterization of the GC CNV tagging SNPs and allelic imbalance pattern.
(A) A schematic overview of four structural haplotypes and the five tagging SNPs inside the GC CNV, shown together with allele frequencies. (B) Five GC CNV tagging SNPs, shown with their positions, rs ID, alleles, location within GC. (C) Allelic imbalance pattern shown in Wt/Mul animals. Animals will get more supporting reads for alternative alleles for the five CNV tagging SNPs, thus the tagging SNPs will be called as heterozygous but with strong allelic imbalance.
Fig 4.
Selection signature scan and trait association (clinical mastitis resistance and milk yield) plot.
(A) A 10-Mb region with a strong selective signature signal was zoomed in (BTA 6: 84–93 Mb). Association mapping results from imputed WGS variants on CM resistance (dark blue) and MY (yellow) are shown in the upper panel; iHS results are shown in the lower panel (black). The CM resistance association peak occurs at the left side of the iHS peak, whereas MY association peak appears on the right side of the iHS peak. The red vertical line marks GC CNV. A 1-Mb region covering GC CNV, iHS lead SNP, and MY lead SNP are marked with translucent blue. (B) The extended haplotype homozygosity of the 1-Mb region marked in panel (A) is shown, together with four genes annotated in this region (top of the figure). The major haplotype shown in the upper part (black) branches outwards, implying recent positive selection acted upon this haplotype. The non-selected haplotypes, shown in the lower side (blue) rapidly break down from the iHS lead SNP. (C) Pairwise D’ and r2 values between GC CNV, iHS lead SNP, and MY lead SNP in the ~4,000 daughter proven bulls. The panel was made using Haploview software [111].
Fig 5.
eQTL mapping and colocalization of fine-mapping and eQTL mapping results for GC and the non-coding RNA.
(A) A Schematic overview of the GC gene structure and position of the GC CNV. Our data detected two GC transcripts, where the canonical form account the majority of the expression (98%) and an alternative form only counting for minor expression (2%). (B) eQTL was mapped for the genes located in a 2-Mb bin (BTA6:87.68–89.68). Of the 13 genes annotated in this bin, GC showed predominantly high expression (5,000 TPM <), whereas the rest were lowly expressed or not expressed at all. The eQTL were mapped for GC and SLC4A4. (C) CM resistance fine-mapping results were shown for the 2-Mb bin, where eQTL was mapped. The color scale indicates the degree of pair-wise LD (r2) between the GC CNV and other SNPs. Annotation of genes in this region is drawn as black bars. Six genes on the left part are AMBN, JCHAIN, RUFY3, GRSF1, MOB1B, and DCK. (D) eQTL mapping results for GC (canonical transcript). (E) P-values obtained from CM resistance fine-mapping and GC eQTL mapping (canonical transcript) were correlated. The GC CNV is located in the right upper corner (ρ = 0.68), showing that it is significant for both fine-mapping and eQTL mapping. (F) The box plot shows altered GC (canonical transcript) expression depending on GC CNV genotypes. (G) eQTL mapping result for GC (alternative transcript). (H) P-values obtained from CM resistance GWAS and GC eQTL mapping (alternative transcript) were correlated. The GC CNV is located in the right upper corner (ρ = 0.74), showing that it is significant for both fine-mapping and eQTL mapping. (I) The box plot shows altered GC (alternative transcript) expression depending on GC CNV genotypes. Panels C-E, G, H were made with LocusCompare programme [99].
Fig 6.
Inspection of functional elements near the GC CNV.
Functional elements were inspected in GC eQTL region using ChIP-seq (H3K27ac and H3K4me3) and ATAC-seq data. The GC CNV genotype of the ATAC-seq sample was Mul/Mul (inferred based on CNV tagging SNP genotypes and read-depth increase in WGS data). The GC CNV genotype of the ChIP-seq sample was unknown (S9 Table).(A) The GC eQTL region was zoomed in. In this region, GC is the only annotated gene. The GC CNV is marked with the translucent blue, and CM resistance candidate SNPs reported by other studies [13,14] are marked with translucent yellow. Other significant eQTL lead SNPs in this region are marked with translucent pink. We overlaid ChIP-seq data to identify putative enhancers and promoters (ChIP-seq tracks; red). Furthermore, liver ATAC-seq data revealed highly accessible chromatin regions, supporting the regulatory elements discovered by ChIP-seq data sets (ATAC-seq tracks; blue). (B) We further zoomed in to the ATAC peak within the GC CNV, and discovered that the ATAC peak overlaps with MER 115. The predicted hepatic transcription factor binding sites are marked with translucent grey. (C) Transcription factor binding motifs are shown together with the ATAC signal located inside the GC CNV.
Fig 7.
Summary of the key findings and hypothesis of physiological aspects linking GC expression and CM resistance.
A schematic overview summarizing the allele effects of wildtype (CN 1) and multiplicated (CNs 4–6) alleles of the GC CNV, a likely causal variant for the major CM resistance QTL. The two alleles at the GC CNV locus lead to altered GC transcription, where the multiplicated alleles correspond to high GC expression. On the bottom shows the phenotypic association between the GC CNV and CM resistance, where the multiplicated allele is associated with low CM resistance. Finally, the area marked with grey shade shows our hypotheses that the amount of DBP is positively related with the GC expression. Further, we speculated that the amount of DBP and free vitamin D is inversely correlated, as long as vitamin D is bound by DBP, it is not biologically available. The solid arrows indicate the relations based on our findings. The dotted arrows indicate the relations based on our speculation.