Pleiotropic Effect of Common Variants at ABO Glycosyltranferase Locus in 9q32 on Plasma Levels of Pancreatic Lipase and Angiotensin Converting Enzyme

For forty-three clinical test values presumably associated to common complex human diseases, we carried out a genome-wide association study using 600K SNPs in a general Japanese population of 1,639 individuals (1,252 after quality control procedures) drawn from a regional cohort, followed by a replication study for statistically significant SNPs (p = 1.95×10−9–8.34×10−39) using an independent population of 1,671 from another cohort. In this single two-stage study, we newly found strong and robust associations of common variants at the ABO histo-blood glycosyltransferase locus in 9q32 with the plasma levels of pancreatic lipase (P-LIP), in addition to successful confirmation of the known ABO association of angiotensin converting enzyme (ACE) independent of the ACE1 gene in 17q23.2 with the ACE level. Our results are compatible with the previously reported association between the ABO gene and pancreatic cancer, and show that the effect of these common variants at the ABO locus on the P-LIP and ACE levels is largely opposing and pleiotropic.


Introduction
Genome-wide association studies using hundreds of thousands of single nucleotide polymorphisms (SNPs) have been revealing important genetic components underlying the common complex human diseases [1], even though their effect sizes are so modest or small as not to account for the original heritabilities of diseases [2]. In addition to such dichotomous traits, some quantitative characteristics such as body mass index (BMI), blood pressure or various kinds of clinical test values in general human populations are also attractive targets for genome-wide association study [3], [4] which are sometimes called as intermediate phenotype, endophenotype or biomarker presumably correlated to unobservable liability of diseases that has long been utilized as a theoretical tool to estimate diseases heritability [5]. With such quantitative endophenotypes underlying the common complex human diseases, association studies could be much more informative and powerful than with dichotomous traits themselves [6].
In order to identify genetic components affecting quantitative clinical test values, we carried out a population-based genomewide association study and a subsequent replication study for the statistically significant SNPs beyond a genome-wide significance level (5610 28 ) or the Bonferroni's corrected level by the number of phenotypes (5610 28 /43). For this two-stage design, we utilized two independent sample populations in Yamagata Prefecture located in the northeastern district of Japan; one from a regional cohort established in a small rural town, Takahata Town, for the 1st genome-wide genotyping, and another from a different cohort in the largest urban capital of the prefecture, Yamagata City, for the replication.

Results
Genome-wide genotyping in the 1st stage By applying standard quality control procedures (see the Methods for details) to the genome-wide genotyping data obtained using 600K SNP BeadChip (Illumina) in the Takahata population of 1,639 individuals, we eliminated low quality SNPs (i.e. low minor allele frequency, high missing rate or deviation from the Hardy-Weinberg equilibrium) and individuals with unusual statistics (i.e. high missing rate, high heterozygosity or cryptic relatedness) as well as potential population stratification [1], to have a high quality data set consisting of 436,670 SNPs in 1,252 individuals with 43 endophenotypic values (see a detailed list in the legend of Figure 5). By applying a standard linear regression analysis for each SNP in this data set with adjustment for (i.e. elimination of the potential confounding effect of) age and gender as covariates, we found strong associations of nine common variants at the ABO histo-blood glycosyltransferase locus in 9q32 with two endophenotypes, the plasma levels of P-LIP (Genomic inflation factor based on median chi-squared = 1.013) and ACE (1.011) (Figure 1) Table 2. Using this 1st data, we imputed unobserved variants on chromosome 9 based on data from 1000 Genomes project and test the effects of imputed variants on the ABO locus on the P-LIP and ACE levels (Tables S1 and S2 in File S1). These results show that there is no variant having lower p-value than that from the real data.
Previously reported associations of ABO with red blood cell related traits [4] or liver enzymes [7] in Asian populations were not found in our populations possibly due to specific factors to each study (c.f. sample size, local/global differences of populations, and/or measurement error inherent in such values). Moreover, although, in this 1st screening using the SNP array in our population, some SNPs were suspected around the genome-wide threshold for other phenotypes, we first focus on this only simultaneous effect of P-LIP and ACE to perform then a replication study in independent sample set.
Confirmatory genotyping in the 2nd stage By genotyping of these associated SNPs using a custom BeadChip (Illumina) in a different sample population of 1,671 individuals from the Yamagata cohort, we successfully confirmed all these associations at highly significant levels ranging from p = 6.58610 211 (rs500498) to 1.33610 221 (rs8176746) for P-LIP and from 2.96610 210 (rs2073824) to 8.34610 239 (rs495828) for ACE in 9q32, and also from 1.13610 229 (rs4461142) to 5.21610 2107 (rs4362) for ACE in 17q23.2, as listed in Tables 1  and 2. Of these SNPs, rs2073824 had a discordant minor allele between the Takahata and Yamagata populations probably due to poor clustering in the 2nd assay (Figures S1 and S2 in File S1), so we eliminated it for further analyses.

Adjustments for potential confounders
These associations were robust to additional adjustment for alcohol intake (for P-LIP) or smoking (for ACE) as potential confounders, and to the Box-Cox transformations for normalizing these endophenotypic distributions having a long-tail (Tables S3  and S4 in File S1). In addition, because the most common and largest potential confounder is treatment of hypertension such as  administration of some ACE inhibitors, we tried to control it by adjustment for the dichotomized treatment history for hypertension (i.e. treated or not) along with age and gender, and then found that these associations were robust to this adjustment and even to a subgroup analysis in which certainly untreated individuals in both populations (840 in Takahata and 844 in Yamagata) were combined in a single group to allow direct comparisons with foregoing analyses because of its compatible sample size (1,684) with the original 1st or 2nd populations (Tables S5 and S6 in File S1).

Association pattern and linkage disequilibrium structures
The detailed association patterns and linkage disequilibrium (LD) structures within about 180 kb of genomic segments around the ABO and ACE1 loci suggested that these genes themselves, rather than adjacent genes, affect each endophenotype in the two independent populations (Figures 2 and 3). Moreover, these patterns are very similar between the 1st and 2nd populations, implying that there is no serious bias from potential population structure.

ABO association with the P-LIP level
As the results of this two-stage study, we newly identified the robust and strong associations of eight common variants at the ABO locus with the plasma level (i.e. leaking level) of P-LIP potentially correlated to some pathological status of the pancreas. For example, the most significant SNP, rs8176746 that causes a nonsynonymous amino-acid change (Leu266Met) chiefly dividing between the B and other lineages (Table S7 in File S1), can account for about 6.7 percent of the total variance in the P-LIP level in the Takahata population (8.3% of that in the Yamagata), in terms of degrees of freedom adjusted R squared (ARS; also called coefficient of determination) listed in Table 1. Intriguingly, a robust association of the ABO histo-blood group with pancreatic cancer reported in earlier epidemiological studies has been established using a modern genome-wide association study [8].
Our finding that T (complementary A) allele of rs505922 is protective against the elevation of the P-LIP level is compatible with their results that the T allele in complete LD with the O allele may express a low risk of pancreatic cancer.

ABO association with the ACE level
In our single two-stage study, we successfully confirmed the known ACE1 and ABO associations with the ACE level [9] after adjustments for possible confounders. No SNP pair from these two loci in 9q32 and 17q23.3 shows strong linkage disequilibrium (D9 = 0.0019-0.15; r 2 = 8.47610 27 -0.00576 in Takahata; 0.0061-0.16 and 3.50610 25 -5.79610 23 in Yamagata) and has any joint effect (i.e. interaction) on the ACE level (p = 0.011-0.996 in Takahata; 0.0017-0.934 in Yamagata), indicating that they are independent of each other; the most significant SNP, rs4362, at the ACE1 locus can account for a large percentage of the total variance of the ACE level (31.4% in Takahata and 25.4% in Yamagata) and also rs495828 in ABO can independently account for a considerable proportion (8.7% in Takahata and 10.2% in Yamagata).

Pleiotrophism at the ABO locus
Box-plots of the plasma levels of P-LIP and ACE per each genotype of two significant SNPs at the ABO locus, rs8176749 and rs8176746, shared by both the endophenotypes, show that their effects are apparently parallel with respect to two traits, which tend to elevate the levels of both the enzymes with increase of minor allele in each of these SNPs (Figure 4a). One might, therefore, regard them as mere secondary effects resulting from a possible single pathologic change (i.e. confounding) such as renal insufficiency or hypertension. However, the P-LIP and ACE levels share no strong correlation to any of other endophenotypes including the creatinine level, urinary albumin and blood pressure in the Takahata population ( Figures 5 and 6), and none of these endophenotypes show any strong association signal on 9q32 and 17q23.2 ( Figure S3 in File S1), together suggesting that potential confounding is vanishingly small, if any, among them. Moreover, it is important to note that three of the remaining SNPs at the ABO locus, rs657152, rs500498 and rs505922, appear to have an opposite effect on the P-LIP and ACE levels, although the effect attained a genome-wide significance level only for P-LIP even in the combined population (Tables S8 and S9 in File S1); if one is elevated with an increase of a minor allele of any of these SNPs, the other would be reduced (Figure 4a). The lack of the association of these SNPs with the ACE levels, which are located in an anterior part of this gene near a variant determining the O group, may result in a largely discordant association pattern of the genomically-deduced ABO histo-blood groups with each enzyme level ( Figure 4b); for example, individuals with the OO group, which has been well known as being protective against pancreatic cancer [8], is similarly protective (p,0.001 in one-way ANCOVA) against the elevation of the P-LIP level compared to AA, but, in contrast, susceptible (p,0.001) to that of the ACE level. Together with the fact that there is no obvious correlation (Pearson's correlation coefficient r = 0.077) between the P-LIP and ACE levels even after eliminating the effect of the ACE1 locus, we conclude that the parallel effect of these common variants at the ABO locus on these endophenotypes is pleiotropic, rather than two different outcomes from a primary pathologic change, Figure 4. ABO locus and PLIP/ACE levels. (a) Box plots for the P-LIP and ACE level per each genotype of the significant SNPs at the ABO locus in the combined population of Takahata and Yamagata with adjustment for age and gender as covariates. These plots show the difference and direction of effects of each SNP on P-LIP and ACE. (b) One-way ANCOVA for the means of the P-LIP and ACE levels per genomically-deduced ABO group in the combined population with adjustment for age and gender as well as rs4356 (the effect of ACE1), in which AA is set as the baseline group in the corner-point parameterization denoted as ''base''. Asterisks denote that the significance level reached in the one-way ANCOVA (*** ,0.001). doi:10.1371/journal.pone.0055903.g004 according to a traditional definition as ''a number of distinct and seemingly unrelated phenotypic effects of a single gene [10].'' In addition, we applied a phenotype defined as the ratio of ACE and P-LIP values, namely ACE/P-LIP, to calculate the P-gain statistics [11], in order to assess the direction of the effect of these SNPs on the two traits. As listed in Table 3, of 17 SNPs originally associated with P-LIP or ACE, 14 SNPs passed the P-gain threshold (.43; the number of all tested traits) for both 1st and 2nd stage samples, suggesting that they have opposing effect to P-LIP and ACE respectively. In contrast, three remaining SNPs, rs8176749, rs8176746 and rs2073824, failed to pass this threshold because they have the same elevation direction between P-LIP and ACE. These results suggest that the pleiotropic effect of the ABO locus on P-LIP and ACE is partly opposing.
Following to the Ahn et al.'s (2012) approach [12], we performed multivariate linear regression model by using the clumped SNP group as predictors, where age and sex are set as covariates. As listed in Table 4, here appears no large difference of the results between untransformed and optimally Box-Cox transformed traits. For effects of the ACE1 on ACE, stringent significance appeared at G1 (P = 3.44610 2105 for untransformed; P = 1.04610 2103 for Box-Cox transformed), while G2 showed moderate significance (P = 7.59610 23 ; P = 6.73610 23 ) and G3 was not significant (P.0.05; P.0.05). For effects of the ABO on ACE, stringent significance appeared at G4 (P = 1.55610 227 ; P = 1.23610 226 ), G5 (P = 2.06610 214 ; P = 9.58610 214 ). For the ABO locus on P-LIP, G6 shows strong significance (P = 3.92610 214 ; P = 4.46610 217 ). The above results possibly means that these four clumped SNP groups (G1 and G2 on the ACE1, G4 and G5 on the ABO) have independent effect on ACE, while G6 on the ABO has independent effect on P-LIP. Of these groups, G5 and G6 share rs8176746 (associated with B blood as an alternative to O/A) but rs505922 (with O). On the other hand, G4 share no same SNP with G5 or G6, so it is independent. G6 may alternatively harbor converse effects by rs8176746 associated with B blood or by O with rs505922 on P-LIP. These results are largely consistent with results from single SNP analyses and ANCOVA for ABO blood type (Figure 4b), although there appears to be some difficulties for interpretation due to potential increase of variability through the multiple application of various kinds of statistical estimations.

Discussion
By the two-stage study using two independent sample populations, we identified the association of rs505922 at the ABO locus on the elevation of the leaking P-LIP. Because this finding is highly compatible with the previous report of the protective effect of T allele of rs505922 (i.e. O group) with pancreatic cancer, one might regard the elevated enzyme levels with increase of alternative C (complementary G) allele of rs505922 as a simple consequence from the tissue damage by preclinical pancreatic cancer. However, given the relatively low incidence of pancreatic cancer (e.g. approximately 10 per 100,000) [13], it is unlikely that as large as 5-8.3 percent of the total variance in the P-LIP level in the general population could be explained by preclinical pancreatic cancer. Rather, our findings may enable us to hypothesize that the ABO blood group affects some unknown common status in the pancreas (e.g. pancreatic steatosis [14], [15]), yielding the variation of the plasma level of the leaking enzyme, prior to irreversible pathological changes such as cancer development and progression promoted by additional risk factors in a small fraction of the susceptible population, although causal relationships between the distribution of the P-LIP level in the general population and the pathogenesis of pancreatic cancer (see ref. [16], [17]) are still obscure. Our findings would warrant further investigations of causal inference between the variation of the P-LIP level and cancer development due to the ABO gene, using prospective data in our cohorts, coupled with detailed echographic examinations (or other imaging technologies) and more mechanistic studies, in order to test our hypothesis of the unknown common status in the pancreas.
Many earlier serological studies and recent genomic studies have reported that the ABO histo-blood groups show weak but consistent associations with a large number of traits including red blood cell phenotypes [4], soluble Intercellular adhesion molecule 1 level [18], stomach cancer [19], diabetes [20], pancreatic cancer [21], and duodenal ulcer [22], together implying the possibility that this gene has a pleiotropic effect. Our findings are the first demonstration of the pleiotropic effect of common variants at the ABO locus on the plasma levels of P-LIP and ACE using a single genome-wide data at the genome-wide significance level, even though the clinical significance of the P-LIP and ACE levels is not so straightforward. Including the opposing (i.e. ''antagonistic'') effects on the P-LIP and ACE levels, the pleiotropism at the ABO locus may be supported by an extensive report of balancing selection on this gene [23], because such opposing effects could reduce the efficiency of negative selection pressure probably in combination with particular environmental perturbations, as suggested for several immune diseases [24], [25]. Moreover, such a pleiotropism seems to be widespread across a variety of diseaserelated genes such as particular alleles of major histocompatibility complex (MHC) [23], apolipoprotein E (ApoE) [25] and nucleotide-binding oligomerization domain protein 2 (NOD2) [26], implying the limitations (e.g. a serious source of drug sideeffects) on a conventional strategy for preventative and therapeutic interventions, based on knowledge from any single effect of genes related to common complex human diseases, as have been mentioned in the context of system biology for molecular pathway (i.e. cascade/network) [27], [28].
So far, no mechanistic studies have successfully deciphered the functional basis for the ABO associations. However, the modern methodologies including extensive applications of genome-wide association study to genomic cohorts would provide new insights into the biochemical/biological basis that could promote novel mechanistic experimentations, as in a lesson from a series of recent studies for the entangled MHC associations with a large number of traits far from adaptive immune system that has been revealing the hidden roles and pathways consisting of functional interactions among MHC and diverse molecules [23], [29], [30].

Subjects
The Takahata cohort was established for a baseline survey in a small rural town, Takahata Town, of Yamagata Prefecture in 2006-2008, whose total population size has been approximately 25 thousands through this period, whereas the Yamagata cohort is now ongoing in the urban prefectural capital, Yamagata City, having approximately 250 thousands residents, becoming part of our large genomic cohort initiative. For the 1st stage, we used genomic DNAs with 43 clinical test values (listed in the legend of Figure 5) from 1,639 individuals who completed the questionnaire for environmental exposures and informed consent for our modern prospective genomic cohort study. For the 2nd stage, from the Yamagata cohort, we collected 1,672 individuals enrolled in multiple sites as different as possible in order to avoid contamination of cryptic relatedness. The age and gender compositions are similar between them; average age is 61.3+/2 10.2 (male:female = 1:1.24) for Takahata and 63.3+/28.6 (1:1.13) for Yamagata. The detailed procedure of the DNA extraction has been described elsewhere [31]. This study was carried out under the approval by the ethical committee at Yamagata University and all other institutions involved.

Genotyping
Using genomic DNAs from the Takahata population, we carried out genotyping for 657,366 SNPs by Infinium Assay with Human660W-Quad BeadChip (Illumina) according to the standard procedure provided by Illumina. After eliminating probes for copy number variation (CNV) detection, we applied a standard SNP quality control (QC) filter to this genome-wide data; SNP locus missing rate (lmiss) .0.05, MAF ,0.01 and HWE p-value, 0.000001. We also adopted a standard sample QC filter; individual missing rate (imiss) .0.01 and high heterozygosity . 0.35. In addition, in order to avoid any bias from cryptic relatedness and potential population stratification, we eliminated higher lmiss samples from pairs with pair-wise IBD (identical by descent) sharing probabilityp p.(1/8+1/16)/2 and outliers until Tracy-Widom statistics p-value,0.05 for any principal component that they are calculated using a LD-based pruned autosomal SNP set (r ' 2,0.01) through a stringent SNP-QC filters (lmiss .0.01; MAF ,0.05; HWE p,0.05). Finally, we had 436,670 SNPs in 1,252 individuals (age = 61.1+/29.97; gender ratio = 1:1.28) for further analyses (Table S10 in File S1). For 20 SNPs in 9q32 and 13 in 17q23.2 around the significant SNPs that survived after imposing a genome-wide significance level or the Bonferroni's corrected level by the number of phenotypes to the 1st study, we newly carried out confirmatory genotyping in the Yamagata population by Golden Gate Assay with a custom BeadChip (Illumina) according to the standard procedure provided by Illumina as well. The quality of the genotypes in the second sample set was checked by the visual inspection of the clusters. Finally, we deduced the ABO histo-blood group for individuals in the two populations using rs505922, rs495828, rs8176749 and rs8176746, in addition to rs8176719 which is genotyped by PCR-RFLP according to a previous described procedure [32].

Statistical analysis
Under the assumption of the additive effect of minor allele dosage, we carried out linear regression for each SNP from Infinium and Golden Gate Assays on the plasma levels of P-LIP and ACE separately with adjustment by age and gender as covariates, in which the statistical significance of each regression coefficient was evaluated by p value from the Wald test using the PLINK software package [33]. In addition, we carried out linear regression for the P-LIP and ACE levels on the same SNP set by applying an additional adjustment by potential confounders, alcohol intake (for P-LIP) and smoking (for ACE), and then by applying the Box-Cox normalizing transformation. We also carried out regression diagnosis for these analysis ( Figure S4 in File S1). Finally, we carried out one-way ANCOVA (analysis of covariance) to compare mean levels of genomically-deduced ABO groups (i.e. six categories) in the combined population separately for the P-LIP and ACE, with adjustment for age and gender as well as rs4356 (i.e. the effect of ACE1), in which AA is set as the baseline group in the corner-point parameterization. These standard genetical/statistical analyses were done with the PLINK software package [33], the EIGENSTRAT software program [34] and the Haploview [35], in addition to the MASS package [36] and the R statistical environment [37].

Supporting Information
File S1 Supporting Information. Figure S1, Polar transformed cluster plot for the significant SNPs surviving the two-stage study for the Takahata and Yamagata. Figure S2, Polar transformed cluster plot for the significant SNPs surviving the two-stage study for the Takahata and Yamagata. Figure S3, Association patterns for SNPs in the 186 kb region from 136,040,829 to 136,227,260 of 9q32 with other clinical test values in the Takahata population, whose range is identical to that in Fig. 2a. The ABO locus is located within about 30 kb on the center of this range of 186 kb (Fig. 2a), which is enough to cover the ABO locus. Figure S4, The regression diagnosis plots for ACE and P-LIP after Box-Cox transformations or untransformations. Methods S1. Table S1, Associations of imputed genotypes in the ABO locus with P-LIP in Takahata population. Table  S2, Associations of imputed genotypes in the ABO locus with ACE in Takahata population. Table S3, Associations of ABO gene in 9q32 with P-LIP or ACE levels after adjustment by smoking or Box-Cox transformation of trait values. Table S4, Associations of ACE1 in 17q23.3 with ACE level after adjustment by smoking or Box-Cox transformation of trait values. Table S5, Associations of ABO gene in 9q32 with P-LIP or ACE levels after adjustment or subgroup by treatment. Table S6, Associations of ACE1 gene in 17q23.2 with ACE level after adjustment or subgroup by treatment. Table S7, SNPs associated with P-LIP and/or ACE levels at a genomewide significance level. Table S8, Associations of ABO gene in 9q32 with P-LIP and ACE in combined population. Table S9, Associations of ACE gene in 17q23.2 with ACE in combined population. Table S10, Sample numbers removed by each quality control filter. Text S1, Organization and all contributors list of Yamagata University Genomic Cohort Consortium. (PDF)