Figures
Abstract
Vascular Endothelial Growth Factor (VEGF) is the main player in angiogenesis. Because of its crucial role in this process, the study of the genetic factors controlling VEGF variability may be of particular interest for many angiogenesis-associated diseases. Although some polymorphisms in the VEGF gene have been associated with a susceptibility to several disorders, no genome-wide search on VEGF serum levels has been reported so far. We carried out a genome-wide linkage analysis in three isolated populations and we detected a strong linkage between VEGF serum levels and the 6p21.1 VEGF region in all samples. A new locus on chromosome 3p26.3 significantly linked to VEGF serum levels was also detected in a combined population sample. A sequencing of the gene followed by an association study identified three common single nucleotide polymorphisms (SNPs) influencing VEGF serum levels in one population (Campora), two already reported in the literature (rs3025039, rs25648) and one new signal (rs3025020). A fourth SNP (rs41282644) was found to affect VEGF serum levels in another population (Cardile). All the identified SNPs contribute to the related population linkages (35% of the linkage explained in Campora and 15% in Cardile). Interestingly, none of the SNPs influencing VEGF serum levels in one population was found to be associated in the two other populations. These results allow us to exclude the hypothesis that the common variants located in the exons, intron-exon junctions, promoter and regulative regions of the VEGF gene may have a causal effect on the VEGF variation. The data support the alternative hypothesis of a multiple rare variant model, possibly consisting in distinct variants in different populations, influencing VEGF serum levels.
Citation: Ruggiero D, Dalmasso C, Nutile T, Sorice R, Dionisi L, Aversano M, et al. (2011) Genetics of VEGF Serum Variation in Human Isolated Populations of Cilento: Importance of VEGF Polymorphisms. PLoS ONE 6(2): e16982. https://doi.org/10.1371/journal.pone.0016982
Editor: Amanda Toland, Ohio State University Medical Center, United States of America
Received: September 14, 2010; Accepted: January 19, 2011; Published: February 9, 2011
Copyright: © 2011 Ruggiero et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by grants from the EU (Vasoplus-037254), the Italian Ministry of Universities (FIRB -RBIN064YAT), the Assessorato Ricerca Regione Campania, the Ente Parco Nazionale del Cilento e Vallo di Diano and the Fondazione Banco di Napoli to MC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Angiogenesis, or the growth of new blood vessels, is required for any process that results in the accumulation of new tissue as well as many processes involving tissue remodelling. When the regulation of angiogenesis fails, blood vessels are formed excessively or insufficiently. It is thus a characteristic of multiple pathologies including cancer, cardiovascular disease, arthritis, psoriasis, macular degeneration, and diabetic retinopathy. In particular, insufficient angiogenesis can be a cause of ischemia, and excessive angiogenesis can result in tumor neovascularization and growth. The angiogenesis process is highly controlled through the balance of pro- and anti-angiogenic factors. VEGF is a crucial player in angiogenesis as it represents the principal pro-angiogenic factor. Throughout development, VEGF orchestrates the process of angiogenesis by regulating the growth, development, and maintenance of a healthy circulatory system[1]. During pregnancy, VEGF is involved in building the placenta. By exerting a powerful antiapoptotic action, VEGF promotes the growth of new blood vessels in tumorigenesis[2]. Because of the crucial role of VEGF, a study of the factors controlling its variability may be of particular interest for many angiogenesis-associated disease studies. The very high heritability of VEGF serum levels reported in the present study and elsewhere [3] suggests that genetic variability contributes to the variation of the trait in the population. Specific polymorphisms in the VEGF gene have been associated with a variation of protein levels [4], [5], [6] and with a susceptibility to several diseases, especially cancer development and progression [7]. However, no genome-wide search on this quantitative trait has been reported so far.
In this work we searched for new quantitative trait loci (QTLs) and polymorphisms influencing VEGF serum levels, in three isolated populations, each living in a different village in the remote hilly region of the Cilento and Vallo di Diano National Park, South Italy. As we recently reported [8], [9], each population is characterized by a large and unique genealogy, including the majority of the current population, the presence of inbreeding and a small number of founders.
We identified the 6p21.1 VEGF gene region as the main QTL for VEGF serum level variation, with a strong and consistent effect in all three populations. An additional and new QTL was detected on chromosome 3p26.3. With a weaker effect, this QTL was detected only in the combined sample of the three populations.
Focusing on the 6p21.1 signal, an extensive sequencing analysis of the VEGF gene was conducted in sub-samples from each of the three villages. Three SNPs were found to be significantly associated with VEGF serum levels in the village of Campora and a fourth SNP (rs41282644) was significantly associated with VEGF serum levels in the village of Cardile.
Altogether, the combination of information on linkage and association in these three population isolates with a common origin allows us to reject the hypothesis of a direct effect on VEGF serum levels of the four SNPs identified. The data suggest an effect of rarer variants, possibly different among the three populations. These results raise a crucial issue in the search for predictive and prognostic VEGF polymorphisms for tumors in the general population.
Results
The characteristics of the study samples are reported in Table 1. The individuals of the three populations have a comparable mean age but the proportion of women is higher in Cardile. We recently reported a significant increase in VEGF serum levels with ageing in a selected sample [10]. This finding was confirmed in the complete population samples of the three villages. No difference was observed in the VEGF serum levels between men and women (Figure 1). However, the VEGF serum levels were significantly higher in Campora compared to Gioi (p-value = 3.4E-03) and Cardile (p-value = 1.4E-03), while no difference was detected between Gioi and Cardile (p-value = 0.44).
The increase of VEGF levels with ageing is reported in each population sample with the related p-values. In Campora, the VEGF levels are higher than in Gioi and Cardile. The median values and 95% IC of the VEGF levels for each age class are reported.
Genome-wide linkage analysis
Genome-wide linkage analysis was performed in the three population samples on the sub-pedigree sets generated by the breaking procedure applied to each population genealogy. A very strong signal was found on chromosome 6p21.1, with the highest LOD score at the marker D6S459 in Campora (mean LOD score = 7.52, q-value = 2.10E-13), in Gioi (mean LOD score = 5.31 q-value = 3.92E-04), and in the combined sample (mean LOD score = 13.94, q-value = 7.27E-22), and at the nearest marker D6S282 in Cardile (mean LOD score = 6.56, q-value = 7.01E-05) (see Table 2). The 6p21.1 region corresponds to the position of the VEGF gene that is exactly located at 0.5 Mb from the D6S282 marker and at 2 Mb from the D6S459 marker.
An additional linkage was detected on chromosome 3p26.3 (mean LOD score = 2.68, q-value = 0.012) at marker D3S4559, able to reach statistical significance in the combined sample (Table 2). Additional signals were found in Campora, on chromosome 2p16.3 at marker D2S2156 (mean LOD score = 1.98, q-value = 0.016) and on chromosome 20q13.13 at marker D20S178 (mean LOD score = 1.94, q-value = 0.022) (Table 2).
VEGF gene variability
To explore gene variability in our population, an extensive sequencing of the VEGF gene was carried out in a total group of 136 individuals. In detail, the exons, intron-exon junctions, promoter and regulative regions were analyzed in 42 individuals from Campora, 49 individuals from Gioi and 45 individuals from Cardile. The individuals included in these three hereafter denoted “detection samples” were chosen to best represent the population's genetic diversity. Data from NCBI (Assembly GRCh37) report 77 SNPs (64 SNPs and 13 Ins/Del) in the regions of the VEGF gene included in our analysis. In our detection samples, 36 out of the 77 (32 SNPs and 4 Ins/Del) were detected in at least one population and 18 new polymorphisms (17 SNPs and 1 Ins/Del) were identified. Two SNPs (rs3025020 and rs833070) outside the sequencing regions but available from previous studies were included in the analysis. The SNP characteristics for the three population “detection samples” are presented in Table 3. Note that given the “detection sample” sizes, all but two of the 18 new SNPs were detected in only one individual (accuracy checked with a replication of the sequencing for these rare variants, in addition to the double strand sequencing applied to all variants), the two remaining SNPs being detected in two individuals from different populations (see Table 3). A schematic representation of the position of the SNPs identified along the VEGF gene is reported in the supplementary figure (Figure S1).
Association study on VEGF gene
In each “detection sample”, the SNPs with a minor allele frequency (MAF) above 5% were tested for association with the VEGF serum levels. Table 4 displays the results for all the SNPs with a significant association signal in at least one “detection sample” and for the SNPs repeatedly reported as associated with VEGF serum levels or correlated phenotypes in the literature. Significant associations were found between the VEGF serum levels and three common SNPs in Campora: the rs25648 variant located in the 5′UTR, the rs3025020 placed in the intron 6 and the rs3025039 located in the 3′UTR.
These associations were confirmed in the large population sample of Campora (Table 5). The TT genotype of the rs3025039 variant was associated with lower median VEGF levels (CC = 435.9 pg/ml vs TT = 295.2 pg/ml) whereas the TT genotype of the rs25648 and rs3025020 variants was associated with higher levels of VEGF (CC = 382.5 pg/ml vs TT = 489.7 pg/ml and CC = 365.2 pg/ml vs TT = 447.3 pg/ml, respectively). No linkage disequilibrium was observed among these three SNPs (LD computed in the population sample: rs25648-rs3025020 r2 = 0.001; rs25648-rs3025039 r2 = 0.001; rs3025020-rs3025039 r2 = 0.114).
Surprisingly, no significant associations were found between these three SNPs and the VEGF levels in the two population samples from Gioi and Cardile (Table 5). However, the allele frequencies are not significantly different in the three populations for SNP rs25648 and rs3025039, and although rs3025020 is less frequent in Gioi and in Cardile, (0.26 and 0.23 respectively versus 0.46 in Campora), it remains a common SNP in these two villages.
Nonetheless, a variant located in the 3′UTR, rs41282644, was significantly associated with the VEGF serum levels in the “detection sample” of Cardile and the association was confirmed in the population sample of this village but not in the population sample of Campora nor in that of Gioi (Table 5). In Cardile, the AA genotype was associated with a lower level of VEGF (AA = 118.2 pg/ml vs GG = 391.2 pg/ml). In contrast to the rs25648, rs3025039 and rs3025020 SNPs, the rs41282644 SNP has a very low frequency in the Caucasian reference population (MAF = 1% in the pilot 1 CEU sample from the 1000 Genome Project) but has become more frequent in the Cilento villages (Cardile population sample MAF = 11%, Gioi population sample MAF = 8%, Campora population sample MAF = 6%). Note that SNP rs41282644 is not strongly correlated with the rs25648, rs3025020 and rs3025039 SNPs (r2 = 0.011, 0.057 and 0 with these three SNPs respectively in the population sample of Cardile).
One significant association was found in Gioi between the rs2146323 variant, located in the intron 2, and the VEGF serum levels. However, this association was observed only in the detection sample (Table 4) and was not confirmed in the population sample of Gioi.
Interestingly, the linkage disequilibrium (LD) among the four SNPs associated in at least one population was not significantly different across the populations, as suggested by the results of the global LD comparison test proposed by Zaykin et al [11] and applied to each pair of populations: Campora/Cardile, p-value = 0.23; Campora/Gioi p-value = 0.63; Gioi/Cardile p-value = 0.33 (Figure S1).
Haplotypes were also analyzed in the three population samples using successively the three SNPs associated in Campora and all the four associated SNPs (three associated in Campora and one associated in Cardile) for haplotype reconstruction. More frequent haplotypes were tested for association with the VEGF serum levels. The results show that when the three SNPs associated in Campora were considered, two haplotypes were found to be associated with the VEGF serum levels in Campora, but were not associated in Gioi and in Cardile (Table 6). Interestingly, of the two associated haplotypes, one (C-C-T haplotype) included all the alleles that in the single SNP testing were associated with low levels of VEGF, while the other (T-T-C haplotype) included all the alleles associated with high levels of VEGF. Further, the association of the T-T-C haplotype with the VEGF levels was stronger compared to that of the C-C-T haplotype and it remains statistically significant also after correction for multiple testing (Table 6).
When all the four associated SNPs (three associated in Campora and one associated in Cardile) were used for haplotype reconstruction, only the T-T-C-G haplotype was still associated with the VEGF levels in Campora although only at the nominal level. No association was found between any of the haplotypes tested and the VEGF levels in Gioi and Cardile.
Linkage on chromosome 6 conditional to VEGF SNP genotypes
To evaluate the contribution of the associated SNPs to the linkage signals detected on 6p21.1, the linkage statistics were recomputed conditional on the associated SNPs. In Campora, the original linkage peak dropped from a LOD score of 7.52 to a LOD score of 6.82 when the rs3025039 genotypes were taken into account, to a LOD score of 6.35 in the case of the rs3025020 variant and to a LOD score of 6.47 in the case of rs25648. When the linkage statistics was computed conditional on the three SNP genotypes, the LOD score dropped to LOD = 5.00, highlighting the independence of these three association signals (Figure 2). A comparable decrease of the LOD score (35%), was obtained when the linkage analysis was conditioned on each of the two haplotypes (C-C-T and T-T-C) associated with the VEGF serum levels in this population.
The percentages reported correspond to the mean LOD score over all sub-pedigree sets analyzed conditional on SNP genotypes divided by the mean LOD score over all sub-pedigree sets analyzed unconditionally. A decrease of the linkage peak is observed after adjusting for the genotypes at each associated SNP. A greater effect is observed when the three SNPs detected in Campora are considered simultaneously.
Similarly, the LOD score of 6.56 detected in Cardile, dropped to a LOD = 5.58 when the linkage statistics was computed conditional on the rs41282644 SNP genotypes.
The same conditional analyses were carried out in the other population samples, respectively Gioi and Cardile for SNPs rs3025039, rs3025020 and rs25648 and Gioi and Campora for SNP rs41282644. As expected, no variation in the LOD score was observed in these samples (data not shown).
Discussion
In this study, we reported a high heritability of VEGF serum levels in our three samples (0.86, 0.80 and 0.89) and a very consistent and strong linkage of this trait with the VEGF gene region. Our genome-wide search detected three additional linkage signals outside the VEGF gene region. A signal on chr3p26 was observed but only reached significance when the three population samples were combined to increase the power, which suggests a weaker effect of this QTL. Further, no clear candidate genes could be identified in this region. Two additional signals were found on 2p16.3 and 20q13.13. Although not consistent across the populations and not detected in the combined sample, these might be of interest since interesting candidate genes are located in these regions. In fact, the EPAS1 gene, located on 2p16.3, is known to be involved in the transcriptional regulation of VEGF [12] and the NCOA-3 (SRC-3) gene, located on 20q13.13, is part of a multi-subunit co-activation complex including the p300/CBP-associated factor and the CREB binding protein [13], that participates in the induction of hypoxia-responsive genes, including the VEGF gene [14].
Altogether, the genome-wide linkage results suggest that most of the genetic variability accounting for the VEGF heritability comes from the VEGF gene region on chr6p21.
Several SNPs in the VEGF gene have been associated with VEGF protein levels and/or with a susceptibility to (or the severity of) several cancers such as breast, lung, colorectal, bladder prostate and gastric [6], [15], [16], [17], [18], [19]. As an increased VEGF expression has been associated with tumor progression and metastasis, these disease associations may well indirectly reflect the effect of genetic variation on VEGF levels. Among the VEGF SNPs, those frequently reported to be associated are: rs699947, rs833061, rs1570360, rs2010963, rs3025039 and rs25648 [7]. As recently discussed by Jain et al [7], the lack of consensus among association studies for these SNPs argues against them having a causal role in cancer development [7].
In our study, associations between rs3025039 and rs25648 and VEGF levels were detected in Campora but not in the two other villages, although these two SNPs have a similar frequency and LD pattern in all three villages. Associations with the other reported SNPs (rs699947, rs833061, rs1570360, rs2010963) could not be identified and new association signals were discovered: rs3025020 in Campora and rs41282644 in Cardile.
From the analysis of haplotypes involving the rs25648- rs3025020- rs3025039 SNPs, we note that in Campora the T-T-C haplotype is more strongly associated with the VEGF levels than the C-C-T haplotype and that it is still associated when the rs41282644 G allele was added (T-T-C-G) to the haplotype. However, the overall haplotype association results, although interesting, are less significant then the single SNP association results, as expected given that all of these are common SNPs and there is a very low LD between them.
All the associated SNPs in our study contribute to the linkage signal, but none of them explains the majority of the signal, even when considered together but independently (3 SNPs in Campora explain 35% of the linkage signal) or as a haplotype. The detection of different association signals in populations with a very similar genetic background and in which a strong linkage was detected, strongly suggests that these cannot point to functional variants, but only to proxies correlated to the functional variants.
Whether these variants are more likely to be rare or common, different or similar among populations remains an open question. Still, given that the LD patterns among common SNPs in the region are relatively similar, if common causal variants were involved, their association with rs3025039, rs25648, rs3025020 or rs41282644 should not be specific to Campora or Cardile. These SNPs should be proxies in all three populations. On the contrary, rare variants, more sensitive to genetic drift, could well display a discordant LD pattern with common variants among the three populations, explaining the discordant association results. A further study of the region, including a sequencing of the whole linkage region in larger sets of individuals, will be required to elucidate this hypothesis.
From a more methodological perspective, this work suggests that our study design, able to provide complementary information on linkage and association in three isolated populations with similar common genetic variations but possibly divergent rarer variations, is particularly powerful in a discrimination between causal and non causal variants.
Materials and Methods
Population sample and VEGF measurement
The study includes 1,957 individuals, recruited through a population-based sampling strategy in three small isolated villages of the Cilento region, South Italy: 656 individuals from the village of Campora, 852 from the village of Gioi and 449 from the village of Cardile. The recruited sample represents about 85% of the living population of each village. Blood samples were collected in the morning after the participants had been fasting for at least 12 h. Aliquots of serum were immediately prepared and stored at −80°C, and were subsequently used for the assessment of VEGF levels. VEGF (pg/ml) was measured using an enzyme-linked immunosorbent assay, according to the manufacturer's instructions (Quantikine™, R&D Systems, Minneapolis, MN). The study design was approved by the ethics committee of Azienda Sanitaria Locale Napoli 1. The study was conducted according to the criteria set by the declaration of Helsinki and each subject signed an informed consent before participating in the study.
Mann-Whitney U test to compare median values in independent samples was performed to compare the VEGF serum levels among population samples. Kruskal-Wallis test was applied to assess the influence of age on VEGF serum variation. These analyses were performed with the SPSS software.
Microsatellite Genotyping
A genome-wide scan of 1,122 microsatellites (average marker spacing of 3.6 cM and mean marker heterozygosity of 0.70) was performed by the deCODE genotyping service. All subjects having a VEGF measurement were genotyped. Mendelian inheritance inconsistencies were checked with the Pedcheck program[20].
Pedigree breaking and linkage analysis
In each village, the vast majority of the phenotyped individuals were connected through a unique deep pedigree. In Campora, 627 out of the 656 phenotyped individuals were included in a 3,049-member pedigree. In Gioi, 798 out of the 852 phenotyped individuals were related through a 4,190-member pedigree. In Cardile, a pedigree of 2,384 members connected 425 individuals out of the 449 phenotyped individuals.
The heritability of VEGF serum levels was estimated using the SOLAR software [21]. A log-transformation was applied to the trait to eliminate an excess of kurtosis. Gender and age were tested as covariates, and only age was retained in the final model. Residuals of the covariate regression were normally distributed and used for heritability estimations. The estimations of heritability were 0.86, 0.80 and 0.89 in Campora, Gioi and Cardile respectively.
The linkage analysis was performed following a procedure based on a multiple splitting of the genealogy, that we developed and already applied to various complex traits [22], [23]. This approach capitalizes on the fact that different family structures differ in their power to detect linkage [24] by successively considering the use of different splittings of the population pedigree. Different splittings of each large population genealogy into sets of sub-pedigrees were generated following a procedure that we previously described [25]. Briefly, the sub-pedigree sets were obtained with the clique-partitioning Jenti method [26], applying different constraints on the splitting procedure (minimum and maximum clique size, minimum relationship level among clique members, and maximum complexity of the resulting families). A selection of the most informative sets was made by maximizing the number of related phenotyped pairs of individuals included in the sets and by minimizing the similarity among the sets in terms of number of pairs in common. By using this approach 15 sub-pedigree sets in Campora, 16 in Gioi, 18 in Cardile and 25 in the combined sample were obtained. The characteristics of these sub-pedigree sets are reported in a supplementary table (Table S1).
A linear regression model of the log-transformed VEGF on age was applied and the residuals were used as a quantitative trait in the multipoint quantitative linkage analysis on each sub-pedigree set using the regression-based approach implemented in MERLIN-REGRESS [27]. The population mean and variance of VEGF were computed from all phenotyped individuals in each population separately, and in the combined sample for the combined analysis.
The contribution of the associated polymorphisms to the linkage signal on chromosome 6 was assessed by performing the linkage analysis on a new phenotype: the VEGF levels adjusted for age and SNP genotypes with a genotypic modeling of the SNP effect.
To take into account the multiple testing problem created by both the number of markers tested and the number of pedigree sets analyzed, we considered a parametric false discovery rate (FDR) approach.
For each marker in each population, the mean LOD score statistics over all the sub-pedigree sets was transformed into a test statistic with a theoretical null distribution following a standard normal [28]. Indeed, to estimate the q-values (which are, for each marker, the minimum FDR induced by the rejection of the null hypothesis), a modelization of the marginal distribution of the test statistic is required and the transformed test statistic is more easily modelized.
A K-components Gaussian mixture model with equal variances was chosen to modelize the marginal distribution of the transformed test statistics, as such a mixture model efficiently separates the empirical null distribution (likely to be composite and different from the theoretical one [28], [29]) from the alternative distribution. For a range of K values (from 2 to 15), the model parameters were inferred in a Bayesian framework by sampling from their joint posterior distributions using MCMC samplers implemented in the WinBUGS software [30]. From the different models, corresponding to different values of K, we selected the one having the highest log-likelihood. To estimate the q-values, without neglecting the fact that the empirical null distribution may be different from the theoretical one [28], [29], the null distribution in the mixture model was itself modelized by the mixture of the K0 first components (K0≤K). K0 was chosen such that the L1 distance between the estimated null density and the density of the theoretical null distribution (a standard normal distribution) was the minimum. Finally, we report here the markers with a q-value below 5%[29], [31].
Identification and genotyping of SNPs in the VEGF gene
To identify polymorphisms in the VEGF gene, the exons, intron-exon junctions, promoter and regulative regions were sequenced in the “detection samples” of individuals selected to best represent the genetic diversity of each village while maintaining reasonable sample sizes (42 individuals in Campora, 45 in Cardile and 49 in Gioi). All the individuals included in the “detection samples” were among the oldest individuals for which DNA was available with children, grand-children and great-grandchildren included in the population sample. The mean number of direct descendants (children, grandchildren and great-grandchildren) was 5.8 for the 42 individuals included in the Campora detection sample, 8.3 for the 45 individuals included in the Cardile detection sample and 9.12 for the 49 individuals included in the Gioi detection sample.
Altogether, 9.8 kb, corresponding to 50% of the entire gene, were analyzed. The oligonucleotide primers for the amplification and sequencing of these regions were designed using the primer prediction program Primer3 (Table S2). The PCR fragments were obtained by 20 µl reaction containing 0.2 of mM dNTPs, 0.8 µM of each forward and reverse primer, 1.5 mM of MgCl2, and 40 ng of genomic DNA as template, with 2 units of recombinant Taq DNA polymerase. The cycling conditions were as follows: 95°C for 3 min, followed by 95°C for 30 sec, 60°C for 30 sec, and 72°C for 30 sec for 35 cycles, and by a final extension at 72°C for 7 min. The PCR products were purified by using MultiScreen PCRµ96 Filter Plates (Millipore) and were sequenced on both strands using the Applied Biosystems BigDye v3.1 sequencing kit according to the manufacturer's recommendations on an Applied Biosystems 3730 DNA Analyzer Sequencer. The sequences were then analyzed using the SeqAnalysis and BioEdit softwares. The SNP discovery accuracy was assured by sequencing in two replicates the fragments including the new SNPs.
As mentioned in the Results section, two SNPs were added to this panel: rs833070 (located in intron 2) and rs3025020 (located in intron 6) and genotyped using the TaqMan SNP genotyping assay and the SDS software was used for allele discrimination (Applied Biosystems, Foster City, CA, USA). The same technology was used to genotype in the population samples the five SNPs associated in the “detection samples” (rs25648, rs3025039, rs3025020, rs41282644 and rs2146323). The rate of successful genotypes was above 95% for each SNP.
Association testing
All frequent SNPs (MAF>5%) identified in the VEGF gene were tested for association with the log-transformed VEGF adjusted for age phenotype in the detection samples. The genotype frequencies of the tested SNPs are reported in a supplementary table (Table S3). Significant associations were then confirmed in the population sample of each village (656 individuals in Campora, 852 in Gioi and 449 in Cardile). To test for association while taking into account the relatedness between individuals, the phenotypes were regressed on the genotypes and a Wald Test was applied on the least square estimator of β (regression coefficient for the genotype covariate in the regression) with a variance of the estimator modified to account for the relatedness, using the genealogical information [9].
To correct for multiple testing, we applied the procedure proposed by Nyholt [32] and modified by Li and Ji [33]. Briefly, a number of independent tests (Meff) equivalent to the number of correlated SNPs tested was estimated from the LD pattern among the SNPs and a Bonferroni correction for Meff tests was applied to obtain the corrected p-value threshold.
The global comparison of LD among the rs25648, rs3025039, rs3025020 and rs41282644 SNPs in the population samples was conducted using the approach proposed by Zaykin et al [11]. Based on the composite LD coefficient proposed by Weir and Cockerham [34], this test contrasts the LD matrices with an empirical assessment of type I error. To account for the inter-individual relationship, all measures of LD were computed on sub-samples of poorly related individuals: 163 individuals for the Campora population sample, 104 individuals for the Cardile population sample and 111 individuals for the Gioi population sample.
The haplotypes were reconstructed taking advantage of family information and tested for association with the VEGF levels using the software FBAT. A biallelic test was performed in which each haplotype was tested against all the others pooled together and an additive model was applied. Two analyses were carried out successively. One used the haplotypes made of the three SNPs associated in Campora and the other used the haplotypes made of the four associated SNPs (three associated in Campora and one associated in Cardile). Only haplotypes having a frequency >1% were tested for association with the VEGF levels.
Supporting Information
Figure S1.
A) Schematic representation of the VEGF gene. The exons are reported in black, introns in white, regulative regions in dark grey, and promoter region in light grey. The position of the 56 SNPs identified in the gene is also indicated. The four SNPs associated with the VEGF levels are framed. B) LD patterns between the four associated SNPs in Campora, Gioi and Cardile. R-squared values are indicated.
https://doi.org/10.1371/journal.pone.0016982.s001
(TIF)
Table S1.
Characteristics of the sub-pedigree sets used in the linkage study for the VEGF serum levels in Campora, Gioi and Cardile
https://doi.org/10.1371/journal.pone.0016982.s002
(DOC)
Table S2.
List of the primers designed to sequence VEGF gene
https://doi.org/10.1371/journal.pone.0016982.s003
(DOC)
Table S3.
Genotype frequencies in the detection samples of the 26 SNPs analyzed for association with the VEGF levels.
https://doi.org/10.1371/journal.pone.0016982.s004
(DOC)
Acknowledgments
We would like to address special thanks to the populations of Campora, Gioi and Cardile for their participation in the study. We thank Don Guglielmo Manna, Dr. Luciano Errico, Dr. Andrea Salati and Dr. Giuseppe Vitale for helping in the interaction with the populations and Antonietta Calabria, Raffaella Romano and Teresa Rizzo for the organization of the study in the villages.
Author Contributions
Conceived and designed the experiments: DR ALL CB MC. Performed the experiments: DR TN RS LD. Analyzed the data: DR CD PB ALL CB MC. Contributed reagents/materials/analysis tools: CD TN MA LD. Wrote the paper: DR ALL CB MC.
References
- 1. Jain RK (2003) Molecular regulation of vessel maturation. Nat Med 9: 685–693.
- 2. Gerber HP, Dixit V, Ferrara N (1998) Vascular endothelial growth factor induces expression of the antiapoptotic proteins Bcl-2 and A1 in vascular endothelial cells. J Biol Chem 273: 13313–13316.
- 3. Pantsulaia I, Trofimov S, Kobyliansky E, Livshits G (2004) Heritability of circulating growth factors involved in the angiogenesis in healthy human population. Cytokine 27: 152–158.
- 4. Renner W, Kotschan S, Hoffmann C, Obermayer-Pietsch B, Pilger E (2000) A common 936 C/T mutation in the gene for vascular endothelial growth factor is associated with vascular endothelial growth factor plasma levels. J Vasc Res 37: 443–448.
- 5. Awata T, Inoue K, Kurihara S, Ohkubo T, Watanabe M, et al. (2002) A common polymorphism in the 5'-untranslated region of the VEGF gene is associated with diabetic retinopathy in type 2 diabetes. Diabetes 51: 1635–1639.
- 6. Hansen TF, Spindler KL, Lorentzen KA, Olsen DA, Andersen RF, et al. (2010) The importance of -460 C/T and +405 G/C single nucleotide polymorphisms to the function of vascular endothelial growth factor A in colorectal cancer. J Cancer Res Clin Oncol 136: 751–758.
- 7. Jain L, Vargo CA, Danesi R, Sissung TM, Price DK, et al. (2009) The role of vascular endothelial growth factor SNPs as predictive and prognostic markers for major solid tumors. Mol Cancer Ther 8: 2496–2508.
- 8. Colonna V, Nutile T, Astore M, Guardiola O, Antoniol G, et al. (2007) Campora: a young genetic isolate in South Italy. Hum Hered 64: 123–135.
- 9. Colonna V, Nutile T, Ferrucci RR, Fardella G, Aversano M, et al. (2009) Comparing population structure as inferred from genealogical versus genetic information. Eur J Hum Genet 17: 1635–1641.
- 10.
Siervo M, Ruggiero D, Sorice R, Nutile T, Aversano M, et al. (2010) Angiogenesis and biomarkers of cardiovascular risk in adult subjects with metabolic syndrome. Journal of Internal Medicine Jun 23. [Epub ahead of print].
- 11. Zaykin DV, Meng Z, Ehm MG (2006) Contrasting linkage-disequilibrium patterns between cases and controls as a novel association-mapping method. Am J Hum Genet 78: 737–746.
- 12. Takeda N, Maemura K, Imai Y, Harada T, Kawanami D, et al. (2004) Endothelial PAS domain protein 1 gene promotes angiogenesis through the transactivation of both vascular endothelial growth factor and its receptor, Flt-1. Circ Res 95: 146–153.
- 13. Demarest SJ, Martinez-Yamout M, Chung J, Chen H, Xu W, et al. (2002) Mutual synergistic folding in recruitment of CBP/p300 by p160 nuclear receptor coactivators. Nature 415: 549–553.
- 14. Gray MJ, Zhang J, Ellis LM, Semenza GL, Evans DB, et al. (2005) HIF-1alpha, STAT3, CBP/p300 and Ref-1/APE are components of a transcriptional complex that regulates Src-dependent hypoxia-induced expression of VEGF in pancreatic and prostate carcinomas. Oncogene 24: 3110–3120.
- 15. Krippl P, Langsenlehner U, Renner W, Yazdani-Biuki B, Wolf G, et al. (2003) A common 936 C/T gene polymorphism of vascular endothelial growth factor is associated with decreased breast cancer risk. Int J Cancer 106: 468–471.
- 16. Koukourakis MI, Papazoglou D, Giatromanolaki A, Bougioukas G, Maltezos E, et al. (2004) VEGF gene sequence variation defines VEGF gene expression status and angiogenic activity in non-small cell lung cancer. Lung Cancer 46: 293–298.
- 17. Lin CC, Wu HC, Tsai FJ, Chen HY, Chen WC (2003) Vascular endothelial growth factor gene-460 C/T polymorphism is a biomarker for prostate cancer. Urology 62: 374–377.
- 18. Balasubramanian SP, Cox A, Cross SS, Higham SE, Brown NJ, et al. (2007) Influence of VEGF-A gene variation and protein levels in breast cancer susceptibility and severity. Int J Cancer 121: 1009–1016.
- 19. Garcia-Closas M, Malats N, Real FX, Yeager M, Welch R, et al. (2007) Large-scale evaluation of candidate genes identifies associations between VEGF polymorphisms and bladder cancer risk. PLoS Genet 3: e29.
- 20. O'Connell JR, Weeks DE (1998) PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet 63: 259–266.
- 21. Almasy L, Blangero J (1998) Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet 62: 1198–1211.
- 22. Ciullo M, Bellenguez C, Colonna V, Nutile T, Calabria A, et al. (2006) New susceptibility locus for hypertension on chromosome 8q by efficient pedigree-breaking in an Italian isolate. Hum Mol Genet 15: 1735–1743.
- 23. Ciullo M, Nutile T, Dalmasso C, Sorice R, Bellenguez C, et al. (2008) Identification and replication of a novel obesity locus on chromosome 1q24 in isolated populations of Cilento. Diabetes 57: 783–790.
- 24. Goddard KA, Goode EL, Rozek LS, Jarvik GP (1999) Impact of family structure on the power of linkage tests using sib-pair methods. Genet Epidemiol 17: Suppl 1S575–579.
- 25. Bellenguez C, Ober C, Bourgain C (2009) A multiple splitting approach to linkage analysis in large pedigrees identifies a linkage to asthma on chromosome 12. Genet Epidemiol 33: 207–216.
- 26. Falchi M, Forabosco P, Mocci E, Borlino CC, Picciau A, et al. (2004) A genomewide search using an original pairwise sampling approach for large genealogies identifies a new locus for total and low-density lipoprotein cholesterol in two genetically differentiated isolates of Sardinia. Am J Hum Genet 75: 1015–1031.
- 27. Sham PC, Purcell S, Cherny SS, Abecasis GR (2002) Powerful regression-based quantitative-trait linkage analysis of general pedigrees. Am J Hum Genet 71: 238–253.
- 28. Efron B (2004) Large-scale simultaneous hypothesis testing: the choice of a null hypothesis. J Am Stat Assoc 99: 96–104.
- 29. Dalmasso C, Pickrell J, Tuefferd M, Genin E, Bourgain C, et al. (2007) A mixture model approach to multiple testing for the genetic analysis of gene expression. BMC Proc 1: Suppl 1S141.
- 30.
Spiegelhalter DTA, Best N, Lunn DWinBUGS User manual Version 1.4.1. WinBUGS website (accessed 2010) [http://www.mrc-bsu.cam.ac.uk/bugs].
- 31. Efron B, Tibshirani R (2002) Empirical bayes methods and false discovery rates for microarrays. Genet Epidemiol 23: 70–86.
- 32. Nyholt DR (2004) A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet 74: 765–769.
- 33. Li J, Ji L (2005) Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity 95: 221–227.
- 34. Weir BS, Cockerham CC (1979) Estimation of Linkage Disequilibrium in randomly mating populations. Heredity 42: 105–111.