Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Genes to Diseases (G2D) Computational Method to Identify Asthma Candidate Genes

  • Karine Tremblay,

    Affiliations Department of Medicine, Laval University, Québec, Quebec, Canada, University of Montreal Community Genomic Centre, Chicoutimi Hospital, Saguenay, Quebec, Canada

  • Mathieu Lemire,

    Affiliation Ontario Institute for Cancer Research, Toronto, Ontario, Canada

  • Camille Potvin,

    Affiliations Department of Medicine, Laval University, Québec, Quebec, Canada, University of Montreal Community Genomic Centre, Chicoutimi Hospital, Saguenay, Quebec, Canada

  • Alexandre Tremblay,

    Affiliation University of Montreal Community Genomic Centre, Chicoutimi Hospital, Saguenay, Quebec, Canada

  • Gary M. Hunninghake,

    Affiliation Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America

  • Benjamin A. Raby,

    Affiliation Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America

  • Thomas J. Hudson,

    Affiliations Ontario Institute for Cancer Research, Toronto, Ontario, Canada, McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada

  • Carolina Perez-Iratxeta,

    Affiliation Molecular Medicine, Ottawa Health Research Institute, Ottawa, Ontario, Canada

  • Miguel A. Andrade-Navarro,

    Affiliations Molecular Medicine, Ottawa Health Research Institute, Ottawa, Ontario, Canada, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada, Max Delbrück Center for Molecular Medicine, Berlin, Germany

  • Catherine Laprise

    Affiliations University of Montreal Community Genomic Centre, Chicoutimi Hospital, Saguenay, Quebec, Canada, Département des Sciences fondamentales, Université du Québec à Chicoutimi, Saguenay, Quebec, Canada

Genes to Diseases (G2D) Computational Method to Identify Asthma Candidate Genes

  • Karine Tremblay, 
  • Mathieu Lemire, 
  • Camille Potvin, 
  • Alexandre Tremblay, 
  • Gary M. Hunninghake, 
  • Benjamin A. Raby, 
  • Thomas J. Hudson, 
  • Carolina Perez-Iratxeta, 
  • Miguel A. Andrade-Navarro, 
  • Catherine Laprise


Asthma is a complex trait for which different strategies have been used to identify its environmental and genetic predisposing factors. Here, we describe a novel methodological approach to select candidate genes for asthma genetic association studies. In this regard, the Genes to Diseases (G2D) computational tool has been used in combination with a genome-wide scan performed in a sub-sample of the Saguenay−Lac-St-Jean (SLSJ) asthmatic familial collection (n = 609) to identify candidate genes located in two suggestive loci shown to be linked with asthma (6q26) and atopy (10q26.3), and presenting differential parent-of-origin effects. This approach combined gene selection based on the G2D data mining analysis of the bibliographic and protein public databases, or according to the genes already known to be associated with the same or a similar phenotype. Ten genes (LPA, NOX3, SNX9, VIL2, VIP, ADAM8, DOCK1, FANK1, GPR123 and PTPRE) were selected for a subsequent association study performed in a large SLSJ sample (n = 1167) of individuals tested for asthma and atopy related phenotypes. Single nucleotide polymorphisms (n = 91) within the candidate genes were genotyped and analysed using a family-based association test. The results suggest a protective association to allergic asthma for PTPRE rs7081735 in the SLSJ sample (p = 0.000463; corrected p = 0.0478). This association has not been replicated in the Childhood Asthma Management Program (CAMP) cohort. Sequencing of the regions around rs7081735 revealed additional polymorphisms, but additional genotyping did not yield new associations. These results demonstrate that the G2D tool can be useful in the selection of candidate genes located in chromosomal regions linked to a complex trait.


Asthma involves genetic and environmental factors in its development, chronicity and severity [1], [2]. Although some of its underlying mechanisms have been elucidated in recent years, more work is needed to gain a clearer understanding of genetic determinants. The mapping of asthma has been one of the most important areas of human genetics in the last two decades. According to an overview by Blumenthal (2005), twelve complete and two incomplete genome scans for asthma have been published, identifying a total of twenty chromosomal linked regions to asthma [3]. Discrepancies often appeared between linkage studies [3][5], leading to the use of standard phenotype definitions and founder populations as a way to decrease phenotypic and genetic heterogeneity [6][8].

To date, common strategies have been employed to identify genes involved in asthma predisposition. Linkage studies followed by positional cloning identified six genes while association studies identified over a hundred genes, the majority of these having quite small effects on asthma susceptibility (see [9], [10] for a review). Here, we describe a novel methodological approach that combines classical genetic approaches with a computational data-mining tool. In this regard, a genome-wide scan for asthma and atopy in families originating from the Saguenay–Lac-St-Jean (SLSJ) founder population (Northeastern Quebec, Canada) [11][16] has been combined with the Genes to Diseases (G2D) new computational tool in the prioritization of asthma candidate genes [17], [18]. G2D performs the selection of the candidates in the chromosomal regions genetically linked to a disease by highlighting genes whose functions are related to the phenotype of the disease according to a data mining analysis of the bibliographic and protein public databases, or according to the genes already known to be associated with the same or a similar phenotype.

Single nucleotide polymorphisms (SNPs) within the candidate genes prioritized by the G2D tool have been subsequently used to conduct an association study with asthma, atopy and allergic asthma phenotypes in the SLSJ asthma familial collection. The chromosomal regions around associated SNPs have then been sequenced in order to find causal mutations, and a replication of the positive association findings has been assessed in the Childhood Asthma Management Program (CAMP) independent cohort. The goal was to apply the G2D tool in the search of genetic determinants associated with a complex trait, using asthma as a model. This approach led to the hypothesis-driven identification of ten genes that may not have been selected otherwise (LPA, NOX3, SNX9, VIL2, VIP, ADAM8, DOCK1, FANK1, GPR123 and PTPRE). Of these, one positive genetic association, resisting to corrections for multiple testing, has been found between the protein tyrosine phosphatase receptor type E gene (PTPRE) and allergic asthma in the SLSJ sample.



Clinical evaluation and phenotyping criteria of the Saguenay-Lac-St-Jean (SLSJ) subjects have been described in recent reports [19][21] and summarized in Table 1. This familial sample is predominantly composed of probands that reported an onset age of asthma below 12 years old (81.6% of the probands). The mean age of onset for the probands is 7 years and the mean age of onset for the asthmatic family members is 22 years. The entire sample has been used for the association study while for the genome scan, the first 79 recruited families, that were available at the time when the genome scan was performed have been used (n = 609 individuals–see supplementary Table S1 for subjects characteristics and studied phenotypes). The Chicoutimi Hospital local ethics committee approved the study and all subjects provided informed consent.

Table 1. Clinical and phenotypic characteristics of the Saguenay–Lac-St-Jean association study sample subjects.

Genome Scan

DNA was extracted for all SLSJ participants from whole blood by using the QIAGEN genomic purification procedure (QIAGEN Inc., Valencia, CA). Genotyping was completed on 367 autosomal and 21 X-chromosome microsatellite markers evenly spaced throughout the genome (average spacing of 9.2 cM). The marker set is a modification of the Cooperative Human Linkage Centre Screening Set (, version 6.0), showing an average heterozygosity of 0.72 in our data set. Each primer was amplified separately and then pooled into panels of eight markers and products were interrogated using ABI 3700 sequencers (Applied Biosystems, Foster City, CA) with a size standard ladder. Duplicate of two CEPH control DNAs and one water were included in each genotyping plate. In our data set of 195939 autosomal genotypes, 91.4% of the alleles were called and the proportion of observed Mendelian error was 1.2%. Linkage of chromosomal regions with putative genetic risk factors for a given trait was assessed by evaluating the extent of excess sharing of alleles identical by descent in affected relatives within families. Test statistics are reported on the LOD scale. Briefly, a multipoint, one-parameter likelihood ratio test that is robust against incompleteness of marker data (when the descent of alleles in a pedigree is not fully known) was used [22], [23]. We moreover evaluated the specific contribution of mothers and fathers to the test of linkage, to look for parent-of-origin effects. The above tests of linkage, linkage through mothers and linkage through fathers are described in details in [24].

Genes to Diseases (G2D)

The G2D tool has been applied in the two best genome scan susceptibility regions: 6q26 between markers D6S476 and D6S305, and 10q26.3 between markers D10S1223 and D10S1248. We used two of the approaches considered in G2D [18] to pre-select gene lists in these regions. The first approach uses a description of the phenotype to point to genes in a region. This method works with automatically derived relationships between the disease symptoms (as MeSH C terms) and gene features (as Gene Ontology or GO terms [25]) that are obtained from the literature and Entrez gene database ( We call this procedure the “PHENOTYPE” method. The second approach consists on automatically finding genes in a region that are similar to other genes previously associated to asthma. To do this, G2D measures the semantic distance between the annotations of the “known genes” and the annotations of genes in the problem region assigned by homology searches. We call this procedure the “KNOWN GENES” method. For details about the algorithm see the G2D web site at and [17], [18].

Association Study

SNP selection.

The HapMap database ( has been used to identify SNPs assumed to be polymorphic in the SLSJ population. TagSNPs were then selected with the tagger program implemented in the Haploview software (version 3.32)[26] using an r2 cutoff of 0.8 and a minor allele frequency (MAF) over 0.10 to cover each whole gene. SNPs were also prioritized on their localization (coding or untranslated regions-see supplementary Table S2). All SNPs are referred using their reference sequence number (rs#).

SNP genotyping.

Eighty SNPs have been genotyped by the Sequenom® matrix-assisted laser desorption/ionization time-of-flight mass array spectrometer (Sequenom Inc., San Diego, CA) (Table S2). Sequenom primers were designed using the Sequenom SNP Assay Design software version 3.0 for iPLEX reactions. A total of 74 assays were designed for a single multiplex reaction. The assay group file containing the PCR primers and the iPLEX extension probes can be supplied on request to the corresponding author. The protocol and reaction conditions are in accordance with the manufacturer [27]. The genotypes were viewed and analyzed using the MassARRAY Typer software version 3.4 (Sequenom Inc., San Diego, CA). The ten remaining SNPs (Table S2) have been genotyped by the TaqMan® SNP Genotyping Assays (Applied Biosystems, Foster City, CA) using the Rotor-Gene™ real-time PCR (Corbett Research Ltd, Sydney, Australia). Protocol and method were supplied by the manufacturer and PCR conditions were optimized to get a good cluster separation between different genotypes (see supplementary Table S3 for PCR conditions). Genotypes were attributed by the Rotor Gene software using the scatter graph analysis option.

Statistical analysis.

Family-based association testing has been performed with the FBAT software (version 1.7) using an empirical estimate of the variance [28][30] to correctly account for linkage. A Sidak correction for multiple testing has been applied on the p-values accounting for the effective number of independent phenotypes and SNPs, according to the definition of Li and Ji [31], as implemented in the SNPSpD program [32] (see the online supporting Text S1 file for supplementary details). Parent-specific transmission disequilibrium tests were performed using sib_tdt from the ASPEX package ( Mendelian errors have been assessed by FBAT and Hardy-Weinberg equilibrium has been assessed with Haploview software (version 3.32)[26].

PTPRE Sequencing

Forty unrelated SLSJ subjects (validated with the BALSAC database [33]) presenting full-fit allergic asthmatic criteria and that have contributed to the PTPRE association were selected. PTPRE sequence information was obtained from Ensembl database (, release 46). The sequencing was divided in six regions that spanned 2.7 kb, starting from the exon 2 to the exon 4. Oligonucleotides and PCR conditions are listed in Table S4. Amplification products were purified with multiscreen PCR plates (Millipore Corporation, Billerica, MA), sequenced with BigDye terminator v3.1 chemistry following instructions of the manufacturer and analyzed on a 3100 Genetic analyzer (Applied Biosystems, Foster City, CA). Sequence analysis was performed with Codoncode Aligner software (CodonCode Corporation, The identified SNPs presenting MAF over 0.05 have been genotyped in the SLSJ sample using a Sequenom panel and analyzed with FBAT, as described above. Newly described SNPs have been submitted to NCBI ( SNP database.

Replication Study

The PTPRE rs7081735 association has been assessed in the Childhood Asthma Management Program (CAMP) study [34], [35]. This analysis includes the 497 non-Hispanic white children and their parents for whom adequate DNA was available. The genotyping of the PTPRE rs7081735 was performed using a TaqMan® SNP Genotyping Assays (Applied Biosystems, Foster City, CA) (see Table S3 for PCR conditions). Plates were scanned using the 7900HT Fast Real-Time PCR System (Applied Biosystems, Foster City, CA) and genotypes were assigned by the SDS 2.2 software using the scatter graph analysis option. The Institutional Review Board of the Brigham and Women's Hospital (BWH), as well as those of the other CAMP study centers, approved this study. Informed assent and consent were obtained from the study participants and their parents to collect DNA for genetic studies.


Genome Scan

The genome-wide linkage scan analysis revealed at least two regions showing suggestive evidence for linkage, as well as differential maternal and paternal contribution (Figure 1). The two regions are 6q26 for asthma (LOD = 1.54, p = 0.0038) and 10q26.3 for atopy (LOD = 2.82, p = 0.00016). In these two regions, affected sibs tend to share more alleles inherited from the mothers than from the fathers. The linkage tests through the mothers reach a LOD of 2.19 (p = 0.00074) in 6q26 for asthma and a LOD of 2.96 (p = 0.00011) in 10q26.3 for atopy. The respective LODs obtained through the fathers at the same loci are only 0.01 (p = 0.40) and 0.58 (p = 0.05). Moreover, even though the 10q region does not show a great strength of linkage with asthma (LOD = 0.57, p = 0.051), the asthmatic sibs tend to share these alleles when received from their mothers (LOD = 2.82, p = 0.00016). Each of the two chromosomal regions show LOD values (either for tests of linkage, or parent-specific LODs, or both) that are, in order of magnitude, consistent with what has been defined as «suggestive» for linkage by Lander and Kruglyak [36], which are expected to occur once per whole genome scan on average. Because of their great hypothesis generating potential, the 6q26 and 10q26 regions as well as asthma and atopy phenotypes were selected for the following G2D and association studies.

Figure 1. Genome scan results summary.

Results from the tests of linkage with atopy (top three panels) and asthma (bottom three panels) reported on the LOD scale. For each phenotype, the top panel shows the results of the test of linkage (excess allele sharing), the middle panel shows results from the tests of linkage through mothers (excess allele sharing transmitted from mothers) and the bottom panel shows results from the tests of linkage through fathers (excess allele sharing transmitted from fathers).


We applied the two algorithms PHENOTYPE and KNOWN GENES to the 6q26 and 10q26.3 regions. For the PHENOTYPE method, we used the OMIM [37] record 600807 as input, particularly the MeSH C terms from the MEDLINE references in OMIM entry 600807 which deals with susceptibility to asthma and asthma related traits. The most prevailing MeSH C terms are “Asthma” and “Bronchial Hyperreactivity”, but the list also includes “Hypersensitivity”, “Respiratory Hypersensitivity” and “Eosinophilia”. We thus considered that the terms that refer to asthma were “Asthma” and “Bronchial Hyperreactivity” and that the terms that refer to atopy were “Hypersensitivity”, “Respiratory Hypersensitivity” and “Eosinophilia”. The highest scoring GO terms, associated to these MeSH C terms, describe a variety of molecular functions and processes that include leukotriene and interleukin signaling, glutathione metabolism, etc (see header of supplementary Table S5). We applied this method to the 6q26 region between D6S476 and D6S305, the two markers directly flanking the linkage peak seen in the region, which corresponds to the 10.48 MB band between positions 151,685,574 and 162,165,587 of chromosome 6. After discarding candidates that did not overlap with any known or hypothetical Entrez Gene sequence, 16 genes remained (see supplementary Table S5-A). A similar analysis was carried between D10S1223 and D10S1248, the two markers directly flanking the linkage peak in 10q26.3, between positions 129,150,822 and 130,982,363 in chromosome 10. In that case, 12 candidates were obtained (see supplementary Table S5-B). For the KNOWN GENES method, we compiled a list of genes reported to be associated with asthma and atopy from the literature [38] and from the Genetic Association Database GAD [39] (supplementary Table S6). We then extracted the GO annotation of those genes in Entrez Gene [37]. We derived a scoring system for the candidates according to the minimal semantic distance between their GO annotation and the ones from the compiled known-gene list. For example, genes annotated with GO terms such as “dipeptidyl-peptidase IV activity” or “chemokine receptor binding” would score high as candidates. The method takes into account the hierarchical structure of GO as well as the specificity of GO terms. In that sense, similarity with more infrequent terms receive higher scores. For the complete list of GO terms see the header of supplementary Table S7. We applied the KNOWN GENES method to both regions 6q26 and 10q26.3, obtaining 15 and 10 genes, respectively, after filtering out those candidates that did not overlap with either known or hypothetical genes (see supplementary Table S7).

Use of a complementary method based on genomic sequence.

We applied the G2D complementary Disease Gene Prediction (DGP) tool [40] to both genetically linked regions in order to predict the involvement of genes in inherited disease by their sequence features. This method analyses the probability of a gene to be associated with any disease phenotype. Genes with no associated phenotype and a probability greater than 0.7 were retained. With this criterion the DGP method identified one candidate in chromosome 10 (MMP21) and three candidates in the 6q26 region: TFB1M [MIM:607033], RGS17 [MIM:607191] and VIP [MIM:192320].

Final list of candidate genes.

Genes that received the higher scores in the pre-selected lists, and those that were pointed by the two different analyses were preferred, allowing the construction of a list of 17 candidates (displayed in the Table 2). We then applied a candidate gene approach to select a final list of ten genes with the best biological potential related to asthma pathophysiology for genotyping (five in each chromosomal region): LPA, NOX3, SNX9, VIL2, VIP, ADAM8, DOCK1, FANK1, GPR123 and PTPRE (marked in bold in Table 2).

Table 2. Genes identified by G2D data mining analysis and those selected for the association study based on their number of appearance in G2D analyses and on their biological function.

Association Study

A final panel of 91 SNPs (75 tagSNPs and 16 non-tagSNPs) was selected among the ten candidate genes (supplementary Table S2). Of these, four were non-polymorphic, six failed Sequenom genotyping assays, none presented deviation from Hardy-Weinberg equilibrium (p-values >0.001) and one presented more than two Mendelian errors (Table S2). For the remaining 80 SNPs, genotypes of individuals with Mendelian errors were considered as missing data in the FBAT analyses. Genotyping presented a mean success rate of 99.0% for the Sequenom assays and a mean success rate of 98.3% for the TaqMan assays. Accounting for the residual correlation between the tagSNPs, it is estimated that the 80 partially correlated SNPs correspond to an effective number of 59 independent ones [31]. As for the effective number of phenotypes, simulation indicates that the three studied phenotypes (asthma, atopy and allergic asthma) correspond to 1.75 effective independent ones (see supplementary Text S1 file). Accordingly, the estimated total effective number of independent tests is 103.25 (59 effective independent SNPs×1.75 effective independent phenotypes). Thus, applying Sidak correction, the p-value threshold of significance is estimated to be 0.000483.

The FBAT single marker analyses were performed under an additive genetic model for each SNP and the three studied phenotypes. For the sake of brevity, only results showing a p-value under 0.05 before correction for multiple testing for one or more phenotypes are presented in Table 3. For the asthma phenotype, minor alleles LPA_rs12175867C, GPR123_rs11101913T, GPR123_rs11101932T and GPR123_rs12257731A were overtransmitted to the asthmatic probands, suggesting a susceptibility effect of these alleles for asthma (0.0085<p<0.047). Inversely, minor alleles DOCK1_rs1051039G, PTPRE_rs4369314A and PTPRE_rs7081735G were undertransmitted to the asthmatic probands, suggesting a protective effect of these alleles for asthma (0.010<p<0.049). For the atopy phenotype, only ADAM8_rs11101672G and PTPRE_rs7081735G minor alleles have been undertransmitted to the atopic probands, suggesting a protective effect (p = 0.039 and 0.037, respectively). Finally, for the allergic asthma phenotype, minor alleles LPA_rs12175867C, GPR123_rs11101913T, GPR123_rs11101932T and GPR123_rs12257731A were overtransmitted to the allergic asthmatic probands, suggesting a susceptibility effect of these alleles for allergic asthma (0.035<p<0.041). Inversely, minor alleles ADAM8_rs11101672G, GPR123_rs11101916A, GPR123_rs761777G, PTPRE_rs11016002A, PTPRE_rs4002572C and PTPRE_rs7081735G were undertransmitted to the allergic asthmatic probands, suggesting a protective effect of these alleles for allergic asthma (0.000463<p<0.037). None of the SNPs reported above showed a significantly greater extent of transmission distortion from mothers than from fathers, thus not providing insights to the observed parental distortions seen in the linkage results (not shown).

Table 3. Significant Family-Based Association Test (FBAT) results between the ten G2D candidate genes studied SNPs and asthma, atopy and allergic asthma phenotypes under an additive genetic model.

PTPRE Sequencing

PTPRE rs7081735 shows the strongest association to allergic asthma (p = 0.000463) and is the only SNP shown to be significant after multiple testing correction (corrected p = 0.0478). We thus sequenced strategic genomic regions around the rs7081735, including coding regions in order to identify causal mutation. Figure 2 shows the PTPRE sequenced regions, the rs7081735 localization and the nine identified variants (diamonds), including four novel ones (c.86172A>G, c.140901G>A, c.140903G>A and c.141102C>T) (Table 4). It is worth to note that any of these identified polymorphisms may affect either PTPRE isoforms. A family-based association analysis has been performed between the three studied phenotypes and variants presenting a MAF over 0.05 that were not included in the first genotyping panel, which was the case for four of the nine identified variants (rs7911506, c.86172A>G, rs7895103 and c.140901G>A). Only rs7895103 showed a modest association with asthma, at a level that does not provide additional insights (p = 0.013, compared to p = 0.000463 for rs7081735).

Figure 2. PTPRE gene sequenced regions and identified SNPs scaled location.

The black thick boxes above the gene define its sequenced parts, which are identified by the same numbers used in the Table 4, in which exact chromosomal positions are available. Studied SNPs are represented below the gene. The TagSNPs correspond to an asterisk (*) and the SNPs identified by sequencing correspond to a diamond (◊). All PTPRE numbers for the discovered SNPs are based on the mRNA sequence NM_006504 (variant 1, receptor form) and from the NCBI ( SNP database (build 127). Image source: HapMap ( October 2007 (Genome Browser,, version 1.69); modified according to our study design.

Table 4. PTPRE sequenced regions and characteristics of identified SNPs.

Replication Study

Considering that the PTPRE rs7081735 association to allergic asthma is the strongest in the SLSJ sample, we evaluated it in an independent familial cohort, the Childhood Asthma Management Program (CAMP) [34]. The genotyping completion rate was 96% and no discordance was observed upon repeat genotyping of two random plates. The minor allele frequency was 0.33 and was in Hardy-Weinberg equilibrium (p = 0.90). Family-based association showed no significant association for asthma or atopy phenotypes (data not shown). However, to ensure that the selection of the CAMP study for the replication is appropriate and to demonstrate that the association found could result from a childhood subset of asthma instead of an adulthood one, we stratified the SLSJ association analyses considering only the probands that reported an asthma age of onset below 12 years old. Thus, the positive association found for PTPRE rs7081735 and allergic asthma (p = 0.000463, Table 3) remains significant when considering only those probands (Family number = 74; Z for the minor allele = −3.166; p = 0.001546). Even with the loss of statistical power due to the stratified analysis, this comparison allowed to assume that the association found for PTPRE is probably more related to a childhood asthma, comforting the choice of the CAMP cohort as a replication study.


This study proposes a novel approach for the selection of candidate genes for asthma association studies using the computational G2D tool to find genetic determinants for this disease. Based on a genome-wide scan performed in an asthmatic familial sample from the SLSJ founder population, we selected the two best-linked regions (6q26 and 10q26.3) and applied the G2D data mining approach [17], [18] to identify ten candidate genes for an association in the SLSJ sample. Among these, five (LPA, ADAM8, DOCK1, GPR123 and PTPRE) presented modest associations with asthma, atopy or allergic asthma. After corrections for multiple testing, only the PTPRE rs7081735 association to allergic asthma remained significant. These findings demonstrate that the G2D tool can be useful in the selection of candidate genes for asthma genetic studies.

Because the PTPRE association to allergic asthma remained significant after correction for multiple testing, we sequenced strategic regions around the rs7081735, aiming to find the causal mutation. Sequencing allowed the identification of five known and four novel variants. Testing four SNPs with MAF >0.05 did not identify additional associated SNPs with asthma or atopy related phenotypes in the SLSJ sample. Thus, the rs7081735, located in the 5′ untranslated region, could be the causal variant, or be in linkage disequilibrium with an unknown causal mutation.

PTPRE is a member of the protein tyrosine phosphatase (PTPs) family, which includes genes that are important regulators of signal transduction pathways involved in various cellular processes such as control of metabolic pathways, cellular adhesion, cell cycle progression and immune response [41], [42]. PTPRE encodes two different isoforms, cytoplasmic and transmembrane [43], based on its different promoters [44]. PTPs receptors participate in transmembrane signaling and cellular adhesion processes, whereas intracellular PTPs take part in signal transduction within the cell [45]. For example in mice, PTPε-deficient macrophages present abnormalities in the regulation of the respiratory burst and the production of cytokines in response to bacterial lipopolysaccharide, suggesting a role of the PTPε isoform in inflammation as well as in host defense [46]. However, PTPRE has been shown to be highly expressed in peripheral human monocytes and granulocytes, and antigen-receptor stimulation induces the expression of PTPRE in activated lymphocytes [47]. These observations and our finding suggest a protective effect of PTPRE in allergic asthma, and lead us to hypothesize that this potential protective role could involve the leukocyte cellular processes in the limitation of lung inflammation following an allergen sensitization. Further work is needed to define the PTPRE possible role in asthma pathophysiology.

The PTPRE rs7081735 association has been evaluated in the independent CAMP cohort. Results showed no positive association for asthma and atopic phenotypes. Taking into account that the SLSJ and the CAMP studies are both family-based designed, well powered [48], [49] and presenting a childhood onset asthma, the PTPRE association lack of replication in the CAMP study can be explained by other reasons: natural variability of asthma history [50], [51], differences of proband mean age between samples (SLSJ mean age of 18, and CAMP mean age of 8, respectively), differences in the genetic background of the two populations [52] (SLSJ individuals descend predominantly from French European founders [11][16] whereas CAMP individuals come from a white North-American admixed population), or population specific gene-gene and gene-environment interactions [52]. Based on these observations, we conclude that the PTPRE rs7081735 association to allergic asthma is more penetrant in the SLSJ population, possibly resulting from an interaction with others genes and/or environmental factors that are more common in this founder population.

In summary, this study demonstrates that the G2D tool can be useful in the prioritization of candidate genes for a complex disease as it allowed us to find a novel asthma genetic association with the PTPRE gene in the SLSJ familial asthma sample. This association represents a potential protective factor for asthma pathogenesis, as it is more likely related to a childhood onset asthma. The present genetic study is an example of how the combination of different methodological approaches can be relevant to target asthma genetic determinants and to motivate further genetic and functional investigations.

Supporting Information

Table S1.

Genome-wide scan Saguenay-Lac-St-Jean subjects clinical characteristics and studied phenotypes

(0.06 MB DOC)

Table S2.

Characteristics of the 91 selected SNPs

(0.25 MB DOC)

Table S4.

Oligonucleotides used for PTPRE sequencing

(0.05 MB DOC)

Table S5.

Genes selected by G2D « PHENOTYPE » analysis

(0.07 MB DOC)

Table S6.

Genes known or suspected to be associated with asthma that were considered for the G2D «KNOWN GENES» analysis

(0.07 MB DOC)

Table S7.

Genes selected by G2D « KNOWN GENES » analysis

(0.06 MB DOC)


We thank families of the Saguenay–Lac-St-Jean for their participation in this study. We also thank Janet Murphy for her participation in the genome scan as well as Alexandre Belisle, Pierre Lepage and Charleen Salesse for the genotyping assays. K. Tremblay is an AllerGen PhD trainee and is supported by the Fondation de l'Université Laval studentship. T.J. Hudson received an Investigator Award from the Canadian Institutes of Health Research (CIHR) and a Clinician-scientist Award in Translational Research from the Burroughs Wellcome Fund. M.A. Andrade-Navarro holds a Canada Research Chair in Bioinformatics. C. Laprise is the chairholder of the Canada Research Chair ( on genetic determinants in asthma and the director of the Genetics platform of the Respiratory Health Network of the Fonds de la recherche en santé du Québec. We also thank all subjects of the CAMP study for their ongoing participation. We acknowledge the CAMP investigators and research team, supported by NHLBI, for collection of CAMP Genetic Ancillary Study data. All work on data collected from the CAMP Genetic Ancillary Study was conducted at the Channing Laboratory of the Brigham and Women's Hospital under appropriate CAMP policies and human subject's protections.

Author Contributions

Conceived and designed the experiments: CPI MAAN CL. Performed the experiments: CP AT CPI CL. Analyzed the data: KT ML GMH CPI CL. Contributed reagents/materials/analysis tools: KT ML CP. Wrote the paper: KT.


  1. 1. Maddox L, Schwartz DA (2002) The pathophysiology of asthma. Annu Rev Med 53: 477–498.
  2. 2. Cookson WO, Moffatt MF (2000) Genetics of asthma and allergic disease. Hum Mol Genet 9: 2359–2364.
  3. 3. Blumenthal MN (2005) The role of genetics in the development of asthma and atopy. Curr Opin Allergy Clin Immunol 5: 141–145.
  4. 4. Sandford A, Weir T, Pare P (1996) The genetics of asthma. Am J Respir Crit Care Med 153: 1749–1765.
  5. 5. Hoffjan S, Ober C (2002) Present status on the genetic studies of asthma. Curr Opin Immunol 14: 709–717.
  6. 6. Rannala B (2001) Finding genes influencing susceptibility to complex diseases in the post-genome era. Am J Pharmacogenomics 1: 203–221.
  7. 7. Ober C, Tsalenko A, Parry R, Cox NJ (2000) A second-generation genomewide screen for asthma-susceptibility alleles in a founder population. Am J Hum Genet 67: 1154–1162.
  8. 8. Laitinen T, Daly MJ, Rioux JD, Kauppi P, Laprise C, et al. (2001) A susceptibility locus for asthma-related traits on chromosome 7 revealed by genome-wide scan in a founder population. Nat Genet 28: 87–91.
  9. 9. Hoffjan S, Nicolae D, Ober C (2003) Association studies for asthma and atopic diseases: a comprehensive review of the literature. Respir Res 4: 14.
  10. 10. Ober C, Hoffjan S (2006) Asthma genetics 2006: the long and winding road to gene discovery. Genes Immun 7: 95–100.
  11. 11. Heyer E, Tremblay M (1995) Variability of the genetic contribution of Quebec population founders associated to some deleterious genes. Am J Hum Genet 56: 970–978.
  12. 12. Scriver CR (2001) Human genetics: lessons from Quebec populations. Annu Rev Genomics Hum Genet 2: 69–101.
  13. 13. Labuda M, Labuda D, Korab-Laskowska M, Cole DE, Zietkiewicz E, et al. (1996) Linkage disequilibrium analysis in young populations: pseudo-vitamin D-deficiency rickets and the founder effect in French Canadians. Am J Hum Genet 59: 633–643.
  14. 14. Engert JC, Berube P, Mercier J, Dore C, Lepage P, et al. (2000) ARSACS, a spastic ataxia common in northeastern Quebec, is caused by mutations in a new gene encoding an 11.5-kb ORF. Nat Genet 24: 120–125.
  15. 15. Richter A, Rioux JD, Bouchard JP, Mercier J, Mathieu J, et al. (1999) Location score and haplotype analyses of the locus for autosomal recessive spastic ataxia of Charlevoix-Saguenay, in chromosome region 13q11. Am J Hum Genet 64: 768–775.
  16. 16. Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265: 2037–2048.
  17. 17. Perez-Iratxeta C, Wjst M, Bork P, Andrade MA (2005) G2D: a tool for mining genes associated with disease. BMC Genet 6: 45.
  18. 18. Perez-Iratxeta C, Bork P, Andrade-Navarro MA (2007) Update of the G2D tool for prioritization of gene candidates to inherited diseases. Nucleic Acids Res Jul 1: W212–W216.
  19. 19. Poon AH, Laprise C, Lemire M, Montpetit A, Sinnett D, et al. (2004) Association of vitamin D receptor genetic variants with susceptibility to asthma and atopy. Am J Respir Crit Care Med 170: 967–973.
  20. 20. Tremblay K, Lemire M, Provost V, Pastinen P, Renaud Y, et al. (2006) Association study between the CX3CR1 gene and asthma. Genes Immun 7: 632–639.
  21. 21. Begin P, Tremblay K, Daley D, Lemire M, Claveau S, et al. (2007) Association of urokinase-type plasminogen activator with asthma and atopy. Am J Respir Crit Care Med 175: 1109–1116.
  22. 22. Kong A, Cox NJ (1997) Allele-sharing models: LOD scores and accurate linkage tests. Am J Hum Genet 61: 1179–1188.
  23. 23. Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES (1996) Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 58: 1347–1363.
  24. 24. Lemire M (2005) A simple nonparametric multipoint procedure to test for linkage through mothers or fathers as well as imprinting effects in the presence of linkage. BMC Genet 6: Suppl 1S159.
  25. 25. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29.
  26. 26. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21: 263–265.
  27. 27. Oeth P, Beaulieu M, Park C, Kosman D, del Mistro G, et al. (2005) iPLEX assay: Increased plexing efficiency and flexibility for MassArray system through single base primer extension with mass-modified terminators. SEQUENOM Application Note.
  28. 28. Lake SL, Blacker D, Laird NM (2000) Family-based tests of association in the presence of linkage. Am J Hum Genet 67: 1515–1525.
  29. 29. Horvath S, Xu X, Lake SL, Silverman EK, Weiss ST, et al. (2004) Family-based tests for associating haplotypes with general phenotype data: application to asthma genetics. Genet Epidemiol 26: 61–69.
  30. 30. Laird NM, Horvath S, Xu X (2000) Implementing a unified approach to family-based tests of association. Genet Epidemiol 19: Suppl 1S36–S42.
  31. 31. Li J, Ji L (2005) Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity 95: 221–227.
  32. 32. Nyholt DR (2004) A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet 74: 765–769.
  33. 33. Bouchard G, Roy R, Casgrain B, Hubert M (1989) [Population files and database management: the BALSAC database and the INGRES/INGRID system]. Hist Mes 4: 39–57.
  34. 34. The Childhood Asthma Management Program Research Group (1999) The Childhood Asthma Management Program (CAMP): design, rationale, and methods. Control Clin Trials 20: 91–120.
  35. 35. The Childhood Asthma Management Program Research Group (2000) Long-term effects of budesonide or nedocromil in children with asthma. N Engl J Med 343: 1054–1063.
  36. 36. Lander E, Kruglyak L (1995) Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet 11: 241–247.
  37. 37. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, et al. (2007) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 35: D5–D12.
  38. 38. Wills-Karp M, Ewart SL (2004) Time to draw breath: asthma-susceptibility genes are identified. Nat Rev Genet 5: 376–387.
  39. 39. Becker KG, Barnes KC, Bright TJ, Wang SA (2004) The genetic association database. Nat Genet 36: 431–432.
  40. 40. Lopez-Bigas N, Ouzounis CA (2004) Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res 32: 3108–3114.
  41. 41. Alonso A, Sasin J, Bottini N, Friedberg I, Friedberg I, et al. (2004) Protein tyrosine phosphatases in the human genome. Cell 117: 699–711.
  42. 42. Li L, Dixon JE (2000) Form, function, and regulation of protein tyrosine phosphatases and their involvement in human diseases. Semin Immunol 12: 75–84.
  43. 43. Nakamura K, Mizuno Y, Kikuchi K (1996) Molecular cloning of a novel cytoplasmic protein tyrosine phosphatase PTP epsilon. Biochem Biophys Res Commun 218: 726–732.
  44. 44. Tanuma N, Nakamura K, Kikuchi K (1999) Distinct promoters control transmembrane and cytosolic protein tyrosine phosphatase epsilon expression during macrophage differentiation. Eur J Biochem 259: 46–54.
  45. 45. Schumann G, Fiebich BL, Menzel D, Hüll M, Butcher R, et al. (1998) Cytokine-induced transcription of protein-tyrosine-phosphatases in human astrocytoma cells. Brain Res Mol Brain Res 62: 56–64.
  46. 46. Sully V, Pownall S, Vincan E, Bassal S, Borowski AH, et al. (2001) Functional abnormalities in protein tyrosine phosphatase epsilon-deficient macrophages. Biochem Biophys Res Commun 286: 184–188.
  47. 47. Wabakken T, Hauge H, Finne EF, Wiedlocha A, Aasheim H (2002) Expression of human protein tyrosine phosphatase epsilon in leucocytes: a potential ERK pathway-regulating phosphatase. Scand J Immunol 56: 195–203.
  48. 48. Lemire M, Roslin NM, Laprise C, Hudson TJ, Morgan K (2004) Transmission-ratio distortion and allele sharing in affected sib pairs: a new linkage statistic with reduced bias, with application to chromosome 6q25.3. Am J Hum Genet 75: 571–586.
  49. 49. Hersh CP, Raby BA, Soto-Quiros ME, Murphy AJ, Avila L, et al. (2007) Comprehensive Testing of Positionally Cloned Asthma Genes in Two Populations. Am J Respir Crit Care Med 176: 849–857.
  50. 50. Reed CE (2006) The natural history of asthma. J Allergy Clin Immunol 118: 543–548; quiz 549–550.
  51. 51. Koh MS, Irving LB (2007) The natural history of asthma from childhood to adulthood. Int J Clin Pract 61: 1371–1374.
  52. 52. Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K (2002) A comprehensive review of genetic association studies. Genet Med 4: 45–61.