TGFA and IRF6 Contribute to the Risk of Nonsyndromic Cleft Lip with or without Cleft Palate in Northeast China

Nonsyndromic cleft lip with or without cleft palate (NSCL/P) are common birth defects with a complex etiology. Multiple interacting loci and possible environmental factors influence the risk of NSCL/P. 12 single nucleotide polymorphisms (SNPs) in 7 candidate genes were tested using an allele-specific primer extension for case-control and case-parent analyses in northeast China in 236 unrelated patients, 185 mothers and 154 fathers, including 128 complete trios, and 400 control individuals. TGFA and IRF6 genes showed a significant association with NSCL/P. In IRF6, statistical evidence of an association between rs2235371 (p = 0.003), rs2013162 (p<0.0001) and NSCL/P was observed in case-control analyses. Family based association tests (FBATs) showed over-transmission of the C allele at the rs2235371 polymorphism (p = 0.007). In TGFA, associations between rs3771494, rs3771523 (G3822A), rs11466285 (T3851C) and NSCL/P were observed in case-control and FBAT analyses. Associations between other genes (BCL3, TGFB3, MTHFR, PVRL1 and SUMO1) and NSCL/P were not detected.


Introduction
Nonsyndromic cleft lip with or without cleft palate (NSCL/P) is one of the most common birth defects, affecting 1 in 500 to 1000 births worldwide [1]. In China, this common malformation occurs at a rate of 1.6 per 1000 live births. NSCL/P presents a significant public health problem because treatment requires comprehensive surgical, orthodontic, phoniatric and psychological management [2].
Most recently, several genome-wide association studies (GWAS) reported a major susceptibility locus for NSCL/P on 8q24.21 and other genes [7]. A case-control genome-wide association study (GWAS) in Germany found significant evidence of association with markers in 8q24.21 [8], and a US case-control GWAS confirmed this region, with rs987525 being the most significant marker in both studies [9]. Mangold and colleagues subsequently used an expanded dataset from Europe and identified additional loci at chromosomes 10q25 (VAX1, ventral anterior homeobox 1) and 17q22 (NOG, noggin) that achieved genome-wide significance [10]. Beaty and colleagues performed a GWAS using a caseparent trio from Europe, the United States, China, Taiwan, Singapore, Korea and the Philippines and found that SNPs in two genes not previously associated with CL/P (ABCA4 on chromosome 1p22.1 and MAFB on 20q12) achieved genome-wide significance, and three potential candidate genes (PAX7 on 1p36, VAX1 on 10q25.3 and NTN1 on 17p13) had one or more SNPs near genome-wide significance [11]. These novel loci have subsequently been replicated in independent studies [12][13][14]. Leslie et al. resequenced the ARHGAP29 and identified a nonsense variant, a frame shift variant, and fourteen missense variants were overrepresented [15]. The GWAS Meta analysis study based on the two GWAS identifies six new risk loci [16].
Transforming growth factor alpha (TGFA) belongs to a large family of proteins that regulate cell proliferation, differentiation, migration and apoptosis. Ardinger et al. first reported an association between the Taq I variant in TGFA and NSCL/P [17]. TGFA has since been extensively investigated for linkage, association and gene-environment interactions with inconsistent results. The mutations (C3827T, G3822A, and T3851C) in 39untranslated conserved regions were reported to be associated with oral-facial clefts [18]. Sull et al. recently tested 17 SNPs in a region surrounding TGFA on chromosome 2p13 and reported over-transmission of rs3771494 as the minor allele without considering the parent-of-origin in four populations [19].
The interferon regulatory factor 6 (IRF6) gene, located on chromosome 1q32.3-q41, has been frequently studied and is strongly associated with oral-facial cleft risk. This gene encodes interferon regulatory factor 6, which is a key element in oral and maxillofacial development. Scapoli et al. selected four SNPs based on Zucchero et al. [20] and reported that rs2013162 and rs2235375 have strong linkage disequilibrium with NSCL/P in an Italian family [21]. Srichomthong et al. suggested that IRF6 rs2235371 (V274I) is responsible for 16.7% of the genetic contribution to CL/P [22]. Large studies in different populations [23][24][25][26] have provided evidence that IRF6 is important in the etiology of NSCL/P.
Erdogan et al. used a novel allele-specific primer elongation microarray to test SNPs in human mitochondrial DNA (mtDNA) [35]. Jagomagi et al. performed SNP genotyping using an arrayed primer extension technique [36]. In this research, we designed a similar microarray to test the risk of 12 SNPs in 7 genes that may contribute to NSCL/P in a population from northeast China. This microarray is a high throughout, efficient, accurate and convenient method for test SNPs, and the current study of northeast China populations was supplement for the previous association study between candidate genes and NSCL/P.

Ethics
This study was approved by the ethics committee at the Liaoning Province Research Institute of Family Planning. Written informed consent was obtained from all study participants. Parents or legal guardians provided written consent on behalf of minors.

Patients and Families
We evaluated 236 unrelated patients with NSCL/P, their parents (185 mothers and 154 fathers, including 128 complete trios), and 400 control individuals. All participants were recruited from Stomatology Hospital of China Medical University in northeast China between 2008 and 2012. All patients underwent a pre-operation examination and questionnaire, and photographs were taken to record the shape of the lips, alveolar ridge, and hard and soft palates to diagnose cleft lip and palate. The patients had a physical examination to document the shape of the skull, eyes, nose, chin, neck, chest, feet, and hands to exclude syndromes that may be associated with cleft lip and palate, such as Albert syndrome, Edward syndrome, and Pierre Robin syndrome. All the controls were recruited from healthy volunteers. Controls with a family history of clefts and other anomalies were excluded based on a questionnaire that evaluated the medical history of the patients' family for three generations. A physical examination was also performed on the controls, particularly regarding movement of the lips and soft palate to detect occult deformities. A five milliliter peripheral blood sample was collected. DNA was obtained from blood samples with a TIANamp Blood DNA kit (TianGen) and used as the template for PCR amplification.

Allele Special Primer Extension Microarray Preparation
A total of 24 oligonucleotide tags used as probe captures were obtained from www.genome.wi.mit.edu. The 39-amino modified tags ( Table 2) were synthesized by Nanjing Genescript Biological Technology Company. Tags were suspended at a concentration of 30 mM in 0.3 M carbonate buffer (pH 8.0) and spotted onto slides in duplicate using a Nano-Plotter 2.1 arrayer (GeSiM Germany) at a rate of ten hits per spot in a humidified chamber (60% relative humidity). After spotting, the microarrays were incubated at room temperature overnight in a humid chamber and broiled at 60uC for 1 hour. Microarrays were rinsed in dH2O and dried.

Allele-specific Primer Design
Each allele-specific primer was composed of a reverse complement of the tag (com-tag) at forward and backward sequences. For each SNP, two allele-specific primers were designed to match the two SNP alleles (Table 2). In addition, a com-tag labeled with Cy3 was synthesized as a positive control.

Multiplex PCR and Allele-specific Primer Extension
PCR was amplified using the Multiplex PCR Assay Kit (Takara) in a 20 mL volume (Multiplex PCR mix2 10 mL, Multiplex PCR mix1 0.2 mL, 0.2 mM primers, and 50 ng DNA). Thermocycling was performed with an initial 60 second denaturation at 94uC followed by 35 cycles of 30 seconds at 94uC, 90 seconds at 57uC, 90 seconds at 72uC, and final extension at 72uC for 10 minutes. The PCR product was used as a template for multiplex allelespecific primer extension. To remove excess dNTPs and primers, 2 units of shrimp alkaline phosphatase (Takara) and 4 units of EXO I (Takara) were added to 10 mL of the PCR product, followed by incubation at 37uC for 30 min and 96uC for 10 min. A total of 10 mL of treated PCR products were added to 5 mL ASPE containing 16Thermopol reaction buffer, 80 mM each of dATP, dGTP, and dTTP (Takara), 40 mM Cy3-dCTP (GE), 0.2 units of Vent exo-DNA polymerase (New England Biolabs), and 25 nM allele-specific primers. Extension conditions were as follows: initial 3 minutes at 94uC, 35 cycles of 94uC for 30 seconds, 60uC for 30 seconds, 72uC for 1 minute, and a final extension at 72uC for 5 minutes.

Hybridization, Scanning of Microarrays and Data Analyses
To prepare Cy3-labeled single strand products, 10 mL allelespecific extension products were added to a 10 mL mixture containing 1.336SSC, 0.067% SSC, 0.033 mg/ml salmon sperm DNA, and 50 nM Cy3-labeled com-tag. The mix was denatured at 94uC for 10 minutes and cooled for 5 minutes on ice. Hybridization was performed using a hybridization solution in a hybridization chamber at 60uC for 2 hours. Hybridization was followed by two washes in 26SSC/0.1% SDS (preheated to 50uC) for 5 minutes, washes in 0.26SSC/0.1% SDS for 1 minute, then rinsed in dH 2 O and dried by centrifugation for 3 minutes at 500 rpm.
Microarrays were scanned at 100-mm resolution using a GenePix 4000B scanner, and TIFF images were imported into GenePix 6.0 software (Axon Instruments, USA). For each allele, the mean pixel intensity was subtracted from the mean background intensity. Alleles with a mean intensity lower than the cutoff (mean background +1000) were excluded. SNP genotypes were identified by the allelic fraction (AF), which was calculated as follows: AF = allele B/allele A+allele B. AF values .0.6, ,0.4, and between 0.4 and 0.6 represented allele B homozygotes, allele A homozygotes, and heterozygotes, respectively. Microarray results can be acquired from Gene Expression Omnibus (GSE45770 accession number).

Statistical Methods
Hardy-Weinberg equilibrium (HWE) was used to assess all SNPs. Case-control statistical analysis was performed using the SPSS 17.0 statistical software package (http://www-03.ibm.com/ software/products/us/en/spss-stats-standard/). Odds ratios and 95% confidence intervals were calculated for the case and control groups. The Family Based Association Test (FBAT) package (http://www.biostat.harvard.edu/fbat/default.html) was used to test for over-transmission of the target alleles in case-parent trios. Linkage disequilibrium (LD) was estimated using haploview (http://www.broad.mit.edu/mpg/haploview) based on the genotypes of the control samples. TGFA and IRF6 haplotypes were calculated using the FBAT software package. Figure 1 shows the SNP microarray results. Each SNP AF was calculated as described, and the SNP genotypes were clearly identified.

Genotype and Case-control Comparisons
To ensure accuracy, 50 samples were run in duplicate, and 10% of all samples were sequenced directly. SNP genotypes were highly reproducible (99.5%), and direct sequenced demonstrated agreement with the microarray (99.6%).
Hardy-Weinberg equilibrium was assessed for the 12 SNPs. There was no evidence of deviation from HWE for any of the SNPs. Genotype distribution between cases was compared with controls (Table 3).
There was no evidence of genotypic association with the risk of NSCL/P in northeast China for the following SNPs: TGFA rs1058213 (C3827T), BCL3 rs1046881, TGFB3 rs3917201, Table 2. SNP probes and allele-specific extension primer sequences.

Gene
SNPs allele Tag Allele-specific extension primers Although the OR of the AA genotype compared with GG was not statistically significant (p = 0.076), the OR of GA+AA compared with the GG homozygote was significant (p = 0.002). For rs11466285, a significant association was observed between patients and controls (p,0.05).
For IRF6 rs2235371 and rs2013162, a risk of NSCL/P was observed in all of the genetic models evaluated (

FBAT
FBAT showed strong associations between TGFA rs3771494, rs3771523 (G3822A), and rs11466285 (T3851C) and IRF6 rs2235371 (V274I) and rs2013162 and NSCL/P (Table 4). These SNPs showed an over-transmission of the minor frequency alleles, with the exception of IRF6 rs2235371, which showed undertransmission of the T allele and over-transmission of the common C allele.

LD and Haplotype Analysis
Pairwise LD was measured for TGFA and IRF6 (Table 5). LD analysis showed tight linkage between the SNPs in IRF6 and TGFA.

Discussion
We used a microarray with allele-specific primer extension to test 12 SNPs in 7 genes reported to be associated with NSCL/P and observed that SNPs in TGFA and IRF6 had statistically significant associations with NSCL/P based on case-control and case-parent analyses consisting of 236 patients and 400 controls from northeast China.
The TGFA gene has been well studied since Ardinger et al. first reported an association between the Taq I variant and NSCL/P [17]. However, there have been conflicting results reported for different population due to study design, sample size, and TGFA variants used [37]. Machida et al tested SNPs in the 39 UTR of TGFA and reported that there were no associations with NSCL/P [18], but Shiang et al. reported significant associations between those SNPs and cleft palate [38]. Letra et al. recently calculated the attributable fraction of high-risk alleles at IRF6 rs2235371 and TGFA rs1058213 in a Brazilian population and estimated the contribution of interactions between the two genes to be approximately 1% [39]. They also evaluated IRF6-TGFA interactions in 142 case-parent trios and detected significant overtransmission of high-risk alleles to the affected child (p = 0.001). Sull et al. examined associations between TGFA markers and NSCL/P in case-parent trios from four populations [19]. The authors genotyped 17 SNPs and reported significant transmission of the minor allele in rs3771494 (OR = 1.59, p = 0.004).
In this study, a significant association with rs3771494 (OR = 1.88, p,0.0001) was observed in the case-control analysis. FBAT analysis also showed over-transmission of the C allele in rs3771494 (p = 0.016). These consistent results suggest that TGFA is strongly associated with NSCL/P in populations in northeast China. We also observed an association between rs3771523 (OR = 1.73, p = 0.002) and rs11466285 (OR = 1.81, p = 0.001) and over-transmission of the A allele in rs3771523 (p = 0.01) and the C allele in rs11466285 (p = 0.03) based on FBAT analysis. However, a significant association was not observed between rs1058213 and NSCL/P in the present study (OR = 1.14, p = 0.427), which is in accordance with previous studies. Using HBAT analysis, we observed an over-transmission of the C-G-C-C haplotype (order: rs3771494-rs3771523-rs11466285-rs1058213).
IRF6 is the most frequently studied gene related to NSCL/P. Research has shown common alleles in IRF6 that are associated   with NSCL/P, which has been independently replicated in genome-wide association (GWA) [11,40] and candidate gene studies [41,42]. Animal models have also supported the role of IRF6 in NSCL/P [43,44]. Birnbaum et al. reported that the T allele in rs2235371 (V274I) coding for isoleucine is underrepresentation in NSCL/P patients compared with controls and may have a protective effect with respect to NSCL/P [40]. However, significant associations were not found between rs2013162 and NSCL/P, which differs from previous findings [21,23,45,46]. Letra et al. demonstrated a significant association between the V274I polymorphism (rs2235371) with complete left cleft lip/palate (p = 0.001) in a Brazilian Caucasian population [39]. Huang et al. found a significant association between rs2235371 and rs2235375 and NSCL/P in west China, but not rs2013162 [47]. In the present study, both rs2235371 (p = 0.003) and rs2013162 (p,0.0001) showed significant differences between patients and controls. rs2013162 results are in contrast to the Huang et al. study, which may be due to population heterogeneity between west and northeast China. FBAT analysis of rs2235371 showed under-representation T allele and over-transmission of the C allele from parent to child, indicating that the minor T allele may be a protective factor in NSCL/P, which is consistent with a previous study [40]. In rs2013162, we observed an overtransmission of the A allele (minor allele) and under-transmission of the C allele (common allele). In haplotype analyses of NSCL/P trios, the C-A (order: rs2235371-rs2013162) haplotype showed significant over-transmission (p = 0.03). These results suggest that the IRF6 gene is strongly associated with an increased risk of NSCL/P in populations in northeast China. Small ubiquitin-related modifier (SUMO1) is strongly expressed in the upper lip, primary palate and MEE of the secondary palate [33]. Several studies [5,48] have shown associations between gene variations in SUMO1 and NSCL/P. Jia et al. reported that the C allele in rs7580433 is overtransmitted from parents to affected individuals in west China [49]. However, we did not observe an association between rs7580433 in SUMO1 and cases and controls (p = 0.561). FBAT analysis of trios did not detect an association between rs7580433 and NSCL/P (p = 0.68).
We failed to replicate an association between candidate gene SNPs, including BCL3, TGFB3, MTHFR and PVRL1, and NSCL/ P in our patient sample. This could be the result of (1) locus and allelic heterogeneity in NSCL/P, (2) selection of too few SNPs that did not span the entire gene, (3) a small number of analyzed patients and controls, and (4) lack of analysis of gene-gene and gene-environmental interactions, which can play an important role in NSCL/P etiology.
In this study, we tested 12 SNPs in 7 candidate genes using a microarray technique. This study examined associations between SNPs and NSCL/P in northeast China. We confirmed that polymorphic variants of TGFA and IRF6 are strongly associated with NSCL/P in a population in northeast China. This is a supplement research for the previous studies. To understand the fully genetic architecture of these genes, additional SNP-based and resequencing studies with a large sample of patients were still needed in further study.