Molecular Basis of Gene-Gene Interaction: Cyclic Cross-Regulation of Gene Expression and Post-GWAS Gene-Gene Interaction Involved in Atrial Fibrillation

Atrial fibrillation (AF) is the most common cardiac arrhythmia at the clinic. Recent GWAS identified several variants associated with AF, but they account for <10% of heritability. Gene-gene interaction is assumed to account for a significant portion of missing heritability. Among GWAS loci for AF, only three were replicated in the Chinese Han population, including SNP rs2106261 (G/A substitution) in ZFHX3, rs2200733 (C/T substitution) near PITX2c, and rs3807989 (A/G substitution) in CAV1. Thus, we analyzed the interaction among these three AF loci. We demonstrated significant interaction between rs2106261 and rs2200733 in three independent populations and combined population with 2,020 cases/5,315 controls. Compared to non-risk genotype GGCC, two-locus risk genotype AATT showed the highest odds ratio in three independent populations and the combined population (OR=5.36 (95% CI 3.87-7.43), P=8.00×10-24). The OR of 5.36 for AATT was significantly higher than the combined OR of 3.31 for both GGTT and AACC, suggesting a synergistic interaction between rs2106261 and rs2200733. Relative excess risk due to interaction (RERI) analysis also revealed significant interaction between rs2106261 and rs2200733 when exposed two copies of risk alleles (RERI=2.87, P<1.00×10-4) or exposed to one additional copy of risk allele (RERI=1.29, P<1.00×10-4). The INTERSNP program identified significant genotypic interaction between rs2106261 and rs2200733 under an additive by additive model (OR=0.85, 95% CI: 0.74-0.97, P=0.02). Mechanistically, PITX2c negatively regulates expression of miR-1, which negatively regulates expression of ZFHX3, resulting in a positive regulation of ZFHX3 by PITX2c; ZFHX3 positively regulates expression of PITX2C, resulting in a cyclic loop of cross-regulation between ZFHX3 and PITX2c. Both ZFHX3 and PITX2c regulate expression of NPPA, TBX5 and NKX2.5. These results suggest that cyclic cross-regulation of gene expression is a molecular basis for gene-gene interactions involved in genetics of complex disease traits.


Introduction
Genome-wide association studies (GWAS) have been highly successful in identifying common genomic variants that are associated with complex human diseases or traits. However, these common variants have small effects, and in aggregate explain only a small fraction of heritability for most diseases or traits. The major portion of heritability remains missing, and this represents a major dilemma in complex trait genetics referred to as "missing heritability". Genegene interaction has been proposed to be a contributor to the problem of missing heritability.
Gene-gene interaction has been long known to have an impact on an organism's phenotype, for example, the color of a flower in plants and the color of a fly's eye. However, it has been challenging to detect gene-gene interaction in human GWAS. Moreover, no gene interaction was functionally validated. Considering the potentially large number of gene-gene interaction, identification of true and casual interaction has been proven to be a daunting task. However, without doubt, studies of gene-gene interaction will contribute to the understanding of inheritance, particularly inheritance of important diseases and traits, and provide insights into the biological pathways and molecular mechanisms of disease pathogenesis.
Atrial fibrillation (AF) is the most common cardiac arrhythmia seen at the clinical setting and accounts for approximately one-third of hospitalizations for cardiac rhythm disturbances [1]. The prevalence of AF is 0.4%-1.0% in the general population, and increases with age, reaching 8% in people over 80 [1]. A similar prevalence rate of 0.77% was found for AF in the Chinese population [2]. AF accounts for 15% of all strokes, worsens heart failure, and independently increases the risk of stroke 5-fold and risk of cardiac death up to 1.9-fold [3]. Genetic factors play an important role in the pathogenesis of AF. The heritability of polygenic liability to AF has been estimated to be 0.62 [4].
To date, several major GWAS have been reported for common complex AF and identified variants in ten chromosomal loci that were associated with AF. The first GWAS for AF identified significant association between SNP rs2200733 near the PITX2c gene encoding paired-like homeodomain 2 transcript c on chromosome 4q25 and AF in several populations of European ancestry as well as one Hong Kong population [5]. Our group later reported that SNP rs2200733 confers a significant risk in the mainland Chinese Han population, too [6]. Then, two independent GWAS identified significant association between AF and SNPs rs2106261 [7] and rs7193343 [8], both of which are located in the ZFHX3 gene encoding zinc finger homeobox 3 on chromosome 16q22. We have found that rs2106261, but not rs7193343, showed significant association with AF in the Chinese Han population [9]. Later, a common variant in KCNN3 (encoding potassium intermediate/small conductance calcium-activated channel, subfamily N, member 3), rs13376333, was found to be associated with lone AF [10]. However, we found that rs13376333 did not show significant association with AF in the Chinese Han population [9]. Ellinor et al [11] identified six susceptibility loci for AF through meta-GWAS analysis. We have shown that only one SNP, rs3807989 at the CAV1 locus (encoding caveolin 1) among the six loci, were associated with AF in the Chinese Han population [12]. In this study, we studied the gene-gene interaction for three AF loci replicated in the Chinese population, i.e. SNP rs2106261 in ZFHX3, rs2200733 near PITX2c, and rs3807989 in CAV1. We provide strong genetic evidence that SNP rs2200733 near PITX2c and rs2106261 in ZFHX3 interact with each other, resulting in a synergistic effect that increases the odds ratios (ORs) to risk of AF. Most importantly, we also carried out a series of cellular and molecular studies to identify the molecular mechanisms underlying the gene-gene interaction. We found that PITX2c and ZFHX3 cross-regulate each other's expression as well as expression of downstream genes such as NPPA (encoding atrial natriuretic factor or ANF), providing a novel molecular basis for their interaction at the molecular genetic level.

Results
Significant association between SNP rs2200733 near PITX2c on 4q25 and rs2106261 in ZFHX3 on 16q22 and AF in three independent populations We previously reported that among the first three genetic loci for AF identified by GWAS in European ancestry populations, only rs2200733 at the PITX2c locus on 4q25 and rs2106261 in ZFHX3 on 16q22, but not rs13376333 in KCNN3, were replicated in the Chinese Han populations [6,9]. We, therefore, carried out a deeper study to determine whether there is gene-gene interaction between rs2200733 and rs2106261. We utilized a case control design which involves three independent populations. The initial association study was carried out with 569 AF patients and 1,996 non-AF control samples (referred to as the Discovery population). The positive findings in the Discovery population were validated in two independent replication populations. The first replication population consisted of 641 AF cases and 1,692 controls (referred to as Replication I population). The second replication population consisted of 810 cases and 1,627 controls (referred to as Replication II population). The clinical characteristics of the three study populations are shown in S1 Table. We first examined the association of AF with each GWAS SNP individually. There was no deviation from the Hardy-Weinberg equilibrium for the two SNPs, rs2200733 and rs2106261 in the control groups of the three populations (S2 Table).

Gene-gene interaction between ZFHX3 variant rs2106261 and PITX2c variant rs2200733
To study the interaction between rs2106261 (G to A substitution, risk allele = A) and rs2200733 (C to T substitution, risk allele = T), we first defined the frequencies of nine possible two-locus genotypes (3 2 genotypes: GGCC, GGCT, GGTT, AGCC, AGCT, AGTT, AACC, AACT, AATT) in cases and controls of the three independent study populations. Then, we used the wild type non-risk GGCC genotype (non-risk homozygote for both loci) as baseline or reference, and estimated the OR for each of the eight other genotypes. As shown in Table 1 and Fig 1 and S1 Fig, compared with the GGCC non-risk reference genotype, the double risk homozygous genotype AATT showed a dramatically increased risk for AF with the highest ORs of 4.81 (95% CI 2.88-8.04) (Pobs = 3.83×10 -10 ) and 6.64 (95% CI 3.64-12.11) (Padj = 6.38×10 -10 ) before and after adjustment for covariates of age and gender, respectively, in the Discovery population. This interesting finding was replicated in two independent AF  Table). In all three independent populations as well as the combined population, ORs for genotype AATT (double risk homozygotes for both rs2106261 and rs2200733) were significantly higher than the ORs for each single-risk homozygotes (GGTT, homozygous risk genotype for rs2106261; AACC, homozygous risk genotype for rs2200733) ( Table 1 and S5  Table). Moreover, the OR of 6.64 for double-risk homozygote AATT was higher than the combined ORs for the two single-risk homozygotes GGTT and AACC together (2.14+1.25 = 3.39) Odds ratios (ORs) for each two-locus genotype for GWAS SNPs rs2106261 in the ZFHX3 gene and rs2200733 close to the PITX2c gene involved in the pathogenesis of AF after adjustment for covariates. For two SNPs, there are a total of 9 genotypes. The wild type or non-risk GGCC genotype was used as the reference and ORs for other genotypes were estimated against the reference genotype using multivariable logistic regression analysis by including the age and gender as covariates. A. Analysis of ORs in the Discovery population. B. Analysis of ORs in the Replication I population. C. Analysis of ORs in the Replication II population. D. Analysis of ORs in the combined population with the Discovery, Replication I and Replication II cohorts. *P<0.01.  Fig 1 and S1 Fig). These data provide genetic evidence for interaction between ZFHX3 variant rs2106261 and PITX2c variant rs2200733, which generates a synergistic effect that markedly increases the risk of AF.
Two other genotypes, GGTT and AGTT, significantly increased risk of AF compared to reference non-risk genotype GGCC, consistently in all three populations (Padj<0.006 after Bonferroni correction) ( Table 1, Fig 1 and S1 Fig).
Molecular basis of gene-gene interaction: PITX2c positively regulates the expression of ZFHX3 via miR-1 To substantiate the novel finding of the genetic interaction between rs2106261 and rs2200733 as identified by the analyses above, we carried out functional studies to identify the underlying molecular mechanism of the interaction. The PITX2c gene near rs2200733 has been demonstrated to be an AF gene using mouse models and shown to regulate several genes in the atria [13][14][15]. Because PITX2c encodes a transcriptional factor, we hypothesized that PITX2c would regulate the expression of ZFHX3, generating a synergistic effect for gene-gene interaction. To test this hypothesis, we transfected HCT116 cells with a PITX2c-specific siRNA and a negative control siRNA (NC control) and then used real-time RT-PCR analysis to measure the expression level of ZFHX3. As shown in Fig 2, knockdown of PITX2c expression by siRNA significantly decreased the expression level of ZFHX3 (P = 4.00×10 -3 ) (Fig 2A and 2C). In a parallel study, overexpression of a FLAG-tagged PITX2c protein by transfection of a p3×FLAG--PITX2c expression plasmid significantly increased the expression level of ZFHX3 (P = 0.01) (Fig 2B and 2C). These studies indicate that PITX2c positively regulates expression of ZFHX3.
To explore the molecular mechanism by which PITX2c regulates ZFHX3, we searched for a potential PITX2c binding site at the ZFHX3 promoter and regulatory region, but failed to find one. Because PITX2c was shown to negatively regulate the expression of miR-1 (microRNA 1-1) [15], we hypothesize that PITX2c may regulate expression of ZFHX3 through miR-1. To test this hypothesis, we transfected HCT116 cells with miR-1 mimics and control microRNA mimics and measured the expression level of ZFHX3. Both real-time RT-PCR analysis and Western blot analysis showed that miR-1 mimics significantly decreased expression of ZFHX3 at both mRNA (P = 4.00×10 -4 ) and protein levels (P = 6.84×10 -5 ), although the effect on the protein level was more robust (Fig 3A-3C). This interesting finding of down-regulation of ZFHX3 by miR-1 was confirmed in another cell line, SW620 at the ZFHX3 mRNA (P = 0.01) and protein levels (P = 4.89×10 -4 ) (Fig 3D-3F). These results suggest that miR-1 negatively regulates expression of ZFHX3.
To explore the molecular mechanism by which miR-1 regulates ZFHX3, we performed bioinformatic analysis by searching two databases, DIANA TOOLs and microRNA.org-Target and Expression, and found that the 3'-untranslated region (3'-UTR) of ZFHX3 contained two potential targeting sites for miR-1 ( Fig 3G). We cloned each region containing a miR-1 binding site downstream of the firefly luciferase coding region in the pMIR-REPORT luciferase vector, resulting in luciferase reporters pMIR-ZFHX3-3'-UTR-1 (cloned genomic region: chr16: 72819500 to 72820662) and pMIR-ZFHX3-3'-UTR-2 (cloned genomic region: chr16: 72818241 to 72819390), respectively ( Fig 3H). Each reporter was co-transfected with miR-1 mimics (100 nM) into HCT116 cells and luciferase assays were carried out. A schematic diagram shows luciferase reporters containing the potential miR-1 binding site or the related mutated site ( Fig 3H). As shown in Fig 3I, miR-1 mimics significantly reduced luciferase activities from pMIR-ZFHX3-3'-UTR-2, but not that from pMIR-ZFHX3-3'-UTR-1. Mutation of the miR-1 binding site in pMIR-ZFHX3-3'-UTR-2 from CATTCCA to TGCGAAC abolished the miR-1-mediated reduction of the reporter luciferase activity ( Fig 3I). These data suggest that miR-1 negatively regulates expression of ZFHX3 by targeting to the second binding site at the 3'-UTR of ZFHX3.  Expression of ZFHX3 is negatively regulated by miR-1. HCT116 (A-C) and SW620 (D-F) cells were transfected with miR-1 mimics and negative control mimics (NC) and used for isolation of RNA samples for real-time RT-PCR analysis or for isolation of protein extracts for Western blot analysis for the expression levels of ZFHX3 mRNA and protein. A. Real-time RT-PCR analysis revealed that the miR-1 mimics reduced the expression of ZFHX3 by 20% in HCT116 cells (P = 0.004). B, C. Western blot analysis revealed that the miR-1 mimics reduced the expression of the ZFHX3 protein by 54% in HCT116 cells (P = 6.84×10 -5 ). D. Real-time RT-PCR analysis revealed that the miR-1 mimics reduced the expression of ZFHX3 by 27% in SW620 cells (P = 0.01). E, F. Western blot analysis revealed that the miR-1 mimics reduced the expression of the ZFHX3 protein by 45% in SW620 cells (P = 4.887×10 -4 ). G. Identification of two putative miR-1 binding sites at the 3'-UTR of ZFHX3 by bioinformatic analysis and alignment of miR-1 binding sequences across species. H. A schematic diagram shows luciferase reporters containing the potential miR-1 binding site or the related mutated site. I. MiR-1 targets the second miR-1 binding site to regulate expression of ZFHX3. Luciferase assays revealed that compared to negative control mimics, miR-1 mimics significantly reduced luciferase activities from pMIR-ZFHX3-3'-UTR-2, but not from pMIR-ZFHX3-3'-UTR-1. *P<0.05; **P<0.01. The AF SNP rs2106261 identified by GWAS is located within the ZFHX3 gene, therefore, we consider ZFHX3 as a strong candidate gene for AF at the chromosome 16q22 locus. As our genetic studies indicate a gene-gene interaction between PITX2c and ZFHX3, we hypothesized that ZFHX3 may regulate expression of PITX2c. Interestingly, knockdown of ZFHX3 expression by a specific siRNA significantly decreased expression of PITX2c about 2-fold (P = 5.00×10 -3 ) (Fig 4A and 4C). Conversely, overexpression of ZFHX3 significantly increased expression of PITX2c by 2.96-fold (P = 2.00×10 -3 ) (Fig 4B and 4D). Knockdown of ZFHX3 expression by siRNA reduced the transactivation activity from a reporter with a 1.5 kb DNA fragment upstream of the PITX2c transcriptional start site fused to the luciferase gene (PITX2c-PGL3) by 1.97-fold (P = 5.00×10 -3 ) (Fig 4E).
Molecular basis of gene-gene interaction: Both PITX2c and ZFHX3 positively regulate the expression of NPPA Several earlier studies showed that PITX2c regulates the expression of the NPPA gene encoding ANF (a cardiac protein hormone), but conflicting results on either positive regulation or negative regulation were obtained in different studies [13,15,16]. We tested the regulation of NPPA by PITX2c in HCT116 cells. As showed in Fig 5A, knockdown of PITX2c expression using siRNA significantly reduced expression of NPPA by 60% (P = 3.20×10 -4 ). Overexpression of PITX2c by transfection of HCT116 cells with p3×FLAG-PITX2c significantly increased NPPA expression by 2.42 fold (P = 0.01) (Fig 5B).
No significant gene-gene interaction between ZFHX3 variant rs2106261 and CAV1 variant rs3807989 or between PITX2c variant rs2200733 CAV1 variant rs3807989 GWAS in European ancestry populations have identified ten genetic loci for AF [5,7,8,10,11]. We analyzed these loci in the Chinese Han population for their association with AF. We found that in addition to the ZFHX3 locus and the PITX2c locus reported previously [6,9], one other locus, rs3807989 in CAV1 encoding caveolin-1, also showed significant association with AF, whereas no significant association was identified for other loci [12]. Therefore, we also analyzed gene-gene interactions between rs2106261 and rs3807989 and between rs2200733 and rs3807989. The classical gene-gene analysis by comparing ORs for the nine two-locus genotypes did not reveal any significant synergistic effect between rs2106261 and rs3807989 ( Table 2). The OR for the double risk homozygotes for both rs2106261 and rs3807989 (AAGG) was 1.25, which is smaller than the product of the ORs (1.18+1.15) for each single- risk homozygotes (AAAA, homozygous risk genotype for rs2106261; GGGG, homozygous risk genotype for rs3807989) ( Table 2). These results suggest that there is no interaction between the ZFHX3 locus and the CAV1 locus for AF. Similarly, the OR for the double risk homozygotes for both rs2200733 and rs3807989 (TTGG) was 1.08, which is smaller than the product of the ORs (1.00+0.77) for each single-risk homozygotes (TTAA, homozygous risk genotype for rs2200733; CCGG, homozygous risk genotype for rs3807989) ( Table 2). These results suggest that there is no interaction between the PITX2c locus and the CAV1 locus for AF.
Real-time RT-PCR analysis showed that knockdown of either ZFHX3 or PITX2c increased the expression level of CAV1 (Fig 7). Similar results were obtained with Western blot analysis (Fig 7). On the contrary, knockdown of CAV3 did not significantly affect the expression of ZFHX3 or PITX2c (Fig 8). Together, these data suggest that there is no cyclic cross-regulation between ZFHX3 and CAV1 or between PITX2c and CAV1.

Analysis of gene-gene interactions by alternative gene-gene interaction programs
Many gene-gene programs have been developed in recent years, therefore, we also analyzed interaction among SNPs rs2200733/PITX2c, rs2106261/ZFHX3 and rs3807989/CAV1 using RERI and INTERSNP programs. RERI (relative excess risk due to interaction) analysis was developed to quantify the extent of synergistic effect by adopting a fundamental measure of additive interaction and relative excess risk due to interaction (RERI) [19]. Here we used this strategy to investigate interaction between rs2106261 and rs2200733 in terms of risk alleles A and T in the combined population. The RERI analysis can distinguish the additive effect from the synergistic effect [19]. A significant RERI value higher or lower than 0 is considered to demonstrate a synergistic effect, whereas a non-significant RERI value indicates an additive effect [19]. The results are shown in Table 3. First, we analyzed the synergistic effect when exposed to one copy of risk alleles at any one locus or both loci (H1). No significant synergistic effect was observed between rs2106261 and rs2200733 (RERI = 0.22 (95% CI -0.20-0.54), P = 0.13; RERI = 0.18 (95% CI -0.29-0.52), Padj = 0.22 after adjustment of covariates of age and gender). Second, we assessed the synergistic effect when exposed to two copies of risk alleles at any one locus or both loci (H2). A significant synergistic effect was detected between rs2106261 and rs2200733 with a RERI value of 2.26 (95% CI 1.06-3.73) (P<1.00×10 -4 ; RERI = 2.87 (1.48-4.69), Padj<1.00×10 -4 after adjustment of covariates of age and gender). Third, we assessed the synergistic effect when exposed to two copies of risk alleles at one locus and one copy of risk alleles at the other locus (H3). A significant synergistic effect was detected between rs2106261 and rs2200733 with a RERI value of 0.99 (95% CI 0.29-1.79) (P<1.00×10 -4 ; RERI = 1.29 (95% CI 0.44-2.33), Padj<1.00×10 -4 after adjustment of covariates of age and gender). These results provided statistical genetic evidence for the interaction between rs2106261 and rs2200733.

Discussion
In this study, we show that gene-gene interaction plays an important role in generation of disease phenotype by identifying gene-gene interaction involved in the pathogenesis of a cardiac disorder, AF. We employed a multi-stage case control association design to compare the frequencies of all nine two-locus genotypes from GWAS SNPs rs2106261 in the ZFHX3 gene and rs2200733 close to the PITX2c gene. Our study involves a careful design with a Discovery population consisting of 569 cases and 1,996 controls, Replication I population with 641 cases and 1,692 controls, and Replication II population composed of 810 cases and 1,627 controls. The combined population has 2,020 cases and 5,315 controls, and is considered to represent a considerably large sample size in the modern population studies for AF. We consider this point as strength of this study. When SNP rs2106261 and rs2200733 were analyzed together, two-locus genotype AATT (double risk homozygote) showed the highest odds ratio (OR) of 6.64 (95% CI 3.64-12.11) (P = 6.38×10 -10 ), 4.04 (95% CI 2.23-7.32) (P = 4.34×10 -6 ), 5.70 (95% CI 3.34-9.71) (P = 1.58×10 -10 ) and 5.36 (95% CI 3.87-7.43) (P = 8.00×10 -24 ) in the Discovery, Replication I, II, and combined population, respectively, when compared to wild type non-risk genotype GGCC. The Breslow-Day test showed that the ORs for AATT were significantly higher than ORs for GGTT or AACC in all populations (P = 5.26×10 -5 vs. GGTT and 2.94×10 -22 vs. AACC in the combined population) and higher than the combined ORs for both GGTT and AACC (5.36 vs. 3.31 in the combined population). We also analyzed gene-gene interaction using the RERI analysis and identified synergistic effects between SNP rs2106261 and rs2200733 when exposed two copies of risk alleles at any one locus or both loci (H2) (P<1.00×10 -4 ) or when exposed to two copies of risk alleles at one locus and one copy of risk alleles at the other locus (H3) (P<1.00×10 -4 ) ( Table 3). Analysis using the INTERSNP program revealed significant genotypic interaction between SNP rs2106261 and rs2200733 under an additive × additive model, but not under other models (Table 5). Overall, our studies establish that gene-gene interaction is involved in the pathogenesis of AF. Most importantly, our results suggest that gene-gene interaction accounts for heritability of human disease because it generates synergistic effects that markedly increase disease risk. The present study identifies the interaction between two common GWAS loci for AF. Ritchie et al [22] previously found that the risk alleles of common variants rs2200733 and rs10033464 at the 4q25 PITX2c AF locus could predict whether carriers of rare mutations in SCN5A (encoding the cardiac sodium channel), NPPA, KCNA5 (encoding potassium voltagegated channel, shaker-related subfamily, member 5), and NKX2.5 (encoding transcriptional factor NK2 homeobox 5) developed AF, suggesting potential interaction between common variants and rare mutation in familial AF. Moreover, Lubitz et al [23] studied AF risk signals within nine GWAS loci and found that there are at least four distinct AF susceptibility signals at the 4q25 AF locus upstream of PITX2c that may increase the risk of AF by 5-fold together. Our cellular and molecular biological studies here on rs2200733/PITX2c and rs2106261/ ZFHX3 identify a fundamental molecular mechanism underlying gene-gene interaction. SNP rs2200733 on 4q25 was the first genomic variant for AF identified by GWAS [5] and located 146 kb from the PITX2c gene encoding a paired-like homeodomain transcription factor 2 involved in the asymmetrical development of the heart and other organs [24][25][26]. Heterozygous knockout PITX2c mice developed atrial arrhythmias (atrial flutter, atrial tachycardia) upon programmed stimulation [13]. Kirchhof et al [14] showed that PITX2c is expressed in human and mouse left atria. Isolated hearts from heterozygous PITX2c knockout mice developed AF upon programmed stimulation and showed shortened action potential duration [14]. Chinchilla et al [15] showed that the expression level of PITX2c was decreased in AF patients and that atria-specific, but not ventricle-specific knockout of PITX2c, resulted in differences in action potential amplitude and increased expression of miR-1. Therefore, all evidence to date strongly suggests that PITX2c should be the causative gene for AF at the 4q25 locus. SNP rs2106261 is located within the ZFHX3 gene. ZFHX3 encodes a transcription factor [27] which contains four homeodomains and seventeen zinc fingers [28]. The ZFHX3 transcription factor appears to regulate myogenic [29] and neuronal differentiation [30]. Although the function of ZFHX3 in cardiac tissue is unknown, it is expressed in mouse hearts [31]. Here we show that PITX2c and ZFHX3 positively cross-regulates each other. PITX2c negatively regulates expression of miR-1, which negatively regulates expression of ZFHX3 by targeting a miR-1-binding site at the 3'-UTR, resulting in a positive regulation of ZFHX3 by PITX2c (Fig 9). Interestingly, ZFHX3 positively regulates expression of PITX2c. The net effect is a cyclic loop of cross-regulation between ZFHX3 and PITX2c (Fig 9). A cyclic loop of cross-regulation of two risk genes for a disease is expected to generate synergistic effects, which further increase disease risk, and therefore provides a novel molecular mechanism for gene-gene interaction. One important future direction is to determine whether this novel mechanism applies to other human disease and to plant and animal phenotypes in general.
On the molecular level, how does the cyclic loop of cross-regulation between ZFHX3 and PITX2c generate interaction between the two genes and increase risk of AF? The expression level of miR-1 was shown to be reduced in human AF patients, which was correlated with upregulation of potassium channel Kir2.1 and increased potassium current I K1 responsible for AF maintenance. PITX2c negatively regulates expression of miR-1, which increases I K1 , resulting in AF. Decreased miR-1 increases expression of ZFHX3, which increases expression of PITX2c, further decreases expression of miR-1 and increases risk of AF (Fig 9). ZFHX3 increases expression of PITX2c, which decreases expression of miR-1 and increases risk of AF (Fig 9). PITX2c positively regulates expression of ZFHX3, which further increases expression of PITX2c, and leads to down-regulation of miR-1 expression and increased risk of AF (Fig 9). In addition to miR-1, PITX2c and ZFHX3 may regulate NPPA, TBX5, NKX2.5 or other downstream target genes to increase risk of AF (Fig 9). These results provide novel insights into the roles of genegene interaction in the pathogenesis of AF.
One other important insight from this study is that not all risk genes for AF interact each other. We have previously shown that genomic variants increase susceptibility of cardiovascular diseases in a population-specific manner. Although some variants increase disease risk in both European ancestry populations and Asian populations, but other variants show significant disease association only in Asian populations [9,11,32]. For AF, we found that among ten GWAS variants for AF identified in European ancestry populations, only three were associated with risk of AF in the Chinese population, including SNPs rs2106261 in the ZFHX3 gene, rs2200733 at the PITX2c locus, and rs3807989 in the CAV1 gene. Despite the robust gene-gene interaction identified for rs2106261/ZFHX3 gene and rs2200733/PITX2c, we did not identify any significant interaction between rs2106261/ZFHX3 and rs3807989/CAV1 with all three gene-gene interaction programs (Tables 2, 4 and 5). For interaction between rs2200733/ PITX2c and rs3807989/CAV1, inconsistent results were obtained. Analysis for the OR for each multi-locus genotype and RERI analysis did not find gene-gene interaction between rs2200733 and rs3807989, whereas the INTERSNP program found significant interaction under a model of additive by additive. Future studies are needed to reconcile the differences between different programs developed for studying gene-gene interaction.
One limitation of the present study is that our statistical analysis was not adjusted for principal components to correct for possible stratification in Chinese samples due to a limited number of SNPs genotyped in the study populations. Genetic interaction may be especially susceptible to small degrees of population stratification, however, this may be unlikely given the replication of the finding in multiple populations.
In summary, we have found that gene-gene interaction can generate synergistic effects that markedly increase disease risk, therefore, accounting for a portion of heritability of human disease. Our identification of the gene-gene interaction between SNPs rs2106261 in the ZFHX3 gene and rs2200733 at the PITX2c locus provide significant insights into the pathogenesis of AF. We further show that PITX2c and ZFHX3 positively regulate each other at the molecular level, generating a loop of cross-regulation between PITX2c and ZFHX3. Our data provide an interesting molecular basis for some gene-gene interaction at the molecular genetic level.

Study subjects and preparation of genomic DNA samples
The subjects involved in the present study include AF patients and non-AF controls selected from the GeneID database [6,9,[32][33][34][35][36][37][38][39]. All study subjects are of Han ethnic origin based on self-description. The study was approved by the Ethics Committee of Huazhong University of Science and Technology and the Ethics Committees from local hospitals, and consistent with the guideline in the Declaration of Helsinki. Written informed consent was obtained from the participants.
The diagnosis of AF was made by multiple experienced cardiologists and cardiac electrophysiologists using data from 12-lead surface electrocardiograms (ECGs) or Holter recordings. The ECG characteristics of AF include the absence of P waves, the presence of rapid oscillations or fibrillatory waves (F waves), and irregular R-R intervals [40][41][42]. The controls are healthy individuals who do not have AF at the time of physical examinations or from medical records.
Details of study subjects and GeneID and preparation of genomic DNA samples were described in S1 Text.

Prediction of potential miR-1 binding sites
Details of bioinformatics prediction of miR-1 binding sites were described in S1 Text.

Plasmids, siRNAs, and microRNA mimics
Details of plasmids, siRNAs and microRNA mimics were described in S1 Text.
Cell culture and dual luciferase reporter assays HCT116 and SW620 cells were cultured and transfected with plasmid DNA, siRNAs, and microRNA mimics using Lipofectamine 2000 and the Opti-MEM I reduced serum medium as described [43,44]. Luciferase activities were measured using the Dual-Glo luciferase assay kit (Gibco Life Technologies, Gaithersburg, MD, USA) as described previously by us [43,45]. Each experiment was performed in triplicate and repeated at least three times. Details of cell culture and luciferase assays were described in S1 Text.

Real-time PCR analysis
The expression levels of PITX2c, ZFHX3, NPPA, CAV1, NKX2.5, TBX5, KCNQ1, and SCN1B were measured using real-time RT-PCR analysis with SYBR green I mix as described by us previously [36,44] and described in detail in S1 Text. Primers for real-time RT-PCR analysis are listed in S6 Table. Western blot analysis Western blot analysis was carried out as described by us previously [43,44] and described in detail in S1 Text.

Statistical analysis
The genotyping data for all SNPs are included in S8-S13 Tables. The genotyping data from the control group for each SNP were first tested for the Hardy-Weinberg equilibrium using PLINK1.06 (http://pngu.mgh.harvard.edu). If a P value was >0.01, the genotyping data were considered to be in the Hardy-Weinberg equilibrium. Genotypic frequencies in controls were all in Hardy-Weinberg equilibrium (P>0.01). For case-control association analysis, we used Pearson's 2×2 and 2×3 contingency table χ 2 tests as implemented in PLINK1.06 (http://pngu. mgh.harvard.edu) to compute the P values for allelic and genotypic associations, respectively. The same PLINK1.06 program was used to estimate the odds ratio (OR) and 95% confidence interval (CI) for each association. In order to exclude confounding factors, multivariable logistic regression analysis was performed using SPSS 17.0 to adjust for gender and age.
For analysis of gene-gene interaction, SNP rs2106261 in ZFHX3 or SNP rs2200733 at the PITX2c locus each has two alleles (G vs. A for rs2106261; C vs. T for rs2200733). The two SNPs together generate nine different genotypes. We defined the homozygous, non-risk (or protective) two-locus genotype GGCC as the reference group, and then estimated the OR of AF for each of the other eight two-locus genotypes GGCT, GGTT, AGCC, AGCT, AGTT, AACC, AACT, and AATT in relation to the reference genotype. The Pearson's 2×2 contingency table χ 2 test was used to compute the nominal P values, ORs, and 95% CIs for each genotypic association using PLINK1.06. The Breslow-Day test was carried out to test whether the ORs between two different genotypes showed a statistically significant difference.
Gene-gene interaction was also measured by a relative excess risk due to interaction (RERI) analysis [19]. The RERI analysis analyzes was suggested to be more meaningful for disease prevention and intervention in public health [46], and advocated to be more biologically interpretable compared to that measured on the multiplicative scale [47]. A synergistic effect was defined as the extent of the combined effect of the exposures in excess of the sum of their individual effects [48]. We adopted a fundamental measure of RERI versus additive interaction to quantify the extent of synergistic effect in this study. The original form of RERI was defined as RERI = RR 11 -RR 10 -RR 01 +1, where subscript 11, 10 and 01 denote relative risks (RR) for doubly-exposed and individually-exposed to each risk factor when treating doubly-unexposed as a reference. When a RERI value equals to 0, it indicates a perfect additive model. Any significant deviation from 0 indicates a synergistic (+, positive values) or antagonistic (-, negative values) interaction. In a case-control study, RERI can be calculated by substituting ORs for RRs, yielding RERI = OR 11 -OR 10 -OR 01 +1. Although simply replacing RRs with ORs would induce an exaggeration problem for ORs [19,49], especially for a high prevalent disease, it is shown that RERI in terms of ORs is a good approximation of RERI in terms of RRs in a disease such as AF with a prevalence rate of 0.4%~0.8%) [49]. Under this circumstance, an OR is a good approximation of the RR. The statistical significance of RERI values in terms of ORs was addressed by the 95% confidence intervals based on the "MOVER" method, which utilizes the asymmetric intervals for ORs [19]. Since the SNPs are bi-allelic, it is of interest to explore if the interaction exists (1) when doubly-exposed to one copy of risk alleles (i.e. doubly heterozygous genotype) and (2) when doubly-exposed to two copies of risk alleles (i.e. double homozygous risk genotypes). In both scenarios, the doubly-unexposed is referred to as the homozygous non-risk genotype (e.g. GGCC). In addition, we tested the interaction when exposed one additional copy of risk alleles given being exposed to one copy of risk alleles. Note that the doubly-unexposed in this scenario is the doubly heterozygous genotype (e.g. AGCT). The P values were estimated by 10,000 times of bootstrap sampling. The P value of 0.05 or less than and the 95% CI of RERI through zero was considered to show statistical significance.
We also conducted a 4 degree of freedom test for genotypic interaction with logistic regression developed by Cordell and Clayton [20], which was implemented in the software INTERSNP [21] as Logistic Regression test #6. This model partitions the variance in AF risk into Additive and Dominant terms for each main effect, then into Additive by Additive, Additive by Dominant, Dominant by Additive and Dominant by Dominant terms. The test yielded ORs and 95% CI for each interaction term along with the global P values for the four terms. P values for individual terms were computed using Wald tests. We also used Logistic Regression test #5 in the INTERSNP program to test for additive interaction on a multiplicative ORs scale.
In molecular studies with quantitative data, a standard Student's t-test was used to compare the means between two groups of variables. A P value of 0.05 or less was considered to show statistical significance.