Comprehensive Exploration of the Effects of miRNA SNPs on Monocyte Gene Expression

We aimed to assess whether pri-miRNA SNPs (miSNPs) could influence monocyte gene expression, either through marginal association or by interacting with polymorphisms located in 3'UTR regions (3utrSNPs). We then conducted a genome-wide search for marginal miSNPs effects and pairwise miSNPs × 3utrSNPs interactions in a sample of 1,467 individuals for which genome-wide monocyte expression and genotype data were available. Statistical associations that survived multiple testing correction were tested for replication in an independent sample of 758 individuals with both monocyte gene expression and genotype data. In both studies, the hsa-mir-1279 rs1463335 was found to modulate in cis the expression of LYZ and in trans the expression of CNTN6, CTRC, COPZ2, KRT9, LRRFIP1, NOD1, PCDHA6, ST5 and TRAF3IP2 genes, supporting the role of hsa-mir-1279 as a regulator of several genes in monocytes. In addition, we identified two robust miSNPs × 3utrSNPs interactions, one involving HLA-DPB1 rs1042448 and hsa-mir-219-1 rs107822, the second the H1F0 rs1894644 and hsa-mir-659 rs5750504, modulating the expression of the associated genes. As some of the aforementioned genes have previously been reported to reside at disease-associated loci, our findings provide novel arguments supporting the hypothesis that the genetic variability of miRNAs could also contribute to the susceptibility to human diseases.


Introduction
MicroRNAs (miRNAs) represent a class of small (,19-29 nucleotides) non coding RNAs that participate in gene posttranscriptional regulation. By binding to complementary target sites that are mainly located in gene 3'UTR regions, miRNAs inhibit mRNA translation either via mRNA degradation or via repression of mRNA translation [1]. A complete or nearly complete match of the miRNA with its target sequence generally results in a decrease of gene expression while a mismatch lead to a repression of mRNA translation. In general, miRNAs participate in regulating the expression of genes located remote from their genomic sequence; however when miRNAs are located within gene introns they are highly likely to modulate the expression of the host gene [2,3].
According to the latest miRNA reference database (miRBase release 18, www.mirbase.org) [4], it is estimated that more than 1,500 miRNAs could exist in humans. A given miRNA may have several mRNA targets and participates in the regulation of a network of genes with genomic sequence similarities [5]. Reciprocally, a given mRNA may harbour in its 3'UTR region several different miRNA target sites and then be under the control of a set of miRNAs. It is estimated that, overall, about 50% of the genome would be subject to regulation by miRNAs [6,7], making them one of the most important component of a cell. It is then not surprising to find miRNAs associated with a large number of human diseases (,300 diseases according to the human miRNA disease database [8]) including cardiovascular and metabolic disorders [9][10][11][12].
As with any genomic sequence, miRNAs are prone to nucleotide variations that may have non negligible effects. The presence of a single nucleotide polymorphism (SNP) in the long miRNA primary (pri-miRNA) may affect its maturation process, its expression or the binding of the mature form to its target, which would then influence the expression of the target genes [13,14]. This is the case, for example, for rs11614913 located in the pri-miRNA-196. It is hypothesized that this SNP affects miR-196a-2 expression, alters the miRNA-target binding site and influences cancer risks [15,16]. The existence of a SNP in the miRNA genomic sequence may create mature miRNA variants, named isomiRs, whose predicted targets could differ from the original miRNA's targets [17]. In addition, the expression of miRNAs is known to be regulated by transcriptional factors, and by polymorphisms within the transcription factor binding sites, which may then modulate miRNA expression [18]. Finally, the presence of a SNP in the miRNA target sequences could also influence the expression of the targeted mRNAs [19,20]. As an example, the rs58186-C allele located in the 3'UTR region of the AGTR1 gene has been shown to decrease the efficiency of the binding of miR-155 to this gene. leading to an increase in AGTR1 expression [20].
In this study, we conducted a genome-wide investigation of the effect of pri-miRNA SNPs (miSNPs) on monocyte gene expression in a large epidemiological study of healthy subjects for whom genome-wide monocyte gene expressions and genotype data have been collected, as part of the Gutenberg Health Study [21][22][23][24]. We also conducted a genome-wide search for pair-wise interactions between miSNPs and SNPs located in 3'UTR regions (3utrSNPs). We reasoned that such investigation could help to identify novel miRNA-sensitive regulation of gene expression in a key cell type participating in several disease processes including inflammation, atherosclerosis and immunity [25]. miSNPs effects identified were further validated for replication in a second large monocyte expression dataset, the Cardiogenics Transcriptomic Study (CTS) [26].

Results
The Gutenberg Health Study (GHS) comprised 1,467 individuals (750 men and 717 women) [23]. All these individuals were typed for common SNPs using the Affymetrix Genome-Wide Human SNP Array 6.0 and their monocyte expression profiles were obtained from the Illumina HT-12 v3 Beadchip. Detailed description of these genome-wide expression and genotype data has already been provided elsewhere [21][22][23][24].

Probes and SNPs selection
The GRCH37 release of the Human reference genome and the 17 th release of the miRNA database [4] were used to identify SNPs located within pri-miRNA sequences and 3'UTR regions. The number of miSNPs genotyped in GHS, or that could be substituted according to the SNAP software [27] by a ''proxy'' genotyped SNP in strong correlation (when expressed in terms of a pairwise linkage disequilibrium (LD) r 2 greater than 0.90) was 294, representing 258 distinct miRNAs.
The pre-processing of the expression data (see Methods) identified 22,004 probes covering 15,786 genes of ''perfect'' quality score according to ReMOAT [28] and not harboring a SNP in their genomic sequence. These probes were then tested for association with all genotyped miSNPs.
The search for interactions between miSNPs and 3utrSNP was restricted to probes targeting genes known to contain SNPs in their 3'UTR region that were either directly genotyped in GHS, or tagged by genotyped SNPs (r 2 .0.90). This led to the selection of a subsample of 8,768 probes characterizing 6,147 genes. In these genes, the total number of 3utrSNPs (or ''proxy'') that were further studied was 10,783. The distribution of the number of 3utrSNPs per gene is given in Table 1.

Association of miSNPs with gene expression
GHS discovery phase. This analysis can be viewed as an ancillary study of the whole genome-wide association study between all genotyped SNPs and all expressions already conducted in GHS and whose results can be found in a publicly available resource [23]. At the Bonferroni correction level of 7.73610 29 (ie. 0.05/(294622,004)), fifty-seven associations between miSNPs and gene expression were significant (Table S1). However, forty-eight of these associations implicated miSNPs proxies mapping the genomic region of the genes they were associated with. We interrogated the GHS express database to identify the SNPs showing the strongest association with the associated expression among those with p,5.50610 25 and located within 1Mb of the probe genomic sequence, thereafter referred to as the best cis eSNPs [23]. In six cases, the miSNP proxies were the best cis eSNPs. After adjusting for the effect of the best cis eSNPs, most miSNPs association vanished and only seven (bold lines in Table  S1) remained significant at p = 7.73610 29 . Most of these 48 cis miSNPs associations are then likely due to LD between miSNPs and ''true'' cis eSNPs. Nevertheless, this must be investigated in greater depth as in several examples the corresponding miRNA was located within an intron of the associated gene, and could therefore participate in the regulation of the host gene.
Search for miSNP 6 3utrSNP interactions GHS discovery phase. Each 3utrSNP was tested for interaction with all miSNPs with respect to the expression levels of the probes tagging the 3utrSNP-associated gene. Interactions were assessed using a standard linear regression analysis where both SNPs coded as 0,1,2 were included to the model together with the corresponding interaction term. Analyses were adjusted for age and sex. The total number of tested interactions was 4,890,102.
Instead of applying the standard Bonferroni correction to handle multiple testing, we followed the suggestion by Pare et al. [29] and adopted a weighted-Bonferroni correction according to the p-value of the Levene's test. This consists in prioritizing 3utrSNPs according to the significance of the test for a difference in the variance of expressions according to genotypes. This strategy relies on the statistical property that a significant difference in phenotypic variances according to sub-groups could be a marker for interaction phenomena.
Using this weighted-Bonferroni correction, 51 miSNP 6 3utrSNP interactions were genome-wide significant at p,1.02610 28 (Table 4). Note, only 31 would have been declared significant according the standard Bonferroni procedure ( Table 4). Seventeen of the detected interactions involved the RFPL1 rs13053624 that was found to interact with 17 miSNPs over 16 distinct miRNAs to modulate RFPL1 expression (probe ILMN_1797383). One of these interacting miRNAs was hsamir-3674. Interestingly, according to microSNiPer database [30], RFPL1 is predicted to harbor a SNP, rs13053817, in a potential target site for hsa-mir-3674 that is, according to the SNAP database, in nearly complete association with the identified rs13053624 (r 2 = 0.90). No other strong biological and bioinformatics evidence could be obtained from public databases (miRanda [31], TargetScan [5], DianaMicro [32], PicTar [33], mirBase [4]) in favour of the 30 other genes we identified through our interaction search (Table 4).
Replication in CTS. The fifty-one genome-wide significant interactions were tested for replication in CTS. However, only eight interactions could be replicable, which did not include the aforementioned interaction involving RFPL1 rs13053624.
Using the same linear regression model, further adjusted for disease status as for the discovery phase, two interactions replicated in CTS at the Bonferroni-corrected level of 6.25610 23 ( Table 5).
The first replicated interaction involved the HLA-DPB1 rs1042448 and hsa-mir-219-1 rs107822 tagged by the rs3128923/rs213208 and rs3117222/rs439205 pairs in GHS and CTS, respectively. These two loci are distant from about 100 kb and the corresponding tag SNPs were in modest linkage disequilibrium (LD), r 2 = 0.58 and r 2 = 0.56, in GHS and CTS, respectively. In GHS, the haplotype analysis of the rs107822 and rs1042448 proxies revealed that the HLA-DPB1 rs1042448-A proxy allele (i.e the allele at the proxy SNP that can be used to tag the rs1042448-A allele) was associated with a strong increase in HLA-DBP1 expression (b = +0.61, p = 1.64610 2105 ) when carried on the same haplotype as the hsa-mir-219-1 rs107822-C proxy allele ( Figure 1). Conversely, when associated with the hsa-mir-219-1 rs107822-T proxy allele, the increasing effect of the HLA-DPB1 rs1042448-A proxy allele was significantly reduced  This interaction remained significant (p = 2.81610 212 ) when the haplotype analysis was further adjusted on the best cis eSNP observed for HLA-DBP1 expression, rs3128963 (p = 2.30610 2151 ) (see GHS_Express database [23]). The same pattern of associations was observed in CTS ( Figure 1). The HLA-DPB1 rs1042448-A proxy allele was associated with a strong significant increase in HLA-DPB1 expression (b = +0.63, p = 5.24610 262 ) when carried on the same haplotype as the hsamir-219-1 rs107822-C proxy allele. The corresponding increase when the rs1042248-A proxy allele was associated with the hsamir-219-1 rs107822-A proxy allele was significantly reduced (p = 2.68610 220 ) and did no longer reach significance (b = +0.05; p = 0.23) (Figure 1).
The second replicated interaction involved the H1F0 rs1894644 and hsa-mir-659 rs5750504 tagged by the rs763137/ rs2899293 and rs1894644/rs6000905 pairs in GHS and CTS, respectively ( Figure 2). These two loci are distant from about 40 kb and the corresponding tag SNPs were in low LD, r 2 = 0.15 and r 2 = 0.14, in GHS and CTS, respectively. In GHS and in CTS, the H1F0 rs1894644-T proxy allele was associated with a strong increase in H1F0 expression (b = +0.65, p = 1.71610 253 and b = +0.79, p = 1.36610 240 , respectively) when it was on the same haplotype as the rs5750504-T proxy allele. Conversely, when the rs1894644-T proxy allele was on the same haplotype as the rs5750504-A proxy allele, the corresponding increase in H1F0 expression was lower (b = +0.23, p = 9.74610 213 and b = +0.26, p = 7.25610 28 , respectively). The test for homogeneity of the H1F0 rs1894644 effect according to the rs5750504 proxy was significant p = 3.03610 212 and p = 5.67610 210 in GHS and CTS, respectively, validating the interaction detected through standard linear regression analysis (p = 2.98610 210 and p = 1.37610 28 , respectively). Note that, in GHS, the rs763137 SNP involved in this interaction was the best cis eSNP for H1F0 (p = 1.10610 262 ).
As shown in Table S3, the two replicated interactions were consistent in CAD and healthy subjects composing CTS. (1) The rs1463335 was tagged by the rs317657 and rs998022 in GHS and CTS, respectively. The rs146335 is located on chromosome 12, at position 69,667,075. As a consequence, the association observed with LYZ and YEATS4 are considered as cis-associations, the remaining eight as trans-associations. (2) Regression coefficient associated with the rare miSNP allele under an additive effect model, adjusted for age and gender

Discussion
Coupling genome-wide association and expression studies have been an attractive strategy to disentangle the architecture of the genetics of gene expression and to assess whether gene expression dysregulation could mediate the effect of SNPs on disease risk identified through genome-wide association studies [23,34]. To our knowledge, such studies [23,[34][35][36][37] mainly focused on assessing marginal associations of single SNPs with gene expression. Even if SNP 6 SNP interactions have often been advocated as a potential source of phenotype variability [38,39], there has been few attempt to assess at the genome-wide scale whether such SNP 6 SNP interactions could influence gene expression variability. This is likely due to the statistical and computing burdens associated with such investigations characterized by a huge number of tested interactions and the very large sample size required to detect genome-wide significance. We postulated that focusing on plausible ''biological'' interactions could be one strategy to dig into the complex architecture of SNP 6 SNP interactions. This is why we undertook what we think is the first systematic and comprehensive search for interactions between SNPs located in the genomic sequence of miRNAs and SNPs located in the 3'UTR gene regions that could participate in monocyte gene expression. This search for interactions was preceded by a genome-wide investigation of miSNPs effect on monocyte expression to assess whether miSNPs could influence gene expression, in particular, through trans regulation.
These investigations were conducted in the Gutenberg Health Study where the extensive genome-wide study of marginal SNP associations with monocyte expressions had previously been reported and the results stored in a publicly available resource [23], and we replicated the significant findings in the Cardiogenics study.
Our survey of marginal miSNP effect has pointed out the hsamir-1279 miRNA mapping to chromosome 12q15 as a candidate regulator of 10 genes in monocytes. Indeed, we observed that the hsa-mir-1279 rs1463335 tagged by rs317657 or rs1463335 was (1) Regression coefficient of the interaction term when both miSNP and 3utr proxy SNPs coded 0/1/2 according to the number of carried rare alleles are introduced in a linear regression model together with their interaction term.
(2) P-value of the interaction test obtained in GHS when the Levene test p-value was used under a weighted-Bonferroni framework.  robustly associated in cis with LYZ expression and in trans with CNTN6, CTRC, COPZ2, KRT9, LRRFIP1, NOD1, PCDHA6, ST5 and TRAF3IP2. The bioinformatics prediction of the LYZ gene as a target for hsa-mir-1279 miRNA supports this hypothesis. The lack of strong correlation between the expression of these 10 genes, together with the trans association observed after adjusting for LYZ expression, could suggest that these nine genes could also be targets for the hsa-mir-1279, despite the absence of such prediction by current bioinformatics tools. However, the observation of positive associations with LYZ and NOD1, but of negative associations with the other genes, is puzzling as we could have expected, at first sight, a similar pattern of associations if all these genes were target for hsa-mir-1279. Functional experimental work is needed to characterize the role of hsa-mir-1279 in the regulation of these genes in-depth, in particular TRAF3IP2 as this gene was identified in two independent GWAS as a susceptibility locus for psoriasis [40,41]. Our results, if confirmed, could open therapeutics perspectives as it is possible to use artificial miRNA targets to modify gene expression [42,43]. A trans association pattern was also recently observed at the locus 12q15 using an unsupervised gene networks analysis of the same datasets [24]. The rs11177644 located in the 3'UTR region of the YEATS4 gene was also found associated in cis to LYZ and YEATS4 and in trans with a module of 36 genes including the CNTN6, CTRC, COPZ2, KRT9, LRRFIP1, NOD1 and ST5 discussed above. However, unlike what we observed here with hsa-mir-1279 rs1463335, the trans associations with rs11177644 had been found mediated by cis regulation mechanisms. Using a standard linear regression analysis (see above), we then tested whether these two SNPs could interact to contribute to the identified trans associations. We did not observe any strong evidence for such phenomenon as the lowest p-value for interaction was p = 8.53610 24 for PCDHA6 (data not shown).

Polymorphisms
After adjusting for rs317657, the rs11177644-G allele was associated with decreased LYZ expression (b = 20.208, p = 9.56 10 284 ). After adjusting for rs11177644, the rs317657-T allele was associated with decreased LYZ expression (b = 20.072, p = 1. 24 10 210  and YEATS4 expression, strongly associated with increased levels of NOD1 (p = 8.30610 213 ), and decreased levels of the eight other genes, with p-values ranging from 2.21610 26 to 2.52610 238 (Table 6). These results suggest that the associations observed at the 12q15 locus are much more complex as initially hypothesized. It appeared that YEATS4 and LYZ expressions could be under the influence of a common cis eSNP, but the latter would also be additionally influenced by a miSNP contributing to trans associations. As discussed in the following paragraph, further investigating including molecular experiments are required to dissect this complex pattern of association. Two interactions miSNPs 6 3utrSNPs were robustly identified, the first involving HLA-DPB1 rs1042448 and hsa-mir-219-1 rs107822, the second the H1F0 rs1894644 and hsa-mir-659 rs5750504. In both cases, the identified 3'UTR rare alleles were found to strongly increase the expression of the associated genes, but these over-expressions were highly reduced in carriers of miSNPs rare alleles. The identified miSNPs are not located within the mature sequence of the associated miRNAs but in their pri-miRNA sequences. These rare alleles could either be associated with increased miRNA expression or could tag for yet-unknown miSNPs within mature sequences leading to the production of isomiRs. It could be speculated that the associated miRNAs or isomiRs would then target the identified 3'UTR regions made sensitive to miRNAs regulation by the identified 3'UTR variants, variants that could create novel motifs for miRNAs' binding and would lead to reduction of the per se effect of the 3'UTR variant. Molecular constructs are required to assess such hypothesis. We further checked whether the identified miSNPs could interact with other 3'UTR SNPs located in genes in the vicinity of the HLA-DBP1 and H1F0 loci. We did not observe any suggestive evidence (P,0.05) for such interaction suggesting that the identified miRNA regulation would be specific to HLA-DBP1 and H1F0.
The identified interactions involved SNPs in modest LD but located within a genomic distance of less than 100 kb. Several examples have already been observed where a given miRNA participates to the regulation of a gene located in its very close vicinity [2,3,44,45]. Nevertheless, one cannot exclude the possibility that the detected interactions are tagging for other complex haplotypic effects spanning a large distance and over several genes, five genes lying between HLA-DPB1 and hsa-mir-219-1 and three between H1F0 and hsa-mir-659 ( Figure 3). Additional functional experiments would be required to biologically characterize the detected statistical interactions.
Little is known about H1F0 in human diseases except that it codes for a histone family member protein. Interestingly, hsa-mir-659 has been shown to influence the risk of dementia [46] through a mechanism that could involve histone deacetylation [47,48]. Although speculative, the joint contribution of H1F0 and hsa-mir-659 on the risk of dementia could deserve further attention. Conversely the HLA-DPB1 gene has been associated with several complex diseases such as pulmonary hypertension, hepatitis B infection and systemic sclerosis [49][50][51]. In addition, hsa-mir-219-1 was suggested to play a role in schizophrenia and in N-methyl-D-aspartate (NMDA) glutamate receptor signaling, two pathophysiological mechanisms linked to HLA-DPB1 [52,53] making our results of valuable information for scientists interested in these pathologies.
Several limitations of this work must be acknowledged. First, because our investigation was conducted on genotyped data of common SNPs, only 258 miRNAs were covered by our study, which represent less than one-quarter of the hypothesized total number of human miRNAs. Second, only one cell type was studied where not all genes are expressed. Therefore not all possible association could be explored. Third, expression were measured using the microarray technology that may be less efficient than emerging mRNA deep-sequencing methods for measuring, especially low abundant, mRNA levels [54,55]. Because a given miRNA can bind several genes and a given 3'UTR can be a target for several miRNAs, compensation phenomena are proposed to explain the relative low impact of miRNA regulation on mRNA expression generally observed [56]. Therefore, genetic effects associated with miRNA and 3'UTR SNPs are hypothesized to be a modest size and very large sample size would be required to detect them. Despite having robustly identified two interactions, we cannot then exclude that other interactions with lower magnitude could have been missed due to power considerations, even if the two genome-wide expression datasets used in this work are among the largest collected so far in human epidemiological studies. Third, by discarding from our investigations probes harboring a SNP in their genomic sequence to avoid any bias in the results of the association analyses, some miRNA-sensitive regulatory mechanisms associated to genes tagged by probes matching their 3'UTR region may have been missed. Last, our investigation was conducted in monocytes and results observed may not be portable to other cells or tissues.
Nevertheless, our study illustrates that the proposed strategy searching for interaction between miSNPs and 3'UTR SNPs in genome-wide expression studies could be an alternative to bioinformatics prediction tools to identify miRNA targeted genes.

Ethics Statement
This work was based on two genome-wide expression studies, the Gutenberg Health Study (GHS) for the discovery phase and the Cardiogenics Transcriptomic Study (CTS) for the replication stage. Both studies were approved by the Institutional Ethical Committee of each participating center and by the local and federal data safety commissioners (Ethik-Kommission der Landesä rztekammer Rheinland-Pfalz) for GHS. These two studies have already been extensively described in [21][22][23] for GHS and in [26,57] for CTS.

Gutenberg Health Study
This analysis was conducted in a population-based sample of 750 men and 717 women aged 35-74 years, of European descent. Monocytic RNA was isolated from peripheral blood monocytes by negative selection using RosetteSep Monocyte Enrichment Cocktail (StemCell Technologies, Vancouver, Canada), Trizol extraction and purification by silica-based columns. Expression profiles were assessed using the Illumina HT-12 v3 BeadChip (Illumina, CA, USA) with ,48,000 probes covering 37,804 genes, and generated data were pre-processed using Beadstudio. Values from probes with #1 bead were re-imputed using SVD impute from the pcaMethods R package [58]. Data were normalized using quantile normalization and VST transformation as implemented in the lumi R package. To avoid spurious associations due to hybridation difference, probes that contained SNPs or were not annotated to be of ''perfect'' quality according to ReMOAT [28] (Reannotation and Mapping of Oligonucleotide Arrays Technologies, http:// remoat.sysbiol.cam.ac.uk) were discarded. Individuals were typed for genome-wide genotype data using the Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, CA, USA). SNP analysis was restricted to autosomal SNPs with minor allele frequency .0.01, call rate .0.98 and Hardy-Weinberg equilibrium testing p-value .10 24 .

Cardiogenics Study
The present study included monocyte expression data from 758 individuals from European descent, 363 patients with coronary artery disease and 395 unrelated healthy individuals.
Monocyte RNAs were isolated from whole blood using CD14 micro beads (Miltenyi) and expression profile was processed in a single center using the Illumina HumanRef-8 v3 beadchip array (Illumina Inc., San Diego, CA) containing 24,516 probes corresponding to 18,311 distinct genes. After hybridization, array images were scanned using the Illumina BeadArray Reader and probe intensities were extracted using the Gene expression module (version 3.3.8) of the Illumina BeadStudio software (version 3.1.30). Raw intensities were processed in R statistical environment using the Lumi and beadarray packages. All array outliers were excluded and only arrays with high concordance in terms of gene expression measures (pairwise Spearman correlation coefficients within each cell type .0.85) were included in the analyses.
Genomic DNA was extracted from peripheral blood leucocytes by standard procedures (Qiagen). Genome-wide genotyping was carried out using one of two Illumina arrays; the Sentrix Human Custom 1.2 M array and the Human 610 Quad Custom array. Data from the two arrays was combined as described in [59]. SNP analysis was restricted to autosomal SNPs with minor allele frequency .0.01, call rate .0.95 and Hardy-Weinberg equilibrium testing p-value .10 25 .

Statistical analysis
The association of miSNP proxies with probe expression was tested by use of a standard linear regression model under the assumption of additive allele effects (i.e. proxy genotype coded as 0/1/2 according the number of rare alleles). Pair-wise SNPs interactions on probe expression were tested using a standard linear regression model in which both SNP (miSNP and 3utrSNP) genotypes were coded as 0,1,2 together with the corresponding product term for interaction. All analyses were adjusted for age and gender, and additionally for disease status in CTS.
In the Gutenberg Health Study, a weighted-Bonferroni procedure was applied to identify genome-wide significant interactions. Each 3utrSNP was first assessed using the Levene statistic [29] testing the equality of associated-probe expression variance across genotypes. The resulting log(p-value) was then used to weight the interaction p-value obtained from the linear regression analysis. This strategy is expected to be more powerful than a standard Bonferroni correction procedure [60,61] as it gives more weight to interaction involving probes showing higher differences in inter-genotype variance.
For each 3utrSNP u (u = 1 to N utr ) associated with a Levene test p-value q u , we define a standardized weight w u as w u~Nutr xN miSNP ð Þ log q u ð Þ P Nutr i~1 N miSNP log q u ð Þ such as P N i~1 w i~N where N utr , N miSNP , N are the total number of studied 3utrSNPs, miSNPs and interactions, respectively. Each interaction p-value P i is then weighted by the w u corresponding to the 3utrSNP that is involved in the interaction, leading to a weighted p-value P i *. Each Pi* that is then below 0.05/N is then declared genome-wide significant at the 0.05 type I error.
In Cardiogenics, the standard Bonferroni threshold was used to declare significance.
Identified interactions between pairs of SNPs were illustrated through haplotype analyses conducted by the THESIAS software implementing a Stochastic-EM algorithm for haplotype-based association analysis [62]. All other statistical analyses were performed in R v. 2.12.0.

Supporting Information
Table S1 Genome-wide significant (p,7.7 10 29 ) associations of miSNPs on monocyte gene expression in the Gutenberg Health Study. (1) SNPs showing the strongest association (with P,5 10 25 ) with gene expression within 1Mb of the associated probe. (2) Regression coefficient associated with the rare miSNP allele under an additive effect model, adjusted for age and gender. (3) P-value of the association between miSNP and gene expression. (4) P-value of the association between miSNP and gene expression adjusted for the best cis eSNP. (5) Pairwise r2 between miSNP and best cis eSNPs in GHS. (6) The best cis eSNP and the associated-miSNP coincide. (XLSX)

Table S2
Cis and trans-associations observed with the hsa-mir-1279 rs1463335 (1) separately in CAD patients and healthy subjects of the Cardiogenics Transcriptomic Study. (1) The rs1463335 was tagged by the rs998022 in CTS. The rs146335 is located on chromosome 12, at position 69,667,075. As a consequence, the association observed with LYZ and YEATS4 are considered as cis-associations, the remaining eight as trans-associations. (2) Regression coefficient associated with the rare miSNP allele under an additive effect model, adjusted for age and gender. (3) P-value of the association between miSNP and gene expression.

(DOCX)
Table S3 Patterns of detected miSNPs 6 3utrSNPs interaction separately in CAD and healthy subjects of the Cardiogenics Transcriptomic Study. (1) Regression coefficient of the interaction term when both miSNP and 3utr proxy SNPs coded 0/1/2 according to the number of carried rare alleles are introduced in a linear regression model together with their interaction term. (2) P-value of the interaction test derived from the standard linear regression analysis in CTS. Bold p-values correspond to the detected interactions that were significant after Bonferroni correction in the whole CTS. (DOCX)