Genetic Dissection of the Drosophila melanogaster Female Head Transcriptome Reveals Widespread Allelic Heterogeneity

Modern genetic mapping is plagued by the “missing heritability” problem, which refers to the discordance between the estimated heritabilities of quantitative traits and the variance accounted for by mapped causative variants. One major potential explanation for the missing heritability is allelic heterogeneity, in which there are multiple causative variants at each causative gene with only a fraction having been identified. The majority of genome-wide association studies (GWAS) implicitly assume that a single SNP can explain all the variance for a causative locus. However, if allelic heterogeneity is prevalent, a substantial amount of genetic variance will remain unexplained. In this paper, we take a haplotype-based mapping approach and quantify the number of alleles segregating at each locus using a large set of 7922 eQTL contributing to regulatory variation in the Drosophila melanogaster female head. Not only does this study provide a comprehensive eQTL map for a major community genetic resource, the Drosophila Synthetic Population Resource, but it also provides a direct test of the allelic heterogeneity hypothesis. We find that 95% of cis-eQTLs and 78% of trans-eQTLs are due to multiple alleles, demonstrating that allelic heterogeneity is widespread in Drosophila eQTL. Allelic heterogeneity likely contributes significantly to the missing heritability problem common in GWAS studies.


Introduction
Uncovering the genetic basis of quantitative phenotypes is a central, yet unresolved problem in biology. There is a major discrepancy between the heritability estimates of most quantitative traits and the amount of heritable variation accounted for by all variants localized to a causative site. This phenomenon is often referred to as the ''missing heritability'' problem. Several hypotheses have been offered as possible explanations, including widespread epistasis [1], the infinitesimal model (many, very small effect loci influencing the phenotype of interest that are difficult to detect statistically) [2][3][4], rare alleles of large effect, that are also statistically difficult to detect [5][6][7], and widespread allelic heterogeneity (many independent effects segregating at each causative locus) [7]. This quest to understand the genetic basis of complex traits has given rise to a community-based strategy of creating freely-available genetic resource populations in model organisms such as mice [8][9][10], Arabidopsis thaliana [11,12], maize [13][14][15][16], and Drosophila melanogaster [17][18][19][20]. Those organisms with the greatest genetic resources and with a community of researchers focused on a single system provide a logical starting point toward finding the missing heritability associated with quantitative phenotypes. In addition, the experimental designs of some of these resources are well suited to test different hypotheses for the sources of missing heritability. For example, Bloom et al. [21] used a large segregant pool from a two line yeast cross to demonstrate that epistasis is not a major contributor to the heritability of most traits. In particular, resources that have a well-defined multi-haplotype structure can be used to identify the extent of allelic heterogeneity [22] owing to the ability to estimate trait means for each haplotype at each mapped QTL. By focusing effort on these community resources, the hope is that we will gain a better understanding of the causes of missing heritability problem.
Much of the genetic variation underlying whole organism phenotypes is thought to be due to regulatory variation, i.e., variants influencing gene expression [23][24][25][26]. Causative loci are linked to whole organism phenotypes through the transcriptome, an interrelated network of transcripts whose abundances influence the resulting phenotype. The transcript abundances of most genes are quantitative traits themselves and have heritabilities comparable to typical whole-organism phenotypes [24,26,27]. Increasingly, expression quantitative trait locus (eQTL) mapping is being used to identify the source of genetic variation in transcript abundances with the ultimate goal of linking variation at the nucleotide level to variation in gene expression and to variation in visible phenotypes. Expression QTL studies have shown that most genes have local (cis) eQTL that tend to be located near the transcription start site and to be of fairly large effect. Distant regulatory effects (trans-eQTL) are more difficult to identify, likely because they are more numerous and are of smaller average effect, leaving a great deal of variation in transcript abundance unexplained [23,24,26,27]. There is a growing movement toward identifying the causative quantitative trait nucleotides (QTN) underlying cis-eQTL, often with the assumption there is a single causative site [28][29][30]. However, if most eQTL harbor allelic heterogeneity [31], identifying a single causative variant will cause researchers to miss a significant portion of the genetic variation [7].
Here we describe transcriptome-wide mapping in female head tissue in the Drosophila Synthetic Population Resource (DSPR) [17,18], one of the major genetic reference panels in the Drosophila model system. Our goals are two-fold. First, we aim to provide a comprehensive map of cisand trans-eQTL for female head tissue in the DSPR. A key advantage of genetic reference panels is the potential to integrate phenotypes measured at multiple levels on genetically identical individuals. Incorporating eQTL data with visible phenotype data can increase mapping power and help users identify candidate genes [9,23,25,32]. Second, we use the large set of discovered eQTL to quantify the number of alleles segregating at each causative locus, providing an evaluation of the degree of allelic heterogeneity at both cisand trans-eQTL. The hypothesis that allelic heterogeneity is prevalent in quantitative traits has not been tested directly, in part because it is difficult to do so using a genome-wide association (GWAS) framework. Within loci, linkage disequilibrium makes it very difficult to distinguish between two SNPs tagging two independent causative sites versus a single causative site. In addition, the step-wise regression approaches used, for example [2,33], to identify multiple SNPs in a gene region associated with a phenotype lack power. The result is that the majority of GWAS that have identified multiple SNPs at a single locus using conditional analysis rarely identify more than two such SNPs despite very large sample sizes e.g. [2] but see [33]. In contrast, mapping in the DSPR and other multi-parental advanced generation intercross mapping panels take a haplotype based approach, providing a natural way to distinguish between multiple alleles at each QTL and a way to ascertain the potential contribution of allelic heterogeneity to the missing heritability problem.

Results and Discussion
We mapped genome-wide expression variation using transheterozygote F1 individuals from 596 crosses between DSPR population A (pA) females and population B (pB) males, thus avoiding mapping variation for inbreeding depression. Gene expression was assayed using Nimblegen 126135 K arrays, and we analyzed the resulting data using a custom data analysis pipeline (see methods) to identify all significant eQTL.

The female head eQTL map
We identified a total of 7922 eQTLs corresponding to 7850 transcripts out of a total of 11064 transcripts tested ( Figure 1). Details for all eQTLs are in Table S1. Of these, 7704 transcripts were associated with a single cis-eQTL, 71 were associated with both cisand trans-eQTL, and 75 were associated with only trans-eQTL. A small percentage of eQTLs (,7%; Table 1) were associated with only a single recombinant inbred line (RIL) population (pA or pB; see methods), but for most eQTL fitting both pA and pB was necessary to explain the eQTL signal, indicating that causative variants were present in both populations.
The amount of variation explained by our mapped eQTLs was high (Figure 2), though our stringent, experiment-wise permutation-based correction for multiple tests severely limits our ability to detect QTL of small effect. Not surprisingly, the variance explained by cis-eQTLs was higher than trans-eQTLs [24]. Our cis-eQTLs explained a median of 24% of the phenotypic variance, and 855 eQTL explained more than 50% of the phenotypic variance. Using our heritability estimates for each transcript abundance, we estimated the percentage of the heritability each eQTL explained. The median for the percent heritability explained by each eQTL was 73%. Our trans-eQTLs explained lower levels of variance, the median phenotypic variance explained was 15%, and the median percent heritability explained was 38%. However, if heritability values are underestimated, and/or we overestimate the effects of eQTLs (which is likely due to the Beavis effect [34]), these values will be inflated. This effect is obvious for the set of eQTL estimated to explain greater than 100% of the heritability (Figure 2A).
We have provided a comprehensive map of eQTLs for female head tissue in the Drosophila model system within the constraints of our statistical power. There is little doubt many smaller effect eQTLs exist that we were not able to identify given our conservative statistical threshold. Our use of transheterozygote individuals means that we not only avoid the effects of inbreeding depression, but we have also obtained estimates for all eQTL for both pA and pB DSPR populations.

Author Summary
For traits with complex genetic inheritance it has generally proven very difficult to identify the majority of the specific causative variants involved. A range of hypotheses have been put forward to explain this so-called ''missing heritability''. One idea-allelic heterogeneity, where genes each harbor multiple different causative variants-has received little attention, because it is difficult to detect with most genetic mapping designs. Here we make use of a panel of Drosophila melanogaster lines derived from multiple founders, allowing us to directly test for the presence of multiple alleles at a large set of genetic loci influencing gene expression. We find that the vast majority of loci harbor more than two functional alleles, demonstrating extensive allelic heterogeneity at the level of gene expression and suggesting that such heterogeneity is an important factor determining the genetic basis of complex trait variation in general.
eQTLs Are Multiallelic in Drosophila Overall, our results confirm what many other researchers have observed, widespread large effect cis-eQTLs and smaller effect trans-eQTLs [23,24,26,27]. One of the major advantages of a stable genetic panel is the ability to measure multiple traits at multiple levels on genetically identical individuals, which allows for the potential to combine these sources of data to identify causative genes [9,23,25,32]. We expect this dataset to be very useful to DSPR users, particularly those interrogating phenotypes measured in females with relevance to neuroanatomy or behavior. All of the raw and analyzed data are freely available at http://FlyRILs.org/Data. The data have also been deposited in NCBI's Gene Expression Omnibus [35] and are accessible through GEO Series accession number GSE52076.

Trans-eQTL hotspots
We identified regions of the genome associated with a high trans-eQTL density to identify eQTL regulating the expression of several other genes (trans hotspots). There were two regions of high trans-eQTL density, TQTLA and TQTLB ( Figure 4; Table 2). These clusters regulate several genes distributed throughout the genome, as is apparent in Figure 1. We used a gene ontology term finder [36] to determine whether the sets of genes regulated by these trans-eQTL were related to a common process. The set of 16 genes regulated by TQTLA showed enrichment for circadian rhythm of gene expression (2 of the 16 genes regulated by TQTLA; P = 0.0007). We used principal components analysis on the set of 16 genes to create a composite variable. All 16 genes load fairly evenly on the first principal component (absolute value range: 0.08-0.20). We then correlated this composite variable with expression measures for each gene in the TQTLA region to identify possible candidate genes. The gene timeless (tim) was highly correlated with the TQTLA composite variable (r = 0.90), and it does have a significant cis-eQTL. All other genes in the interval had a correlation with an absolute value of less than 0.5. Additionally, after correlating the expression of each of the 16 transcripts regulated by TQTLA with the expression of all genes in the TQTLA region, timeless showed the maximum pairwise correlation in all 16 cases (absolute value of correlation   [37] and is involved in transcriptional regulation of circadian rhythm [38]. Not all genes in the TQTLA interval are included in our expression set. For example, some genes may have been dropped due to the presence of SNPs in probes, or were not included in the Nimblegen probe set to begin with. For TQTLA, 23 genes in the interval are not represented in the expression set. However, none of these genes are associated with any terms involving circadian rhythm, regulation of gene expression, or transcription (http:// FlyBase.org) [39], and we therefore do not consider any of these likely candidate genes.
The genes associated with TQTLB are enriched for several GO terms including compound eye pigmentation (2/11 genes; P = 0.005), the umbrella term: single-organism metabolic process (6/11 genes; P = 0.007), and several specific metabolic process terms: tryptophan metabolic process (2/11 genes; P = 0.008), indolalkylamine metabolic process (2/11 genes; P = 0.0008), indole-containing compound metabolic process (2/11 genes; P = 0.002), aromatic amino acid family metabolic process (2/11 genes; P = 0.006). Once again we performed PCA to create a composite variable. sugarbabe (sug) was the gene most highly correlated with the TQTLB composite variable (r = 20.63) and does have a significant cis-eQTL. All other genes in the interval had a correlation with an absolute value of less than 0.4. Loadings were again fairly even (absolute value range for all other genes: 0.08-0.39). Pairwise correlations between the transcripts associated with TQTLB and the expression measures in the interval showed sugarbabe to be most highly correlated in all cases except two: gene CG5321 and gene CG6834 (absolute value of correlation range for all other genes: 0.40-0.52). These two genes were also the two with the lowest loading values on the composite variable. The correlation between the estimated haplotype effects for the cis-eQTL for sugarbabe, and the effects for the trans-eQTLs were moderate (mean absolute value correlation: 0.24; min: 0.005; max: 0.44). The gene sugarbabe (sug) is expressed in the adult head [37], is involved in regulation of transcription [40], is involved in regulation of response to starvation [41], and is part of the insulin-like growth factor signaling pathway [41]. The 21 genes not included in the interval are not associated with any terms involving metabolism, regulation of gene expression, or transcription (http://FlyBase.org) [39].
We have identified two trans hotspots, and, in both cases, we were able to use our expression dataset to narrow the causative gene to a single likely candidate gene. Previous eQTL studies have identified many more trans hotspots that regulate many more genes (hundreds or thousands) than our two identified hotspots (TQTLA: 16 genes; TQTLB: 11 genes; e.g. [27,42], reviewed in [24,26]). However, while some of these global regulators of gene expression have been confirmed as true signals, most notably in yeast [43,44], Kang et al. [43] show how hotspots can result from confounding factors such as batch effects. In our dataset, we employed PCA to correct for possible batch effects [45]. This method has been shown to increase power to detect eQTL [29,45,46], however, it makes identifying even true trans global regulators impossible. The signal that results from a global regulator is statistically indistinguishable from an unmeasured batch effect. In addition, even true global regulators can confound the detection of other true eQTLs, and correcting for these true global regulators increases the power to detect these other associations [43,45]. It is possible to distinguish true trans hotspots from batch effects using biological replicates [43], but for our study we chose to maximize the number of RILs rather than increase replication to maximize our statistical power to map eQTL. As a result, we are unable to detect many trans hotspots in this study. However, our stringent statistical correction does give us increased confidence that the eQTL we do identify are indeed true signals.

Most eQTLs are multiallelic
The vast majority of our eQTLs appear to be multiallelic ( Figure S1). In 95% of cases, the number of alleles estimated at cis-eQTL was 3 or greater. For trans-eQTL this percentage was somewhat lower, at 78%. Figure 5 shows an example of an eQTL where the best model is a two allele model and of an eQTL where the full haplotype model is the best model. In cases where we estimated multiple alleles, we were able to explain additional phenotypic variance compared to the best two allele model ( Figure S2), sometimes as much as an additional 27%. We investigated our ability to accurately estimate the number of alleles by performing a simulation designed to provide the highest power to distinguish between different alleles (see methods). Our simulation revealed that our estimator underestimates the number of alleles in 63% of cases, correctly estimates the true number of alleles in 26% of cases, and overestimates the number of alleles in 10% of cases ( Figure 6). This bias toward underestimating the number of alleles gets increasingly severe as the true number of alleles increases. Our simulations with a lower effect size (5%) and normally distributed allelic effects both resulted in an even stronger bias toward underestimating the true number of alleles. Our allele number distribution for cis-eQTLs is no doubt composed of a mixture of eQTLs of varied numbers of true alleles. Overall, it is closest to the distribution we obtain for a simulation of ,5 alleles. So while most of our estimates for cis-eQTL are for 3-4 alleles, many may be determined by many more alleles.
Our results indicate widespread allelic heterogeneity for both cisand trans-eQTLs. The focus of mapping studies is often to identify the single causative variant underlying a significant signal, the implicit assumption being that the causative loci are biallelic. cis-eQTL in particular, with their large effects, are thought to be more likely than other traits to have a simple genetic architecture and be biallelic [22,[28][29][30]. Baud et al. [22] found some support for this idea when comparing a two allele model to the full haplotype model in hippocampus eQTLs in the heterogeneous stock mouse resource [32]. They found that in 97% of cases, the two allele model was superior for cis-eQTLs while trans-eQTLs were more likely to be multiallelic [22]. However, in contrast to these findings, cis-eQTLs have been found to be multiallelic in Drosophila [47], Arabidopsis [42], and humans [31,48]. Our results strongly confirm the result of multiallelism in Drosophila with 95% of cis-eQTLs estimated to be due to 3 or more alleles. This result indicates that in Drosophila, widespread allelic heterogeneity exists at one of the most basic levels of genetic variation: cis-regulatory variation.
Widespread allelic heterogeneity is one potential explanation for the missing heritability problem in the study of complex traits. Allelic heterogeneity presents a statistical challenge for GWAS [7]. GWAS utilize natural populations and interrogate each SNP (or other specific variant) for association with the phenotype of interest. At the single gene level, it is difficult to distinguish between simple linkage disequilibrium between a single causative variant and other, nearby neutral SNPs, and multiple independent causative SNPs. If GWAS focus only on the strongest association at a locus, in the presence of allelic heterogeneity that individual variant will account for less of the variation than the entire gene, causing the effect of the locus to be underestimated [7]. In this respect, haplotype-based mapping approaches, such as the one described here, have an advantage because entire haplotypes (and thus an entire set of causative variants associated with a single gene) are tested together. The effect size associated with the causative gene will tend to be larger and easier to detect in this framework. This effect, combined with the more favorable frequencies of alleles in linkage based panels could explain why these studies tend to explain very large proportions of the heritable variation [9,21,49], while GWAS grapple with large amounts of missing heritability. However, one drawback of current haplotypebased methods is that they do not have single gene resolution and therefore identifying the causative gene within the QTL interval can be a significant challenge. Furthermore, while identifying the causative loci under allelic heterogeneity is easier with haplotype based methods, the subsequent identification of the causative SNPs within the loci is made much more complicated by heterogeneity [17,18,50].  Allelic heterogeneity is typical for Mendelian diseases (http:// www.omim.org/) and it has been suggested as the likely model for quantitative traits [51]. There is a growing body of empirical [2,17,22,31,42,47,50] and theoretical [7] support for this idea. For example, one of the largest GWA studies found support for allelic heterogeneity for human height by identifying several cases of multiple SNPs likely associated with the same gene [2]. Even age related macular degeneration, the first successful GWA study [52], has subsequently been shown to harbor multiple functional alleles [53][54][55][56]. Our results should therefore not be surprising. However, they do suggest the community should focus on developing experimental designs and analytical methods, e.g., [7], that function well under a model of allelic heterogeneity.

Mapping population
We used RILs from the DSPR (http://FlyRILs.org) to map genome-wide expression variation. The DSPR has been described in detail previously. Complete details of the development of the DSPR, founder whole genome re-sequencing, and RIL genotyping are described in [17]. The development of the hidden Markov model to infer the mosaic structure of the RILs and the power and mapping resolution of the DSPR for QTL mapping are described in [18]. Briefly, the DSPR is a multi-founder advanced intercross panel consisting of a set of over 1700 RILs of Drosophila melanogaster. Two 8-way synthetic populations (pA and pB) were created from two independent sets of 7 inbred founder lines (A1-A7 or B1-B7) with one additional line (AB8) shared by both populations. Each synthetic population was maintained as two independent replicate subpopulations (pA1 and pA2 or pB1 and pB2), kept at a large population size, and allowed to freely recombine for 50 generations. At generation 50, each subpopulation gave rise to ,500 RILs via 25 generations of full-sib mating. The genomes of the original fifteen inbred founder lines have been completely resequenced, and the complete underlying founder haplotype structure of all RILs in the panel has been determined via Restriction-Associated DNA (RAD) sequencing along with a hidden Markov model (HMM).
In order to avoid potentially mapping QTL for inbreeding depression, we phenotyped trans-heterozygote F1 individuals from crosses between pA females and pB males. The crosses were done to maintain the subpopulation structure by crossing pA1 to pB2 and pA2 to pB1. In both cases, we arbitrarily crossed pA and pB RILs with the same line number (i.e., pA1 1 *pB2 1 , …, pA1 n *pB2 n , pA2 1 *pB1 1 , …, pA2 n *pB1 n ). For each of 596 crosses, we generated 4-6 replicate cross vials containing 10 virgin pA females and 10 pB males and cleared the adults after 24-48 hours to maintain roughly equal larval density across experimental vials. Both the inbred RIL parents and the experimental trans-heterozygous cross progeny were raised on standard cornmeal-yeast-molasses media at 25uC, 50% relative humidity, and on a 12:12 light:dark regime.

RNA isolation and arrays
Progeny from each cross vial were allowed to emerge and mate in the source vial for 2-4 days. Then 250-300 females were harvested over CO 2 from the multiple replicate vials. Since we did not isolate virgin females on eclosion, females are very likely mated. These experimental females were kept for 24 hours in fresh vials to minimize any effects of the anesthesia before the heads were isolated (3-5 days old). Heads were removed by transferring the females without anesthesia to a 50 ml conical bottom centrifuge tube, freezing in liquid nitrogen, vigorously vortexing, and sieving using dry ice-chilled brass analytical sieves (mesh sizes 0.0165 and 0.0278 inches), separating heads from bodies and from legs and wings. Head samples were stored at 280uC until RNA isolation.
We did not have any technical or biological replicates aside from the effect of pooling 250-300 individuals, collected from multiple source vials, for each sample. This was intentional because we are mainly interested in the variance among RILs. There were two exceptions to this lack of replication. Crosses A1.2996B2.299 and A1.3506B2.350 were prepared independently twice.
RNA was isolated using TRIzol Reagent (Life Technologies), cleaned up using RNeasy Mini spin columns (Qiagen), concentrated-if necessary-using a vacuum centrifuge, and shipped to the Carver Center for Genomics Microarray Center at the University of Iowa for cDNA synthesis and array hybridization. We used Nimblegen 126135 K arrays to assay genome-wide gene expression. These arrays assay 16,637 transcripts with eight 60 bp probes per transcript. Each array holds 12 different crosses.

Data analysis pipeline
All data analysis was performed in R [57]. Initially, we performed standard quantile normalization and corrected for background effects using the normalize and backgroundCorrect functions in the oligo package to correct for any overall array effects [58][59][60][61]. We then created a custom probe-to-transcript map using the most recent version of the CDS file available at FlyBase (v. 5.48). We blasted all probe sequences against the CDS, requiring an exact match [62,63]. We eliminated any probe sequences without an exact 60 bp match to a transcript (6842 probes). We did not require a unique match given many transcripts from the same gene share portions of their sequences. Thus a single probe can correspond to multiple transcripts.
Single nucleotide polymorphisms in probe sequences are known to affect array hybridization and thus expression measurement [64][65][66][67][68]. We took advantage of the availability of full genome sequences for all 15 founder lines to identify SNPs within probe sequences. We first updated the alignment and SNP calling for the founder re-sequencing data using the Burrows-Wheeler Aligner (BWA) [69] with the following switches: -m 50000000 -R 5000, followed by the SAMtools [70] mpileup command (the initial alignment used Mosaik and a custom SNP caller, see [17]) to obtain an accurate, comprehensive list of SNPs in the founder lines (http://FlyRILs.org/Data, Release 3). We also applied the following filters: 1) at least one founder was fixed for the minor allele and at least three founders were fixed for the major allele (given a coverage of 106), 2) minimum overall coverage of 90 (5 per sample), and 3) maximum overall coverage of 3600. A large proportion of our probe sequences contained SNPs segregating in the set of DSPR founder lines. Because we have the full genome sequences in silico of all RILs in the panel, we were able to identify all positions in probes that are SNPs in our RIL panel and test for the effect of each SNP on the expression measurement. We discarded any probes containing multiple SNPs (22018 probes). For probes containing a single SNP, we used the haplotype probabilities from the hidden Markov model to infer the probability each RIL harbored the minor allele and assigned a genotype value to each cross by adding the paternal and maternal probabilities. In the case of perfect certainty, genotype values are: 2 = AA, 1 = Aa, and 0 = aa. We then tested for the effect of the SNP on the expression measurement by fitting the following model: where y is the expression measurement, S is subpopulation, M is the cross genotype at the marker, and b s and b m are the corresponding effect estimates. We then eliminated all probes with a p-value less than 0.05 (21141 probes).
Following re-mapping of probes and elimination of probes with SNPs affecting expression, transcripts were associated with a variable number of probes instead of each transcript being associated with exactly 8 probes as in the original NimbleGen array design. We eliminated any transcript associated with fewer than four probes. Next, we performed standard RMA using the basicRMA function in the oligo package [61] to combine probespecific data and generate a single expression measure per transcript. Many genes are associated with multiple transcripts. Whether the expression of different transcripts can be independently assessed is dependent on how many probes uniquely map to each transcript. We calculated pairwise correlations between each transcript in each set of transcripts associated with a single gene. If all of the pairwise correlations between the set of transcripts were . = 0.95, we used the average expression for the gene. Otherwise, we mapped each transcript separately. We will refer to all expression measures (including those averaged across transcripts for a single gene) simply as transcripts for clarity.
We followed the methods of [29,46] and used principal components analysis (PCA) to minimize batch effects [45] and increase our power to detect QTL. Following quantile normalization of each transcript to coerce each transcript distribution to be normal, we performed PCA on the entire set of transcripts. We selected the first 10 principal components to correct our expression measurements. The percentage of the variance explained by each remaining principal component was below 1% (Figure S3). We then fit the following model where y i is the ith expression measurement, S is subpopulation, x j is the jth principal component, and b s,i and b j are the corresponding effect estimates. We used the resulting residuals for the remaining analyses. We performed an additional round of quantile normalization on these residuals to ensure normality. We estimated the narrow-sense heritabilities for all transcripts by fitting a linear mixed model using the polygenic function in the GenABEL package [71]. Briefly, the model includes a random effect polygenic term whose variance is determined by the kinship matrix between RIL crosses. We calculated the kinship matrix using the genome-wide haplotype assignments resulting from the HMM. At each position spaced every 0.025 cM, we calculated the probability of identity by decent and averaged these across the genome to obtain the relationship coefficient. Our kinship matrix is thus estimated over genetic distance. We then used the polygenic function to calculate heritabilities for each transcript [71].
To map eQTLs, we first selected transcripts expressed above background levels. We utilized the two replicated samples, A1.2996B2.299 and A1.3506B2.350, to identify the point where measurements were less repeatable and excluded all transcripts with expression levels below this point ( Figure S4). This cutoff excluded approximately 23% of transcripts. For all included transcripts, we performed haplotype-based genome scans by fitting the following model at regularly spaced positions every 10 KB across the genome (11768 positions; http://FlyRILs.org/Data, Release 3).
where y r,i is the ith transcript, m is the grand mean, G A,j are the genotype probabilities for the jth paternal RIL, G B,j are the genotype probabilities for the jth maternal RIL, and b A,j , and b B,j are the corresponding effect estimates. Because we assayed only females, the model for the X chromosome is the same as for the autosomes. At each position, we calculated the F-statistic for the overall effect of genotype and obtained LOD scores.
To identify the statistical significance threshold, we performed 1000 permutations of the expression measures [72]. The entire set of expression measures was permuted together to maintain the correlation structure in the dataset. We used these permutations to determine a conservative genome-wide, experiment-wise 5% significance threshold (threshold = 14.99). We also determined a separate threshold for cis-eQTL. We defined cis-eQTL as QTL occurring within 1.5 cM of the transcription start [18] site for each transcript (1.5 cM is our typical confidence interval width). To define a cis-only threshold, we only included the LOD scores for the positions within 1.5 cM of the transcription start for each gene (threshold = 14.4).
We identified all peaks with LOD scores exceeding the abovedefined thresholds. When multiple nearby peaks were identified, we determined whether their 3 LOD drop intervals overlapped, and, if so, only the peak with the highest LOD score was retained. We expect 3 LOD drops to be a conservative estimate of the 95% confidence interval. Standard 2 LOD drops have been shown to be overly narrow for pA6pB cross designs [18]. It should be noted however, that confidence intervals on QTL locations are not true 95% confidence intervals and effect size, sample size, and the number of haplotypes in the model affect the degree of coverage. We also calculated Bayes credible intervals, for which 95% coverage tends to be more consistent [73,74].
In a pA6pB cross, a mapped QTL may be due to genomic variation at that position in only one population or in both. We identified peaks associated with only a single population using Akaike's Information Criterion (AIC). We calculated the AIC for three models: pA alone, pB alone, and pA & pB. The smallest AIC indicates the model with the best fit. Thus any cases in which the lowest AIC resulted from a reduced model, the QTL peak was concluded to be due to variation in a single population.
We identified trans-eQTLs influencing multiple transcripts by estimating the trans-eQTL density across the genome using a 500 kb sliding window with a step size of 1 kb. Our estimate of density included only unique genes, not transcripts to avoid counting multiple transcripts associated with a single gene as independent events. If trans-eQTL density in a window exceeded the density expected by chance under a Poisson distribution, we concluded it was a significant trans hotspot. This threshold for a Poisson distribution given the total number of trans-eQTLs (147), the window size (500 kb), the size of the genome tested (118 Mb) and the Bonferonni corrected P-value threshold (117,741 tests; P = 4.2610 27 ) is a trans-eQTL density greater than 6. We delineated the size of these hotspot regions as the lowermost and uppermost confidence interval bound for any trans-eQTL peak included in a window exceeding a density of 6.
Our initial scan identified 3 trans hotspots but upon further investigation, we determined one to be a false signal resulting from a single gene family. All of the eQTL peaks associated with this hotspot represent 13 members of a single gene family located on the X chromosome: Stellate (Ste). In addition, members of this family also occur at an unlocalized region in the heterochromatin on the X chromosome. The ''trans-'' eQTL we map regulating this family is located at the very tip of the X chromosome, making it very likely we are tagging this heterochromatic location of Stellate members, and it is in fact an additional cis effect. In fact, all thirteen members show two peaks, one cis peak and a second ''trans'' peak at the tip of the X, indicating most of our probes for these genes are tagging multiple members of this gene family. In addition, Stellate is expressed in adult males and involved in spermatogenesis (http://FlyBase.org) [39]. It is likely we are seeing high expression due to large numbers of copies of gene family members (,200 copies) [75]. We therefore excluded this trans hotspot.

Estimating the number of alleles at eQTLs
We estimated the number of alleles at each eQTL using a model comparison technique similar to the method employed by Yalcin et al. [76] and Baud et al. [22] The major difference in our approach is that we consider models with more than 2 alleles and do not restrict our analysis to specific SNPs in the QTL interval. The merge analysis employed by Baud et al. [22] considered all two allele models associated with a single SNP within the QTL interval. We simply assign different alleles to different haplotypes without those necessarily corresponding to SNPs in the interval. This method also allows us to consider models with several alleles. For each eQTL, at the peak position, we fit all possible models for different numbers of alleles, fitting a maximum of 11337 models at each eQTL. We first estimated the haplotype means at the peak, sorted these means, and then fit all possible models that did not change the order of the haplotype means for 2, 3, 4, 5, 6, 7, 8, and 16 (the full model allowing different estimates for AB8 in pA RILs and AB8 in pB RILs) alleles ( Figure S5). We only included haplotypes at the peak that occurred at least 5 times (at a probability of greater than 95%) in our set of crosses. Haplotypes at lower frequencies lead to inaccurate estimates of haplotype means with large standard errors. For each possible allele grouping, individual founder haplotype probabilities in each allele group were summed to obtain a probability each RIL harbored each allele group. For example, if haplotypes A3 and A5 are grouped as a single allele named allele 1, and the probabilities a given RIL cross harbors the A3 or A5 haplotype are 0.90 and 0.03 respectively, then the probability that RIL cross harbors allele 1 is 0.93 (i.e., the probability the RIL cross harbors either A3 OR A5 and thus allele 1). Alleles were only combined within pA and within pB given that the pA and pB sets of probabilities are independent. The model fit was as follows: where y r,i is the ith transcript, m is the grand mean, na is the number of pA allele groupings, nb is the number of pB allele groupings, G A,c are the genotype probabilities for the cth paternal allele group, G B,d are the genotype probabilities for the dth maternal allele group, and b A,c , and b B,d are the corresponding effect estimates. The model with the lowest Pvalue was chosen as the best model and the number of alleles associated with this model was recorded. We also explored using Akaike's information criterion (AIC) to choose the best model, however simulations revealed a higher error rate using AIC (see below). Table S3 provides hard coded genotype assignments for all RIL crosses at all significant eQTL.

Simulation
To test our method of estimating the number of alleles associated with QTL, we simulated QTL stemming from between 2 and 15 different alleles and subsequently estimated the number of alleles using the model comparison methodology described above. We intentionally set up this simulation to make distinguishing different alleles as easy as possible. We performed 1000 iterations for each of 2, 3, 4, 5, 6, 7, 8 and 15 alleles (the full model assuming the same effect for AB8 in the pA and pB panels). For each iteration, we randomly selected 600 pA RILs and 600 pB RILs from the DSPR panel and randomly paired them to create pA-pB crosses. We then simulated a QTL in this set of RIL crosses at a randomly selected position in the genome with the chosen number of alleles. We assigned the different alleles equal effects, because we found equal effects gave higher power to distinguish different alleles compared to pulling effects from a normal distribution ( Figure S6). For example, for a four allele model each founder haplotype was randomly assigned an effect of 1, 2, 3, or, 4. We assumed an additive model to calculate a genetic effect for each cross. We generated a set of random normal ) to correspond to environmental variance where z = the percent of the phenotypic variance explained by the QTL and s 2 G is the genetic variance at the QTL. The percent of the total phenotypic variance explained by the QTL was randomly chosen from our observed distribution of phenotypic variance explained by cis-eQTLs. These effects tend to be quite large, however, we found large effects lead to higher power to distinguish different alleles ( Figure S7). We then estimated the number of alleles at our simulated QTL as described above. We used two methods to determine the best model: 1) the model with the lowest P-value, and 2) the model with the lowest AIC. Our results showed the method using P-values had a greater accuracy (P-value method: 26% accuracy; AIC method: 19% accuracy). More importantly, the AIC method overestimates the true number of alleles more often, estimating more than two alleles in 83% of cases when the true number of alleles is two (Table S2). We prefer the method that is more conservative, meaning it has a greater tendency to underestimate rather than overestimate the number of alleles, and we therefore use the P-value method in all subsequent analysis ( Figure S8). Complete sensitivity information for the different methods and the different simulation models can be seen in Figures S5, S6, S7 and in Table S2.  Figure S8 The true number of alleles versus the estimated number of alleles for a simulation identical to that described in the main text but with AIC determining the best model instead of the lowest P-value. The size of each circle and the number displayed denotes the percentage of times each number of alleles is estimated for a given true number of alleles. The estimated number of alleles for our cisand trans-eQTL using the AIC method are shown at the top of the plot in blue. (PDF)

Table S1
Complete details for all eQTL. Columns are as follows: Name = eQTL identifier, TID = transcript identifier (CG name) for transcripts mapped separately, gene identifier otherwise, GID = gene identifier (CG name), chr = chromosome location of eQTL peak, peakp = physical position of eQTL peak, peaklpL = lower confidence interval bound using 3 LOD drop (physical position), peakupL = upper confidence interval bound using 3 LOD drop (physical position), peaklpB = lower confidence interval bound using Bayesian credible interval (physical position), peakupB = upper confidence interval bound using Bayesian credible interval (physical position), peakg = genetic position of eQTL peak, peaklgL = lower confidence interval bound using 3 LOD drop (genetic position), peakugL = upper confidence interval bound using 3 LOD drop (genetic position), peaklgB = lower confidence interval bound using Bayesian credible interval (genetic position), peakugB = upper confidence interval bound using Bayesian credible interval (genetic position), LOD = LOD score at eQTL peak, Pvar = percent phenotypic variance explained by eQTL peak, h2 = heritability of transcript abundance, psdist = physical distance to transcript start site, gsdist = genetic distance to transcript start site, cis = true/false for whether eQTL is cis, GlocC = chromosomal location of transcript, GlocP = physical location of transcript start site, GlocG = genetic location of transcript start site. (TXT)

Table S2
Sensitivity of the minimum P-value and AIC method of estimating different alleles for different simulation models. For each, the probability of estimating 2 or more alleles given a true value of 2 or more alleles is displayed. (DOC)

Table S3
Hard coded founder genotype assignments at all significant eQTL. Each RIL at each eQTL peak is assigned the most likely founder genotype, given the probability is greater than 0.95. This corresponds to a 2 digit number with the assignment from the population A RIL and the population B RIL. E.g. the number 24 indicates that RIL cross has an A2B4 genotype. If the highest founder genotype probability is less than 0.95 it is coded as uncertain. The number 9 indicates an uncertain assignment.
Column names for columns 5 to 601 are the maternal RIL ID. The paternal RIL is the RIL with the matching number in the corresponding subpopulation (see Methods). Other columns are: Name = eQTL identifier, TID = transcript identifier (CG name) for transcripts mapped separately, gene identifier otherwise, GID = gene identifier (CG name), chr = chromosome location of eQTL peak. (ZIP)