Genome-wide expression profiling Drosophila melanogaster deficiency heterozygotes reveals diverse genomic responses

Deletions, commonly referred to as deficiencies by Drosophila geneticists, are valuable tools for mapping genes and for genetic pathway discovery via dose-dependent suppressor and enhancer screens. More recently, it has become clear that deviations from normal gene dosage are associated with multiple disorders in a range of species including humans. While we are beginning to understand some of the transcriptional effects brought about by gene dosage changes and the chromosome rearrangement breakpoints associated with them, much of this work relies on isolated examples. We have systematically examined deficiencies on the left arm of chromosome 2 and characterize gene-by-gene dosage responses that vary from collapsed expression through modest partial dosage compensation to full or even over compensation. We found negligible long-range effects of creating novel chromosome domains at deletion breakpoints, suggesting that cases of changes in gene regulation due to altered nuclear architecture are rare. These rare cases include trans de-repression when deficiencies delete chromatin characterized as repressive in other studies. Generally, effects of breakpoints on expression are promoter proximal (~100 bp) or within the gene body. Genome-wide effects of deficiencies are observed at genes with regulatory relationships to genes within the deleted segments, highlighting the subtle expression network defects in these sensitized genetic backgrounds. Author summary Deletions alter gene dose in heterozygotes and bring distant regions of the genome into juxtaposition. We find that the transcriptional dose response is generally varied, gene-specific, and coherently propagates into gene expression regulatory networks. Analysis of deletion heterozygote expression profiles indicates that distinct genetic pathways are weakened in adult flies bearing different deletions even though they show minimal or no overt phenotypes. While there are exceptions, breakpoints have a minimal effect on the expression of flanking genes, despite the fact that different regions of the genome are brought into contact and that important elements such as insulators are deleted. These data suggest that there is little effect of nuclear architecture and long-range enhancer and/or silencer promoter contact on gene expression in the compact Drosophila genome.


Introduction
Deficiency (Df) is a genetic definition for mutations that affect contiguous loci on a chromosome [1]. They are now known to be a result of DNA deletion [2] and have many important uses in genetic analysis. Dfs are part of an important series of tests for defining the nature of mutant alleles according to Muller's morphs [3] where, for example, an allele is said to be an amorph when, in the homozygous condition, it exhibits the same phenotype as when uncovered by a Df encompassing the locus. Genetic mapping by complementation tests using a series of defined Dfs is also common, although not necessarily definitive, since dose-dependent interactions between loci (non-allelic non-complementation) can also result in mutant phenotypes [2]. Many dominant 3 Lee et al.
Df/+ expression dose-dependent suppressor and enhancer mutations had been identified in Drosophila by the 1930's [4] and screens for non-allelic modifiers of mutant phenotypes is one of the most important uses for large collections of Dfs that tile the genome. The genetic interactions uncovered in such screens can be extremely informative, since gene pairs showing dosedependent interactions often encode near neighbors in genetic pathways or subunits of the same protein complex. "Df kit" screens for modifiers of a gene of interest can thus rapidly identify regions where genes encoding members of the same pathway reside [5]. However, despite the undisputed utility of Dfs, we know relatively little about how these widely used tools globally impact the transcriptome.
Drosophila melanogaster shows very little clear haploinsufficiency [2], with most mutant alleles recessive to the wild type allele. The largest group of haploinsufficient loci is the Minutes, which encode ribosomal proteins or elongation factors [6], suggesting that there is a very strong requirement for diploidy when it comes to ribosome biogenesis. However, like many other animals, Drosophila is sensitive to large-scale reductions in gene dose. In a classic study, the entire genome was examined for dosage effects using synthetic deletions generated through crosses between translocation-bearing flies [7] and this segmental aneuploidy screen demonstrated that, outside of haploinsufficent regions, deleterious effects of gene dose reduction are generally dependent on the amount of material removed rather than the particular locus. This pioneering work suggested that there are many small additive or cumulative effects of reduced gene dose and, as the extent of a deleted segment grows, more genes in any given pathway are perturbed [8]. Thus, the effects of dose alteration accumulate, propagate, and eventually collapse the network. The limit of approximately a 1% deletion of the genome that Drosophila tolerates [7] is likely to reflect the connectivity of the gene network and the limits of network robustness [8]. The small effects associated with dose reduction are the main reason that Dfs are so useful in enhancer and suppressor screens. The dose changes in pairs of genes close in a network result in a phenotype, even though dose reduction of either alone is without overt consequence.
With the more recent application of genomic approaches, we are beginning to understand more about the effect of gene dose on the expression of autosomal hemizygous (one-copy) genes in 4 Lee et al.
Df/+ expression Drosophila. In general, gene expression goes down when gene dose is reduced, but not by 2-fold [8][9][10][11][12][13][14][15][16]: genes tend to be expressed at a higher level than expected if there were a simple one-toone relationship between copy number and expression level. Such modest, but measurable, autosomal dosage compensation could be due to the biochemical properties of pathways and the regulatory interactions commonly found in molecular biology [17] or to a more global response to aneuploidy that specifically recognizes aneuploid segments and increases expression of all genes in that segment [12]. The latter is analogous to the sex chromosome dosage compensation system that globally increases expression of the single X in wild type Drosophila males [18].
There has been some debate as to whether the non-sex-chromosome (autosomal) dosage compensation response is due to a general effect, elevating the expression of all genes, or to a gene-by-gene effect consistent with classic gene regulation [8,10]. Within the genome there are many genes that show a consistent response, which could be due to a general system, but there are also dramatic outliers, where expression of one-copy genes collapses or actually increases.
The later are more consistent with disrupted positive or negative feedback loops. The best evidence for gene-by-gene regulatory compensation is the coherent propagation of expression changes across gene expression networks observed in Df/+ flies [8]. The dose effects for essentially the entire genome have been probed in highly aneuploid Drosophila cell lines [15,16], but the vast numbers of changes in these cell lines makes interpretation of propagation extremely challenging. Cell lines have also evolved copy number states and show variable degrees of dosage compensation, which confounds analysis. One way to help address issues relating to mechanisms of autosomal dosage compensation would be to obtain a larger sample of expression profiled Df genotypes. We have therefore examined the effects of chromosome arm 2L Dfs (Df(2L)) on transcription in adult females and males in two genetic backgrounds, generating a total of 815 expression profiles in biological duplicate (or greater).
We report on three aspects of the effect of Df(2L)s on the transcriptome. First, we show that onecopy gene expression is generally locus-specific, suggesting that biochemical processes and molecular regulatory circuits account for most autosomal dosage compensation. However, the genome is organized into chromatin domains flanked by insulators [19] and we also provide evidence that deletions within chromatin domains associated with repressive chromatin marks 5 Lee et al.
Df/+ expression result in superior compensation or even overexpression. This counter-intuitive effect of increasing expression of one-copy genes suggests that there is a trans effect of Dfs that can weaken repressive domains. Surprisingly, these effects were preferentially found in females.
Second, Df breakpoints bring together two regions of the genome that are usually distant in the linear chromosome. This can result in breakpoint proximal changes due to transcription unit fusions, or local changes due to juxtaposition of regulatory regions such as enhancers, and it may be expected that fusing domains in cis would generally result in altered expression within novel chromatin domains. In agreement with previous work on inversions [20], our results suggest that there is very little functional long-range promoter communication with enhancers or silencers, and that disrupting chromatin domains is generally innocuous in terms of transcription. Third, genes function in networks thus perturbations should act at distance in genomic or 3D nuclear space due to information propagation through dynamic biological systems controlled by transcriptional regulators. We find strong support for this type of network structure in the expression profiles, since we observed that reduction in transcript levels from one-copy genes propagates to primary network neighbors and is dissipated after tertiary network separation. This suggests that we can learn much about the logic of gene networks by measuring how they respond to dose changes in hemizygous conditions without overt phenotypes, rather than profiling mutants with morphological, physiological, or behavioral phenotypes that complicate pathway analysis.

RESULTS
To systematically investigate the effects of deletions on transcription, we expression profiled a set of molecularly defined hemizygous DrosDel fly genotypes [21] uncovering approximately 68% of the euchromatic portion of the left arm of chromosome 2 (2L; Figure 1A). We examined gene  [22] from two sub-pools to characterize measurement variance and ratiometric performance, and determined low expression cutoffs based on an evaluation of intergenic expression. At a minimum we used biological duplicates for each Df and each sex for a total of 815 expression profiles, which are available in the Gene Expression Omnibus (GEO, accession GSE61509 and GSE73920).

Gene dose responses
To compare one-copy expression to two-copy expression for individual genes on 2L we took advantage of the fact that there were many Df/+ genotypes where a given gene was two-copy.
Therefore, for any given gene, we took the median of two-copy gene expression in all lines as a reference for the expression when that gene was only in one-copy. To summarize the typical responses of genes to their own dose, we pooled the data for one-copy gene expression across all Df/+ genotypes within the isogenic or hybrid backgrounds. In both genetic backgrounds, we observed a clear reduction in gene expression from one-copy (p < 0.001, Mann Whitney U test) as compared to two copies. However, our analysis confirmed previous reports that reduced expression is not 2-fold [8, [12][13][14]. We observed a mean 1.1-fold compensation in response to 7 Lee et al.
Df/+ expression gene dose reduction (Figure 1B,C). As in previous work, we observed that compensation was not due to a uniform effect on all genes, as one-copy gene expression was skewed towards compensation ( Figure 1D,E; Pearson's second coefficient of skewness = 0.14-0.31 for one-copy genes compared to 0.01-0.03 for two-copy genes; kurtosis = 6.6-11.5 for one-copy genes compared to 12.8-14.4 for two-copy genes), with extended tails in the distributions of one-copy gene expression values (not shown in the truncated plots). These data indicate that different genes show differences in compensation responses. We observed similar (but not identical, as will be important later) compensation in females and males ( Figure 1F,G expression are expected to be more sensitive to noise and therefore might require tighter expression level control. However, we observed no increased compensation for genes expressed at low levels in our study (Figure 1H,I) and this was independent of low-expression cut-off. Low gene expression in whole animals can be due to low uniform expression in most cells or high expression in limited cell types. Compensation has also been reported to be biased for broadly expressed genes [13]. We therefore asked if temporal or spatial heterogeneity, or representation of Gene Ontology (GO) terms correlated with compensation, but again we observed no significant trend (not shown). It is likely that data compression and increased contributions of technical noise to low-level gene expression measurements contributed to overestimating compensation at low expression levels in previous work and confounded subsequent analysis (see Discussion). Thus, while the dose response has a gene-specific component, we were not able to explain that response by particular gene expression levels or functional gene categories.

Gene-specific dosage response examples
To further explore the influence of locus, sex and genetic background on the dosage response, we used overlapping Dfs to increase the number of expression measurements from one-copy genes.
This analysis has the added advantage of determining if a particular Df used to uncover a gene 8 Lee et al.
Df/+ expression altered the response. We examined a region near the middle of 2L (cytological regions [33][34] with five distinct Dfs, and a second closer to the centromere (cytological regions [36][37] with four different Dfs (Figure 2A-D). We observed a complex variety of expression variance patterns, compensation responses, and sex-or allele-biased compensation depending on the individual locus. One-copy gene expression is generally noisier that two-copy gene expression, which we discuss at length elsewhere (Cho et al, companion paper), but this is also gene-specific. For example, we observed a wide ranges of responses to reducing the dose of nubbin (nub), from overcompensated (>2-fold increase) to anticompensation (>2-fold decrease), which also showed some genetic background-specificity, as we observed better compensation in the modENCODE OregonR background (Figure 2A,B). In contrast the hook gene showed no compensation across 24 different experiments ( Figure 2C,D). We observed partial compensation of the Multidrug-Resistance like Protein 1 (MRP) locus in females (Figure 2A), but variable compensation in males ( Figure 2B). We also observed a sex-biased response in the case of Similar to deadpan (Sidpn), which was overcompensated in females ( Figure 2C) and partially compensated in males ( Figure 2D). As expected, based on the correlation between compensation in females and males across 2L (see Figure 1F,G), we found that many genes showed similar responses. For example, we observed over-compensation of CG15485 (Figure 2A,B) and anti-compensation of CG17572 ( Figure 2C,D) in both sexes. Most ribosomal protein encoding genes are haploinsufficient, resulting in a Minute phenotype. We note with interest that the ribosomal-protein-encoding gene RpL30 ( Figure 2C-E) showed evidence of compensation, consistent with the lack of a Minute phenotype reported for mutations in this gene [6]. A second ribosomal protein encoding gene (RpL7-like, Figure 2F) also showed compensation. That these two loci are rare examples of ribosomal protein encoding genes that are not haplo-insufficient genetically and exceptionally well compensated at the transcriptional level supports the idea that stoichiometric mRNA levels of ribosomal protein encoding genes are ultimately important for ribosome function [6].
We observed one case where the particular uncovering Df correlated with a specific response. In males, the cluster of the ACXA, ACXB, ACXC, and ACXE genes showed very good compensation when uncovered by Df(2L)ED775, but much poorer compensation when uncovered by four other Dfs ( Figure 2AB). The increased compensation when these genes were uncovered by 9 Lee et al.
Df/+ expression Df(2L)ED775 was also allele-specific as the effect was only observed in the isogenic background.
The amount of DNA removed by Df(2L)ED775 was more extensive than most of the Dfs used in the study (Figure 1A), raising the possibility that the extent of a deletion contributes to compensation. However, we observed no significant relationship between the length of hemizygous segments and dose responses in our experiments ( Figure 2G,H).

Nuclear architecture and dosage responses
Our data do not support the idea that specific functional classes of genes or gene features, such as length, expression breadth or level, are associated with distinct dosage responses. However, we did notice that some blocks of genes showed common compensation responses and speculated that these might correspond to a particular chromatin state.  Figure   3B,C). These repressive domains show overlapping characteristics between the DamID and Hi-C studies, as well as being enriched in LaminB binding. When we specifically looked at LaminB domains, we also observed improved compensation ( Figure 3D). We were very surprised to find that all these improved compensation distributions were only observed in females: we found no significant correlations, or even hints of a trend, between repressive chromatin and compensation in males. We also observed improved compensation in regions of Polycomb group (PcG) protein occupancy ( Figure 3B), but not in the structural domains enriched in those proteins from Hi-C ( Figure 3C). Again, we observed a correlation between PcG occupancy and compensation only in females. In addition to the increased median (and mean) compensation levels in these repressive chromatin domains, we observed an increased range of responses. Thus, there is a greater heterogeneity in the compensation response within these domains. We observed modest, but significant decreased compensation in one of the two types of active chromatin in both sexes based on occupancy ( Figure 3B). Active regions of the "Yellow" type, which is enriched in H3K36me3 domains and in genes with broad expression patterns [24], showed significantly worse compensation, while active regions of the "Red" type showed the global dosage response.
The locations of chromatin domains were defined from different samples (e.g. cells or embryos) than our RNA-Seq analysis (adult flies), but these data indicate that the female dosage response is different from males in contiguous regions regardless of underlying cause.  Scatterplots that compare one-copy gene expression relative to two-copy gene expression between the isogenic genetic background and hybrid genetic background in females (C) and males (D). A subset of genes in (C) represents "better compensated" genes identified by clustering analysis (Green). r = Pearson's correlation coefficient. Slopes are from linear regression. P values are from F-tests.  Figure 5A). We also observed an ambiguous effect of breakpoints in HP1 domains on two copy gene expression, but only in females and the directionality of the change in expression differed in the two genetic backgrounds we assayed. We found slight and occasionally significant increased expression of two-copy genes flanking breakpoints in "Null" domains ( Figure 5A). Breakpoints in other chromatin domains showed no significant changes in the expression of breakpoint proximal two copy genes. Our data suggests that LamB repressive domains are more sensitive to de-repression in trans than in cis. This suggests that LamB domain repression is additive or cooperative across homologs. Deletions can also remove cis-regulatory regions such as enhancers and silencers. For example

Breakpoints and gene expression
all have a common breakpoint just upstream of the CG31646 promoter, deleting a region where bearing a known CNS regulatory region [29] (Figure 5B). In males, and especially in females, these deletions resulted in dramatic overexpression of CG31646 suggesting that a silencer was removed by each of these deletions. To determine if the effects of structural rearrangements on gene expression are common, we centered all the breakpoints from the Dfs used in this study and plotted expression flanking the breakpoint as well as the median absolute deviation to summarize 13 Lee et al.
Df/+ expression the results. We observed no significant change in expression with distance from the breakpoint, with the exception of genes within 100bp of a breakpoint ( Figure 5C). Even this breakpoint proximal effect is probably less significant than it appears, since the spike of increased expression in the isogenic background is due almost exclusively to the Dfs in Figure 5B. Thus despite the fact that the deletions our study removed 2,100 insulators identified in an embryo study [30], and have breakpoints that disrupt 437 chromatin domains identified in a Hi-C study [25], we find little evidence that these play major roles in transcription. These data suggest that hemizygous chromatin rarely alters expression on the deletion homolog and that bringing two separated regions into a novel configuration has little effect on the expression of two-copy genes near breakpoints. These data support the idea that the Drosophila genome is compact: most genes are regulated with promoter proximal regulatory sequences and show little long distance effects of chromatin structure. The regional effects of LamB domains in females are an exception.

Propagation through gene networks
The general absence of breakpoint proximal effects of Dfs on transcription of two-copy genes does not mean that there is no effect of deletions on the rest of the genome. We observed tens of   We observed similar overall patterns of network propagation and dissipation in both the isogenic and hybrid backgrounds in both sexes. However, the precise genes that changed in response to a given Df differed by sex and by genetic background. For example, Df(2L)ED136/+ males showed many more expression changes than females in both backgrounds (Figure 8A,B). In males, other than the one-copy genes, only four genes showed a significant change in both backgrounds. Of the genes showing differential expression in Df(2L)ED136/+ males, only CG18600 was also differentially expressed in females. This gene is expressed preferentially in gonads and male accessory gland in wildtype flies [40]. Globally, differently responding genes among the sexes and backgrounds was a strikingly common trend (Figure 8C,D). Significant changes in the expression of the one-copy genes showed 27% overlap in the genetic backgrounds, indicating that dose responses were similar among alleles from different backgrounds (also see Figures 2,3).
There was a much larger group of two-copy genes that showed significant changes in gene expression. In our analysis of chromosome 2L we observed that 10,418 genes display significant 16 Lee et al.
Df/+ expression changes at least once in any of the test genotypes (76% of genes). In striking contrast to similarity in dose responses among one-copy genes, the genes responding to the perturbation were usually different. We observed < 1% overlap between backgrounds (p = 0.39 to 0.61) and even fewer genes showed changed expression in both sexes and in both backgrounds. This is perhaps unsurprising given that there are ~600K heterozygous SNPs and Indels in the hybrid background relative to the isogenic background (Supplemental file 1) and given the pervasive sex-bias in Drosophila gene expression. Thus, while there are coherent pathway responses to hemizygous driver perturbations, the exact path through network space was highly dependent on sex and genetic background. ranging from no compensation to nearly 2-fold up-regulation of hemizygous gene expression [9-11, 13-15, 41]. Some of the differences in compensation values are probably due to biology, such as the varied responses of aneuploid tissue culture cells [15]. However, data compression in microarray-based studies also contributes to over estimating dosage compensation especially at low expression levels where array responses are nonlinear [42][43][44]. For example, a microarray study where stringent expression cutoffs were applied to measure compensation levels [10] resulted in the same 1.1-fold compensation from the hemizygotic genes we report here.
Reanalysis with the same stringent method (not shown) results in 1. There has been debate about whether there is a regional response to reduced gene dose in Drosophila [8,10,12,13]. In our analysis of chromosome 2L, we found that many genes showed poor or partial compensation, while others showed excellent dosage compensation, which is consistent with feedback and buffering models. Given the propagation of gene expression changes from one copy segments to two copy genes through regulatory network connections, it is clear that gene dosage perturbs gene networks. Thus, we suggest that traditional gene regulation involving feedback can explain the vast majority of the dosage compensation response on chromosome 2L. However, we also identified blocks of well compensated or over compensated genes in females and could correlate these with repressive chromatin domains. Such chromatinbased responses to autosomal aneuploidy are analogous to the chromosome-wide MSL and POF systems that alter chromosome wide expression from the X and the ancestral X (the current chromosome 4) in Drosophila [45].
Interestingly, the chromatin domains resulting in superior compensation were repressive, with diagnostic LamB and/or PcG enrichment. The PcG proteins can mediate pairing-dependent silencing [46], which has the counter-intuitive effect of increasing expression of one-copy genes.
We observed the same effect in some clusters of one-copy genes in this study. In C. elegans, the Lee et al. Df/+ expression two X chromosomes in XX hermaphrodites are down-regulated to counteract the increased X chromosome expression that X0 males use to equilibrate X and autosomal gene expression [47].
XX down-regulation is achieved by strengthening the attachment of both X chromosomes to the repressive regions while this is relaxed in males with one-copy of the X to increase expression [48][49][50]. The increased repression of two-copy genes relative to one-copy genes due to deletions in LaminB domains is similar, suggesting a plausible model for the evolution of X chromosome dosage compensation. Genetic material, including pairing-dependent repressive domains are progressively lost from neo-Y chromosomes as they diverge from the X homolog on evolutionary timescales. The loss of pairing dependent repressive domains could lead to regional dosage compensation prior to evolution of a specific X chromosome-wide mechanism.
Curiously, we observed this regional LaminB and PcG dosage compensation response only in females. On the one hand this may reflect ascertainment bias since we used domains defined by work in tissue culture cells and embryos [24][25][26] and it is possible that the arrangement of domains in the adult fly could be significantly different and sex-biased. On the other hand, sexspecific differences in the nature of heterochromatin have also been noted [51] and it is possible that there is a general weakening of repressive domains in males, reducing the possibility for regional autosomal dosage compensation due to further de-repression. The fact that a group of well-compensated genes was only found in females, regardless of where they were located, favors a female-biased derepression model. Clearly, additional experiments will be required to investigate this curious finding.

Breakpoints
Df breakpoints bring two regions of the genome together that are usually distant in the linear chromosome. This can result in breakpoint proximal changes due to transcription unit fusion as occurs in many cancers and has been especially well studied in immune cell tumors [52].
However, we observed only one such case in our analysis: Df(2L)ED680 results in a fusion transcript of taiman and mini-white (the marker for deletion). It is also clear that some genes have enhancers and silencers located many kb from the promoter [53][54][55]. It is also clear that the genome is organized into chromatin domains flanked by insulator regions, which could facilitate 19 Lee et al.
Df/+ expression regional transcriptional control [56]. Deficiencies delete insulator sites resulting in the creation of novel arrangements of insulator pairs. If this creates a novel gene expression regulatory milieu, then transcription should be altered. Our analysis indicates that across ~ 20 Mb of the genome we surveyed, the vast majority of the regulatory information is within the gene body or ~100 bp upstream. This agrees with work where inversions generated within Drosophila neighborhoods of co-expressed genes failed to disrupt co-regulated gene expression [20]. It is possible that there are highly deleterious cases where generation of a Df is dominant lethal, but even this is likely to be rare. In a study that generated a large number of FRT (Flippase Recognition Target) deletions, 6% of the pairs failed to produce a deletion [57]. The majority of regions can be joined without dominant lethality. We suggest that effects of DNA topological domains and long-range enhancer promoter interactions are rare in Drosophila adults.

Network interactions
We observed substantial changes in gene expression throughout the genome, not just in the one copy regions, and these are likely due to "error" propagation as is expected in a dynamic biological system. The primary two copy network neighbors of one copy genes change expression in response to reduced dosage of genes uncovered by deletions. In many cases Dfs Df/+ expression raising the possibility that they are not deletions, although they were homozygous lethal.
Additionally, Df(2L)ED1050 and Df(2L)ED123 complemented mutations that they should uncover, suggesting that they are not correctly identified Dfs. Inclusion/exclusion of these three Dfs did not alter overall compensation values at the rounding levels reported here. We excluded them from one-copy analysis, but they serve as additional controls. We mixed 400ng of total RNA in 50µl of nuclease-free water with 50µl of 2:5 dilution of

RNA-Seq data analysis
RNA-Seq results were analyzed as in Cho et al. (companion paper) with a minor difference in handling the ERCC spike-ins. Briefly, the short reads generated from the analysis were mapped onto the Drosophila reference genome (Release 5, with no "chrU" and "chrUextra" scaffolds) using TopHat 2.0.11 [65]. We used Cufflinks 2.2.1 [66] to measure gene expression levels in Fragments per Kilobase per Million mapped reads (FPKM). We also measured FPKM values from intergenic regions as in [14,67], which we used to determine expression cutoff levels as 0.6829118 for the isogenic background results, and 0.8140542 for the hybrid genetic background (see below). We used HTseq 0.6.1p1 [67] [68] and "voom" in the R limma package [68] to obtain raw reads counts at the gene level, and to call differential gene expression as in (Cho et al, companion paper). Benjamini-Hochberg multiple hypothesis correction of the P values was used throughout the manuscript. FPKM values for the spike-ins were separately determined as abundant transcripts influence of estimation of gene expression. In the calculation, the number of reads from both genes and spike-ins, not solely from spike-ins, was used as the denominator of FPKM to infer the lowest detectable expression levels of genes. In describing gene expression changes, we report expression values for genes that produce polyA+ mRNA in the gene model, since we followed a poly-A purification protocol. For example, we detected expression of histone transcripts, but the values were highly variable between libraries due to their lack of poly-A tails, and essentially followed the values of residual rRNAs in the sequencing libraries [69]. Expression results presented were robust to normalization methods used in different bioinformatics tools (FPKM, DESeq, and TMM). We used Expectation-Maximization method to identification of the fully compensated group of genes in Figure 4 using "mclust" package in R (doi:10.1007/s00357- We observed outstanding biological replicate profiles (Pearson's r <0.9) in all 396 isogenic samples ( Figure 9A). However, we observed 10% outliers in the single fly profiles, and therefore we removed all samples where Pearson's r <0.9, and we used the two duplicates with the highest correlation. The sexual identify of a sample is self-reported in the expression profile ( Figure   9B,C). Detection of low-level gene expression is complicated by the contributions of noise, which vary between libraries. To mitigate this problem we measured reads from intergenic regions (trimmed to account for variation in transcription start and stop sites) and determined the 95th percentile as a low expression cutoff ( Figure 9D). While some of this intergenic expression may be due to strain-specific transcripts or un-annotated genes, much is likely to be due to noise such as ectopic Pol-II initiation, inclusion of contaminating genomic DNA, sequencing, and/or mapping errors. Ratios and data compression measurements are critical for dosage compensation analysis. We used pools of ERCC controls in each sample library to produce 1.5:1, 1:1, and 1:1.5 ratios across a > 2 15 input concentration range ( Figure 9E). Ratio measurements show a clear increase in scatter when input was low. However, even at very low input, there was only modest compression, and no evidence of compression in the useful range for this work. Read Archive (SRR630490). We mapped the raw reads to the reference genome using Bowtie 2 [70] with default parameters. We used Samtools to call SNPs from the mapping result [71]. The calls were filtered to have equal to or more than quality score 20 (-Q 20). Any calls that have more than twice the average read depth were discarded. Based on the mapping, we incorporated substituted bases, or SNPs, as previously described [72] to have "w 1118 SNP-substituted genome".
The differences between w 1118 SNP-incorporated genome and OregonR DNA-Seq results were identified using Samtools as described above.

Data Access
The gene expression profiles generated in this study are available in GEO with accession numbers of GSE61509 (isogenic genetic background) and GSE73920 (hybrid genetic background).