Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Figure 1.

Overview of the data collection and sorting.

A) Exons of coding sequence were extracted from the annotated genome of D. melanogaster using Extractor and electronically joined using Analyst to obtain complete coding sequences. These coding sequences were then automatically blasted against the genome of D. melanogaster, D. simulans and D. sechellia with Megablast. Analyst scanned the resulting alignments for the best hits and assembled the coding sequences in the three species from them. B) Analyst also calculated the coverage, the percentage not covered, the divergence in sim-sec, sim-mel, sec-mel as well as the control mel-mel and organized this data in a table. C) To minimize artifacts due to incomplete clone representation in the genomic libraries, the coding sequences were filtered and only genes with the same coverage in D. simulans and D. sechellia retrieved. To avoid genes truncated by Megablast (i.e. usually genes with small exons), only genes with a mismatch up to 1% in the control mel-mel were retrieved. After these two filters were applied, a new table like the one exemplified in C) was generated for each chromosome.

More »

Figure 1 Expand

Figure 2.

Patterns of divergence along chromosomes and two screening methods.

A, B, C, D and E) Each graph corresponds to a chromosome or chromosome arm (X, 2L, 2R, 3L and 3R), where the genes are ordered from the least divergent to the most divergent in sim-sec (abscissa) and sim-mel (ordinate). The horizontal and vertical dashed lines delimit the averages plus one standard deviation of the divergence between sim-sec (horizontal) and sim-mel (vertical). The upper left quadrants delimit genes found by the method of averages and standard deviations. Note that the divergences of most genes in all 5 graphs are clustered in a quadrant that can be roughly delimited between the abscissa values of 0%–5% and ordinate values 0%–10% (red rectangles). In this quadrant, the genes have a good fit with a linear distribution (P<0.0001). To better delimit the quadrant in which the divergence is linear in each chromosome, the data was divided in percentiles of divergences of sim-sec, sim-mel and sec-mel. (A') exemplifies the percentiles of the X chromosome. Since each point in these curves represents one percentile, the percentage of genes that diverge linearly is equal to the number of points that can be transected by a straight line. Once this linear interval is defined, the values on the x and y axes become known and can be used to redefine the quadrant of linear divergences (inferior left quadrant in blue). The region where the genes in sim-sec vary the least and the genes in sim-mel vary the most is the adjacent upper quadrant to the left of the point where the horizontal and vertical lines cross (gray quadrant).

More »

Figure 2 Expand

Table 1.

Number of genes identified by the screening methods 1 and 2.

More »

Table 1 Expand

Figure 3.

Distribution of ancestral alleles in the three major chromosomes.

The 20 division coordinates used are those of D. melanogaster. Almost all divisions have one or more ancestral alleles. The dashed lines indicate the average number of alleles plus one standard deviation. Note that some divisions have a higher density of ancestral alleles than others.

More »

Figure 3 Expand

Figure 4.

The biological functions of 48 ancestral alleles defined by Gene Ontology.

The graph shows only genes with biological functions assigned by assays, inferred by sequence similarity or phenotype and does not include general biochemical properties such as phosphorylation, transcription initiation, signal transduction, proteolysis among others. The complete list of genes can be found in supporting information.

More »

Figure 4 Expand