Skip to main content
Advertisement

< Back to Article

Retroviral Integration Process in the Human Genome: Is It Really Non-Random? A New Statistical Approach

Figure 4

Integration distance (ID) from the nearest gene transcription start site (TSS).

In this picture, six hypothetical genes with different length and orientation (blue arrows) are scattered along a chromosome (x-axis). The purple piecewise linear function represents the distance from the TSS of the nearest gene. This function has discontinuities exactly in the middle of the intervals between two consecutive genes. Even assuming a series of random integrations in this setting, we obtain a distribution of distances from TSSs (projected on the y-axis, gray plot) which is a mixture of Uniform distributions. As a consequence, the bell-shape curve is observed. Notice that the ID distribution is asymmetric around zero, since gene orientations and gene lengths determine which is the TSS to be considered in computing the distances (a symmetric distribution would be observed plotting the distance from the nearest TSS instead of the nearest gene TSS, data not shown).

Figure 4

doi: https://doi.org/10.1371/journal.pcbi.1000144.g004