Figure 1.
Nested Inversions Are Always Amalgamated by ST-Synteny
The dot-plot of a signed permutation of anchors (green) between two genomes is shown. Since the anchors are signed, they are represented as ±45-degree segments. Blocks were constructed by ST-Synteny (red) and GRIMM-Synteny (blue). ST-Synteny amalgamates everything into one block. GRIMM-Synteny produces the correct blocks. If the genome on the horizontal axis is taken to be the identity permutation 1 2 3 4 5, then the genome on the vertical axis is the signed permutation −3 2 −1 −5 4.
Figure 2.
Asymmetric Treatment of Genomes by ST-Synteny
In comparing two genomes, ST-Synteny may produce different synteny blocks depending on which one is chosen as the reference genome. The synteny blocks produced by ST-Synteny are shown as red boxes around the anchors.
(A) The genome shown on the y-axis is the reference genome 1, …, 10, and the genome shown on the x-axis is represented as a permutation π of this.
(B) The exact same anchor arrangement is shown, but the x-axis is taken as the reference genome 1, …, 10 and the y-axis is the permutation π−1. Although the anchor arrangements are identical, ST-Synteny with parameters w = 2, Δ = 1 produces different blocks depending on which genome is the reference genome.
Figure 3.
Breakpoint Reuse Rates in Simulations
The simulated number of microrearrangements is k, and the microrearrangement size is w. The same simulated rearrangements were analyzed three ways.
(A) ST-Synteny simulation, with signs of blocks determined using their majority sign rule.
(B) ST-Synteny simulation, with signs of blocks determined using GRIMM-Synteny's separable permutation rule.
(C) GRIMM-Synteny simulation. Anchors have length 1 for comparison with ST-Synteny.
Figure 4.
GRIMM-Synteny and ST-Synteny on the Same Simulated Data
The genomic dot-plot is shown in thick green. The synteny blocks identified by GRIMM-Synteny are shown as blue rectangles, and the ones from ST-Synteny are dashed red rectangles. When block coordinates coincide, this appears as dashed blue/red. Signs of the blocks are shown as diagonals. Tiny blocks have been artificially enlarged for visibility and do not actually protrude into other blocks. The simulated human genome has anchors 1 through 5,000. The simulated mouse genome was generated as π = Simulation(5000, 15, 500, 5). Blocks were identified via GRIMM-Synteny(π, 8, 3) and ST-Synteny(π, 5, 3).
Figure 5.
Synteny Blocks between Human and Mouse X Chromosomes
Blocks for the X chromosomes were constructed by GRIMM-Synteny (blue) based on anchor coordinates and ST-Synteny (red) based only on anchor permutations. Anchors are shown in green. Small blocks deleted by ST-Synteny are shown in black.
Table 1.
GRIMM-Synteny versus ST-Synteny Applied to Human and Mouse X Chromosomes
Table 2.
Randomized Unichromosomal Rearrangement Simulation with Human X Chromosome Anchors
Table 3.
Randomized Multichromosomal Rearrangement Simulation with Human Anchors
Table 4.
Randomized Unichromosomal Rearrangement Simulation with Mouse X Chromosome Anchors from Mouse/Rat Alignment
Table 5.
Breakpoint Reuse Rate versus Block Size and Gap Threshold Used in GRIMM-Synteny on Simulated Genome-Based Random Breakage Model
Table 6.
Sequence-Based versus Gene-Based Rearrangements
Figure 6.
Distribution of Human Intergenic Regions within Synteny Blocks or within Breakpoint Regions
(A) Regions of length ≤1 Mb and (B) length >1 Mb that are within synteny blocks (blue) and within breakpoint regions or across breakpoint regions and synteny blocks (red). Data derived from NCBI Human version 34 and Mouse version 30.
Figure 7.
Breakpoint Reuse Rates as a Function of Upstream Regulatory Region Size in Intergenic Breakage Model Simulations
Genes from NCBI Human version 34 were each extended by length 0 to 210 kb upstream, thus shortening or eliminating the intergenic regions. In the intergenic breakage model, simulated reversals were performed with breakpoints chosen uniformly among the nucleotides remaining in the shortened intergenic regions, while in the random breakage model, breakpoints were chosen uniformly among all nucleotides in the genome. Then blocks were derived, and the breakpoint reuse rate was computed.