Fig 1.
A) Three and two-generation family configurations useful for inferring recombination events (type I in blue on the left and type II in yellow on the right). B) An example pedigree in which one type I (in blue) and two type II families (in yellow with stripe patterns) can be extracted.
Fig 2.
The steps of Recombulator-X are represented on the left column of the figure, while we reported a simplified example on the right using just three X-STRs.
1) Preprocessing: Recombulator-X reads the PED file, performs preliminary quality checks, extracts the informative type I and type II families and phases all the females, whenever their father is available. 2) Likelihood computation, depending on the family type: in the case of a phased mother (type I family), the likelihood (L1) of each possible recombination is computed and summed up. Here, the red crosses indicate genetic incompatibilities (mutations greater than one repeat), while the single red lines correspond to compatible single-step mutations. When the grandfather is not available (and thus the mother cannot be phased, type II family), this process is repeated for each possible maternal phase (L2). 3) In the last step—optimization—the likelihood of the entire dataset is computed by multiplying together the likelihood of each family and Recombulator-X searches the parameters (recombination and mutation rates) that maximize the global likelihood.
Fig 3.
Mean time needed to compute the likelihood for one family typed over up to 10,000 markers.
Each implementation is represented with a different colour, while the linestyle refers to family types. The y axis is in log scale. For each implementation, the number of markers was progressively increased until the computation time went above one second per family.
Fig 4.
Recombination and mutation estimation times using the fastest (dynamic-numba) likelihood implementation depending on the number of markers.
A simulated dataset of 100 type I families (A) and one of 100 type I and 100 type II families (B) were tested. Times are in seconds, with a logarithmic axis for the right panel.