^{1}

^{1}

^{1}

^{2}

^{2}

^{1}

^{3}

^{3}

^{4}

^{5}

^{6}

^{7}

^{5}

^{4}

^{3}

^{5}

^{2}

^{1}

^{2}

^{*}

Conceived and designed the experiments: AW DF RA GS IS LZ YR YD ND YR DB ET ES ES. Performed the experiments: AW DF RA GS IS LZ YR AH YD ET. Analyzed the data: AW DF RA SI TS ES ES. Contributed reagents/materials/analysis tools: SK GS IS LZ YR AH YD ND YR DB ET. Wrote the paper: AW DF RA SI GS YR YD ND DB ET ES ES.

The authors have declared that no competing interests exist.

The depth of a cell of a multicellular organism is the number of cell divisions it underwent since the zygote, and knowing this basic cell property would help address fundamental problems in several areas of biology. At present, the depths of the vast majority of human and mouse cell types are unknown. Here, we show a method for estimating the depth of a cell by analyzing somatic mutations in its microsatellites, and provide to our knowledge for the first time reliable depth estimates for several cells types in mice. According to our estimates, the average depth of oocytes is 29, consistent with previous estimates. The average depth of B cells ranges from 34 to 79, linearly related to the mouse age, suggesting a rate of one cell division per day. In contrast, various types of adult stem cells underwent on average fewer cell divisions, supporting the notion that adult stem cells are relatively quiescent. Our method for depth estimation opens a window for revealing tissue turnover rates in animals, including humans, which has important implications for our knowledge of the body under physiological and pathological conditions.

All the cells in our body are descendants of a single cell – the fertilized egg. Some cells are relatively close descendants, having undergone a small number of cell divisions, while other cells may be hundreds or even thousands of divisions deep. So far, science was unable to provide even gross estimates for the depths of the vast majority of human and mouse cells. In this study, we show that precise depth estimates of cells can be obtained from the analysis of non-hazardous mutations that spontaneously accumulate during normal development. The concept behind the method is simple: deeper cells tend to acquire more mutations and “drift away” from the original DNA sequence of the fertilized egg. Knowing how deep cells are is the key to many fundamental open questions in biology and medicine, such as whether neurons in our brain can regenerate, or whether new eggs are created in adult females.

Direct observation of cell divisions, which was used to reconstruct the cell lineage of the 959 somatic cells of

Our work develops the notion of genetic molecular clocks into a quantitative method for depth estimation of single cells of any type. When a cell divides, its DNA is replicated with almost perfect fidelity, yet somatic mutations occur in every cell division

(A) The depth of a cell is the number of divisions it underwent since the zygote. The figure shows a tiny part of the cell lineage tree of an organism – a binary tree representing the exact pattern of cell divisions of its developmental history from a single cell to its current state. The tree depicts not only the lineage relations between cells, but also their depths, obtained by projecting them to the depth axis. A correlation between genetic distance and cell depth is shown in a small fraction (5 MS alleles) of the genome. Each allele is assigned a relative allelic value – a whole number equal to the difference between the number of repeats of that allele and the number of repeat units of the corresponding allele in the zygote. Mutations are coloured in red. (B) Computer simulations of MS mutations and depth estimations based on a maximum likelihood approach. Cells at various depths were simulated accumulating MS stepwise mutations according to wild-type and MMR-deficient mutation rates (^{−5} and

We use the zygote genome as a reference against which mutations are determined (

These simulations assume that the mutational behaviour of MS alleles is simple, consistent, and completely known to us. In practice this is not the case: although some macro-properties of MS mutational behaviour are known

(A) Our method for

Successful depth estimation based on our suggested method depends on the fulfilment of three conditions: (i) there is a good linear correlation between reconstructed and actual node depths in CCTs; (ii) this correlation is similar between similar experiments, i.e. a multiplier obtained from the correlation in one experiment can be used in another; (iii) this multiplier can be reliably transferred from ^{2} = 0.94 and 0.87 for CCTs A and C, respectively) and a newly-created mouse CCT. Multipliers of human CCTs are very similar – 411 and 421 for CCTs A and C, respectively. Depth estimations of nodes from one CCT based on a fit obtained from the other CCT are extremely precise: the average error when estimating the depth of a node from CCT A based on a fit obtained from CCT C is 6.4%±4.1% (and 11%±11% vice versa, for the estimation of CCT C nodes based on a fit obtained from CCT A). The multiplier of mouse CCT is different (256), reflecting the differences between mutational behaviour of our human and mouse MS sets. To further demonstrate that multipliers can be transferred between similar experiments, we performed computer simulations in which a multiplier obtained from one randomly generated tree was used to estimate depths of cells of other similar random trees. These simulations show that when 100 alleles (with mutation rate ^{−6}).

Next we checked whether depth analysis is sensitive to the number of analyzed cells and the specific choice of analyzed alleles. To test the former, we generated random trees with 50 leaves and simulated MS stepwise mutations at various rates. We reconstructed the trees, with increasing subsets of leaves (3–50). Depth estimates of a single leaf (included in all subsets) varied by less than 5% between reconstructions demonstrating that our method is robust to the number of analyzed cells (data not shown). To test the latter, we calculated a fit for the mouse CCT by bootstrapping the data 1000 times (see

We applied the method and estimated depths of 163 cells of various types sampled from four MMR-deficient (Mlh1−/−, see ^{2} = 0.97;

Mouse | Cell Type | Source | Estimated depth |

ML7 (75), Age: 5.5 wk | Satellite cell (57) | Various muscles and myofibers | 37.5±1.9 |

Oocyte** (8) | Right ovary | 28.6±5.4 | |

B cell** (10) | Spleen | 33.8±3.8 | |

ML2 (26), Age: 10 wk | Satellite cell (10) | Various muscles and myofibers | 28.4±3.2 |

Kidney stem cell (8) | Kidney | 40.1±7.7 | |

B cell* (8) | Spleen | 67.1±8.7 | |

ML4 (25), Age: 13 wk | Satellite cell (12) | Various muscles and myofibers | 36.3±3.6 |

Mesenchymal stem cell (5) | Femur/tibia | 27.6±7.4 | |

Hematopoietic stem cell (2) | Femur/tibia | 24.0±4.0 | |

B cell* (5) | Spleen | 78.6±35.4 | |

NK cell* (1) | Spleen | 42.0 | |

ML8 (37), Age: 40 wk | Tumor** (23) | Thoracic cavity/lung | 237.0±8.4 |

Epithelial** (14) | Lung | 117.3±18.0 |

Depth estimates of various cells sampled from mice aged 5.5–40 weeks. (A) Box plots of depths according to cell type and mouse age. Box (blue) displays the middle 50% of the data from the lower to upper quartiles (median is red). Ends of vertical lines (whiskers) indicate minimum and maximum data values, unless outliers (marked by ‘+’) are present, in which case the whiskers extend to a maximum of 1.5 times the inter-quartile range. Stars depict cell types with statistically significant different average depths (p<0.05). (B) Average depths of satellite cells and B cells as a function of mouse age. While depths of satellite cells did not correlate to age, depths of B cells showed a linear correlation (R^{2} = 0.97) to age, corresponding to about one cell division per day. Error bars denote standard errors of the mean.

The DNA of analyzed cells was amplified either by

In conclusion, we developed a method for estimating depths of cells

Horwitz and colleagues also recently developed a method for cell lineage analysis based on somatic mutations in polyguanine repeat DNA sequences

The reconstructed tree obtained by Horwitz and colleagues

Our depth estimations of oocytes were highly similar to previous reports, providing an independent confirmation for the precision and correctness of our method. Nevertheless, depth estimations may be imprecise to some extent due to various factors, such as the stochastic nature of mutations, differences between

Mlh1−/− MEF cells (obtained from Michael Liskay, OHSU) were grown in medium composed of DMEM low glucose (Gibco) supplemented with 10% Fetal Bovine Serum, 1% Non-essential amino acids, and Gentamycin (70 µg/ml). The CCT was created as previously described

Identifiers of cells at various depths were simulated based on a symmetric stepwise mutation model, according to which each MS allele mutates with probability ^{−5}) or MMR-deficient (_{2} (estimated_depth / real_depth)|. Normally, increasing depth improves (lowers) the score. However, in shallow cells with slow mutation rates (e.g. cells 10 divisions deep in wild-type simulations) there are usually no mutations, hence the estimated depth is zero and the score increases with depth since the difference between the estimated and real depths increases.

This algorithm was used for estimating depths of cells in computer simulations (in the case the mutational behaviour of MS is simple and completely known). We assume that the mutational behaviour of each MS allele is defined by a Probability Vector (PV; p_{i} is the probability of the allele to mutate by _{i} = 1). For every MS allele, given its initial value (number of repeats) at the root, fill up a table whose rows are indexed by number of repeats, whose columns are indexed by depth (starting at 1 and ending at the maximum conceivable depth of a leaf), and whose (

Only alleles which were successfully amplified and analyzed in at least 20% of mouse CCT and Mlh1−/− samples were analyzed. We generated 10^{6} random permutations of the percent of mutations in the

The loci for

A lineage tree was reconstructed for each Mlh1−/− mouse (using NJ and the ‘Absolute Distance’ function), and for each cell the reconstructed depth was obtained, which is the sum of edge lengths in the reconstructed tree. The estimated depth of each cell was its reconstructed depth multiplied by the multiplier obtained from the mouse CCT. We calculated the 95% confidence interval of the regression coefficient, and used its lower and upper bounds as multipliers for obtaining the 95% confidence interval of the depth estimate of each cell. Depth recalculations assuming WGA artefact mutations were calculated as follows: for each WGA sample

Materials and Methods for obtaining cell identifiers for ML2, ML4 and ML7 cells

(0.05 MB DOC)

Estimated depths of each analyzed cell

(0.22 MB DOC)

List of microsatellite loci used for analysis of mouse CCT

(0.17 MB DOC)

Table of identifiers of mouse CCT nodes

(0.05 MB XLS)

Cell identifiers for ML2, ML4, ML7 and ML8 samples

(0.61 MB XLS)

We thank Uriel Feige for suggesting the algorithm for estimating simulated depths and Amit Mishali for the design and preparation of the figures. Ehud Shapiro is the Incumbent of The Harry Weinrebe Professorial Chair of Computer Science and Biology and of The France Telecom – Orange Excellence Chair for Interdisciplinary Studies of the Paris “Centre de Recherche Interdisciplinaire” (FTO/CRI).