Post a new comment on this article
Post Your Discussion Comment
Please follow our guidelines for comments and review our competing interests policy. Comments that do not conform to our guidelines will be promptly removed and the user account disabled. The following must be avoided:
- Remarks that could be interpreted as allegations of misconduct
- Unsupported assertions or statements
- Inflammatory or insulting language
Why should this posting be reviewed?
See also Guidelines for Comments and Corrections.
Thank you for taking the time to flag this posting; we review flagged postings on a regular basis.close
Did Ralph and Coop allow for nonrandom recombination, leading to "hot spots" and "cold regions"??
Posted by Gary_S_Collins on 08 May 2013 at 05:47 GMT
My autosomal DNA was analyzed by 23andMe earlier this year. Out of a total of 660 other individuals who have matching DNA segments, 150 individuals surprisingly have segments matching with me on chromosome 22 between about 17 and 21 million base-pairs (minimum shared segment 8.0 cM). Assuming that other 23andMe subscribers are a relatively random lot, what can explain such an anomaly?? Lacking any reason to suspect that there is a defect in the way that 23andMe calculates matches, are there special features of that stretch of chr22??
I submit that the explanation lies in high variability in probabilities of recombination along the lengths of the chromosomes, lleading to "hotspot" regions where recombination is much more frequent than if it occurred at random, and "coldspots", regions in which recombination has lower frequency.
Information about recombination variability in chromosome 22 is given in the paper http://bioinformatics.bc..... See especially Figure 3, which shows the "Genetic Distance (cM)" as a function of the "Physical Distance (mega base-pairs)". The arrows indicate hotspots, with "colder" intervening regions. There appears to be a very "cold" region between about 17 and 20 Mb—the same region in which I have many matches--especially for females. The data in that study are based on analyses of DNA of just eight families.
Consider an archetypal ancestor. In each descending generation, suppose (for point of argument) that the ancestor’s genetic inheritance—on average--gets "sliced and diced" by a factor of two due to recombination, with another factor of two due to pairing of chromosomes from each parent. On average, children in the nth descendant generation would then share 2^-n of the ancestor’s DNA, with each chromosome divided into 4^-n smaller segments. Thus, 25% segments after one generation, 6.3% after two generations, 1.6% after three, etc. The stretch from 16000000 to 21000000 on chr22 is more than 9% of the length. Thus, it appears completely impossible that 150 out of my 660 matching “cousins” should share a segment as long as 9% of chr22 unless they were all something like first or second cousins. In addition, they should likely share additional segments, which they don’t.
The math may not be entirely correct, but my central point is that the nonrandom statistical nature of recombination may well lead to propagation of segments from one very distant ancestor over many, many generations that are decidedly not from multiple, more recent ancestors. I am not sufficiently expert to assess whether Ralph and Coop have explicitly or implicitly taken such “cold spots” into account, but think that they must do so.
RE: Did Ralph and Coop allow for nonrandom recombination, leading to "hot spots" and "cold regions"??
Bonnie_E_Schrack replied to Gary_S_Collins on 08 May 2013 at 12:48 GMT
That was the section entitled, "IBD Rates along the Genome."
RE: RE: Did Ralph and Coop allow for nonrandom recombination, leading to "hot spots" and "cold regions"??
Gary_S_Collins replied to Bonnie_E_Schrack on 08 May 2013 at 17:26 GMT
Thank you, Bonnie. I'm examining that section and the supplemental figures more closely.
RE: RE: RE: Did Ralph and Coop allow for nonrandom recombination, leading to "hot spots" and "cold regions"??
gmcoop replied to Gary_S_Collins on 09 May 2013 at 00:41 GMT
That's an interesting example, Gary. We'd make one correction to the interpretation-- your example is an interesting anomaly not so much because of the length but because of the clustering: nearly 1/4 of the long shared blocks are found on chromosome 22 (less than 2% of the genome). However, we don't know the specifics of 23&me's IBD caller, so can't comment on the specifics of whether something is wrong there. Are you saying that the shared blocks are clustered in one particular region? If so could you give us the coordinates of that region, and we’ll see if it looks unusual in our IBD data as well [and report back to you].
Bonnie's pointing to the right part of the paper; but let is add more details. First, we measured lengths on the genetic map, rather than in base pairs, which normalizes things so that the average number of crossovers per unit length per generation is constant across the genome. So, if we have the right genetic map, then we've got the expected number of crossovers per unit length per generation right; which is all we really need for our analysis. The sort of variation you're pointing out will only affect the variance, not the average number of crossovers [and our findings are robust to that type of effect].
But, do we have the map right? The map is we use was generated from an Icelandic study. Comparisons to other genetic maps for Europeans [by groups including <a href=”http://www.ncbi.nlm.nih.g...> and others] show that these maps have good concordance [at least at the scale we need]. For our work we need the genetic map we use to be a good approximation to the time-averaged genetic map over the past few thousand years. However, some variants that modify
recombination (inversions, and alleles that control the location of hotspots) are at slightly different frequencies in some modern populations; and also, these frequencies aren't likely not constant in time (we've worked on aspects of these modifiers too, see <a href=http://www.ncbi.nlm.nih.g...> and <a href=”http://www.ncbi.nlm.nih.g...>). This is why we see a few places with distortions in the length spectrum (supplemental figure S1); these are due to a known inversion and potentially new unidentified inversions. The effect is not large, however, and since these deviations only occupy a small proportion of the genome, they don't affect our results significantly (a point we talk about more in the discussion).
Peter and Graham
RE: RE: RE: RE: Did Ralph and Coop allow for nonrandom recombination, leading to "hot spots" and "cold regions"??
Sorry the html links seemed to fail in that comment the first should have red:
"[by groups including ours, http://www.ncbi.nlm.nih.g... and others]"
and the last two references should have been
I should also note that we also measured our IBD blocks on the genetic map given in http://www.ncbi.nlm.nih.g... (i.e. another European population), and the results were very consistent.