Recombination varies greatly among species, as illustrated by the poor conservation of the recombination landscape between humans and chimpanzees. Thus, shorter evolutionary time frames are needed to understand the evolution of recombination. Here, we analyze its recent evolution in humans. We calculated the recombination rates between adjacent pairs of 636,933 common single-nucleotide polymorphism loci in 28 worldwide human populations and analyzed them in relation to genetic distances between populations. We found a strong and highly significant correlation between similarity in the recombination rates corrected for effective population size and genetic differentiation between populations. This correlation is observed at the genome-wide level, but also for each chromosome and when genetic distances and recombination similarities are calculated independently from different parts of the genome. Moreover, and more relevant, this relationship is robustly maintained when considering presence/absence of recombination hotspots. Simulations show that this correlation cannot be explained by biases in the inference of recombination rates caused by haplotype sharing among similar populations. This result indicates a rapid pace of evolution of recombination, within the time span of differentiation of modern humans.
Citation: Laayouni H, Montanucci L, Sikora M, Melé M, Dall'Olio GM, Lorente-Galdos B, et al. (2011) Similarity in Recombination Rate Estimates Highly Correlates with Genetic Differentiation in Humans. PLoS ONE 6(3): e17913. https://doi.org/10.1371/journal.pone.0017913
Editor: Carles Lalueza-Fox, Institut de Biologia Evolutiva - Universitat Pompeu Fabra, Spain
Received: December 22, 2010; Accepted: February 14, 2011; Published: March 28, 2011
Copyright: © 2011 Laayouni et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This research was funded by grants BFU2007-63657, BFU2009-13409-C02-02 and SAF-2007-63171 awarded by Ministerio de Educación y Ciencia (Spain), by the Direcció General de Recerca of Generalitat de Catalunya (Grup de Recerca Consolidat 2005SGR/00608 and 2009 SGR 1101), and by the National Institute for Bioinformatics (www.inab.org), a platform of Genoma España. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
The recombination rate is neither constant along chromosomes nor across species. The rate within genomes has been observed to vary at both the megabase level, with different chromosomal regions in the human genome showing differences in their recombination rates ,  and at a finer level, due to the existence of recombination hotspots , , . Comparisons of the human and the chimpanzee genomes have revealed poor conservation of recombination landscapes, in contrast to the high level of DNA sequence conservation observed among these species , . Multiple lines of evidence suggest that sequence changes in the zinc-finger protein PRDM9 may be responsible for hotspots differences among species , . Recombination rates have also been compared among human populations, revealing large-scale conservation , while some differences in hotspot intensities and some population-specific hotspots have been described at a finer scale , , , , . Finally, different studies have shown the existence of individual variation in recombination ,  and its heritability has been investigated, along with its biological consequences .
Measuring the fine-scale recombination rate is experimentally challenging, the resolution of recombination maps experimentally obtained is limited by relatively few meioses and a low density of markers, exception made by the recent paper by Kong et al. . However good estimates can be obtained by applying population-genetic methods to DNA sequences . Statistical methods have been developed to infer the fine-scale structure of recombination rate variation from genome-wide scale data . One of the widely used methods is implemented in the LDhat package, which is based on a composite-likelihood approach. Simulations have shown that LDhat produces largely unbiased rate estimates of the fine-scale genetic map . More recently, Khil and Camerini-Otero  have shown that present-day genetic crossovers are well predicted by a population averaged hotspot map computed from linkage disequilibrium data.
Differences in recombination rates among human populations provide a useful temporal framework to analyze the evolution of the recombination landscape, as they are recent enough to capture fast evolutionary changes. The basal branches of the genetic diversification of human populations happened some 150,000 years ago, a much shorter time than the split between humans and chimpanzees (around 6.5 million years). The comparison of the recombination patterns among human populations provides a means to verify whether recombination landscapes evolve over time. To address this issue, we analyzed whether differences in recombination rates among human populations are correlated with their genetic differences computed as genetic distances. Whole genome estimations of recombination rates based on SNP data are already available for HapMap samples which, however, consist only of four populations for HapMap Phase I and II ,  and 11 populations for HapMap Phase III . Here we computed the recombination rates using data for 660,918 SNPs on the Illumina HumanHap650K Beadchips genotyped in the full HGDP-CEPH panel samples ,  for 28 populations belonging to six continental groups representing most of the worldwide human diversity .
Materials and Methods
Recombination rate estimation
We considered the H971 subset of the Human Genome Diversity Cell Line Panel (HGDP-CEPH) recommended by Rosenberg . The 51 original HGDP-CEPH population samples  were re-grouped into 39 populations based on geographic and ethnic criteria as in . To avoid small sample sizes, the analysis was performed on genotypes from 28 populations belonging to six continental groups, with sample sizes over 19 individuals (a list of used populations and their number of individuals is presented in Table 1). We used data for 660,918 SNPs on the Illumina HumanHap650K Beadchips successfully genotyped in the full HGDP-CEPH panel samples , . SNPs are spaced 4.4 kb apart on average, an appropriate length given that hotspots occur every 200 Kb or less and their widths are 1–2 Kb , . Population recombination rates were calculated between neighboring SNPs according to the method implemented in the rhomap program  within the LDhat package . LDhat methods have been demonstrated to give highly similar results to alternative approaches in human and chimpanzee datasets ,  and are computationally practicable for genome wide variation surveys. For a reliable estimation of the recombination rates, loci with more than 10% missing data in at least one population were discarded from the analysis. After this cleaning procedure, the total number of SNPs included in the analysis was 636,933 (96% of all the SNPs in the HGDP). The number of SNPs for each chromosome is reported in Table 2. For each population, 5 independent runs of the rhomap program were carried out (with parameters: iterations = 10.000.000, sampling = 5.000, burnin = 100.000). For each pair of adjacent SNPs we obtained 5 estimates of the population recombination rate (4Ner/kb) in each population and the median of these 5 estimates was used in the analysis.
Since population recombination rates (ρ) are dependent on the effective population size (ρ = 4Ner), estimates of the population recombination rate in each population were normalized by θ = 4Neµ, a scaled population mutation rate obtained from the same individuals and populations, where µ is the genome-wide average microsatellite mutation rate per locus and per generation . We have used a measure obtained though microsatellites because they represent a totally independent set of data and thus there will not be problems of circularity; moreover they refer to exactly the same populations. As there is no evidence of mutation rates varying among human groups, this correction produces values that are not biased by effective population size.
Correlation between genetic distance and recombination dissimilarity
We obtained a Spearman rank correlation matrix for the recombination rates among all pairs of populations. Each correlation value was obtained by comparing the values of corrected ρ (see above) for all pairs of adjacent typed SNPs between a population pair. In order to simplify the comparison with the genetic distance, the Spearman correlation values were turned into a dissimilarity measures by subtracting them from 1. The obtained 28×28 matrix is then a measure of the dissimilarity of recombination rates between each pair of populations.
The differentiation among human populations was estimated through the FST measure  among each pair of populations. FST values were calculated using a routine implemented in the PopGen module of BioPerl  and stored in a 28×28 matrix.
The matrix of recombination dissimilarity and that of genetic distance (FST matrix), were compared using a standardized Mantel test  by randomly permuting 9,999 times the rows and columns of one of the matrices. Statistical analyses were implemented using the R statistical software.
To further investigate the effect of the sharing of haplotypes and, hence of linkage disequilibrium patterns (which are at the base of the recombination rate estimates) on the relationship between genetic distance and recombination landscape, we designed a simulation study. We simulated human demography under a model in which the recombination rate was the same for all the simulated populations, and we sought to determine whether the correlation between genetic distance and inferred recombination similarity between simulated populations was similar to the observed in empirical data.
The simulations were carried out with the COSI program  which provides a simulation of the human demography under a three-population model based on the HapMap populations. This model was specifically designed to generate sequences that closely resemble empirical data of three human populations (African, European and Asian) by means of simulating a human-like demography and a variable recombination rate along the sequences, allowing for presence and absence of hotspots. Cosi consists of two programs which are run one after the other. The first generates a random local recombination map based on the distribution seen in the deCODE genetic map for the autosomes . The second, is the coalescent program itself and it builds up a coalescent network taking into account the local recombination map generated previously. Therefore, each simulation will generate a different recombination landscape with different number of hotspots and coldspots. Specifically, the model was calibrated to obtain realistic FST values that mimic the divergence found among the three human populations being simulated and to obtain similar values of the frequency distribution of alleles, among other parameters. We performed 1000 simulations using the best-fitting demographic model provided by COSI. For each simulation, we set the length of the simulated sequences to 1 Mb and adopted a sample size of 56 sequences for European and Asian populations and 42 for the African population with the aim of having the same amount of individuals as in a three chosen equivalent HGDP populations (Yoruba, French and Japanese). In each simulation, the distribution of the recombination rate is the same for the three simulated populations; this leads to simulated genotypes of different populations that share common haplotypes but have not experienced differences in their recombination rate. Finally, as SNPs included on Illumina's HumanHap650Y Genotyping BeadChip are tagSNPs with r2 higher than 0.8, and in order to have a similar ascertainment bias in the simulations and in the observed data, we performed a selection of tagSNPs with the same criteria using Haploview software with the pairwise option .
In order to compare simulated data to a consistent empirical dataset, we randomly chose, along the whole genome, 1000 non-overlapping 1 Mb long windows, and we analyzed them across the three populations of Yoruba, French and Japanese.
We then computed FST and recombination rates, following the same procedure as before, for real and simulated data. If the shared haplotypes were the main source of the high correlation found between recombination and genetic distance, we expect to observe this correlation also in the simulated data.
Results and Discussion
Exploratory analysis of recombination rates
Population recombination rates were computed between 636,933 neighboring SNPs for 28 populations. As the recombination rate was estimated through several runs for each population, and to test for the agreement of estimates between runs of the same chromosome, 10 runs were performed for chromosome 22 for all populations. We carried out a repeated measure ANOVA testing population and run as the main effects and pairs of adjacent SNPs as a covariate. No statistical significance of runs was found, but population and pairs of adjacent SNPs were highly significant (data not shown). This result reflects that the noise in the estimation procedure is low in relation to differences between populations.
Table 1 shows the mean estimated recombination rate for all populations, grouped according to their geographical region. Results indicate considerable variation in recombination rates between populations, with low recombination rates for populations from East Asia. A repeated measure ANOVA shows that differences between populations are highly significant (F27, 636,910 = 59,479.8; p<0.00001; R2 = 8.5%). A Friedman ANOVA test shows similar results (ANOVA χ2 = 2,255,369; p<0.000001; df = 27). Post hoc analysis using a Bonferroni correction for the repeated ANOVA test show that differences between populations remain significant, except for two homogenous groups from Central South Asia: Pathan, Burusho and Brahui on the one hand; and Mozabite, Balochi and Makrani on the other. Figure 1 shows the estimated recombination rate (scaled by the genome-wide average microsatellite mutation rate) along chromosome 22 for 6 populations (one from each continental region; all populations are shown in Figure S1). The figure shows a similar pattern for all populations; but substantial variation can be detected by close observation. For example, North East China and Maya exhibit fewer hotspots than the other populations. A hotspot located just before 20 Mbp in all populations is absent (or much weaker) in North East China. A hotspot region located around 25 Mbp is absent (or much weaker) in Bedouin and French, but present in other populations. More variation is observed when considering all populations (Figure S1). This variation is consistent with previous reports in other genomic loci and genome-wide .
Genetic distance and recombination similarity between populations
Spearman rank correlation between population recombination estimates were obtained by comparing the values of corrected recombination ρ/θ for all pairs of adjacent typed SNPs between each population pair. The differentiation among human populations was estimated through the FST measure  among each pair of populations. The correlation values in recombination between population pair and FST measures were stored as dissimilarity and distance matrices respectively and compared using a standardized Mantel test . A significant Mantel's r correlation of 0.894 (p<0.0001) was observed, indicating that differences in recombination rates among populations increase with their genetic distance (Figure 2). In other words, genetic differentiation across human populations explains a considerable amount of recombination differences among them. This result also stands when the analysis is independently performed for each chromosome; then the Mantel test correlation ranges from 0.761 for chromosome 16 to 0.946 for chromosome 10 (Table 2).
Genetic differentiation estimated through FST is a measurement of allele frequency differences. LD based estimation of recombination rate is also affected by allele frequencies at the marker loci. To explore the effect of allele frequency on recombination estimates, SNPs were placed into 10 bins based on their minor allele frequency calculated using the global allele frequency of all populations together. An analysis of variance using recombination estimates as dependent variable and MAFs as a mean effect shows significant differences between categories for all populations, with bins with high MAFs showing high recombination estimates (F9, 636,901 = 776,8 p = <0.0001; Figure S2). However no differences were observed between bins with MAFs higher than 0.2. The correlation between genetic differentiation and recombination dissimilarity using only SNPs within each MAFs category remains very high and strongly significant (Table 3). The mean Mantel's r correlation from the different MAFs categories is slightly smaller than the one obtained from individual chromosomes using all SNPs together (0.858 and 0.878 respectively). We also calculated this correlation using only SNPs with a minor allele frequency (MAFs) above 0.1 and above 0.2 in the 28 analyzed populations. This analysis includes 141,921 SNPs and 34,706 SNPs respectively and show similar results to the mean analysis (Figure 3), the Mantel's r correlation is 0.855 for SNP with MAFs high than 0.1 and 0.830 for SNPs with MAFs high than 0.2 and highly significant for both (p<0.00001).
It can be argued that these results could be driven by a bias in the selected populations, that is, by similar recombination rates in populations belonging to the same continental group, due to the presence of common or shared haplotypes, and - not to smaller changes in crossing-over rate. To test this hypothesis, we repeated our analysis considering only one population per continental group, thus avoiding redundancy in the genetic composition of the populations in our dataset. In particular, the analysis was performed with data from Yoruba (Africa), French (Europe), Bedouin (Middle East/North Africa), Burusho (Central/South Asia), Han (East Asia) and Maya (America) populations. The observed correlation remained very high (Mantel's r = 0.863, p = 0.002) and was statistically significant even with the low number of pairwise comparisons.
To test for the impact of using the same data set for estimating recombination and genetic distance, we performed a Mantel test between the FST matrix calculated for each individual chromosome versus the recombination dissimilarity matrix computed on all the other chromosomes. This makes the estimates of recombination and genetic distance effectively independent since they are estimated from different parts of the genome, and avoids any problem of circularity in our analysis. Results are presented in Table S1. The correlation remains and is still highly significant in all cases (p<0.00001).
Alternatively, comparisons of recombination rates among populations can be evaluated by attending to the presence or absence of recombination hotspots. This analysis has a special value as it is much less dependent on the accuracy of estimating the recombination rates. We defined a hotspot in each population as a recombination rate that exceeds 5 times the mean rate, with a threshold of 0.1×4Ner/Kb. A total of 22,413 hotspots have been detected at least in one population each. The number of hotspots vary among populations, from 2,582 for South China to 8,042 for Palestinian (no correlation between the number of hotspots and population sample size was observed, Pearson correlation test r = −0.08 p>0.05; Spearman correlation test r = 0.34 p>0.05).
Taking into account only the common hotspots shared by all populations within a given continental region, the proportion of shared hotspots between continental regions is maximum between Europe and Middle East and North Africa (0.52), Europe and Central South Asia (0.44) and between Middle East and North Africa and Central South Asia (0.41). These values are, as expected, much lower when considering Sub-Saharan African or East Asian populations (Table 4). An interesting outcome from this analysis is the number of hotspots common to non African human populations compared to Sub-Saharan Africans. The proportion of hotspots shared between these two groups is only 17.4%, which is a small proportion given the recent out of Africa origin of non African population, and also show that the pace of evolution of hotspots is substantial. Figure S3 shows, as an example, patterns of recombination rates for SNPs where a hotspot event was detected in at least one population. Most variation is observed between continental groups while there is a substantial pattern sharing among populations belonging to the same continental group.
We calculated the Jaccard distance between each pair of populations to measure the overall difference in presence/absence of hotspots (defined as the size of the intersection divided by the size of the union of sample sets; in this distance the absence of a hotspot in a given position in two populations does not contribute to the similarity between them as would be in the case of a simple matching coefficient). Comparing this distance matrix with the FST matrix, highly significant results were obtained (Mantel's r = 0.866, p<0.0001), suggesting that differences in the location of recombination hotspots increases with genetic differentiation between human populations.
With the Mantel test analysis using only one population from each continent, we have shown that the effect of haplotype sharing in closely related populations does not explain the correlation between genetic differentiation and recombination. However, it is possible that the sharing of haplotypes and, hence of linkage disequilibrium patterns, had a considerable effect also on distant populations, since its origin can be traced back to the Out of Africa origin of modern humans. To disentangle this point, we performed a simulation study designed to recognize the impact of using shared haplotypes on the estimates of recombination rates.
As the number of simulated populations is only three, the Mantel test cannot provide a robust comparison. To compare the relationship between recombination similarities and genetic differentiation in the three populations being simulated and in the three corresponding HGDP populations, we performed a Spearman correlation of the values of recombination between all neighboring SNPs in the 1Mbp region and their FST values, for both simulated and empirical data. This is a more stringent test than the previous overall comparison between FST and recombination patterns, since, rather than general means, data points correspond now to 1000 windows a 1 Mb each. The correlation between recombination values and genetic distance for empirical data are 0.26, 0.25 and 0.27 for Yoruba-French, Yoruba-Japanese, and French-Japanese respectively (all significant). The values here are much lower than before as they refer to correlations between FST and the correlation of the recombination values for windows of 1 Mb in two populations and not to distances (in the Mantel test, two matrices of FST and recombination dissimilarity between points of populations were compared). Conversely, these values are only 0.05, 0.06 and 0.09 for the simulated African-European, African-Asian and European-Asian (only the last comparison was marginally significant). This shows that, within the simulated populations, FST and recombination rate were not correlated despite sharing common haplotypes, whereas they are clearly correlated within the three studied populations. The common origin of haplotype structure, as illustrated in the simulation data, is unlikely to have contributed to a large part of the correlation between genetic distances and structure of the recombination landscape.
In the simulations we have not considered the possible impact of natural selection and its consequences both on the estimated taxes of recombination (as they rely on measures of linkage disequilibrium) and on FST measures. As its impact is likely to be restricted to the relative low number of regions that could be under very recent positive selection and acting differentially among populations , it is unlikely to have a genome wide impact. Nonetheless, the relationship between recombination, population differentiation and selection in humans is still a working and open field.
The results of this study reveal the footprint of the evolutionary history of human populations on the recombination rate. The large differences found in the comparison of the recombination landscapes among humans and chimpanzees ,  showed that recombination evolves quickly. Here, we give evidence that, even at the narrow timescale separating human populations, on the order of tens of thousands of years, differences appear to be detectable and to be correlated with genetic differentiation among populations. Recombination rate appears to be a rapidly changing parameter, indicating that the underlying factors shaping the likelihood of a recombination event, such as DNA sequences controlling recombination rate variation, also change. The change is strongly detectable also in terms of presence or absence of recombination hotspots even if at the present stage it is not possible to measure the relative importance between changes in intermediate recombination rates and the appearing or disappearing of recombination hotspots. This is consistent with recent data showing that allelic variants of PRDM zinc fingers are significantly associated with variability in genome hotspots among humans . The results obtained in this work contribute to the growing perception of recombination not as fixed feature of the genome of a species, but as a phenotype with ample genetic variation.
Recombination rate estimates (4Ner/kb) corrected for effective population size for successive SNP-pairs for chromosome 22 and in each of 28 populations, grouped into geographical regions.
Mean of the recombination estimate (4Ner/kb) for all populations calculated for 10 categories of SNPs based on their minor allele frequency. MAFs are calculated using the global allele frequency of all populations together.
Heatmap showing patterns of hotspots observed for 300 SNPs of chromosome 22 for the 28 populations, grouped according to their geographical region. The first 300 SNPs of chromosome 22, for which a hotspot is present in at least one population, are reported on the x axis. In color, for each population the value of the recombination estimate (4Ner/kb) corrected for effective population size for that SNP in a gradient from blue (low recombination values) to green (high recombination values).
Mantel's r values between the genetic distance and recombination dissimilarity matrices. First row shows chromosome for which the genetic distance was calculated; first column show the chromosome for which the recombination matrix was calculated. All value were highly significant (p<0.00001).
We are grateful to Brandon Invergo for reviewing the manuscript. We would also want to acknowledge two anonymous reviewers for their comments and suggestions that greatly improved the manuscript.
Conceived and designed the experiments: HL F. Casals F. Calafell JB EB DC AN JG PA. Performed the experiments: HL LM F. Casals. Analyzed the data: HL LM MS MM GMD BLG KMM. Wrote the paper: HL F. Casals F. Calafell JB.
- 1. Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, et al. (2002) A high-resolution recombination map of the human genome. Nat Genet 31: 241–247.
- 2. Myers S, Bottolo L, Freeman C, McVean G, Donnelly P (2005) A fine-scale map of recombination rates and hotspots across the human genome. Science 310: 321–324.
- 3. Crawford DC, Bhangale T, Li N, Hellenthal G, Rieder MJ, et al. (2004) Evidence for substantial fine-scale variation in recombination rates across the human genome. Nat Genet 36: 700–706.
- 4. McVean GA, Myers SR, Hunt S, Deloukas P, Bentley DR, et al. (2004) The fine-scale structure of recombination rate variation in the human genome. Science 304: 581–584.
- 5. Ptak SE, Hinds DA, Koehler K, Nickel B, Patil N, et al. (2005) Fine-scale recombination patterns differ between chimpanzees and humans. Nat Genet 37: 429–434.
- 6. Winckler W, Myers SR, Richter DJ, Onofrio RC, McDonald GJ, et al. (2005) Comparison of fine-scale recombination rates in humans and chimpanzees. Science 308: 107–111.
- 7. Myers S, Bowden R, Tumian A, Bontrop RE, Freeman C, et al. (2010) Drive against hotspot motifs in primates implicates the PRDM9 gene in meiotic recombination. Science 327: 876–879.
- 8. Baudat F, Buard J, Grey C, Fledel-Alon A, Ober C, et al. (2010) PRDM9 is a major determinant of meiotic recombination hotspots in humans and mice. Science 327: 836–840.
- 9. Serre D, Nadon R, Hudson TJ (2005) Large-scale recombination rate patterns are conserved among human populations. Genome Res 15: 1547–1552.
- 10. Bertranpetit J, Calafell F, Comas D, Gonzalez-Neira A, Navarro A (2003) Structure of linkage disequilibrium in humans: genome factors and population stratification. Cold Spring Harb Symp Quant Biol 68: 79–88.
- 11. Conrad DF, Jakobsson M, Coop G, Wen X, Wall JD, et al. (2006) A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat Genet 38: 1251–1260.
- 12. Evans DM, Cardon LR (2005) A comparison of linkage disequilibrium patterns and estimated population recombination rates across multiple populations. Am J Hum Genet 76: 681–687.
- 13. Graffelman J, Balding DJ, Gonzalez-Neira A, Bertranpetit J (2007) Variation in estimated recombination rates across human populations. Hum Genet 122: 301–310.
- 14. Coop G, Przeworski M (2007) An evolutionary view of human recombination. Nat Rev Genet 8: 23–34.
- 15. Coop G, Wen X, Ober C, Pritchard JK, Przeworski M (2008) High-resolution mapping of crossovers reveals extensive variation in fine-scale recombination patterns among humans. Science 319: 1395–1398.
- 16. Kong A, Barnard J, Gudbjartsson DF, Thorleifsson G, Jonsdottir G, et al. (2004) Recombination rate and reproductive success in humans. Nat Genet 36: 1203–1206.
- 17. Kong A, Thorleifsson G, Gudbjartsson DF, Masson G, Sigurdsson A, et al. (2010) Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467: 1099–1103.
- 18. Stumpf MP, McVean GA (2003) Estimating recombination rates from population-genetic data. Nat Rev Genet 4: 959–968.
- 19. Smith NG, Fearnhead P (2005) A comparison of three estimators of the population-scaled recombination rate: accuracy and robustness. Genetics 171: 2051–2062.
- 20. Khil PP, Camerini-Otero RD (2010) Genetic crossovers are predicted accurately by the computed human recombination map. PLoS Genet 6: e1000831.
- 21. (2005) A haplotype map of the human genome. Nature 437: 1299–1320.
- 22. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, et al. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861.
- 23. Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA, et al. (2010) Integrating common and rare genetic variation in diverse human populations. Nature 467: 52–58.
- 24. Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, et al. (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319: 1100–1104.
- 25. Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, et al. (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451: 998–1003.
- 26. Cann HM, de Toma C, Cazes L, Legrand MF, Morel V, et al. (2002) A human genome diversity cell line panel. Science 296: 261–262.
- 27. Rosenberg NA (2006) Standardized subsets of the HGDP-CEPH Human Genome Diversity Cell Line Panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann Hum Genet 70: 841–847.
- 28. Gardner M, Gonzalez-Neira A, Lao O, Calafell F, Bertranpetit J, et al. (2006) Extreme population differences across Neuregulin 1 gene, with implications for association studies. Mol Psychiatry 11: 66–75.
- 29. Jeffreys AJ, Neumann R, Panayi M, Myers S, Donnelly P (2005) Human recombination hot spots hidden in regions of strong marker association. Nat Genet 37: 601–606.
- 30. Auton A, McVean G (2007) Recombination rate estimation in the presence of hotspots. Genome Res 17: 1219–1227.
- 31. Fearnhead P, Donnelly P (2001) Estimating recombination rates from population genetic data. Genetics 159: 1299–1318.
- 32. Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution Int J Org Evolution 38: 1358–1370.
- 33. Stajich JE, Hahn MW (2005) Disentangling the effects of demography and selection in human history. Mol Biol Evol 22: 63–73.
- 34. Sokal RR, Rohlf FJ (1995) Biometry. New York: Freeman. 3rd edition.
- 35. Schaffner SF, Foo C, Gabriel S, Reich D, Daly MJ, et al. (2005) Calibrating a coalescent simulation of human genome sequence variation. Genome Res 15: 1576–1583.
- 36. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21: 263–265.
- 37. Grossman SR, Shylakhter I, Karlsson EK, Byrne EH, Morales S, et al. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327: 883–886.