Fig 1.
HLA sampling (A) is much more widespread than that of the 1,000 Genomes Project (B), which represents the current reference panel of human genomewide diversity.
A. Circles represent the populations that were HLA typed. Colors and size of the circles correspond to the number of common alleles observed in each population (scales in the bottom left corner). B. Circles represent the region of origin for each of the 26 populations of the 1,000 Genomes Project. Each population has a color that corresponds to the associated geographical region: Europe (blue), Africa (yellow), Americas (red), South Asia (purple), and East Asia (green). A-B. Maps were generated using the ggplot2 package in R [16] and the world database.
Fig 2.
The HLA diversity in the 1000 Genomes Project panel only represents 78% of the expected diversity for the alleles with a frequency >1%.
The top part of the figure shows the % match (Y axis) between the expected HLA diversity at different frequency cutoffs (X axis) and the HLA diversity observed in the 1,000 Genomes Project panel. For each frequency cutoff, the number of expected alleles is displayed at the top of each histogram. The bottom part of the figure displays the same information on a locus by locus basis for the 1% cutoff.
Fig 3.
The common HLA alleles that are missing in the 1,000 Genome Project panel define a worldwide distribution with variable frequencies that range from low (1–2.5%) to high (>10%).
A. Worldwide distribution of the populations harboring alleles that are missing in the 1,000 Genomes Project. Colors and size of the circles correspond to the number of common alleles observed in each population (scale on the bottom left corner). The map was generated using the ggplot2 package in R [16] and the world database. B. Geographical distribution of the alleles that are missing in the 1,000 Genomes Project. For each region, the number of distinct alleles is given together with their maximum frequencies.