Skip to main content

< Back to Article

Figure 1.

Population structure of Caribbean and neighboring populations.

A) Areas in red indicate countries of origin of newly genotyped admixed population samples and blue circles indicate new Venezuelan (underlined) and other previously published Native American samples. B) Principal Component Analysis and C) ADMIXTURE [12] clustering analysis using the high-density dataset containing approximately 390 K autosomal SNP loci in common across admixed and reference panel populations. Unsupervised models assuming K = 3 and K = 8 ancestral clusters are shown. At K = 3, Caribbean admixed populations show extensive variation in continental ancestry proportions among and within groups. At K = 8, sub-continental components show differential proportions in recently admixed individuals. A Latino-specific European component accounts for the majority of the European ancestry among Caribbean Latinos and is exclusively shared with Iberian populations within Europe. Notably, this component is different from the two main gradients of ancestry differentiating southern from northern Europeans. Native Venezuelan components are present in higher proportions in admixed Colombians, Hondurans, and native Mayans.

More »

Figure 1 Expand

Figure 2.

Diagram of the analytical strategy used for reconstructing migration history and sub-continental ancestry in admixed genomes.

The starting point consists of genome-wide SNP data from family trios. Unrelated individuals are used to estimate global ancestry proportions with ADMIXTURE, whereas full trios are selected for BEAGLE phasing and PCA-based local ancestry estimation using continental reference samples. From here, two orthogonal analyses are performed: 1) Ancestry-specific regions of the genome are masked to separately apply PCA to European, African, and Native American haplotypes combined with large sub-continental reference panels of putative ancestral populations. We refer to this methodology as ancestry-specific PCA (ASPCA) and the code is packaged into the software PCAmask. 2) Continental-level local ancestry calls are used to estimate the tract length distribution per ancestry and population, which is then leveraged to test different demographic models of migration using Tracts software.

More »

Figure 2 Expand

Figure 3.

Demographic reconstruction since the onset of admixture in the Caribbean.

We used the length distribution of ancestry tracts within each population from A) insular and B) mainland Caribbean countries of origin. Scatter data points represent the observed distribution of ancestry tracts, and solid-colored lines represent the distribution from the model, with shaded areas indicating 68.3% confidence intervals. We used Markov models implemented in Tracts to test different demographic models for best fitting the observed data. Insular populations are best modeled when allowing for a second pulse of African ancestry, and mainland populations when a second pulse of European ancestry is allowed. Admixture time estimates (in number of generations ago), migration events, volume of migrants, and ancestry proportions over time are given for each population under the best-fitting model. The estimated age for the onset of admixture among insular populations is consistently older (i.e., 16–17) compared to that among mainland populations (i.e., 14).

More »

Figure 3 Expand

Table 1.

Models of Migration into the Caribbean after the advent of admixture.

More »

Table 1 Expand

Figure 4.

Sub-continental origin of Native American components in the Caribbean.

A) Ancestry-specific PCA analysis restricted to Native American segments from admixed Caribbean individuals (colored circles) and a reference panel of indigenous populations (gray symbols) from [11], grouped by sampling location. Darker symbols denote countries of origin with populations clustering closer to our Caribbean samples. Indigenous Colombian populations were classified into East and West of the Andes to ease the interpretation of their differential clustering in ASPCA. Population labels are shown for samples defining PC axes and representative clusters within locations. B) ADMIXTURE model for K = 16 ancestral clusters considering additional Latino samples, a representative subset of African and European source populations, and 52 Native American populations from [11], plus three additional Native Venezuelan tribes genotyped for this project. Vertical thin bars represent individuals and white spaces separate populations. Native American populations from [11] are grouped according to linguistic families reported therein. Labels are shown for the populations representing the 12 Native American clusters identified at K = 16. Clusters involving multiple populations are identified by those with the highest membership values. C) Map showing the major indigenous components shared across the Caribbean basin as revealed by ADMIXTURE at K = 16 from B). Namely, Mesoamerican (blue), Chibchan (yellow), and South American (green). Colored bars represent individuals and their approximate sampling locations. Bars pooling genetically similar individuals from more than one population are plotted from left to right following north to south coordinates as listed by population labels. Guarani, Wichi, and Chane from north Argentina are pooled with Arara but only the location of the latter is shown to allow us to provide a zoomed view of the Caribbean region (see [11] for the full map of sampling locations). The thick arrow represents schematically the most accepted origin of the Arawak expansion from South America into the Great Antilles around 2,500 years ago according to linguistic and archaeological evidence [30]. Asterisks next to population labels denote Arawakan populations included in our reference panel. The thin arrow indicates gene flow between South America and Mesoamerica, possibly following a coastal or maritime route, accounting for the Mayan mixture and supporting pre-Columbian back migrations across the Caribbean.

More »

Figure 4 Expand

Figure 5.

Sub-continental origin of European haplotypes derived from admixed genomes.

ASPCA is applied to haploid genomes with >25% European ancestry derived from insular Caribbean (black symbols) and mainland populations (gray symbols) combined with a reference panel (colored labels) of 1,387 POPRES European samples with four grandparents from the same country [15], and 54 additional Iberian individuals (in yellow) from [24]. PC1 values have been inverted and axes rotated 16 degrees counterclockwise to approximate the geographic orientation of population samples over Europe. Population codes are detailed in Table S1 and regions within Europe are labeled as in [16]. Inset map: countries of origin for POPRES samples color-coded by region (areas not sampled in gray and Switzerland in intermediate shade of green to denote shared membership with EUR W, EUR C, and EUR S). Most Latino-derived European haplotypes cluster around the Iberian cluster. One of the two Haitian individuals included in the analysis clustered with French speaking Europeans (black arrow), in agreement with the colonial history of Haiti and illustrating the fine-scale resolution of our ASPCA approach.

More »

Figure 5 Expand

Figure 6.

Sub-continental origin of Afro-Caribbean haplotypes of different sizes.

A) Map of West Africa showing locations of reference panel populations. Samples in black are more likely to represent the origin of short ancestry tracts and those in red of long ancestry tracts, according to B) assignment probabilities for each putative ancestral population of being the source for short (<50 cM in black) and long (>50 cM in red) ancestry tracts. African ancestry tracts for Puerto Ricans are shown and results for all populations are available in Figure S16. C) Proportion of African ancestry of inferred Mandenka origin as a function of block size in the combined set of Caribbean genomes. By running PCAdmix within the previously inferred African segments, we obtained posterior probabilities for Mandenka versus Yoruba ancestry. Overall, we found evidence for a differential origin of the African lineages in present day Afro-Caribbean genomes, with shorter (and thus older) ancestry tracts tracing back to Far West Africa (represented by Mandenka and Brong), and longer tracts (and thus younger) tracing back to Central West Africa.

More »

Figure 6 Expand