Conceived and designed the experiments: SDJP GSK. Performed the experiments: MFM JPG FSGK FK LAVM MRM JAP CR VAS. Analyzed the data: SDJP GSK MHH AKCRS EBO MARS. Contributed reagents/materials/analysis tools: GDP RCM MOM MEAM FRS. Wrote the manuscript: SDJP GSK.
The authors have declared that no competing interests exist.
Based on pre-DNA racial/color methodology, clinical and pharmacological trials have traditionally considered the different geographical regions of Brazil as being very heterogeneous. We wished to ascertain how such diversity of regional color categories correlated with ancestry. Using a panel of 40 validated ancestry-informative insertion-deletion DNA polymorphisms we estimated individually the European, African and Amerindian ancestry components of 934 self-categorized White, Brown or Black Brazilians from the four most populous regions of the Country. We unraveled great ancestral diversity between and within the different regions. Especially, color categories in the northern part of Brazil diverged significantly in their ancestry proportions from their counterparts in the southern part of the Country, indicating that diverse regional semantics were being used in the self-classification as White, Brown or Black. To circumvent these regional subjective differences in color perception, we estimated the general ancestry proportions of each of the four regions in a form independent of color considerations. For that, we multiplied the proportions of a given ancestry in a given color category by the official census information about the proportion of that color category in the specific region, to arrive at a “total ancestry” estimate. Once such a calculation was performed, there emerged a much higher level of uniformity than previously expected. In all regions studied, the European ancestry was predominant, with proportions ranging from 60.6% in the Northeast to 77.7% in the South. We propose that the immigration of six million Europeans to Brazil in the 19th and 20th centuries - a phenomenon described and intended as the “whitening of Brazil” - is in large part responsible for dissipating previous ancestry dissimilarities that reflected region-specific population histories. These findings, of both clinical and sociological importance for Brazil, should also be relevant to other countries with ancestrally admixed populations.
Continental populations of the world vary considerably in their predisposition to diseases and in the allele frequencies of important pharmacogenetic loci, probably as a result of genetic drift, but also because of adaptation to local selective factors such as climate and available nutrients. In many countries, skin color has traditionally been used in clinical and pharmacological studies as a phenotypic proxy for geographical ancestry. Brazil is no exception.
The Brazilian population was formed by extensive admixture from three different ancestral roots: Amerindians, Europeans and Africans. This resulted in a great variability of skin pigmentation, with no discontinuities between Black and White. For instance, in a single small fishing village in Brazil, Harris and Kotak
However, the Instituto Brasileiro de Geografia e Estatística (IBGE), which is responsible for the official census of Brazil, has employed only few pre-established color categories, which are based on self-classification. Since 1991 they number five: White (“branca”), Brown (“parda”), Black (“preta”), Yellow (amarela) and Indigenous (“indígena”). Brown (“pardo”) emerged as a synthesis of a variety of classifications, such as “caboclo”, “mulato”, “moreno”, “cafuzo”, and other denominations that express the admixed character of the Brazilian population
In general, there is academic support for the IBGE classification system, which is the only source of information on color categories at a national level
In 2008 IBGE ascertained a population of
With an area of 8,511,960 Km2, Brazil has a territory of continental size (the fifth largest in the world) and different regions have diverse population histories. For instance, the North had a large influence of the Amerindian root, the Northeast had a history of strong African presence due to slavery and the South was mostly settled by European immigrants. These different compositions were quite evident in our studies of mtDNA haplotypes of White Brazilians
When we look into the Brazilian census data on the proportion of each color category according to region, we indeed can see noticeable differences (
Population | White | Brown | Black | ||
(X 103) | |||||
Brazil | 189,953 | 92,003 | 83,196 | 12,987 | |
(48.43%) | (43.80%) | (6.84%) | |||
Region | State | ||||
North | Pará | 7,367 | 1,530 | 5,374 | 398 |
(3.88%) | (20.77%) | (72.95%) | (5.40%) | ||
Ceará | 8,472 | 2,800 | 5,370 | 257 | |
Northeast | (4.46%) | (33.05%) | (6339 | (303 | |
Bahia | 14,560 | 2,999 | 9,149 | 2,328 | |
(7.67%) | (20.60%) | (62.84%) | (15.99%) | ||
Southeast | Rio de Janeiro | 16,203 | 8,509 | 5,302 | 2,328 |
(8.53%) | (52.51%) | (32.72%) | (14.37%) | ||
Santa Catarina | 6,091 | 5,297 | 608 | 160 | |
South | (3.21%) | (86.96%) | (9.98%) | (2.63%) | |
Rio Grande | 10,856 | 8,776 | 1,495 | 529 | |
do Sul | (5.72%) | (80.84%) | (13.77%) | (4.87%) |
The first column shows the total population of Brazil and the population of each state expressed in absolute values and percentage of the total for the whole Country. The columns for the color categories contain data also expressed in absolute numbers and percentages self-categorized in that region (in parentheses). The percentages for Whites, Blacks and Browns do not add to 100% because each State has individuals who belong to color categories that are distinct from the ones shown. Data obtained from
We have already shown that a set of 40 short insertion-deletion (indel) polymorphisms was sufficient for an adequate characterization of human population structure at the global level
In the present study we used these loci to estimate the Amerindian, European and African genomic ancestry of 934 Brazilians from the four most populous geographical regions of the Country, self-categorized as White, Brown and Black.
In the present work we established the genotype of 934 self-classified White, Brown or Black Brazilians at 40 autosomal short insertion-deletion polymorphisms (indels) dispersed in the human genome. The allele frequencies at these loci are shown in
The regions with a square label were analyzed in this work. The cities and respective states where the samples were collected are shown with a star.
Each point represents a separate individual and the ancestral proportions can be determined by dropping a line parallel to each of the three axes. The graphs were drawn using the Tri-Plot program
Region | Ancestral Roots | Color category | Color-independent “Total Ancestry” | |||||
White | Brown | Black | ||||||
Mean | s.e. | Mean | s.e. | Mean | s.e. | |||
North | European | 0.782 | 0.026 | 0.686 | 0.034 | 0.524 | 0.031 | 0.697 |
(Pará) | African | 0.077 | 0.011 | 0.106 | 0.016 | 0.275 | 0.023 | 0.109 |
Amerindian | 0.141 | 0.022 | 0.209 | 0.030 | 0.201 | 0.026 | 0.194 | |
Northeast | European | 0.668 | 0.037 | 0.603 | 0.060 | 0.539 | 0.034 | 0.606 |
(Bahia) | African | 0.244 | 0.033 | 0.308 | 0.057 | 0.359 | 0.014 | 0.303 |
Amerindian | 0.088 | 0.012 | 0.089 | 0.020 | 0.101 | 0.031 | 0.091 | |
Northeast | European | 0.758 | 0.032 | 0.728 | 0.029 | N.S. | N.S. | |
(Ceará) | African | 0.133 | 0.017 | 0.144 | 0.021 | N.S. | N.S. | |
Amerindian | 0.109 | 0.021 | 0.128 | 0.015 | N.S. | N.S. | ||
Southeast | European | 0.861 | 0.016 | 0.675 | 0.028 | 0.427 | 0.032 | 0.737 |
(Rio de Janeiro) | African | 0.074 | 0.011 | 0.238 | 0.025 | 0.495 | 0.032 | 0.189 |
Amerindian | 0.065 | 0.007 | 0.087 | 0.012 | 0.079 | 0.009 | 0.074 | |
South | European | 0.855 | 0.021 | 0.442 | 0.037 | 0.431 | 0.062 | 0.777 |
(Rio Grande do Sul) | African | 0.053 | 0.019 | 0.444 | 0.035 | 0.459 | 0.052 | 0.127 |
Amerindian | 0.093 | 0.006 | 0.114 | 0.016 | 0.110 | 0.026 | 0.096 | |
South | European | N.S. | N.S. | N.S. | N.S. | 0.293 | 0.031 | |
(Santa Catarina) | African | N.S. | N.S. | N.S. | N.S. | 0.596 | 0.030 | |
Amerindian | N.S. | N.S. | N.S. | N.S. | 0.111 | 0.012 |
N.S. = Not studied.
The most evident diversity in the ancestral Amerindian, European and African proportions of the different color categories, both between and within the different regions of Brazil, was seen in individuals self-assessed as Brown. For instance, in the North (Belém, PA) self-classified Brown individuals had, on the average, 68.6% European ancestry, followed by 20.9% Amerindian ancestry and 10.6% African ancestry, while in the South they had, on the average, 44.2% European, 11.4% Amerindian and 44.4% African ancestries.
To estimate the significance of the pairwise differences observed between the samples of individuals self-classified as Brown in diverse regions, we used a specially designed Monte Carlo randomization test of the distance D between the means, described in detail in the
Since we have six comparisons, we need to control for type I error. Applying the Bonferroni correction
Since both the census proportions of each color category and the trihybrid ancestry of Brazilians vary according to region, we decided to merge the two sets of data and estimate what we have called the “total ancestry” of a given region. This has the advantage of circumventing the different regional semantics of what it means “to be” White, Brown or Black. To calculate the total ancestry we simply multiply the proportions of a given ancestry in a given color category by the census proportion of that color category in the specific region to arrive at an ancestry estimation regardless of color.
In order to show how the calculation of the “total ancestry” was done, let us take the example of European ancestry in the North region (state of Pará) using the data from
The “total ancestry” estimates thus calculated for all regions are shown in the rightmost column of
The results obtained showed that there is in fact a smaller level of variability between the different regions than had been observed in the census data of color categories or in the ancestry proportions of the different color classes (
We here present results of the molecular estimation of the European, African, and Amerindian ancestry in 934 individuals belonging to different color categories and originated from four regions of Brazil (
In a previous publication
Another important observation is the considerable variability in the ancestry of color categories in different regions, most manifest in Brown and Black individuals. For instance, self-classified Brown individuals from the North had on the average 68.6% European ancestry, while in the South they had on the average 44.4% African ancestry. Also, for individuals self-classified as Black we can see considerable, but highly discrepant levels of European ancestry varying from 29.3% in Santa Catarina to 53.9% in Bahia. The most uniform category was that of individuals self-classified as White who consistently had a predominant European ancestry, varying from 66.8% in Bahia (BA) to 85.5% in Rio Grande do Sul and 86.1% in Rio de Janeiro.
It is noteworthy that such different regional subjective differences in color perception unraveled by our ancestry analysis appear to run counter to expectations based on pre-genomic racial/color methodology. For instance, Osorio
One possible explanation for this might be the effect of darker pigmentation by sun exposure. Jablonski and Chaplin
Independent of the reason, it is evident that ancestrally people who are White, Brown or Black in the northern part of Brazil are different from their counterparts in the southern part of the Country. This shows that, as has been pointed out before
To eschew the use of color categories we decided to try to estimate the general ancestry proportions of the different regional samples independent of color categories. To do that, we multiplied the proportions of a given ancestry in a given color category by the census proportion of that color category in the specific region, to arrive at ancestry estimation independent of color. Once such a correction was performed on the basis of the relative proportion of Amerindian, European and African ancestries, there emerged a higher level of uniformity than expected. In all regions studied the European ancestry was predominant, with proportions being ranging from 60.6% in the Northeast to 77.7% in the South (
Each point represents a separate region, as follows (1) North (Pará), (2) Northeast (Bahia), (3) Southeast (Rio de Janeiro) and (4) South (Rio Grande do Sul). The graph was drawn using the Tri-Plot program
This is novel genetic information about the Brazilian people that needs to be placed on a historical and phylogeographical context. First, we will compare them with our previous observations with uniparental genetic markers in Brazilians.
We earlier examined DNA polymorphisms in the non-recombining portion of the Y-chromosome and in the hypervariable region of mitochondrial DNA (mtDNA) in the four main regions of the Country (the same four regions analyzed in the present paper, although with samplings from different states). The vast majority of Y-chromosomes, independent of the region, proved to be of European origin
The proportions of Amerindian and African maternal ancestry were higher in the previous investigation using mtDNA than in the regional total ancestry averages calculated in the present study using biparental markers. However, it is interesting to note that both studies agree in that the highest level of Amerindian ancestry could be found in the North region (54% for mtDNA; 19.4% in the present study) and the highest level of African ancestry belonged to the Northeast region (44% for mtDNA; 30.3% in the present study), exactly as expected from known historical and anthropological studies of Brazilians
As mentioned previously, Brazil is the home of genetically heterogeneous people, the product of five centuries of admixture between Amerindians, Europeans and Africans. However, such admixture has occurred in a sexually asymmetric fashion, as a result of the colonization model employed by the Portuguese. Indeed, we know that few women came from Portugal to Brazil in the period from the arrival of the Europeans in 1500 until 1808, when the Portuguese Court fled the Napoleonic invasion of the Iberian Peninsula and relocated to Rio de Janeiro
Initially, the whole population was composed by the indigenous Amerindians. Little is known about their number when the Portuguese arrived in 1500
The slave traffic started in the middle of the 16th century, extending until 1850 and resulting in the forced relocation of an estimated 4 million Africans to Brazil
Let us take, as a generic example, the mating of a white European male with a Black African slave woman in Brazil. Because of the Brazilian social race identification system based primarily on phenotype, the children with dark skin pigmentation and other African iconic individual components of color would be considered Black, while those with light colored skin and other European iconic individual components of color would be considered White, even though they would have exactly the same proportion of African and European alleles
It is relevant to notice that 1.72 million slaves (42.9% of the total) arrived in Brazil during the first half of the 19th century, a time by which the number of Amerindians in Brazil had dwindled due to strife and/or European-borne disease. Most likely, the main contribution of Amerindians to the formation of the Brazilian people occurred in the first 2 or at most 3 centuries of its colonization, no longer being of high importance in the early 19th century, when larger and larger portions of Brazilians moved from rural areas to the cities. Since Africans (up until 1850) and Europeans (up until the 20th century) continued to arrive to Brazil and to participate in the gene pool, the Amerindian ancestry component was diluted across color-lines to the levels that we observe presently, but without losing its mtDNA representativity because of the sexual asymmetry of the relationships. The resulting highly admixed Brazilian population can be assessed by the proportions of the color categories in first Brazilian census in 1872, which was 19.7% Black, 42.2% Brown and 38.1% White.
In 1850, the forced arrival of Africans stopped due to prohibition of the slave trade. At the same time the Government started a campaign to stimulate the immigration of Europeans to Brazil. This process, which has been denominated the “Whitening of Brazil” had complex economic and sociological causes, and was tinged with racist ideology
This huge demographic event is probably responsible for the noteworthy dissipation of previously established regional differences in ancestries, as the European component of ancestry became uniformly preponderant, with similar proportions of 69.7%, 60.6%, 73.7% and 77.7% in the North, Northeast, Southeast and South, respectively.
How to explain why no similar wash-out occurred in respect to the matrilineal ancestry? We believe that the regional disparities in mtDNA ancestry were maintained because, once again, in the immigratory wave of Europeans there was a significant excess of males. When they admixed with the Brazilian women there was rapid europeanization of the genomic ancestry, but preservation of the established matrilineal pattern. There is demographic information to corroborate this possibility. First, of 1,222,282 immigrants from all origins that arrived in the Port of Santos in the period 1908–1936 the sex ratio (males/females) was 1.76
Understanding the heterogeneity and admixture of Brazilians within and between geographical regions has important clinical implications for the design and interpretation of clinical trials, the practice of clinical genetics and genomic medicine, the implementation of pharmacogenetic knowledge in drug prescription, and the extrapolation of data from other, more homogeneous populations.
Let us take the case of VKORC1, a key enzyme of the vitamin K cycle that is a molecular target of the coumarin anticoagulant warfarin. Polymorphisms of the
We genotyped the
Another example was provided by the
These results show that the heterogeneity of our population cannot be adequately represented by arbitrary “race/color” categories. In a pharmacogenetic context, this implies that each person must be treated as an individual rather than as an “exemplar of a color group”
Based on traditional demographic racial/color methodology, clinical and pharmacological trials in Brazil have usually considered the different regions of the Country as very heterogeneous. Our results show that when viewed under the light of molecular population genetics these classical paradigms are inadequate, since the genomic ancestry of individuals from different geographical regions of Brazil is more uniform than expected.
Our results have considerable sociological relevance for Brazil, because the race question presently figures prominently in Brazilian political life
The relevance of our work also extrapolates the Brazilian borders. Because of its heterogeneous Amerindian, European and African ancestral roots, Brazil has been an important model for the population genetics and pharmacogenetics of admixed populations. Our article demonstrates how critical it is to use genomic tools to reevaluate and modernize previous regional population models established using conventional demographic, anthropological and sociological studies. The same should also be applied to other countries that contain ancestrally admixed populations.
The Research Ethics Committee of the Instituto Nacional do Câncer (INCA) approved in July 15, 2005 the protocol of the study “Characterization of polymorphisms of pharmacogenetic interest and correlation with genetics ancestry” as well as the written Informed Consent form. In August 11, 2008 the Research Ethics Committee of the Instituto Nacional do Câncer (INCA) approved the enlargement of the study and carried forward the approval of the written consent Informed Consent form. The samples were anonymized after collection.
We studied 934 unrelated Brazilians from different geographical regions of Brazil (
The North region was represented by 203 unrelated, healthy individuals (92 men, 111 women) from the Amazonian state of Pará (PA -
Two different samples were collected in the Northeast region: (i) 82 individuals were ascertained from healthy students and work personnel at the University of Ceará, in Fortaleza, Ceará (CE -
The Southeast sample was made up of 264 unrelated, healthy individuals (162 men, 102 women) from the state of Rio de Janeiro (RJ -
Two different samples were obtained in the South region: (i) 189 individuals ascertained from blood donors in Porto Alegre, Rio Grande do Sul (RS -
DNA from each individual was independently typed for the following 40-biallelic short insertion/deletion polymorphisms (indels): MID-1 (rs3917), MID-15 (rs4181), MID-17 (rs4183), MID-51 (rs16343), MID-89 (rs16381), MID-107 (rs16394), MID-131 (rs16415), MID-132 (rs16416), MID-150 (rs16430), MID-159 (rs16438), MID-170 (rs16448), MID-258 (rs16695), MID-278 (rs16715), MID-420 (rs140709), MID-444 (rs140733), MID-468 (rs140757), MID-470 (rs140759), MID-663 (rs1305047), MID-788 (rs1610874), MID-857 (rs1610942), MID-914 (rs1610997), MID-918 (rs1611001), MID-1002 (rs1611084), MID-1092 (rs2067180), MID-1100 (rs2067188), MID-1129 (rs2067217), MID-1291 (rs2067373), MID-1352 (rs2307548), MID-1428 (rs2307624), MID-1537 (rs2307733), MID-1549 (rs2307745), MID-1586 (rs2307782), MID-1642 (rs2307838), MID-1654 (rs2307850), MID-1759 (rs2307955), MID-1763 (rs2307959), MID-1847 (rs2308043), MID-1861 (rs2308057), MID-1943 (rs2308135), MID-1952 (rs2308144). In this list, The MID number relates to the nomenclature of Weber et al.
This set of 40 indels was previously validated as useful in ancestry estimation through the study of the HGDP-CEPH Diversity Panel
To estimate the proportion of Amerindian, European and African ancestry in each Brazilian, we applied a model-based clustering algorithm using the
Triangular graphs of the genomic proportions of Amerindian, European and African ancestry of each individual were obtained using the Tri-Plot program
For statistical testing of the proportions of European, African and Amerindian ancestry in the different samples we developed a Monte Carlo resampling method, which has the advantage of being completely non-parametric
(DOC)
(DOC)