Dermatoglyphics from All Chinese Ethnic Groups Reveal Geographic Patterning

Completion of a survey of dermatoglyphic variables for all ethnic groups in an ethnically diverse country like China is a huge research project, and an achievement that anthropological and dermatoglyphic scholars in the country could once only dream of. However, through the endeavors of scientists in China over the last 30 years, the dream has become reality. This paper reports the results of a comprehensive analysis of dermatoglyphics from all ethnic groups in China. Using cluster analysis and principal component analysis of dermatoglyphics, it has been found that Chinese populations can be generally divided into a southern group and a northern group. Furthermore, there has been considerable debate about the origins of many Chinese populations and about proper assignment of these peoples to larger ethnic groups. In this paper, we suggest that dermatoglyphic data can inform these debates by helping to classify a Chinese population as a northern or southern group, using selected reference populations and quantitative methods. This study is the first to assemble and investigate dermatoglyphics from all 56 Chinese ethnic groups. It is fortunate that data on population dermatoglyphics, a field of physical anthropology, have now been collected for all 56 Chinese ethnic groups, because intermarriage between individuals from different Chinese ethnic groups occurs more frequently in recent times, making population dermatoglyphic research an ever more challenging field of inquiry.


Introduction
Each person's set of fingerprints is different, but fingerprints for an individual remain stable over a lifetime. These characteristics have made fingerprints very useful as tools for law enforcement officials in many criminal cases. Fingerprints also vary considerably among different groups of people, and can be useful as tools for tracing individuals to particular populations. Because fingerprints are highly variable and genetically influenced, they have important significance for forensic science, anthropology, ethnology, genetics, and medicine [1,3,4].
Population dermatoglyphics is a field of research within physical anthropology. It focuses on the dermatoglyphics of different ethnic groups [1][2][3][4]. The investigation of population dermatoglyphics in China began in 1910 (Taiwan), and a total of more than fifty papers on dermatoglyphics were published prior to 1971, though they reported on only a limited number of dermatoglyphic variables [5]. Only a small number of research projects on dermatoglyphics were carried out in Mainland China before 1964, and large-scale investigation and research on dermatoglyphics did not begin until 1977 [6][7][8][9][10]. Over the past 30 years, through the endeavors of many dermatoglyphic researchers in China, we have jointly completed a grand research project on the dermatoglyphics of the Chinese people .
China has a population of 1.3 billion people, and a total of 56 different ethnic groups are recognized in the country [6,10,11,16,19]. The Han Chinese group has the greatest population with 1.2 billion members. We have now successfully completed an investigation and analysis of dermatoglyphics for all 56 Chinese ethnic groups. One result of this study has been a recognition that dermatoglyphics among Han Chinese show strong diversities. Table 1 lists the geographic area, sample size and published references for all populations studied in China . If a sample's abbreviation has an asterisk ''*'' after the name, it is a combined sample. In our study, an ethnic group may have samples from several populations, and the data from these populations are combined into one sample. The complete dataset of dermatoglyphic variables for the Chinese ethnic groups are listed in Table 2. This study is the first  Figure 1 shows the results of a cluster analysis performed on the 156 samples. These samples include 122 population samples, 31 Geographically, this can be treated as an area of transition between the southern group and the northern group, or as a mixed area. For the physical characters of dermatoglyphics, there is a process of gradual diffusion from south to north or from north to south. Migration and mixing of many ethnic groups are still restricted by geographical barriers.

Results from the Cluster Analysis of 56 Chinese Ethnic Groups
Northern group (72-154). This group contains 83 samples. Clusters 115-126 contain samples from southern China. Therefore, this cluster could be seen as a transition area between the northern group and the southern group. In the northern group, there are several ethnic groups from Xinjiang province (Kazak, Kirgiz, Uygur, Uzbek, Tatar, Tajik) and Salar of Qinghai Province. These seven samples constitute a cluster by themselves. With the exception of the Salar, the fingerprint frequency of whorl (W) among these six Xinjiang samples is significantly lower than the frequency of loop (L) (p,0.01), and the frequency of true pattern in the third interdigital area (III) in the hands is higher than 20%. Our Xingjiang samples express clear characters similar to the peoples of Central and Western Asia, and they could be treated singly as a ''northwest group''. Some Experience. Africans and Caucasian Americans, working as outgroups, express clear and suitable positions on the cluster tree. Gin-Vietnamese cluster in the southern Chinese group. Caucasian Americans first cluster with the Tajik and then cluster with northwest samples. Africans form the most peripheral cluster.
Combined samples representing 31 ethnic groups are included in the cluster analysis. The frequencies of their dermatoglyphic variables were calculated using the population size of each population sample. The combined samples tend to show a general picture for a specific ethnic group.
Sichuan is a province with many minorities of large population size in southwestern China. Ten (Han-8, Miao-2, Qiang-1-2-*, Tibetan-6-3-4, Tujia, Yi-4) of 11 samples from this province cluster in the northern group (including the Qiang combined sample) with only one sample (Yi-1) clustering in the southern group. During the past three centuries, the population in Sichuan has increased from 100 thousand to 100 million. Most likely, Sichuan is a place of migration and fusion of peoples.
Han Chinese are represented by 16 samples (including combined samples): 4 samples (Han-4-6-11-9) cluster in the southern group and 12 samples (Han-8-10-1-2-14-7-12-*-15-13-5-3) cluster in the northern group. Han-2 and Han-14 are neighboring samples in the northern group on the cluster tree, but Han-2 and Han-14 were collected separately from the south and north. Two samples (Han-6-9) were collected from the north but cluster in the southern group. Nine samples (Han-8-10-1-2-7-12-15-13-5) that were collected from the south actually cluster in the northern group. Three samples of Han Chinese in Shanghai (Han-10-15-13), with each sample having more than 1000 persons, all cluster in the northern group. Within clusters 109-114, there is a section containing many samples of Han Chinese. Samples of Han Chinese do not cluster into a single group. Han Chinese is the ethnic group with the largest population in China and throughout the world. Cluster analysis indicates that Han Chinese samples from different places (east, northwest, northeast and southwest) tend to cluster together as a group with local minorities. Therefore, the dermatoglyphic characters of Han Chinese express strong nationwide diversities.
Many large migrations through history, including migrations from south to north and from north to south, as well as migrations relating to the opening of the Silk Road for interchange between the east and the west, have divided the original ethnic groups into different populations. For example, migratory populations such as Mongol-2 and Hui-2-7-3 who migrated from northern China to southern China cluster with a neighboring ethnic group (southern group). This indicates correlations between physical characters of dermatoglyphics and geographical areas. Clearly, there can be large differences between migratory populations and the original population within the same ethnic group.
All nine Tibetan samples (including combined samples) cluster with the northern group although they are geographically located in southwestern China. There are five Tibetan samples (Tibetan-*-3-Ind.-5-8) in cluster 85-98 where Tibetan populations are relatively concentrated. Tibetan dermatoglyphics shows characters of the northern group. Therefore, it seems that Tibetans are a northern group and not a ''southern group from India'' as has been suggested by scholars. It seems likely that Tibetans originated from the ancient Qiang people in northern China.
Tibetan-4 is a population whose origin is up for debate, and they are known as Baima Tibetan people in Sichuan province. On the cluster tree, Tibetan-4 clusters with Gansu Tibetan (T.B.-7) in northwestern China. This suggests that there is a difference between Baima Tibetan People and Tibetan people living in Tibet. Tibetan migrants in India (T.B.-1) cluster with the Tibetan sample from the Lhasa area (T.B.-5), expressing a close relationship between these two populations.
Mang is a population that has not yet been assigned to an ethnic group. In the cluster tree, Mang clusters with Miao-2 and Russ (138,139). This result does not help to assign them to a particular ethnic group.
Regarding the Miao samples, Miao-1 was collected from Hainan Island (province) and clusters in the southern group, and Miao-3-2 in Sichuan and Guizhou provinces cluster in the northern group. This result may be explained by evolution of physical characters occurring in populations that are isolated in an island setting.
Minnan Han Chinese (Han-2) is the largest population in Taiwan. Their dermatoglyphics are similar to the mainland northern group [10]. Minnan people in Taiwan come from the southern part of Fujian Province, and Minnan people in Fujian originate from northern China. The dermatoglyphics of Hakka Han Chinese in Taiwan (Han-1) are also similar to the northern group [9].
Taiwan  Yi people in Yunnan Province are represented by two samples in the analysis: Samei (Yi-2) and Luoluobo (Yi-3). They separately cluster in the southern group and the northern group. Differences between these Yi populations are obvious.
Six samples, including Bai-2, Yi-5, Jino-2, Hani-2-3, and Blang-2 studied by Haiguo Zhang and colleagues, and five different samples collected from the same ethnic groups (Bai-1,Yi-2, Jino-1, Hani-1, Blang-1) studied by Anlu Jin and colleagues, all cluster in the southern group. Also, Derung-2 [6] studied by Haiguo Zhang and colleagues and Derung-1 [11] studied by Anlu Jin and colleagues both cluster in the northern group. Scholars from different research teams can obtain similar results using different samples collected from the same ethnic groups in Yunnan province. This fact demonstrates that the technical analysis [1][2][3] standard and variables standard [6,19,31] required by the Chinese Dermatoglyphics Association (CDA) has great value and effectiveness.

Discussion
Dermatoglyphic characteristics can divide Chinese populations into a southern group and a northern group, taking the Yangtze River or 30 0 -33 0 latitude as the boundary. This conjecture is similar to the results of dermatoglyphic research conducted in 1998 [6]. Previous studies from anthropometrics, HLA and immunoglobulin have also suggested that Chinese ethnic groups can be divided into northern and southern groups, and that they may be of different origins. [16]. Since there are great differences between the southern and northern groups, it is better to use data collected from local ethnic groups as references for medical applications and genetic studies.
There has been much debate about the origins of many Chinese populations and about proper assignment of these peoples to ethnic groups. Dermatoglyphic data can inform these debates by helping to classify a population as a northern or southern group. In order to make such assignments, we selected 29 samples from the dataset as reference populations (as population marker, PM). The 29 reference populations were limited to northern ethnic groups that actually cluster into the northern group, and southern ethnic groups that actually cluster into the southern group. In addition, preference was given to populations with larger sample sizes. Two outgroups, Africans and Caucasian Americans (as supervisory marker, SM), were also used to make such assignments.
There are 11 clustering methods available for cluster analysis in SAS software. If a clustering method is suitable for assigning a population to the northern or southern group, it should output 29 reference populations and 2 outgroups divided into four groups in the cluster tree: a southern group, a northern group, an African group and a Caucasian group. After selection, we found five usable clustering methods: Average linkage, complete or longest distance method, flexible-beta method, McQuitty's similarity method, and Ward's minimum-variance method. All these methods can classify 31 samples into 4 large groups. Although each of these five methods results in a different position (Y axis) in the clustering figure or a different clustering distance (X axis) for each population, the positions of the populations within the four groups is relatively stable. Figure 2 is an example of the results for the average linkage method, from which the cluster figure for 31 samples and the Han Chinese in Shanghai (Han-10) has been drawn. The results from the cluster analysis show that the Han-10 sample should be assigned to the northern group.
We also conducted principal component analysis on these 32 samples, and used PCI and PCII to make a scatter diagram (Figure 3). The Han Chinese in Shanghai (Han-10) were also assigned to the northern group in this analysis. Principal component analysis and cluster analysis produced identical results. Although Shanghai is south of the Yangtze River, these two analyses assign this city to the northern group. Not surprisingly, only 14% of individuals in the sample have both parents from Shanghai. Shanghai is a typical immigrant city.
According to the principal component analysis, the first four components can explain 83.51% of the variance (41.61%, 20.73%, 10.62% and 10.54%, for each component respectively). In a previous study of 38 loci (130 alleles, including blood groups, HLA, red cell enzymes, serum proteins etc.) in 33 Chinese ethnic groups (106 populations), principal component analysis showed that the first four components could only explain 65.8% of the variance (30.4%, 17.2%, 12.2% and 6.0%, for each component respectively) [16]. Thus, these dermatoglyphic data can explain 17.71% more of the variance than did the genetic markers. This research demonstrates that dermatoglyphics, although a classical discipline, still shows vitality and good future prospects.
The Mang are a population that have not been assigned to any of the 56 Chinese ethnic groups. Therefore, we conducted a cluster analysis to determine its most closely related group. Figure 4 shows a cluster tree that includes the 31 reference samples and the Mang. The results show that the Mang cluster with the Southern Group. We also conducted principal component analysis on the 32 samples, and used PCI and PCII to make a scatter diagram ( Figure 5). The Mang are also assigned to the Southern Group in this analysis. This result fits with the fact that they currently reside in southern China. Dermatoglyphic data, coupled with cluster analysis and principal component analysis, are a useful tool for assigning Chinese populations to the northern or southern group. Dermatoglyphic data from Chinese ethnic groups can also be used as reference populations or outgroups when doing anthropological research.
The standard of technical analysis for dermatoglyphics used for this research is called the Cummins' standard or the Euro-American standard [1][2][3], because it was strongly promoted by an American, H. Cummins, but was originally suggested by F. Galton (1822-1916) and E. R. Henry (1850-1931) from the U.K. [1]. The Chinese Dermatoglyphics Association (CDA) follows this Euro-American standard. According to CDA standards, 11 dermatoglyphic variables must be included in all research: total finger ridge count (TFRC), a-b ridge count (a-b RC), percentage frequencies of the arch (A), ulnar loop (Lu), radial loop (Lr) and whorl (W), percentage frequencies of true pattern in the thenar area (T/I), second interdigital area (II), third interdigital area (III), fourth interdigital area (IV) and hypothenar area (H).
SAS software was used to perform cluster analysis (see Figure S1 in Supporting Information File S1) and principal component analysis using a 156611 data matrix. Through the computation of  these two analyses, we created a cluster tree and scatter diagram using PCI and PCII (see Figure S2 in Supporting Information File S2). We also developed some computer programs for frequency calculating or weighting using QBASIC or C++.
Dermatoglyphic data from other research teams used in this paper has been carefully checked. The total frequency for several dermatoglyphic variables must add up to 100%. If the total did not reach 100%, this could have been caused by publication error or miscalculation, and needed to be corrected. No data were included in the research when there was no way to correct for such errors.
All dermatoglyphics were obtained by ink print. All our analyses on dermatoglyphics were based on these ink prints.
No data were included in the research when there was no way to correct for such errors.

Supporting Information
Supporting Information File S1