Genetic Variation of 25 Y-Chromosomal and 15 Autosomal STR Loci in the Han Chinese Population of Liaoning Province, Northeast China

In the present study, we investigated the genetic characteristics of 25 Y-chromosomal and 15 autosomal short tandem repeat (STR) loci in 305 unrelated Han Chinese male individuals from Liaoning Province using AmpFISTR® Yfiler® Plus and IdentifilerTM PCR amplification kits. Population comparison was performed between Liaoning Han population and different ethnic groups to better understand the genetic background of the Liaoning Han population. For Y-STR loci, the overall haplotype diversity was 0.9997 and the discrimination capacity was 0.9607. Gene diversity values ranged from 0.4525 (DYS391) to 0.9617 (DYS385). Rst and two multi-dimensional scaling plots showed that minor differences were observed when the Liaoning Han population was compared to the Jilin Han Chinese, Beijing Han Chinese, Liaoning Manchu, Liaoning Mongolian, Liaoning Xibe, Shandong Han Chinese, Jiangsu Han Chinese, Anhui Han Chinese, Guizhou Han Chinese and Liaoning Hui populations; by contrast, major differences were observed when the Shanxi Han Chinese, Yunnan Bai, Jiangxi Han Chinese, Guangdong Han Chinese, Liaoning Korean, Hunan Tujia, Guangxi Zhuang, Gansu Tibetan, Xishuangbanna Dai, South Korean, Japanese and Hunan Miao populations. For autosomal STR loci, DP ranged from 0.9621 (D2S1338) to 0.8177 (TPOX), with PE distributing from 0.7521 (D18S51) to 0.2988 (TH01). A population comparison was performed and no statistically significant differences were detected at any STR loci between Liaoning Han, China Dong, and Shaanxi Han populations. The results showed that the 25 Y-STR and 15 autosomal STR loci in the Liaoning Han population were valuable for forensic applications and human genetics, and Liaoning Han was an independent endogenous ethnicity with a unique subpopulation structure.


Introduction
Liaoning Province, located in the northeast of China, is known in Chinese as "the Golden Triangle" from its shape and strategic location. It was established in 1907 as the name of Fengtian and changed to Liaoning in 1929, with an estimated population of approximately 43.91 million in 2014 (www.stats.gov.cn). The population is mostly Han Chinese (83.94%) with minorities of Manchus (12.88%), Mongols (1.60%), Hui (0.632%), Koreans (0.576%) and Xibe (0.317%). Liaoning Han individuals mainly migrated from Shandong Peninsula during the hundred-year period starting at the last half of the 19 th century. "Chuang Guandong" is a description that Han Chinese population, especially from the Shandong Peninsula and Zhili, entered Manchuria [1]. During the first two centuries of the Manchu Qing Dynasty, Liaoning Province is the traditional homeland of the ruling Manchus with only certain Manchu Bannermen, Mongol Bannermen, and Chinese Bannermen allowed in. The region, now known as Northeast China, has an overwhelmingly Han population. After the establishment of the People's Republic of China at the end of the Chinese Civil War, further immigrations were organized by the Central Government to "develop the Great Northern Wilderness", eventually peaking the population over 100 million people [2,3]. Thus, it is necessary and sufficient to investigate the genetic background of Liaoning Han population and compare the genetic distance with other population. Additionally, it is interesting to observe how much admixture took place over the past 100 years among Han Chinese and other groups.
Y-chromosomal short tandem repeats (Y-STR) is a useful tool for inferring genetic genealogy evolution [4] and ancient human migration trajectories and timing [5,6]. The nonrecombinant region of the Y-chromosome may play a potential role in revealing the ethnic and regional representation of the Han Chinese population owing to its significant phylogeographic information content [7,8]. It can supply an informative reference for investigating patterns of genetic variation in the Han Chinese population across East Asia considering that the genetic and cultural diversity among East Asian populations is still not fully understood. Autosomal STR loci are usually applied in forensic personal identification and paternity tests, which can provide a mighty powerful discrimination capability without influenced by linkage disequilibrium. It can also be used to uncover the population genetic backdrop and structure [9]. The population data of autosomal STR loci can be utilized to constructed the phylogenies and clarify the genetic structure using genetic distance measurements, neighborjoining dendrograms and principal component analysis base on different genotyping frequencies [10].
Therefore, we investigated the frequencies of 25 Y-STR and 15 autosomal STR loci in Liaoning Han population to expand the available population information for forensic medicine and human genetic diversity. Population comparison was performed between Liaoning Han population and different ethnic groups to better understand the genetic background of the Liaoning Han population.

Study population
Three hundred and five blood samples were collected from unrelated healthy male individuals living in Liaoning Province, Northeast China, after obtaining written informed consent. The blood was then stained onto filter papers. Samples were obtained and analyzed after approval from the Ethics Committee of China Medical University.

Data extraction, PCR amplification, and genotyping
Genomic DNA was extracted using Chelex-100 [11]. PCR amplification was performed using AmpFISTR 1 Yfiler 1 Plus and Identifiler TM PCR amplification kits (Thermo Fisher Scientific, CA, USA) in a GeneAmp 1 PCR 9700 (Thermo Fisher Scientific, CA, USA) thermal cycler, according to respective manufacturer specifications. The AmpFISTR 1 Yfiler 1 Plus amplification kit (Thermo Fisher Scientific, Waltham, MA, USA) can co-amplify 25 Y-STR loci with six dyes, including seven rapidly mutating loci [12]. The AmpFISTR 1 Identifiler TM PCR Amplification kit (Thermo Fisher Scientific) can co-amplify 15 autosomal STR loci and the Amelogenin locus with five dyes. Fragments of the 25 Y-chromosomal and 15 autosomal STR loci were produced simultaneously. Separation and detection of amplicons was performed on an Applied Biosystems™ 3500 Series Genetic Analyzer (Thermo Fisher Scientific, Waltham, MA, USA). Data were analyzed using GeneMapper ID v4.1 software (Thermo Fisher Scientific, Waltham, MA, USA). Control DNA 007 was included as a standard reference in each batch of genotyping. We strictly followed the recommendations of the DNA Commission of the International Society of Forensic Genetics (ISFG) for Y-STR analysis [13].

Data analysis
For Y-STR loci, allele frequencies and gene diversity were calculated using PowerMarker v3.25 [14]. Haplotype frequencies, random match probabilities (sum of squares) and haplotype diversity were calculated using Arlequin Software v3.5 [15]. The discrimination capacity (DC) was determined as the proportion of different haplotypes in each sample [16]. A cluster structure of Y-STR haplotypes was generated using the YHRD database (http://www.yhrd.org/). To compare data from the studied Liaoning Han population with other published data, genetic distance (Rst statistics) was measured by analysis of molecular variance (AMOVA) and visualized using two multi-dimensional scaling (MDS) Rst plots via YHRD online tools (http://www. yhrd.org/Analyse/AMOVA).
For autosomal loci, sample allele frequencies and exact Hardy-Weinberg equilibrium (HWE) tests were calculated using PowerMarker v3.25 [14]. Values for power of discrimination (DP), polymorphism information content (PIC), power of exclusion (PE), and heterozygosity (He) were calculated using Power Stats v1.2 software [17] that had been modified by Raquel, et al. to support and manage the large amount of samples [18]. Pairwise genetic distance (Fst) and p values for each locus were calculated between populations using Arlequin v3.5 software [15]. Furthermore, Nei's standard genetic distance between populations was generated by the Phylip 3.69 package [19] and visualized with Treeview software [20]. Because the published relevant data is limited, the included groups for population comparison between Y-STR and autosomal STR are different.

Y-chromosomal STR
Two hundred and ninety-three different haplotypes were observed from 305 unrelated individuals. Among them, 281 were unique and 12 were shared by two individuals (S1 Table). Null alleles were found in nine individuals at DYS448 and one individual at DYS385, respectively. Haplotype diversity rendered a high value (0.9997 ± 0.0003). Likewise, a high random match probability (0.0035) was determined with a DC of 0.9607. Genetic diversity values of the 25 loci ranged from 0.4525 (DYS391) to 0.9617 (DYS385) (S2 Table). Among them, allele frequencies ranged from 0.7016 (DYS438) to 0.0033 (DYS389I, DYS389II, DYS458, YGATAH4, DYS448, DYS391, DYS456, DYS439, DYS481, DYS533, DYS576, DYS627, DYS460, DYS518, DYS449, DYF387S1 and DYS385). Cluster analysis was performed for the 12 haplotypes that were observed twice. Ancestry information showed that the haplotypes of the Liaoning Han population most likely belonged to the East Asian-Sino Tibetan-Chinese culture, which corresponds with its history, culture, and geographical distribution (Fig 1). The powerful informative content of the 25 Y-STR loci in the Liaoning Han population will be useful and interesting in forensic medicine and enrich the Han Chinese population database.

Autosomal STR
The distribution of allele frequencies, forensic efficiencies, and statistical parameters across the 15 autosomal STR loci are presented in S3 and S4 Tables. Among the 157

Population comparison
For Y-STR loci, we compared our haplotype data with that of the five populations that were submitted to the YHRD database (Release 51), which included Austrian [21], German [22], Polish [23], African and Native American [24]. Rst values for genetic distance demonstrated that haplotypes of the Liaoning Han population were significantly different from those of the other five populations (all p values < 0.05/5 after Bonferroni correction). As shown in the MDS plot (Fig 2), there were significant differences between Liaoning Han population and the five population. Furthermore, in order to comprehensively investigate the genetic substructure of Liaoning Han population, the population comparison using the 16 [32], Gansu Tibetan (YP001032), Hunan Tujia (YP001037), Liaoning Xibe [33], Guangxi Zhuang (YP000591), Japanese [34] and Korean [35]. Rst values for genetic distance demonstrated that haplotypes of the Liaoning Han population were significantly different from those of the other 22 populations (all p values < 0.05/22 after Bonferroni correction; Table 1). As shown in the MDS plot (Fig 3), minor differences were observed when the Liaoning Han population was compared to the Jilin Han Chinese, Beijing Han Chinese, Liaoning Manchu, Liaoning Mongolian, Liaoning Xibe, Shandong Han For autosomal STR loci, S5 Table presents pairwise Fst and p values for differentiation tests between the Liaoning Han ethnic group and nine additional published populations [36][37][38][39][40][41][42][43][44]; statistically significant differences (p < 0.05/15) were found between the Liaoning Han population and the China Miao population at five STR loci, the China Bouyei population at four STR loci, the China Uygur and Jinan Han populations at three STR loci, the Japanese population at two STR loci, and the Korean and Shanghai Han populations at one STR locus. No statistically significant differences were detected at any STR loci between the Liaoning Han and the China Dong or Shaanxi Han populations. Table 2 shows genetic distances between populations. Fig 4  indicates clusters of unrooted phylogenetic trees to mirror the historical and geographical backgrounds of the populations compared. In culture custom, because most people in Northeast China trace their ancestries back to the migrants from the Chuang Guandong era, Northeastern Chinese were more culturally uniform compared to other geographical regions of China. Therefore, people from the Northeast would first identify themselves as "Northeasterners" before affiliating to individual provinces and cities (http://chinaneast.xinhuanet.com).
For Han Chinese population, the previous studies showed that it was intricately sub-structured and clustered roughly to two (northern Han and southern Han) or three (northern Han, central Han and southern Han) subgroups [45][46][47]. The distinction between southern and northern Han populations were reported by Chu et al using the neighbor-joining method based on the data of STR loci [48]. The Han Chinese group has the same predecessors, the Yan   Emperor and the Yellow Emperor in the Yellow River Basin. However, the Han population has been forming a series of relationships with different groups and coexisted with other ethnic groups since thousands of years ago [49]. Obviously, Liaoning Han population belonged to northern Han subgroup according to the geographic distribution and historical cultural. The population comparison based on Y-STR loci showed that Liaoning Han was an independent endogenous ethnicity with a unique subpopulation structure. The previous study showed that Liaoning Han had a close genetic distance with Manchu, which was not as near as Han population of Jilin and Beijing, but nearer than other ethnic groups [50]. This result might indicate that the Liaoning Han integrated gradually with natives, such as Manchu, Mongolian and Xibe, following its geographical migration, which was corresponded with the historical records [9]. However, autosomal STR population comparison presented that there was no significant difference between the Liaoning Han and the China Dong or Shaanxi Han populations, which seemed to be contradictory to Y-STR results. This might be due to the discrepancy of different genetic markers. Consequently, Liaoning Han population owns its unique genetic  There were two potential limitations in the present study. First, the analysis of the Y chromosomal and autosomal STR loci could not provide the precise and reliable data for population comparison with the absence of the whole genome data. Second, the included groups for population comparison between Y-STR and autosomal STR are different, due to the limited available relevant data. Thus, more genetic investigations need to do in order to better understand the characteristics of Liaoning Han Chinese population.

Conclusion
The population comparison demonstrates that the Liaoning Han population is an independent endogenous ethnicity and still owns its unique genetic characteristics. In summary, the reported genetic characteristics of the 25 Y-STR and 15 autosomal STR loci allelic frequencies and haplotype distributions of the Liaoning Han population are informative for forensic investigation and paternity testing. The results could help inferring the genetic genealogy evolution and ancient human migration patterns.
Supporting Information S1