GST M1-T1 null Allele Frequency Patterns in Geographically Assorted Human Populations: A Phylogenetic Approach

Genetic diversity in drug metabolism and disposition is mainly considered as the outcome of the inter-individual genetic variation in polymorphism of drug-xenobiotic metabolizing enzyme (XME). Among the XMEs, glutathione-S-transferases (GST) gene loci are an important candidate for the investigation of diversity in allele frequency, as the deletion mutations in GST M1 and T1 genotypes are associated with various cancers and genetic disorders of all major Population Affiliations (PAs). Therefore, the present population based phylogenetic study was focused to uncover the frequency distribution pattern in GST M1 and T1 null genotypes among 45 Geographically Assorted Human Populations (GAHPs). The frequency distribution pattern for GST M1 and T1 null alleles have been detected in this study using the data derived from literatures representing 44 populations affiliated to Africa, Asia, Europe, South America and the genome of PA from Gujarat, a region in western India. Allele frequency counting for Gujarat PA and scattered plot analysis for geographical distribution among the PAs were performed in SPSS-21. The GST M1 and GST T1 null allele frequencies patterns of the PAs were computed in Seqboot, Gendist program of Phylip software package (3.69 versions) and Unweighted Pair Group method with Arithmetic Mean in Mega-6 software. Allele frequencies from South African Xhosa tribe, East African Zimbabwe, East African Ethiopia, North African Egypt, Caucasian, South Asian Afghanistan and South Indian Andhra Pradesh have been identified as the probable seven patterns among the 45 GAHPs investigated in this study for GST M1-T1 null genotypes. The patternized null allele frequencies demonstrated in this study for the first time addresses the missing link in GST M1-T1 null allele frequencies among GAHPs.


Introduction
Metabolic activities play an important role in shaping the livelihood of an organism. Drug-xenobiotic compound metabolizing enzyme (XME) systems are the most investigated pathways that are involved in equilibrating the health status of an individual. Among the numerous drug related genes investigated, Glutathione-S-transferases (GST) of Phase II XMEs were found to play an important role in cellular protection and in cellular resistance to drugs by glutathione conjugation reactions. GST classes convert the active endogenous and/or exogenous carcinogenic compounds to their detoxified form. Among the GST classes, GST M1 and GST T1 were found associated to a loss of function with a structural deletion (null mutation); moreover, they were also found to modify the detoxification ability of the individual exposed to tobacco or carcinogenic pollutants in the environment [1]. Genotoxins such as aromatic hydrocarbon epoxides and products of oxidative stress such as DNA hydroperoxides, polycyclic aromatic hydrocarbon diol epoxides are catalyzed and detoxified by GST M1 while, the constituents of cigarette smoke such as alkyl halides, bezo (a) pyrene diol epoxide, acrolein are catalyzed and detoxified by GST T1 [1,2]. Several factors such as environmental pollution, dietary habits and activity-dependent genetic differences have been reported as modulators of GST expression and susceptibility to xenobiotic compound detoxification [3]. Numerous studies in the recent past have hypothesized the difference in metabolic rate of M1 and T1 classes of GST as the risk factor associated to cancers of bladder, pancreas, upper aero digestive tract, lung, esophageal, head-neck, melanoma and also in Balken endemic nephropathy patients [4][5][6][7][8][9]. Further, the inter-individual difference in drug disposition and efficacy has been investigated by various authors [10] and the observed frequency distributions of GST M1-T1 genotypes among different populations are reported as ethnic or PAs dependent [10,11]. Drugs are the major hope of remedy for the people around the globe with various metabolic and genetic disorders but the scenario in the past was found distressed as the effectiveness of the drugs were reported by the influence of the unidentified polymorphic patterns observed in drug metabolism genes among different ethnics or PAs [11][12][13]. Though researchers from different PAs are interested in analyzing the frequencies of GST M1 and T1 null genotypes and their possible risk association with various disorders, they are not able to report conclusive association in all major PAs [14,15]. Recent advances in molecular techniques have opened a new era of pharmacogenomics and several researchers are inclined towards investigating the relationship in genetic diversity and allelic frequency of GST classes to insight genetic predisposition or susceptibility among various ethnics or PAs. In this context, probing the genetic variation in GST classes is inevitable for genomic epidemiological studies and to develop new common drugs in future to majority of PAs [9,16,17]. The allele frequency pattern in GST M1-T1 null genotypes of different PAs are yet to be explored to unlock several phenomenons related to a risk association with genetic diseases and drug dispositions [10]. A study including statistically valid number of subjects from various major PAs could address the issue of understanding the phenomenon for frequency distribution pattern in geographically assorted human populations (GAHPs); however, it will be tedious and might require huge population size [16]. Therefore, the present study was focused to uncover the genetic distance based ancestral origin or genetic affinity among GAHPs to address the paradigm for GST M1-T1 null allele frequency diversity. We are currently exploring how best to do this for the large number of populations in the present analysis to understand this phenomenon of frequency distribution pattern in GAHPs. GST M1 and GST T1 loci investigated in this study have been derived from literatures representing 44 different populations affiliated from Africa, Asia, Europe, South America and the genome of Gujarat PA, a region in western India. GST M1-T1 null allele frequency of 45 GAHPs were computed for phylogenesis with pair wise genetic distance based unweighted pair group method with arithmetic mean (UPGMA) and the findings of seven patterns for GST M1-T1 null allele frequency in this study have been demonstrated for the first time with highest genetic affinity. The patterns of null allele frequencies reported in this study add insights to determine a conclusive risk association of GST M1-T1 loci with several cancer or genetic disorders.

Subjects
The present investigation includes GST M1 and GST T1 null allele frequency of 45 GAHPs from 39 studies. Null allele frequency of Gujarat population was investigated in this study from 504 healthy unrelated volunteers of Gujarati origin with a mean age of 60 years. After signing the informed consent to participate in the study, blood samples of 2 mL were collected from each subject. Data of the remaining 44 populations were collected from different populations investigated by various authors (Table 1). Several studies of same ethnicity were also gathered in the study to fulfill the statistical significance of the study and to minimize the varying frequency of polymorphism among the ethnic groups while, the data gathered from Naveen et al. [18] had allele frequencies of combined Tamilnadu and Pondicherry PAs. The study was approved by the institutional ethics committee of Shrimathi Vasantben Ratilal Desai Cancer Research Centre, Rajkot Cancer Society -India.

DNA isolation and Genotyping
Lahiri and Nurnberger method was used to isolate genomic DNA from whole blood [19]; the Huang et al., method of multiplex polymerase chain reaction was performed to identify GST M1 and T1 polymorphism with albumin gene as internal control [20]. Amplified products of PCR were visualized in 2% agarose gel and the band patterns were analyzed for polymorphism.

Statistical analysis
Distributions of GST M1 and GST T1 null alleles in Gujarati population were calculated by frequency counting method in SPSS-21 (4-27AEA) for windows. The standard genetic distance (D ST ) between different PAs for GST M1-T1 null allele frequencies were calculated by Nei's (1972) method in Phylip 3.69 version [52,53]. Least D ST values between the PAs were used to compute clades with more than 50% of 1000 bootstrap replicates by Felsenstein (1989) method and then the phylogenetic trees were constructed in Mega-6 software by UPGMA method [54][55][56]. Finally, the clusters of PAs split found among the geographically assorted human populations in phylogenetic tree were used in the scattered plot to analyze their geographical distribution. The longitude (X-axis) and latitude (Ys-axis) of different continental regions were used to construct the scattered plot in SPSS-21 as summarized in Table 4. The online web source world atlas was used to compute the latitude and longitude of the respective geographic locations (http://www.worldatlas.com/aatlas/latitude_and_longitude_finder.htm [57]).

Results
Phylogenetic tree for GST M1-T1 null allele frequency in GAHPs The frequency of GST M1 and T1 null genotypes in Gujarat populations of India was observed as 0.200 and 0.355 respectively in this present study. The pair wise genetic distance matrix  Table 2.  1). Abbreviations used were same as those in Table 1.  Tables 2 and 3 respectively. The phylogenetic analyses of 20 different continental regions ( Fig. 1) and 45 GAHPs (Fig. 2) for GST M1-T1 null allele frequency were performed using the pair wise genetic distance matrix by UPGMA method in MEGA-6 software [52,56]. Consense program that clustered more than 50% of 1000 bootstrap replicates was used to assess the reliability of the constructed phylogenetic trees [53,54]. The Nei's D ST value varies from 0.0001 to 0.007 (Table 2) and 0.0006 to 0.008 (Table 3) Table 2 and cluster with more than 50% of 1000 bootstrap replicates were included in the consensus tree obtained by Felsenstein (1989) Fig. 2. Nevertheless, the South Asian Indian  The values represented in the table were computed between the population affiliations by Nei's (1972) standard genetic distance (DST) method and were used in phylogenetic tree of 45 geographically assorted human populations for GST M1-T1 null allele frequency (Fig. 2). Abbreviations used were same as those in Table 1.   The values represented in the table were computed between the population affiliations by Nei's (1972) standard genetic distance (DST) method and were used in phylogenetic tree of 45 geographically assorted human populations for GST M1-T1 null allele frequency (Fig. 2). Abbreviations used were same as those in Table 1.   The values represented in the table were computed between the population affiliations by Nei's (1972) standard genetic distance (DST) method and were used in phylogenetic tree of 45 geographically assorted human populations for GST M1-T1 null allele frequency (Fig. 2). Abbreviations used were same as those in Table 1.  (Fig. 2). However, Mongolia of East Asia was observed with least D ST value to Pakistan allele frequency (0.000686) than the Caucasians (0.046671). The least D ST value between European continental regions and South Asian Afghanistan allele frequency that ranged from 0.002064 to 0.004708 was clustered together for GST M1-T1 null allele frequency in the phylogenetic tree of 20 different continental regions (Fig. 1). Nevertheless, the phylogenetic tree of 45 GAHPs (Fig. 2) clustered only 13 European PAs (Sweden, Finland, Denmark, Netherlands, Germany, France, Italy, Spain, Greece, Bulgaria, Poland, Slovakia and Russia) out of 16 investigated in this study with South Asian Afghanistan allele frequency (least D ST value  Table 3. Other aspects were same as those in Fig. 1. Major group of GST M1-T1 null allele frequencies were from population of Xhosa tribe, Zimbabwe, Ethiopia, Egypt, Afghanistan, Caucasian and Andhra Pradesh. Abbreviations used were same as those in Table 1. that ranged from 0.001176 to 0.00723) while, the other 3 PAs [Slovenia (0.002462), Czech Republic (0.004116) and UK (0.006718)] were clustered with North African Egypt allele frequency. Singapore-Malay and Indonesia PAs from South East Asia were observed with least D ST value to East Asia (China, 0.001246) and Caucasian (0.003788) respectively for GST M1-T1 null allele frequency. Nevertheless, the other counter parts from same continental region were observed as the most diverse PAs with PA admixture from North Africa (Egypt, 0.002571) and East Africa (Ethiopia, 0.007943) for Philippines; South Asia India (Andhra Pradesh, 0.004738) and East Africa (Ethiopia, 0.004922) for Vietnam among the 45 GAHPs investigated in this study for GST M1-T1 null allele frequency as shown in Table 3 and Fig. 2 respectively. GST M1-T1 null allele frequency patterns among the GAHPs The effect of isolation by geographical distance in population differentiation [51] was validated in a scattered plot with respect to the phylogenetic clusters of 45 GAHPs for GST M1-T1 null allele frequency that corresponds to the latitudes and longitudes of 20 different continental regions representing PAs from Africa, Asia, Europe and America ( Table 4). The scattered plot illustrated in Fig. 3 suggest three major geographical split for the seven GST M1-T1 null allele frequency clusters or patterns observed in the phylogenetic tree of 45 GAHPs (Fig. 2). South African Xhosa allele frequency pattern (I) observed mostly in continental regions of Africa suggest an "Africa" split in the scattered plot with least population differentiation to Nigeria (West Africa), Cameroon (Middle Africa) and Namibia (South Africa). However, the GST M1-T1 null allele frequency patterns of PAs from other African continental region such as East African Zimbabwe (II), East African Ethiopia (III) and North African Egypt (IV) were observed with least population differentiation to PAs from non African continental regions such as South Asia (India and Iran), South East Asia (Philippines and Vietnam), Southern Europe (Slovenia), Eastern Europe (Czech Republic) and Northern Europe (UK) irrespective of the geographical isolation suggest an "out of Africa" split. Finally, the remaining three non-African GST M1-T1 null allele frequency patterns observed from Caucasian (V), South Asian Afghanistan (VI) and South Asian Indian Andhra Pradesh (VII) were geographically distributed in different continental regions such as Asia, Europe and America with the exception of Africa suggest an "other than Africa" split among the 45 GAHPs in the scattered plot.

Discussion
Understanding the genetic variation and diversity among the geographically assorted human populations (GAHPs) is an interesting topic in population genetics with a wide range of neutral genetic markers and adaptive markers being employed to uncover the patterns of genetic diversity [58]. To address the extent of diversity in allele frequency distribution among populations from different ethnicity, region, country or continent is difficult, nevertheless understanding this phenomenon is inevitable [8,11,16,17] and recent advances in the molecular techniques excel the perspective of inter-individual genetic variations in GAHPs. Allele frequencies of  Table 1. large number of neutral markers or of even few candidate markers that duplicate or decay to favor new environments and lead to rapid adaptations are often used for investigating the patterns [58]. The paradigm of allele frequency among the populations holds the key to unlock the existing problem of inter-individual genetic variation in xenobiotic metabolizing enzymes (XMEs) and in particular the decay or null allele frequency of Glutathione-S-transferase's classes such as Mu 1 (GST M1) and Theta 1 (GST T1), which are considered as the major risk factor for various diseases including several types of cancers [8][9][10]. Therefore, the present investigation analyzed the pattern for GST M1-T1 null allele frequency among GAHPs using a phylogenetic approach. A set of 20 different continental region PAs (Table 2) and 45 GAHPs (Table 3) were recruited for GST M1-T1 null allele frequency data from 38 previously reported works and genomic data of Indian Gujarat PA in this study (Table 1) and the respective phylogenetic trees (Figs. 1 and 2) have been constructed by UPGMA method based on Nei's (1972) standard genetic distance (D ST ) with clusters more than 50% of 1000 bootstrap replicates obtained by Felsenstein (1989) program [52,54,56]. In addition to the ancestral origin or genetic affinity based clusters for GST M1-T1 null allele frequency demonstrated in the phylogenetic trees, a positive correlation between the genetic distance and geographical distance were analyzed for the effect of isolation in population differentiation by distance in a scattered plot (Fig. 3). Indeed, the observations from phylogenetic trees and scattered plot of different PAs constructively reveals the findings of seven probable patterns for GST M1-T1 null allele frequency among the GAHPs in concordance to the reports of archeological signatures, ancient gene flows and sex-specific components [59][60][61][62].
The genetic affinity and geographical distribution of 20 different continental regions that included 45 GAHPs investigated in this study (Figs. 1-3) revealed the findings of an allele frequency pattern for GST M1-T1 null genotypes among Namibia (South Africa), Nigeria (West Africa), Cameroon (Middle Africa), Gujarat (South Asian Indian) and Xhosa tribe (South Africa) for the first time. We report here, the findings of Xhosa allele frequency (I) with major genetic affinity towards populations from Africa (Namibia, Nigeria, Cameroon) as an "Africa" split pattern for GST M1-T1 null allele frequency in agreement to the reports of linkage disequilibrium computed for loss of variants in GST classes by Polimanti et al. (2013). The observations in phylogenetic trees (Figs. 1 and 2) and scattered plot analysis (Fig. 3) demonstrated the findings of another three patterns such as East Africa Zimbabwean allele frequency -II in population from India (South Asia), East Africa Ethiopian allele frequency -III in populations from Iran (South Asia) and Somalia (East Africa) and North Africa Egyptian allele frequency -IV in populations from Slovenia (Southern Europe), Czech Republic (Eastern Europe) and UK (Northern Europe) for GST M1-T1 null genotypes. The findings of Ethiopian -III and Egyptian -IV allele frequency pattern for GST M1-T1 null genotypes in this study are in concordance to the earlier reports of genome wide diversity study in the Levant by Haber et al., (2013), who found two major groups with one close to Africans and Middle Easterners and the other closer to modern day Europeans [61]. Further, the findings of these Zimbabwean, Ethiopian and Egyptian patterns from African populations with high genetic affinity towards non-African populations for GST M1-T1 null allele frequency have been reported as an "Out of Africa" split in this study in corroborate to the findings of Templeton (2002), who reported the out-of-Africa theory of migration and the ancestral root of allele frequency admixture [63].
The allele frequency of population from South Asian Afghanistan with high genetic affinity to majority of European PAs investigated in this study (Tables 2 and 3 and Figs. 2 and 3) has been reported as pattern -V for GST M1-T1 null genotypes in accordance to the reports of various authors [60,62,64]. Further, population in Pakistan (South Asia) has been reported with Afghanistan (South Asia) pattern for GST M1-T1 null allele frequency (Table 3) in this study, though it was found with genetic affinity to PAs from Mongolia (East Asia), Europe (South, East and West) and Andhra Pradesh (South India) in corroborate to the earlier reports of Templeton (2002), who stated the findings of considerable overlap among East Asians, Europeans and South Indian populations [64]. Moreover, the pattern of allele frequency from Caucasians (Americans and Canadians) found among East Asians (Fig. 1) in this study has been identified as pattern -VI for GST M1-T1 null allele frequency and reported for the first time. Finally, the allele frequency from South Indian Andhra Pradesh PA was found with least genetic distance (Table 3) to populations from Pakistan (South Asia), Vietnam (South East Asia) and Brazil (South America) irrespective of the phenomenon of population differentiation by geographical isolation [52] and has been reported as pattern -VII for GST M1-T1 null genotypes (Figs. 2 and 3) among the 45 GAHPs in this study. These observations of South East Asian and South American PAs with the null allele frequency pattern from South Indian Andhra Pradesh PA are in agreement to the reports of agro-pastoral system in South India that acted as agricultural center and source of dispersion to lineages from different preexisting populations [60]. Furthermore, the reported East Africa patterns from Zimbabwe (II), Ethiopia (III) among India, Iran (South Asia) and South Asian pattern from South Indian Andhra Pradesh (VII) among Vietnam (South East Asia) populations in this study for GST M1-T1 null allele frequency are in concordance to the reports of migration pattern of Homo sapiens from East Africa with the demographic expansions by Field and Lahr (2006), who investigated the geographic information systems during oxygen isotope stage 4 [62]. GST M1-T1 null allele frequency from South East Asian PAs has been reported as the complex admixture of Zimbabwe (II), Ethiopia (III) and Andhra Pradesh (VII) patterns in this study. Finally, the scattered plot analysis (Fig. 3), clearly demonstrates the findings of allele frequency patterns from South Asian Afghanistan -V, Caucasian -VI and South Indian Andhra Pradesh -VII as an "Other than Africa" split among 45 GAHPs for GST M1-T1 null genotypes with respect to their geographical distribution. This observation of other than Africa split in this study has been reported here in agreement to the concepts of later migration of the populations in regions other than Africa [60,64]. In conclusion, the data of seven patterns for GST M1-T1 null allele frequency from Xhosa tribe (I), Zimbabwe (II), Ethiopia (III), Egypt (IV), Afghanistan (V), Caucasian (VI) and South Indian Andhra Pradesh (VII) reported in this study compare constructively with the earlier studies that suggested the PAs of relatively recent origin show comparatively small genetic differences and high genetic affinity among them [11,46,52,58]. Findings of these seven patterns (I-VII) for GST M1-T1 null allele frequency reported here, would shed some light to address the missing link in most of the genomic epidemiological studies that lacks conclusive risk association [9,16,17]. The "Africa" (I), "Out of Africa" (II, III and IV) and "Other than Africa" (V, VI and VII) split among the 45 GAHPs reported in this study have to be explored further to rationalize the GST M1-T1 null allele's frequency patterns in world populations.