Geographically Structured Populations of Cryptococcus neoformans Variety grubii in Asia Correlate with HIV Status and Show a Clonal Population Structure

Cryptococcosis is an important fungal disease in Asia with an estimated 140,000 new infections annually the majority of which occurs in patients suffering from HIV/AIDS. Cryptococcus neoformans variety grubii (serotype A) is the major causative agent of this disease. In the present study, multilocus sequence typing (MLST) using the ISHAM MLST consensus scheme for the C. neoformans/C. gattii species complex was used to analyse nucleotide polymorphisms among 476 isolates of this pathogen obtained from 8 Asian countries. Population genetic analysis showed that the Asian C. neoformans var. grubii population shows limited genetic diversity and demonstrates a largely clonal mode of reproduction when compared with the global MLST dataset. HIV-status, sequence types and geography were found to be confounded. However, a correlation between sequence types and isolates from HIV-negative patients was observed among the Asian isolates. Observations of high gene flow between the Middle Eastern and the Southeastern Asian populations suggest that immigrant workers in the Middle East were originally infected in Southeastern Asia.


Introduction
Cryptococcosis is one of the main fungal diseases in Asia due to the AIDS pandemic and is caused by members of the Cryptococcus neoformans/C. gattii species complex [1,2]. In South and Southeast Asia, the number of HIV-infected patients that annually acquire cryptococcosis is estimated to be over 140,000 [3], with the majority of cases being caused by C. neoformans var. grubii [4][5][6][7][8][9][10][11][12]. The causative agent is an encapsulated opportunistic pathogenic basidiomycetous yeast. Cryptococcosis caused by C. neoformans var. grubii has also been reported to occur in immunocompetent individuals in the Asian region, e.g. from China, Japan, Korea and Taiwan [8,[13][14][15][16]. In Vietnam, cryptococcosis in both immunocompromised and immunocompetent individuals was found to be mainly caused by C. neoformans var. grubii [17]. C. neoformans var. grubii (serotype A) has a global distribution and is found in avian excreta, especially from pigeons, and decaying wood [18][19][20][21]. The other variety, C. neoformans var. neoformans (serotype D), also has a worldwide distribution but is more frequently encountered in Europe [22,23]. C. gattii (serotypes B and C), a sibling species of C. neoformans, is associated with many tree species in tropical and subtropical regions [18,[24][25][26][27] and is a major cause of cryptococcal meningitis in immunocompetent individuals. This latter species has also been reported as a causative agent in immunocompromised individuals, particularly HIV-infected patients and solidorgan transplant patients [19,[28][29][30][31]. Since 1999, C. gattii emerged in various outbreaks, e.g. at Vancouver Island (British Colombia, Canada), the Pacific Northwest of the United States and more recently in Mediterranean Europe [28,[32][33][34][35].
Several molecular typing methods, including PCR-fingerprinting, randomly amplified polymorphic DNA (RAPD), PCRrestriction fragment length polymorphism (PCR-RFLP), amplified fragment length polymorphism (AFLP), microsatellite typing, multilocus microsatellite typing (MLMT) and multilocus sequence typing (MLST), have been developed for the investigation of the epidemiology of species belonging to the C. neoformans/C. gattii species complex [30,31,[36][37][38][39][40][41]. MLST is a typing system that has several advantages over other commonly used typing methods, because the technique is highly reproducible and MLST sequence data can be stored in internet databases, such as http://www.mlst. net/ and http://mlst.mycologylab.org. Thus, the data are portable and exchangeable between laboratories. Recently, seven unlinked genetic loci, i.e. CAP59, GPD1, IGS1, LAC1, PLB1, SOD1 and URA5, that represent housekeeping genes, virulence factor coding genes and the intergenic spacer of the ribosomal DNA have been selected for MLST analysis of the C. neoformans/C. gattii complex by the International Society of Human and Animal Mycoses (ISHAM) working group on ''Genotyping of C. neoformans and C. gattii'' [39].
Previous studies that used MLST and AFLP to investigate the population structure of C. neoformans var. grubii showed a correlation between both methods and grouped the isolates into three genetically different subgroups, named AFLP1/VNI, AFLP1A/VNII/VNB and AFLP1B/VNII [36,39,42,43]. The AFLP1/VNI and AFLP1B/VNII genotypes occur globally and form a monophyletic cluster, whereas the AFLP1A/VNB genotype occurs in Southern Africa, especially Botswana, but has also been reported from Brazil [36,43]. Previously, recombination has been observed within subpopulations in Botswana, but at the global scale reproduction is mainly clonal [43]. MLST has also been used to trace the putative origin of Cryptococcus populations [31,32,34]. Simwami and coworkers (2011) showed a correlation between MLST types among Thai and African C. neoformans var. grubii isolates that supported the hypothesis of long-distance dispersal from the African continent to Asia within the last 5,000 years [41].
In the current study, MLST was employed to determine the genetic diversity and epidemiological relationships of a collection of clinical and environmental C. neoformans var. grubii isolates that originated from various geographic locations in Asia, including countries from East, South/Southeast Asia and the Middle East.

MLST determination
DNA from each isolate was amplified by PCR in 25 ml reaction volumes for each of the seven MLST loci using the primers and protocols described in Table 1. Each amplicon was subsequently sequenced using the BigDye v3.1 Chemistry kit (Applied Biosystems, Foster City, CA) using the same primers as used to obtain the amplicons. Sequencing reaction products were purified with Sephadex G-50 Superfine columns (Amersham Biosciences, Piscataway, NJ) and a MultiScreen HV plate (Millipore, Billerica, MA). An ABI 3700XL DNA analyzer (Applied Biosystems) was used to determine the forward and reverse DNA sequences. Consensus sequences were manually edited using SeqMan v8.0.2 (DNASTAR, Madison, WI) and were subsequently aligned with MEGA v5.05 (www.megasoftware.net). Allele Types (ATs) were assigned to each of the seven loci, resulting in a seven-digit allelic profile for each isolate. The allelic profiles were then defined as Sequence Types (STs) according to the ISHAM MLST consensus scheme for C. neoformans/C. gattii species complex (http://mlst. mycologylab.org). All sequences have been deposited in GenBank under the accession number KC529683 to KC533008 (Table C in  Supplementary Tables S1) and novel ATs have been added to http://mlst.mycologylab.org/. AT's analysed in Simwami et al. (2011) [41] had the indels removed in order to make them compatible for the then-current MLST dataset. However, the current MLST scheme (http://mlst.mycologylab.org) includes indels, and therefore we realigned the entire set of sequences from the latter study in our analyses. This required the reassignment of a number of AT's from the dataset of Simwami et al. (2011) [41] (Table A in Supplementary Tables S1). For the global comparison we used data from http://mlst.mycologylab.org and recent reports by Cogliati and colleagues (2013) [44] and Mihara and colleagues (2012) [16].

Nucleotide diversity
The determination of the extent of DNA polymorphisms, such as haplotype diversity (H d ), nucleotide diversity (p), number of polymorphic sites (S), average number of nucleotide differences (k) and Watterson's estimate of the population scaled mutation rate per sequence (h s ), were calculated using DnaSP v5.10 (http:// www.ub.edu/dnasp/) [47]. Tajima's D, Fu & Li's D*, Fu & Li's F* and Fu's F s , tests for neutrality, were also calculated using DnaSP Table 2. Distribution of sequence types (STs) of C. neoformans var. grubii isolates among different countries.

Country
Sequence types (STs) Total

Recombination
Two common statistics for multilocus linkage disequilibrium analysis, the index of association (I A ) and rBarD, were estimated using Multilocus v1.3b (http://www.agapow.net/software/ multilocus/) [52]. These statistics test the null hypothesis of free recombination (i.e. no linkage disequilibrium). The observed values of I A and rBarD were compared against the expected values obtained with 1,000 randomized data sets. Using these criteria, p,0.05 indicates that the null hypothesis of free recombination should be rejected and, consequently, indicates the presence of substantial clonal reproduction. In order to do so, the Pairwise Homoplasy Index (PHI) test implemented in SplitsTree v4.0 (http://www.splitstree.org/) [53] and the pairwise linkage disequilibrium analysis implemented in DnaSP v5.10 using Fisher's exact test were used to detect recombination events among populations using separate alignments for all seven MLST loci. We also used the reticulated network analysis using the CASS algorithm [54] to detect recombination among the Asian population using alignments of concatenated sequences for all seven MLST loci. The genome of C. neoformans var. neoformans strain B3501 ( = CBS 6900) was used as an outgroup for the CASS network analysis.

Phylogenetic relationships
The minimum spanning tree that represented the comparison between the original sources of C. neoformans isolates and their allelic profiles was generated by Phyloviz v1.0 using the goeBURST algorithm (http://goeburst.phyloviz.net/) [55].
Phylogenetic analyses were performed using the Neighborjoining method with 1,000 bootstrap replicates implemented in MEGA v5.10. The substitution model of this analysis was the uncorrected genetic distances (p-distance) model. neoformans var. grubii isolates from Hong Kong as described previously [8]. All other data were taken from the Pan et al., 2012 study [8]. Recently published epidemiological cutoff values (ECVs) for C. neoformans var. grubii of AMB, 5FC, FLU, ITR, VOR and POS were implemented in this study. The ECVs for C. neoformans var. grubii of 5FC and FLU is 8 mg/ml, 0.25 mg/ml for ITR, VOR and POS, and 1 mg/ml for amphotericin B, respectively [56,57].

Statistical analyses
Analysis of molecular variance (AMOVA), implemented in Arlequin v3.5.1.3, was used to analyze the hierarchical structuring of genetic variation among Asian C. neoformans populations using the concatenated MLST sequences. Significance was assessed by computing distance pairwise matrices from the MLST sequences using 10,000 permutations [58].

MLST analysis
The 476 C. neoformans var. grubii isolates in our dataset were obtained from 228 HIV-positive patients, 134 HIV-negative patients, and 47 from individuals with unknown HIV status ( Table A in Supplementary Tables S1), as well as 67 isolates from avian droppings from Chiang Mai, Thailand and Tokyo, Japan ( Table B in Supplementary Tables S1). All isolates possessed mating-type a and serotype A (i.e. were aA). The genetic diversity of the 476 C. neoformans var. grubii isolates as assessed by MLST revealed 28 sequence types (STs) ( Most isolates from the Chinese, Hong Kong, and Japanese populations belonged to ST5. Fourteen percent of the Thai isolates (n = 30) belonged to this ST, which was here significantly rarer than in China (Chi-square p,0.0001). ST4 and ST6 were found to be the major MLST types in Thailand, while ST93 was dominant in India and Indonesia (Chi-square p,0.0001). Fifteen isolates from the Middle East were distributed among 10 STs (Table 2). Most STs from this area consisted of a single isolate, except ST5 and ST31. Most isolates of these latter two STs were obtained from immigrant workers that originated from India, Indonesia, Philippines, and Sudan (Table A in Supplementary  Tables S1). Among the 409 clinical isolates, 24 STs were identified and 16 of them contained clinical isolates only (Tables 3-4). STs 4-6 and 93 were the predominant STs and accounted for 83 (20.3%), 142 (34.7%), 78 (19.1%) and 52 (12.7%) isolates, respectively (Chisquare p,0.0001). The remaining STs consisted of few isolates, except ST31 and ST77 that consisted of 10 (2.4%) and 14 (3.4%) isolates, respectively ( Table 3). The majority of isolates from HIVnegative patients belonged to ST5 (n = 92; 68.7%), while the majority of isolates from HIV-positive people belonged to STs 4, 5, 6 and ST93 that accounted for 72 (31.6%), 27 (11.8%), 68 (29.8%) and 41 (18%) isolates, respectively (Table 3). We investigated how genetic variation was structured across the Asian clinical isolate dataset (i.e. isolates from HIV-positive, -negative and unknown HIV-status patients) using AMOVA. This analysis showed that allelic variation within populations (88.38%) was higher than that observed among populations (11.62%) (p,0.0001) ( Table 4). When we compared clinical isolates from HIV-positive patients categorized into three regions, East Asia, Middle East and South/Southeast Asia, the variance within populations was approximately 92% (p = 0.065460.0025), indicating that significant variation in MLST genotypes occurred among individuals within each regional population group. In contrast, MLST genotypic variation within populations of isolates from HIV-negative patients showed less genotypic differences (36.80%) than the variance observed among populations (63.20%) (p,0.0001). Chi-squared tests showed a relationship between HIV status and STs (p,0.0001; Cramer's V = 0.474) (Sheet S1). A binary logistic regression test showed that ST5 is associated with HIV status (p,0.0001) and the standardized (adjusted) Pearson residuals showed that ST5 correlated to isolates obtained from HIV-negative patients (Sheet S1). Of the 92 ST5 isolates, almost all were sampled from East Asia, including China (n = 70; 76.1%) and Japan (n = 20; 21.7%).
Only clinical and environmental isolates from Chiang Mai, Thailand and Tokyo, Japan could be compared as no environmental isolates could be studied from the other regions. Sixtyseven environmental isolates from Chiang Mai, Thailand, and Tokyo, Japan, belonged to 12 STs from which five STs (i.e. ST141, ST176, ST188, ST190 and ST193) contained environmental isolates only. STs 4, 5 and 6 were the predominant ST types found among the environmental isolates (Chi-square p,0.0001). The majority of environmental isolates from Chiang Mai, Thailand, belonged to ST4 (n = 22; 38.6%) and 6 (n = 18; 31.6%) (Chi-square p = 0.0042), while almost all Japanese environmental isolates belonged to ST5 (n = 8; 80%) (Fisher's exact test p = 0.064) ( Table 5).

Association between sequence types and geographic origin of Asian C. neoformans isolates
In order to determine the distribution of STs in different geographical locations, minimum spanning trees and phylogenetic analyses were undertaken based on allelic profiles using the goeBURST algorithm and analysis of concatenated sequences with the Neighbor-joining algorithm, respectively (Figure 1 and 2). Three linages were observed in the minimum spanning tree. Group 1 contained mostly isolates of STs 5,186,193 and 194 that originated from China, Hong Kong, and Japan, and also contained 30 out of 222 (13.6%) isolates from Thailand. Group 2 contained mostly isolates from Thailand (n = 184; 82.9%). The predominant STs in this group were ST4 and 6. Group 3 comprised most of the Indian and Indonesian isolates that belonged to STs 31, 77 and 93. (Figure 1A).
Phylogenetic analysis of the Asian isolates also showed three clusters. Cluster I/VNI contained three major STs (ST4, 5 and 6) that contained C. neoformans isolates from China, Hong Kong, Japan, and Thailand. Most Indian and Indonesian isolates occurred in cluster II/VNI, whereas cluster III/VNII contained one ST (ST40) with only isolates from India ( Figure 2). The Middle East isolates showed a more scattered distribution ( Figure 1A and 2). Two isolates of ST31 came from Qatar, but they were isolated from Indian and Sudanese immigrant workers suggesting that their geographical origins lie elsewhere. Among the clinical isolates, the minimum spanning tree and Maximum Likelihood tree showed an association of the predominant STs, including STs 4, 6 and 93, with C. neoformans isolates from HIVpositive patients, while ST5, one of the predominant STs, contributed mainly to isolates from HIV-negative patients ( Figure 1B and 2; Figure A in Supplementary Figures S1).
The global C. neoformans var. grubii MLST dataset that contained 179 isolates originating from Africa (n = 45), North/South America (n = 31), Asia (n = 55) and Europe (n = 48) was compared using the goeBURST algorithm with the isolates from Asia. Most Asian C. neoformans isolates clustered together in one group, but a few Asian isolates showed a scattered distribution. Two clusters of African isolates and one cluster of European isolates were observed. Some of isolates from those regions showed a scattered distribution as did the North/South American isolates ( Figure 1C). Phylogenetic analysis using Neighbor-joining showed three clades among the global C. neoformans var. grubii isolates. Clade I/VNII contained isolates from Africa, North/South America, Asia and Europe, clade II/VNB contained 17 STs from African isolates and one ST comprising European isolates, and almost all Asian STs occurred in clade I/VNI that also contained isolates from other global regions (n Africa = 27, n North/South America = 23, n Europe = 45) ( Figure 3).

Nucleotide diversity
Nucleotide sequences of all seven loci studied (CAP59, GPD1, IGS1, LAC1, SOD1, PLB1 and URA5) had between 6 and 15 polymorphic sites (Table 6). Locus IGS1 had the highest nucleotide diversity (p) of 0.0045, followed by LAC1 (p = 0.0018) and GPD1 (p = 0.0014). The average number of nucleotide differences per sequence, i.e. the k-value, of most loci ranged from 0.046 to 0.867, except for locus IGS1 that had a higher k value of 3.274. Locus LAC1 showed the highest mutation rate (h s = 2.226), while the other loci had low h s values ranging from 0.890 to 2.077. The number of haplotypes (alleles) at each locus ranged from 3 for CAP59 and SOD1 to 7 for LAC1. Haplotype diversity ranged from 0.013 for SOD1 to 0.658 for LAC1. The neutrality tests, including Tajima's D, Fu & Li D*, Fu & Li's F* and Fu's F s showed significant evidence of purifying selection for all loci, except IGS1 that showed some evidence of balancing selection (Table 6). The number of polymorphisms of the concatenated sequences of C. neoformans var. grubii isolates obtained from the East Asian region, including China, Hong Kong and Japan, were lower than those from South/Southeast Asian isolates (i.e. India, Indonesia and Thailand), and those from the Middle East (i.e. Kuwait and Qatar) ( Figure 4A and Table E in Supplementary Tables S1). The highest nucleotide diversity (p = 0.002), the highest average number of nucleotide differences per sequence (k = 7.962), and the highest haplotype diversity (H d = 0.924) were found in C. neoformans isolates from Kuwait and Qatar. C. neoformans isolates from South/Southeast Asia had 75 polymorphic site (S) and 21 different haplotypes (h), and a high mutation rate per sequence (h s = 11.816). Within the South/Southeast region, haplotype diversity (H d ) of each population was almost similar, while other nucleotide polymorphism estimation values of each population, such as number of polymorphic sites (S), nucleotide diversity (p), mutation rate (h) and the average number of nucleotide differences per sequence (k), were different ( Figure 4A, Table E in Supplementary Tables S1). A significant signal of purifying selection was observed in two C. neoformans populations, namely the one from East Asia (i.e. the Japanese population) and the South/Southeast Asian one (i.e. the Indian population), whereas evidence of a balancing selection or expansion of rare polymorphisms was found in the Indonesian population (Table E in  Supplementary Tables S1). Compared to the global MLST dataset, the Asian population had lower values of nucleotide diversity (p) and haplotype diversity (H d ) of 0.0016 and 0.780, respectively, than those of the African  Table F in Supplementary Tables S1). The Asian population had lower numbers of polymorphic sites (S) and number of haplotypes (h) of 81 and 32, respectively, than those from the African populations (S = 124; h = 34), but they were higher than those from the North/South American (S = 79; h = 16) and European populations (S = 78; h = 17) ( Figure 4B and Table F in Supplementary Tables S1). Neutrality tests indicated that the variation of all populations was neutral and population sizes did not change. However, the North/South American population showed evidence for a population overdominant selection, whereas the remaining populations showed purifying selection or population expansion.
Population structure of Asian C. neoformans var. grubii Genetic differences and the level of gene flow between each of two populations from the three Asian regions studied were estimated using two statistics, F ST and Nm, using concatenated MLST sequences (4,022bp). Genetic differences of the East Asian versus the South/Southeast Asian (F ST = 0.351), and the East Asian and Middle East (F ST = 0.233) populations were higher than those between the South/Southeast Asian and Middle East populations (F ST = 0.019) ( Figure 4A, Table G in Supplementary Tables S1). High levels of gene flow, indicated by an Nm value of .1, were observed between the South/Southeast Asian population when compared to the Middle East population (Nm = 25.76), and, secondly, between the East Asian population and the Middle East population (Nm = 1.64). However, the Nm value between the South/Southeast Asian and the Middle East populations was much higher than that between the East Asian and the Middle East populations. When the MLST data of the Asian isolates were compared to those from the African, North/South American, and European continents, F ST and Nm estimates were between 0.193 and 0.222, and between 1.76 and 2.09, respectively ( Figure 4B, Table H in Supplementary Tables S1), indicating the presence of slight genetic differences, but the occurrence of significant gene flow between the Asian population with those from Africa, North/ South America, and Europe.
Clusters of Asian C. neoformans var. grubii populations were estimated using different numbers of populations that ranged from K = 2 to K = 10 using Structure. The Evanno method implemented in the Structure Harvester showed the highest delta K, an ad hoc statistic, was produced at K = 3 ( Figure BA in Supplementary Figures S1). This implicates that K = 3 seems a good estimate for the actual number of populations included in this study, thus suggesting that three real genetic population clusters occur among the Asian C. neoformans var. grubii isolates that do not fully corroborate the geographically identified populations. The distribution of these three populations differed between the countries ( Figure 5A). Almost all cryptococcal isolates from China, Hong Kong and Japan, as well as some isolates from Thailand, belonged to population I. The Thai and part of Indonesian isolates formed  Table 5. Distribution of sequence types (STs) of C. neoformans isolates from clinical and environmental samples from Thailand and Japan. The population structure of the Asian isolates was the same as described above, but two other major populations occurred, namely an African and North/South American population, and an European one. The African and American populations were genetically diverse. Some isolates contained haplotypes occurring among isolates from Asia and Europe and a few of the European isolates shared haplotypes that occurred in isolates from other continents ( Figure 5B). However, whether these isolates represent acquisitions from the local environment, or are due to the patient traveling with an in situ latent infection is not known and requires further sampling of environmental isolates.

Recombination
The index of association (I A ) and rBarD values were estimated from the allelic data set to determine the presence of clonality and recombination among Asian C. neoformans var. grubii populations. For all isolates in the entire Asian population and in those from each region (i.e. East Asia, the Middle East and South/Southeast Asia) both I A and rBarD values strongly rejected the null hypothesis of free recombination (Table 7). However, two recombination events were observed using the pairwise linkage disequilibrium routine implemented in DnaSp (Table I in Supplementary Tables S1). One event occurred among locus GPD1 and the remaining occurred among locus IGS1. Results of the Pairwise Homoplasy Index (PHI) test (Table J in Supplementary Tables S1) showed that no recombination occurred within each locus across all Asian populations, however two recombination events were observed among concatenated sequences of isolates from East and South/Southeast Asia. CASS analysis    Figures S1) showed that no recombination occurred among concatenated sequences of Asian isolates. When allelic data that included the global MLST dataset were included, rBarD showed that the overall population genetic structure was in overall significant linkage disequilibrium. The PHI test did not detect recombination events occurring within each locus, but could detect the presence of recombination events among concatenated sequence of the global MLST dataset ( Table J in Supplementary Tables S1).

Discussion
Previous studies on the genetic structure of C. neoformans var. grubii from Thailand using MLST data showed limited genetic variation [41] with the majority of isolates belonging to STs 4, 5, and 6 (designated as ST44, 45 and 46, respectively, in the original paper by Simwami et al., 2011, Table A in Supplementary Tables S1). Two of these predominant STs (i.e. ST4 and ST6) differ only in four nucleotides at a single locus [41]. In the current study, we increased the size of the Asian MLST dataset to include nearly 500 C. neoformans var. grubii isolates originating from three broadly-defined regions, namely East Asia (China, Hong Kong and Japan), South/ Southeast Asia (India, Indonesia and Thailand), and the Middle East (Kuwait and Qatar). We found that 99.8% (n = 475) of these isolates belonged to lineage VNI, 0.2% (n = 1) were VNII and 0% were VNB. The C. neoformans var. grubii population from the East Asian region showed less genotypic variation than those from South/ Southeast Asian and the Middle East regions, and most isolates  belonged to ST5. This latter genotype was previously found to be the main ST in China, Japan and South Korea [13,14,16] and was reported previously as the MLST M5 genotype [13,14] [41], the Asian C. neoformans var. grubii population was found to be less diverse than the African, the North/South American, and the European populations. The population genetic structure of the Asian population was found to be different from the African, and the North/South American and European populations, and contained very few isolates that shared haplotypes occurring in these other populations. On the other hand, the African, North/South American, and European populations also contained some isolates that contained haplotypes occurring among Asian isolates. These findings are in agreement with previous investigations that showed a high genetic diversity of the African population, especially genotype VNB, and less genetic diversity of the Asian population. Note that the VNB lineage also contains also Brazilian isolates [36,40,41,59,60], thus additional sampling at the global scale may show a broader occurrence of this genotype.
In the current study, no evidence of recombination was detected in the entire Asian region, nor in the smaller regions, or at the global scale using the CASS -, Multilocus (rBarD) -, and PHI (in case of separated loci) analyses. These results suggest that the entire Asian C. neoformans var. grubii population is largely clonal as was previously shown for the Thai population only [41]. However, the pairwise linkage disequilibrium analysis showed a recombination event among sequences of GPD1 and IGS1 loci and the PHI test also detected recombination events among concatenated sequences in the East and South/Southeast Asia populations. This may be due to non-meiotic reproduction as previously reported [36,41,43] amongst isolates that are of the same a-mating type. However, despite this limited recombination, clonal propagation of genotypes predominated leading to a widespread occurrence and overrepresentation of clonal genotypes as has also been seen in several other pathogens, such as Neisseria meningitides, Mycobacterium tuberculosis, Fusarium oxysporum, and Leishmania tropica [61].
Our analyses revealed a significant association between predominant sequence types (STs) and their geographical origin in Asia that was not encountered before. These local geographic differences could result from different founder effects and/or regional factors, and may be due to environmental and climate differences [6,8,62,63]. Sequence type ST5 was the predominant MLST genotype found in East Asia and the North of Thailand. Due to the association of C. neoformans with birds, dispersal may have occurred via the East Asian-Australian flyway [8,64], and may contribute to the broader distribution of these genotypes. As most bird migrations generally happen twice a year depending on weather conditions, this may also contribute to the observed limited dispersal and low gene flow estimates between East Asia and Northern Thailand. Thus, it seems that Asian C. neoformans isolates efficiently reproduce clonally, and that rare recombination events may result in an increased genetic variation at some locales, e.g. due to same-sex mating or rare MATa x MATa crossings [65].
A scattered MLST distribution of the Middle East population was observed, similar to previous findings using microsatellite analysis [8]. Owing to a low number of isolates (n = 15) with ten haplotypes, the Middle East C. neoformans var. grubii population showed the highest haplotype diversity in our dataset. Seven out of these 15 C. neoformans var. grubii isolates were obtained from immigrant workers, mainly originating from South/Southeast Asia, who may have acquired the yeast in their home countries and, subsequently, carried the pathogen when moving to the Middle East region.
Cryptococcal species are known to have a high prevalence of subclinical infection due to infection in childhood [66]. Immigrants and tourists in Europe showed the same phenomenon of being infected by strains that were obtained from their home-country [31,46,67]. The observed high level of gene flow between the Middle East and the South/Southeast Asian populations supports this human migration hypothesis. Minimum spanning tree analysis showed that most isolates from India and Indonesia belonged to the same STs (i.e., ST77 and ST93) and a high level of gene flow was observed between these two populations. This observation, unfortunately, is not easy to explain and requires more sampling especially from the environment in these regions.
HIV-status, STs and geographic origins were found to be confounded. However, a significant correlation occurred between the predominant ST5 and HIV-negative patients in Asia. Thus, our study reinforced that genetic differences occur between C. neoformans var. grubii isolates from HIV-positive and HIV-negative patients in Asia [8,17,68,69]. Therefore, this finding may explain the observed correlation of MLST genotypes and HIV status in our study. To unravel the effect of geographically determined genetic variation in the pathogen and its link to HIV status, extensive sampling of environmental isolates is needed across the region in order to decouple the HIV-status of individuals from the geographical origin of the isolates. Next to possible genetically determined differences amongst Cryptococcus isolates, human factors, such as anti-interferon-c and anti-granulocyte-macrophage colony-stimulating factor autoantibodies that have been observed in Taiwanese and Thai patients to be associated with adult-onset immunodeficiency without HIV-infection [70], may contribute to the observed specific pathogen-host correlations.
Recently, epidemiological cutoff values (ECVs) have been defined for the major antifungals against C. neoformans and C. gattii [56,57]. When we used those values, the interpretation of the overall susceptibility results of all isolates, including those presented by Pan et al. [8], differed slightly from the interpretation given by these last authors. Twenty-four clinical isolates mainly from Indonesia and Thailand showed high MIC values of 5FC, FLU and VOR. As 5FC is not used in those two countries, resistance to this compound is unlikely to be induced by patient treatment and the origin of this resistance needs further studies.

Conclusion
MLST typing showed significant genotypic variation between C. neoformans var. grubii populations originating from different Asian regions. Each country had an unique distribution of STs, especially of the predominant STs. Overall, the Asian population showed limited genetic diversity and reproduction is mainly clonal when compared with data from the global C. neoformans var. grubii MLST dataset. A correlation between STs and HIV-negative status, and resistance traits was observed. A largely clonal reproduction strategy helps to maintain these regional differences that are clinically relevant due to their association with the HIVstatus of the patients that also differs between the regions studied.

Supporting Information
Sheet S1 Chi-square test of Asian C. neoformans according to sequence types (STs) and HIV status.