Molecular epidemiology of Paracoccidiodes spp. recovered from patients with paracoccidioidomycosis in a teaching hospital from Minas Gerais State of Brazil

Introduction Paracoccidioidomycosis (PCM) is caused by several species of the Paracoccidioides genus which can be differentiated by interspecific genetic variations, morphology and geographic distribution. Intraspecific variability correlation with clinical and epidemiological aspects of these species still remains unclear. This study aimed to sequence the loci GP43, exon 2 and ARF of 23 clinical isolates of Paracoccidioides spp. from patients in the Southeast Region of Brazil. Methodology and main findings GenBank was used to compare the present (23) with previous described sequences (151) that included ARF and GP43. It was identified a high polymorphism rate among the 23 isolates in comparison to the other 151. Among the isolates, 22 (95.66%) were S1/P. brasiliensis and 1 (4.34%) was identified as PS2/P. americana. A total of 45 haplotypes were found as follows: 19 from S1/P. brasiliensis (13 from the present study), 15 from P. lutzii, 6 from PS2/P. americana (1 from the present study), 3 from PS3/P. restrepiensis and 2 from PS4/P. venezuelensis. Moreover, exclusive haplotypes according to clinical origin and geographical area were found. S1/P. brasiliensis (HD = 0.655 and K = 4.613) and P. lutzii (HD = 0.649 and K = 2.906) presented the highest rate of polymorphism among all species, from which 12 isolates of the present study were clustered within S1b/P. brasiliensis. The GP43 locus showed a higher variability and was found to be the main reason for the species differentiation. Conclusions The results herein decribed show a high intraspecific genetic variability among S1/P. brasiliensis isolates and confirm the predominance of this species in the Southeast region of Brazil. The finding of exclusive haplotypes according to clinical origin and geographical area would suggest correlation between the molecular profile with the clinical form and geographic origin of patients with PCM.

Introduction Paracoccidioidomycosis (PCM) is caused by a thermodimorphic fungi from the Paracoccidioides genus and it is considered one of the most prevalent endemic-systemic mycoses in Latin America [1,2]. Nearby, 80% of all the PCM cases from Latin America are diagnosed in Brazil where it represents the 8 th cause of death among other chronic infectious diseases. PCM fill all criteria to be considered as a neglected nosological entity [3,4].
Classically, PCM presents two different clinical forms: acute/subacute which is commonly described in children and young adults who present severe systemic and progressive symptoms related to mononuclear phagocytic system and the skin; and the chronic ones which represents 80-90% of all cases and occurs mainly in male adults who present pulmonary and mucosal commitment [5,6]. Since 1989 when it was described the first case of PCM associated to HIV infection, over 200 patients with this coinfection have been reported. These patiens exhibited a faster development of a more aggressive clinical display, acute and chronic symptoms overlapping, and frequent systemic dissemination. Several experts have suggested a third clinical form associated with immunodeficiency, therefore named "mixed" PCM [7][8][9].
PCM taxonomy is constantly evolving as new technologies and approaches are introduced. As of now, P. brasiliensis complex is composed by at least 5 genetically isolated groups: S1/P. brasiliensis sensu stricto with strong population structure in Brazil and harboring S1a and S1b, two distinct populations that are found in the Midwest, Southeast and South regions of Brazil [10,11]; PS2/P. americana can be found in Venezuela and Southeastern Brazil; PS3/P. restrepiensis and PS4/P. venezuelensis overlap their distribution over Colombia and Venezuela [12][13][14]; and last but not least P. lutzii that comprises a single species found in Equador and Central Western/Amazonian regions of Brazil [12,13,15,16]. Divergency times among species pairs range from 0,03 to 33 million years and may be explained by geographical overlapping [10]. In addition, Brazil and Venezuela might harbor more than one species of Paracoccidioides which opens the possibility for gene exchange between those species [17] and therefore the emergence of new admixed species as recently described [18].
Molecular characterization of clinical isolates of Paracoccidioides spp. allows a better understanding of correlations involving species/genotype, geographical distribution, clinical phenotype, host preference, reinfection frequency, pathogen evolution and therapeutic response [19,20].
The present study aimed to characterize, using phylogenomic and population genetics tools, the cryptic species of Paracoccidioides clinical isolates recovered from patients with PCM diagnosed and treated at the teaching hospital from Universidade Federal do Triângulo Mineiro, Minas Gerais State of Brazil. This region is considered an PCM endemic zone with reports of both P. brasiliensis and P. lutzii complexes.

Ethics statement
All samples used in this study were retrieved from the culture collection of the Mycology Laboratory of the Triângulo Mineiro Federal University. All data were deidentified. Institutional human research ethics approval for the study was obtained from the Research Ethics Board of the Triângulo Mineiro Federal University (protocol CIBIO/UFTM 50, 18/06/2015). The need for consent was waived by the Ethics Board.

Clinico-epidemiological data collection
The University Hospital of the Federal University of Triangulo Mineiro serves population of 27 municipalities that make up the macro Southern Triangulo Mineiro region and has an estimated coverage of one million inhabitants, which corresponds to about 11% of the total population of the State of Minas Gerais (Brazilian Institute of Geography and Statistics, 2019) [21]. Retrospectively, the medical records of patients with a diagnosis of PCM confirmed by culture and direct examination and admitted at the Infectious and Parasitic Diseases ward from 2008 to 2019 were reviewed. The most relevant demographic, epidemiological, clinical and outcome data in the context of PCM were obtained.

Fungal isolates
Twenty three clinical isolates identified by conventional mycological methods were included in this study [22]. Isolates were obtained from the following clinical sources: 01 from cerebrospinal fluid (CSF), 4 from lymph nodes, 3 from bronchoalveolar lavage (BAL), 2 from skin fragment, and 2 from peripheral blood, 2 from lung fragment and 9 from oral lesion (S1 Table). Isolates were maintained in Fava-Netto agar tubes incubated at 37˚C for yeast growth and cultivated every 30 to 60 days [23].

DNA extraction and loci selection
Genomic DNA was extracted from yeast cells using phenol-chloroform-isoamyl alcohol method as described previously [24]. DNA quantification and integrity were measured by photometry in NanoDrop Lite, Thermo Scientific [25]. The loci ARF and GP43 were chosen for identification of Paracoccidioides spp. isolates since they have a most complete databank. The present 23 isolates' genome were added to the previous 151 sequences in GenBank (https:// blast.ncbi.nlm.nih.gov/Blast.cgi) (S1 Table).

Haplotype analysis
Genetic polymorphism analysis was performed by concatenated sequences of both ARF and GP43 loci. PCR products were purified using PCR purification KIT (250)-28106 (QIAGEN) and submitted to Sanger sequencing. Resulting sequences were edited using the Chromas-pro v. 1.7.6 software available at http://technelysium.com.au/ChromasPro.html. In addition, only sequences with a Phred quality score > 20 were included to limit the possibility of incorrect nucleotide bases incorporation to 1 in 100 (99% accuracy). Consensus sequences were obtained from forward and reverse readings using Chromas-pro 1.7.6.

Phylogenetic analysis
The phylogenetic analysis was performed in MEGA 7.0 [29,31]. Consensus sequences of the isolates and those obtained from GenBank were aligned using the Clustal W2 algorithm available at https://www.ebi.ac.uk/Tools/msa/clustalw2/ [32]. The allelic sequences for each isolate were concatenated, and the evolutionary relationships, with 1000 bootstrap replicates, were inferred by construction of an unrooted maximum likelihood (ML) phylogenetic tree. In addition, the data set was subjected to neighbor joining (NJ), maximum parsimony (MP), and the unweighted pair group method with arithmetic mean (UPGMA) analysis [29]. The especies of Paracoccidioides were confirmed according to phylogenetic clustering with the reference type strains of each especie by construction of an unrooted maximum likelihood (ML) phylogenetic tree.
Nucleotide diversity DNASP 5.10 [33] was used to calculate the extent of DNA polymorphism, including the number of polymorphic sites (S), nucleotide diversity (p), number of haplotypes (h), haplotype diversity (Hd), and average number of nucleotide diferences (k). The neutrality test Tajima's D, Fu & Li's D � , Fu & Li's F � , and Fu's Fs were also calculated. Negative or positive results of these tests provide evidence of purifying or balancing selection, respectively. The Watterson estimator (theta) method was used to determine the degree of recombination within the population using DNASP 5.10. The presence of recombination was also checked by measuring the phylogenetic compatibilities of nearby polymorphic sites along single and concatenated sequences in SPLITSTREE v. 4.13.1 (https://mybiosoftware.com/splitstree-computephylogenetic-networks.html) [34]. This analysis was performed by applying the uncorrected (observed, 'P') distances in characters transformation using the neighbor-net algorithm [34]. The pairwise homoplasy index (PHI) was used to assess statistical significance for recombination.

Statistical analyses
Statistical analyses were performed using DNAsp 5. 10

Clinical and epidemiological results
Of the 23 patients with PCM evaluated, 18 (78.3%) were men, with a mean age of 37.4 years. The chronic form was characterized in 10 (43.47%) of the cases, the acute/subacute form in eight (34.83%) and the mixed form in five (21.74%) cases ( Table 1). The diagnosis of PCM was confirmed in 23/23 (100%) of the cases by culture and additionally in 11/23 (47.83%) by histopathology, and in 12/23 (52.1%) by KOH direct examination. Patients with the acute and mixed forms were treated with amphotericin B followed by itraconazole, whereas patients with the chronic form received itraconazole. Of the 23 patients, 8 (34.78%) were HIV infected. Among patients co-infected with HIV, most were male 6/8 (75%), the average age was 32.9 years, five (62.5%) presented the mixed clinical form, five (62.5%) were originated from the Minas Gerais State (MG), seven (87.5%) presented S1/P. brasiliensis infection and the most common outcome was cure (75%).
The haplotype with the highest number of isolates was H2, which contains 54 S1/P. brasiliensis isolates from different states of Brazil and Argentina. These isolates had different origins (clinical, armadillo and soil). The reference isolate B17 (Pb18) of S1b/P. brasiliensis was also included in this haplotype. The haplotype with the second highest number of isolates was H41  with 24 S1/P. brasiliensis isolates of clinical and armadillos origin from Venezuela and Colombia (Fig 3).
Most haplotypes have isolates from clinical origin 41/45 (91%). Haplotypes with isolates exclusively from armadillo (H21/isolate B8) and soil (H2 and H44) were also found, within the latter haplotype (isolates V1 and V2) only described in Venezuela. Isolates from penguin feces and dog food were described only in one haplotype each, H22 and H15, respectively. Most 36/ 45 haplotypes (80%) are from Brazil. However, haplotypes exclusive to Argentina (H23, H24 and H43) and Venezuela (H44 and H45) were also found. The haplotype distribution by states isolates applying the neighbournet algorithm by means of the uncorrect-P parameter model to evidence the diversity and branching ambiguities attributable to recombination events. The observation that isolates are linked to each other by multiple pathways and are forming an interconnected network rather than a single bifurcating tree is suggestive of recombination. The phi test for recombination implemented in the software SplitsTree showed significant evidence (p<0.0001) for recombination in the ARF+GP43. In the single locus evaluated are demonstred the allele types (ATs). In the concatenated sequences are demonstred the haplotipes (H) found. The especies are diferenciated by colors as follow: red PS2/P. americana, blue PS3/P. restrepiensis, pink PS4/P. venenzuelensis, yellow S1/P. brasiliensis and green P. lutzii. The exclusives ATs and H of a specific species are shown by the color indicative of the species. The exclusives ATs and H of isolates from present study are marqued by a frame.
https://doi.org/10.1371/journal.pntd.0009956.g001  The five especies of Paracoccidioides are marked with different colors. The haplotypes are described according to the names or numbers of isolates which compose the haplotype. When the number of isolates is less than or equal to three all isolates are described in parentheses. When the number is more than three is cited the number of of Brazil shows that Minas Gerais presented the highest number of haplotypes (11), of which nine were exclusive to this state (Fig 3).

Discussion
The epidemiological characterization of species of the Paracoccidioides genus includes the genetic analysis of the isolates which pointed out the potential relation with their geographic distribution, clinical presentation, therapeutic response and preference for hosts, among others, as it has been described for others fungi that cause disease in humans [51].
In the epidemiological context and natural history of PCM, important factors such as frequent migration of individuals for work reasons and the long latency period of the infection make it difficult to define the exact location where they are infected and the correct association of the identified genotype with the geographic location where the patient gets sick [14,52].
Herein 23 isolates of Paracoccidioides spp. obtained from patients with various clinical forms of PCM were evaluated. The molecular characterization of these isolates through the sequencing of the loci GP43 and ARF, showed predominance of S1/P. brasiliensis (22/23). This finding is in agreement with other authors who reported the high occurrence of this phylogenetic species in South America and Brazil, mainly in the Southeast and South states, where PCM is highly endemic [4,[52][53][54].
In accordance with the taxonomic evolution, S1/P. brasiliensis was proposed to be constituted by two lineages, S1a and S1b [11], which was later endorsed by other authors [18,52,55]. The S1b lineage was associated with most isolates of Paracoccidioides spp. and it is considered by different authors as the most recombinant and variable lineage [11,56,57].
The analysis of the GP43 and ARF sequences allowed to observe that the majority of the isolates in the present study presented complete identity or a few nucleotides difference from isolates of S1b type. The predominance of S1b lineage helps explain the high genetic variability found that even with a small number of isolates it was higher than that observed for the isolates from other Brazilian states and Latin American countries where the PCM is endemic [11,52,55].
The isolate identified as PS2/P. americana (isolate 13, H9) was recovered from a lung fragment culture from a 42-year-old male patient, born in Ribeirão Preto, São Paulo State, recently diagnosed with HIV infection and who simultaneously presented PCM in its chronic clinical form. Isolates of this species have been previously described in different states of Brazil, Venezuela, Uruguay and Argentina and most of them were obtained from patients with the chronic form of PCM [14,39,52,58]. PS2/P. americana was recovered from armadillos (Dasypus novemcinctus) in different places [18,35] and from a female Doberman dog with generalized lymphadenomegaly [14,58,59]. It has been suggested that this species could be less virulent than S1/P. brasiliensis, because it was recovered from patients with PCM mild clinical forms [60]. However and in line with other authors, some evidence suggestthat PS2/P. americana can present virulence similarly to that described for S1/P. brasiliensis [58,61].
The genomic sequencing of PS2/P. americana exhibited a lower frequency of introgressions and genetic exchanges when compared to other species of the P. brasiliensis complex [55]. Isolates 769 and 133 clustered polyphyletically with PS2/P. americana isolates (Fig 3). About the isolates that composed the haplotype followed by haplotipe number in bold (H), country from which isolates are originated, states from origin, type of clinical presentation and/or source of isolate. The countries where the isolates were recovered are abbreviated according to the alfa-2 code of ISO 3166 ±

PLOS NEGLECTED TROPICAL DISEASES
isolate 769, genomic peculiarities and allele sharing between S1/P. brasiliensis and PS2/P. americana were described suggesting that it can correspond to a hybrid or to the presence of ancestral polymorphisms in its genome [13,14]. This fact could be favored by the overlapping of ecological niches between the two species and by the current evidence of sexual reproduction in the genus Paracoccidioides [62,63].
When evaluated by sequences of ARF+GP43 [58], the isolate 769 was also grouped with S1/ P. brasiliensis isolates and by use of different loci with P. lutzii isolates [13]. Moreover, the analysis with another set of isolates, this isolate was grouped with P. lutzii using the ARF sequences and with P. brasiliensis using the GP43 sequences [27]. Taken together, these findings may partly explain why this isolate clustered polyphyletically into PS2/P. americana in the present study.
The GP43 locus presented the highest polymorphism and the best ability to discriminate among the species, with most ATs unique to a single species. This locus encodes the glycoprotein GP43, considered the immunodominant epitope used for the diagnosis of PCM and the pivotal molecule for the identification of Paraccoccidiodes spp. [64,65].
The PbGP43 gene is composed of two exons separated by a 78 bp intron and apparently has a single copy [65]. The variability in the sequence of exon 2 has been described in different isolates since the 1990s, when the first correlations among these sequences with the origin of the isolates and virulence in animal models were described for the first time [65,66]. Studies of this locus with more isolates and with the insertion of more locus allowed to subsidize the definition of the concept of phylogenetic species of Paracoccidioides [12,57].
Isolates MS1, MS2 and 5979 grouped with P. lutzii isolates in a polyphyletic manner (Fig 3) similar to that observed by Macedo et al. 2019 [58]. The genetic diversity of these isolates could be explained by the lack of consensus on their identification and for sharing ATs between P. lutzii and S1/P. brasiliensis as evidenced by other authors and herein corroborated [16,27,39,58].
The sharing of ATs could also help to explain the non-differentiation of the species and/or their polyphyletic origin in some of the analyzes carried out. The possibility of disagreement in the separation by phylogenetic species is described in the genealogic concordance for phylogenetic species recognition (GCPSR). This technique was used to differentiate species from several fungal genera and to define that different genetic loci may present different genealogies within the same species due to a recombination process. However, the genealogy of the different loci must be concordant within the same species due to effects of genetic isolation and drift [13,15,68]. Despite a small number of isolates herein evaluated, it was possible to identify a group of three isolates with different haplotypes (H7, H8 and H10), monophyletic grouping, clinical origin and from patiens from Minas Gerais State. Previously, morphological differences between Paracoccidioides species and their corresponding geographic area had been already described [27,50,56,57]. Additionally, differences in virulence between species [69,70] and the presence of distinct genetic profiles with variable capacity to infect mice have also been reported [71]. However, the correlation between clinical isolates and the geographic origin of patients must be interpreted with caution, as these isolates may have been acquired in regions different from those where the patient originated or was diagnosed with PCM [72].
Although the geographic region of the patients with PCM herein evaluated is borderline to the areas where P. lutzii has been described, none of the isolates was characterized as such. Similarly, in the Southeast region of Brazil, only one isolated of this species was reported among 46 clinical and four environmental isolates evaluated [52]. Another study with 40 clinical and environmental samples from different geographic origin in South America found P. lutzii in 20% of the samples, recovered from patients from the Midwest region of Brazil [39], where case-series reports of this species are sparse and still incipient regarding the geographic mapping of its distribution. The natural habitat of this species has not been well elucidated and has not been isolated from armadillos yet [53,[73][74][75].
Despite the small number of isolates included, the data herein presented confirm the predominance of S1b/P. brasiliensis in Minas Gerais as already described for other states from the Southeast Region of Brazil. Moreover, a significant intraspecific variability and a potential correlation of the molecular profile with the clinical form and geographic origin of patients with PCM can be observed.
Thus, the evaluation of a larger number of isolates together with the analysis of sequence data deposited in GenBank from other Brazilian regions and Latin American countries where PCM is endemic can contribute to expand the plotting of the geographic distribution of Paracoccidiodes spp. and to elucidate the hypotheses about the correlation of their molecular profile with the PCM clinical forms, virulence, therapeutic response, host preference, among others.
Supporting information S1