Genetic variation analysis and relationships among environmental strains of Scedosporium apiospermum sensu stricto in Bangkok, Thailand

The Scedosporium apiospermum species complex is an emerging filamentous fungi that has been isolated from environment. It can cause a wide range of infections in both immunocompetent and immunocompromised individuals. We aimed to study the genetic variation and relationships between 48 strains of S. apiospermum sensu stricto isolated from soil in Bangkok, Thailand. For PCR, sequencing and phylogenetic analysis, we used the following genes: actin; calmodulin exons 3 and 4; the second largest subunit of the RNA polymerase II; ß-tubulin exon 2–4; manganese superoxide dismutase; internal transcribed spacer; transcription elongation factor 1α; and beta-tubulin exons 5 and 6. The present study is the first phylogenetic analysis of relationships among S. apiospermum sensu stricto in Thailand and South-east Asia. This result provides useful information for future epidemiological study and may be correlated to clinical manifestation.


Introduction
The Scedosporium apiospermum species complex is a group of filamentous fungi that have been reported in cystic fibrosis (CF) patients [1]. It can be isolated from the environment, especially in human-impacted areas such as playgrounds, industrial and agricultural zones [2]. In Thailand, S. apiospermum has been reported in brain abscesses of near-drowning and renal transplant patients [3,4], and S. boydii infections have also been reported in brain tissue of renal transplant patient [5]. Additionally, two Swiss tourists who nearly drowned in the tsunami disaster in Thailand were found to be infected with S. apiospermum [6].
In order to overcome the difficulties in identifying Scedosporium species by routine microbiological methods, several molecular techniques have been proposed such as quantitative real-time PCR (qPCR), PCR-based reverse line blot (PCR-RLB), and loop-mediated isothermal amplification (LAMP) [11,12]. Additionally, globally standardized genotyping of S. apiospermum and S. boydii, the Multi-Locus Sequences Typing (MLST) scheme, was developed by Bernhardt et al. [13]. The MLST scheme amplifies sequences at five genetic loci-actin (ACT), calmodulin exons 3 and 4 (CAL), the second largest subunit of RNA polymerase II gene (RPB2), ß-tubulin exons 2-4 (BT2), and manganese superoxide dismutase (SOD2) and has been found to have strong repeatability [13][14][15]. The allele types (ATs) and sequences types (STs) numbers of the consensus MLST scheme can be obtained through the Fungal MLST Database (http://mlst.mycologylab.org/). In our previous study, we investigated the spatial distribution of the S. apiospermum species complex in soil in Bangkok, Thailand. We found that the S. apiospermum species complex is widespread in soil across Bangkok and detected predominance of S. apiospermum sensu stricto (72%) [16].
Here, we continue the study by considering the genetic variation and relationships among S. apiospermum sensu stricto isolated from soil in Bangkok. The data may be used for further epidemiological research, for which it is important to recognize different strains and subspecies.

Materials and methods Strains
We used 48 isolates of S. apiospermum sensu stricto from our stock collection. Each isolate originated from soil and had previously been typed by PCR of the beta-tubulin gene (exons 5 and 6) [16]. Each isolate was labeled as TMMI (Department of Microbiology and Immunology, Faculty of Tropical Medicine, Mahidol University yeast and moulds culture collection) with a unique identification number (Table 1).

DNA extraction
DNA was extracted with an E.Z.N.A. Fungal DNA mini kit (Omega Bio-tek). The DNA samples were quantified and the quality was checked with a NanoDrop 2000 spectrophotometer (Thermo Fisher, Wilmington, DE, USA) and stored at −20˚C until further use.

Molecular biology technique
For PCR, sequencing and phylogenetic analysis we used eight genes: ACT, CAL, RPB2, BT2, SOD2, internal transcribed spacer (ITS), transcription elongation factor 1α (TEF-1α), and 28S large subunit ribosomal RNA (LSU). PCR amplification of eight gene regions was carried out with the specific primer pairs listed in Table 2. Each PCR reaction mixture was performed in 25-μl final volume containing: 0.5 μM of each primer, KAPA 2G Fast HS ReadyMix PCR kit with loading dye (KAPA Biosystems, USA), nuclease-free water and genomic DNA. PCR amplifications were carried out in a T100 Thermal Cycler (Bio-Rad) according to the following protocol: an initial step of 96˚C for 6 min, followed by 35 cycles of 94˚C for 1 min, an annealing temperature that specific with each gene ( Table 2) for 1 min and 72˚C for 45 s and a final extension step of 72˚C for 10 min. Then, 5 μL of the PCR products were electrophoresed in a 1.5% agarose gel containing SERVA DNA Stain G (SERVA Electrophoresis GmbH, Germany) in 1X TBE buffer, and photographed using a Gel Doc XR+ system (Bio-Rad). Each PCR product was purified using a FavorPrep TM GEL/PCR Purification Mini Kit (Favorgen Biotech Corporation, Taiwan) and sequenced using gene-specific both forward and reverse primers at the AITbiotech Pty Ltd (Singapore). The retrieved sequence files were analyses using BioEdit software (http://www.mbio.ncsu.edu/bioedit/bioedit.html) and compared with existing sequences in GenBank using BLASTn (http://blast.ncbi.nlm.nih.gov/Blast.cgi).  Table 3). Genotype variation of each gene was analyzed by MLSTest v1.0.1.23 software, which was downloaded from http://ipe.unsa.edu.ar/software [21]. The ATs number was created by  [20] using the same software and allele profiles were used to assign the STs number. A phylogenetic network of each loci was performed with the neighbor-net algorithm by SplitsTree4 and downloaded from http://www.splitstree.org/ [22]. Sequences were concatenated by the following order respectively; ACT, CAL, RPB2, BT2, SOD2, ITS, TEF-1α and TUB. The best model of evolution for concatenated data set were selected from the Bayesian Information Criterion (BIC) in MEGA7 [23]. Model with the lowest BIC score was chosen to construct a maximum likelihood phylogenetic tree. A phylogenetic tree of the 48 aligned sequences excluding gaps and missing data for the heuristic search was obtained by applying the neighbor-joining method to a matrix of pairwise distances estimated using the maximum likelihood approach based on the best model in MEGA7 [23]. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Codon positions included were 1st+2nd+3rd+Noncoding. There were a total of 4,841 positions in the final dataset. A bootstrap analysis was conducted with 1000 replications and bootstrap values ! 50% were shown above branches.

Nucleotide sequences accession numbers
The generated nucleotide sequences were deposited in GenBank under accession numbers and listed in Table 1.

Results
PCR amplification of the eight genes was successful for all strains, with a single band investigated on gels after electrophoresis. The BLASTn algorithm was used for sequence similarity searching in the NCBI database. Sequence-based identities with a cutoff of ! 99% were considered significant [24,25]. Genotypic variation profiling of S. apiospermum sensu stricto was generated and showed 37 good different STs from 48 strains ( Table 4). The numbers of ATs variation at each loci were 8 (ACT), 7 (CAL), 11 (RPB2), 12 (BT2), 18 (SOD2), 15 (ITS), 6 (TEF-1α), and 8 (TUB) ( Table 5). The number of polymorphisms, typing efficacy, and power of discrimination (95% confidential interval) were calculated (Table 3).
We found diverse genetic relationships among the genotyped variants of 48 S. apiospermum sensu stricto after analyzing each gene with a neighbor-net algorithm (Fig 1). The best model for concatenated data set (ACT, CAL, RPB2, BT2, SOD2, ITS, TEF-1α and TUB, respectively) analyses was HKY+G+I (HKY: Hasegawa-Kishino-Yano; +G: Gamma distribution; +I: invariable sites) and the BIC score was 18004.65987. Therefore, the maximum likelihood phylogenetic tree of concatenated data set was created based on the HKY model [26]. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.0500). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 48.7606% sites) (Fig 2).
To compare the genetic relationship of strains in Thailand with other regions of the world, the ACT, CAL, RPB2, BT2 and SOD2 sequences from previous studies in France, China and Japan were downloaded from GenBank. Data for these five genes were concatenated. TN93+G +I model (TN93: Tamura-Nei) was chosen according the best model of evolution analyses (the BIC score was 17282.89945). Then, a phylogenetic tree constructed by maximum likelihood analysis based on the TN93 model [27]. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.1587)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 42.7147% sites). As a tree (Fig 3), 82 nucleotide sequences comprised 2,958 positions were involved. S. apiospermum sensu stricto was strongly clustered together among Thai, French, Chinese and Japanese isolates (strongly supported by bootstrap value of 100%) and were subdivided into two groups (Group I and Group II).

Gene ATs Frequency
In terms of discriminatory power (DP) evaluation, SOD2 showed the highest DP (0.932); TUB showed the lowest DP, which was 0.688. Additionally, there were 37 STs, which could be grouped; each isolate had one ST type except ST3, ST9, ST11, ST13, ST18, and ST21, which had 2, 5, 2, 4, 2, and 2 ST types, respectively. In our study, SOD2 provided the highest number of alleles (18), and another study showed a similar high number of alleles for SOD2 [13]. Moreover, we combined five loci (ACT, CAL, RPB2, BT2 and SOD2) for objective genetic relationship analysis between Thailand (representing South-east Asia), China and Japan (Asia), and France (Europe): (i) to assess the relationship between clinical and environmental strains of S. apiospermum; and (ii) to assess the global variation between S. apiospermum strains. Our results detected a close relationship between the environmental strains from Thailand and the clinical strains from France [15], China and Japan [28]. Therefore, French, Chinese and  2  2  TMMI005, TMMI009   3  2  TMMI006, TMMI067   4  4  TMMI010, TMMI057, TMMI060, TMMI061   5  14  TMMI011, TMMI016, TMMI017, TMMI020, TMMI034, TMMI037, TMMI038,  TMMI039, TMMI042, TMMI043, TMMI044, TMMI046, TMMI050, TMMI055   6  1  TMMI025   7  1  TMMI048   8  1  TMMI053 https://doi.org/10.1371/journal.pone.0181083.t005   Japan isolates originated in Thailand also. These data suggest that S. apiospermum sensu stricto isolates retrieved from different regions and different countries shared a close genetic relatedness. One limitation of our study was a lack of clinical isolates from Thailand to compare with our environmental strains. We hope that our data may be useful for other researchers in future study. Interestingly, TEF-1α used as a marker in an MLST scheme for S. aurantiacum (http://mlst.mycologylab.org/) including other filamentous fungi such as Fusarium [29], did not use S. apiospermum or S. boydii. In our data, TEF-1α presented the lowest number of alleles (6) and the DP was also quite low (0.688). This result may explain why TEF-1α was not used for S. apiospermum or S. boydii.
In summary, we here present the first phylogenetic analysis of relationships among S. apiospermum sensu stricto in Thailand and the South-east Asian region. The results provide valuable knowledge to assist future study and perhaps link the relationships of species in clinical settings.
Supporting information S1 Table. Strains, specimens, countries and GenBank accession numbers of 34 sequences. (All sequences were using for phylogenetic analysis of the concatenated sequences of ACT, CAL, RPB2, BT2 and SOD2.) (DOCX)