Contrasting genetic diversity and structure among Malagasy Ralstonia pseudosolanacearum phylotype I populations inferred from an optimized Multilocus Variable Number of Tandem Repeat Analysis scheme

The Ralstonia solanacearum species complex (RSSC), composed of three species and four phylotypes, are globally distributed soil-borne bacteria with a very broad host range. In 2009, a devastating potato bacterial wilt outbreak was declared in the central highlands of Madagascar, which reduced the production of vegetable crops including potato, eggplant, tomato and pepper. A molecular epidemiology study of Malagasy RSSC strains carried out between 2013 and 2017 identified R. pseudosolanacearum (phylotypes I and III) and R. solanacearum (phylotype II). A previously published population biology analysis of phylotypes II and III using two MultiLocus Variable Number of Tandem Repeats Analysis (MLVA) schemes revealed an emergent epidemic phylotype II (sequevar 1) group and endemic phylotype III isolates. We developed an optimized MLVA scheme (RS1-MLVA14) to characterize phylotype I strains in Madagascar to understand their genetic diversity and structure. The collection included isolates from 16 fields of different Solanaceae species sampled in Analamanga and Itasy regions (highlands) in 2013 (123 strains) and in Atsinanana region (lowlands) in 2006 (25 strains). Thirty-one haplotypes were identified, two of them being particularly prevalent: MT007 (30.14%) and MT004 (16.44%) (sequevar 18). Genetic diversity analysis revealed a significant contrasting level of diversity according to elevation and sampling region. More diverse at low altitude than at high altitude, the Malagasy phylotype I isolates were structured in two clusters, probably resulting from different historical introductions. Interestingly, the most prevalent Malagasy phylotype I isolates were genetically distant from regional and worldwide isolates. In this work, we demonstrated that the RS1-MLVA14 scheme can resolve differences from regional to field scales and is thus suited for deciphering the epidemiology of phylotype I populations.

Introduction appeared very early during vegetative growth and the BW-tolerant cultivars, developed by the National Centre for Rural Development and Applied Research (FIFAMANOR) and distributed to farmers, became susceptible to RSSC [19].
In 2013, a wide sampling campaign was conducted in Madagascar's central highlands. A total of 1224 isolates were collected: 10% belonged to phylotype I, 72% to phylotype IIB-1 and 18% to phylotype III [19]. Only the population genetics of phylotypes IIB-1 and III were analysed, using the MLVA schemes RS2-MLVA9 for phylotype II isolates and RS3-MLVA16 for phylotype III isolates [19], derived from previously published studies [20,21]. This study revealed contrasted population structures between phylotype IIB-1 and phylotype III. Findings suggest that phylotype IIB-1 isolates, reported in Madagascar for the first time, were introduced and spread massively via latently infected potato seed tubers, whereas Malagasy phylotype III isolates appeared to be endemic [19].
The present study aims to develop and apply an optimized MLVA scheme specifically designed for phylotype I isolates in order to: (i) assess the level of genetic diversity and structure of phylotype I populations isolated in Madagascar's central highlands and lowlands, (ii) compare the population genetic structures of phylotype I with Malagasy phylotypes IIB-1 and III isolates and (iii) analyse the potential genetic links of the Malagasy phylotype I isolates with isolates from other countries.

Bacterial isolates and phylogenetic assignment
Two collections of Malagasy RSSC isolates belonging to phylotype I and maintained at the Plant Protection Platform (Saint-Pierre, Réunion, France) were used in this study. The first collection included 123 isolates from Analamanga (n = 28) and Itasy (n = 95), regions in the central highlands (mean elevation = 1057.23 m). The second collection included 25 isolates from the Atsinanana region in the lowlands (mean elevation = 28.31 m) (S1 Table). The regional difference in elevation provided the opportunity to study the influence of altitude on the genetic diversity and structure of phylotype I populations. In addition, the Analamanga and Itasy regions are in the main vegetable producing areas in Madagascar, where a devastating BW outbreak occurred in 2013. In the Atsinanana region, several cases of BW were also reported. Madagascar's main seaport is in this region (Toamasina), which allowed the study of the potential influence of frequent commercial exchanges of plant material on the genetic diversity of phylotype I isolates.
The 148 Malagasy isolates were collected from stem fragments of seven plant species (Solanum lycopersicum, S. tuberosum, S. melongena, S. aethiopicum, S. nigrum, S. scabrum, Capsicum annuum) collected from 16 fields (Fig 1) in the Atsinanana region (4 plots), the Analamanga region (4 plots) and the Itasy region (8 plots). All the Malagasy isolates were checked by multiplex PCR [29] in order to confirm their assignment to RSSC and phylotype I.
In order to identify the potential genetic relationships between Malagasy and international RSSC phylotype I isolates, 145 isolates from a representative worldwide collection (South-West Indian Ocean, Africa, Asia, Americas) kept at the Plant Protection Platform (Saint-Pierre, Réunion, France) were also included in this study (S2 Table).
All RSSC isolates were stored on Cryobank 1 microbeads (Microbank 1 , PRO-LAB DIAG-NOSTICS, Neston, Wirral, U.K.) at -80˚C. After growth on nutrient broth and Kelman media [36], 1 μl loop of a bacterial cell suspension was streaked onto agar plates to isolate colonies. Single colonies were suspended in 400 μl of sterile HPLC-grade water and used as templates for PCR amplification.
The phylogenetic assignment of Malagasy phylotype I isolates was based on the multiplex phylotype PCR [29] and the partial nucleotide sequences of the endoglucanase (egl) gene [29]. PCR amplification, sequencing and sequevar determination were performed as previously described [37]. The newly generated sequences were deposited in the GenBank database under accession numbers MN310771 to MN310886 and MW014321 to MW014325.
In a second step, we screened the reference GMI1000 genome with the Tandem Repeat Finder v4.09 (https://tandem.bu.edu/trf/trf.html) [44] to detect new VNTR loci. New VNTRs were validated by using the Phobos Tandem Repeat Finder [45]. Then, an alignment with the nine draft genome sequences was conducted to localize the VNTR and confirm whether their characteristics met the requirements described above.

Primer design and assay optimization
PCR primers were designed from the flanking regions of VNTR loci using Primer3 v4.1.0 [47]. Selection was based on the following criteria: primer size from 18 to 23 bp, annealing temperature between 60˚C and 65˚C, percentage of guanine and cytosine from 50% to 70%, and total length of PCR product between 100 bp and 500 bp. The probability of dimerization and hairpin were evaluated using the OligoAnalyzer v3.1 software package (https://eu.idtdna.com/calc/ analyzer). Oligonucleotides were synthesized by Macrogen, Inc. (Seoul, South Korea).
To perform a preliminary screening, simplex PCR were conducted to assess the amplifiability, reproducibility and polymorphism of the selected VNTR loci. We used eight phylotype I isolates (RUN0054, RUN0157, RUN0215, RUN0334, RUN0969, RUN1744, RUN1985 and RUN3014) representing several sequevars (13, 14, 15, 17, 18, 31 and 47) from a worldwide collection (French Guiana, Réunion, China, Thailand, Cameroon, Taiwan, Côte d'Ivoire). PCR amplifications were performed in 15 μl reaction volumes containing 7.5 μl Terra PCR Direct Buffer 2X (Terra™ PCR Direct Polymerase Kit, Clontech Laboratories, Inc.), 0.3 μl Terra PCR Direct Polymerase Mix-1.25 U/μl, 1.5 μl 5X Q-solution (Qiagen 1 , Hilden, Germany), 2.3 μl of a forward and reverse primer mix (2 μM each), 2.4 μl sterile HPLC-grade water, and 1 μl of bacterial suspension as a template. PCRs were carried out in a GeneAmp PCR System 9700 Thermal Cycler (Foster City, CA 94404, USA) under the following conditions: an initial denaturation step at 98˚C for 2 min, 30 cycles of denaturation at 98˚C for 10s, annealing at 62˚C for 15s, extension at 68˚C for 1 min, and a final extension step at 68˚C for 30 min. Then, 6 μL of PCR product was mixed with 1 ml of loading dye solution and loaded into a 1.5% (w/v) Sea-Kem 1 LE Agarose (Lonza, Basel, Switzerland) gel for electrophoresis. Ethidium bromide was used to stain the gels and the G:BOX gel imaging system (Syngene, Cambridge, UK) enabled the visualization of the bands under ultraviolet. The molecular weights were estimated by comparison with a 100 bp DNA ladder (Promega, Madison, Wisconsin, USA). VNTR loci showing poor amplification and/or lacking diversity were removed at this step.

MLVA genotyping
A Multiplex PCR protocol was applied to analyse from 3 to 4 VNTR loci per reaction using the Multiplex PCR kit (Qiagen, Courtaboeuf, France). The primers (Table 1) were pooled in four different mixes according to their annealing temperature and the size of the PCR product.  The VNTR number range, the number of alleles per locus, the Nei's unbiased diversity index (H nb ), and the allelic richness were calculated from the genotyping data derived from the 293 isolates used in this study.

c
The number of repeats of the VNTR loci was calculated using the following formula: Each forward primer was marked with one of the fluorophore: 6-FAM, NED, PET and VIC (Applied Biosystems, Life Technologies, Saint Aubin, France). The conditions of amplification for the multiplex PCR were the same as for the simplex PCR, except the number of cycles (25 instead of 30). PCR products were diluted (at least 1:80) to avoid peak saturation. In a 2 ml Eppendorf tube, 1080 μl formamide (Hi-Di™ formamide, Applied Biosystems) was mixed with 20 μl size marker (GeneScanTM-500 LIZ 1 Size Standard, Applied Biosystems). Then, in each well of a 96-well plate, 11 μl of the mix was distributed with 1 μl of diluted PCR product. The samples were denatured at 95˚C for 5 min, cooled immediately on ice and loaded onto an ABI Prism 3130XL Genetic Analyzer for capillary electrophoresis. The reproducibility of the MLVA genotyping was checked by including 3 phylotype I control isolates in each 96-well plate: RUN0054 (GMI1000, sequevar 18), RUN0157 (PSS04, sequevar 15) and RUN3014 (TFB1.08, sequevar 31).

Data analysis
The genotyping results were analysed using Geneious v10.0.9. Fragment sizes were estimated using the third order least squares algorithm and attributed to a bin size, which takes into account small size variation due to experimental variation. To confirm the validity and repetition number for each VNTR locus, PCR products of isolates RUN0320, RUN3012, RUN3216 and RUN3277 were sequenced (Genewiz, Leipzig, Germany) and analysed using Geneious v10.0.9. These sequences were also used to check for patterns in the flanking VNTR sequences and internal repeat variations (i.e. copy homology). When a VNTR array was truncated, the VNTR number was rounded to the nearest bin whole number [48][49][50]. The combination of alleles (i.e. number of repetitions) for the 14 VNTR loci was considered as MLVA type (MT, i.e. haplotype).
In order to assess the resolution of the RS1-MLVA14 scheme, a genotype accumulation curve (GAC) was built, based on the genotyping of 291 phylotype I isolates (146 from Madagascar and 145 from worldwide). The GAC represents the power of discrimination of our set of loci in the population [51,52]. The discriminatory power of the RS1-MLVA-14 scheme was also evaluated by calculating the Hunter Gaston Discrimination Index (HGDI) [53]. It was compared to the recently developed MLST-7 scheme [37], by using 94 worldwide isolates (S1 and S2 Tables) representing 9 sequevars of phylotype I and selected to maximize the phylogenetic diversity, as well as the diversity of country, host and date of isolation.
Using Arlequin v3.5.2.2 [54], the genetic diversity of Malagasy isolates was evaluated by calculating several indices: Nei's unbiased estimates of genetic diversity (H nb ), number of haplotypes, number of alleles, number of private alleles. The allelic richness and private allelic richness were evaluated using HP-RARE [55]. The intra-population diversity indices (H nb , number of haplotypes, mean number of alleles, number of private alleles, allelic richness and private allelic richness) provide information on the level of genetic diversity in a population, the population's epidemiological profile (epidemic or endemic), and the mode of dispersion of the inoculum (stochastic dispersion or dissemination by infected plant material, etc.).
For the robustness of analyses, we considered that a minimum number of samples was required (size n � 14) to form a population. A population is defined a priori by a group of individuals that could exchange genes. Different parameters, such as plant host, altitude and collection area were considered for the analysis of the genetic diversity indices and to define putative populations. The F ST and R ST index calculated with Arlequin (10 000 permutations) provided information on the similarity or differentiation between populations due to the presence or absence of gene flows, mainly due to migration.
We also determined a genetic structure of populations without a priori via the R package Bios2mds [56]. This package is used for conducting metric multidimensional scaling (MDS), a method that represents measurements of similarity (or dissimilarity) among pairs of objects as distances between points of low-dimensional or multidimensional space [57]. Furthermore, the genetic relationships among Malagasy isolates and between Malagasy and worldwide isolates were computed using Phyloviz (http://www.phyloviz.net/) [58]. Different minimum spanning trees (MSTs) were built with the goeBURST full MST algorithm using global optimal eBURST (goeBURST) and Euclidean distances. They revealed the level of differentiation between haplotypes: Single, Double or Triple Locus Variant (SLV, DLV, or TLV), and enabled the identification of clonal complexes (CC). We defined CC as groups of SLV haplotypes in which the founder(s) is (are) defined as the haplotype(s) comprising the largest number of SLV.

Development of the RS1-MLVA14 scheme
We developed an optimized MLVA scheme capable of characterizing phylotype I isolates with greater reliability and discriminatory power. Consequently, we were able to overcome certain issues, namely the absence of polymorphism, a low typeability rate and the presence of different VNTRs in some loci [20,59]. These problems were highlighted by our in-laboratory preliminary tests with Malagasy isolates. Based on previously identified VNTR markers [20, 22, 38] and our present genome screening, we selected a final set of 14 VNTR loci to generate the RS1-MLVA14 scheme (Table 1). VNTR markers were distributed on the chromosome (n = 4) or the megaplasmid (n = 10) (S1 Fig), 5 being intergenic and 9 being intragenic. VNTR sizes varied from 5 to 9 nucleotides and were repeated in the genomes from 1 to 25 times. In our collection of 293 phylotype I isolates, all selected VNTR loci were polymorphic. The number of alleles per locus ranged from 2 to 16 alleles and the allelic richness varied from 1.179 to 5.027. The VNTR marker GMIch_1844 showed the highest genetic diversity (H nb = 0.738), whereas the VNTR marker GMImp_1618 had the lowest (H nb = 0.027). Our screening of 114 publicly available RSSC genomes (S3 Table) revealed that the 14 selected VNTR loci were found in all the 48 phylotype I isolates (except in some incomplete genomes, where 3% of loci could not be found, as already observed [20]). In addition, 10 VNTR loci were found in the 3 phylotype III isolates and 3 to 8 VNTR loci were found in the 50 phylotype II and 13 phylotype IV isolates. Four VNTR loci (GMIch_0581, GMIch_3461, GMIch_1844, GMImp_0266) appeared to be specific to phylotype I.

Genotypic resolution of the RS1-MLVA14 scheme
The genotypic resolution of our MLVA scheme was assessed by a genotype accumulation curve (S2 Fig), which revealed that our set of loci is sufficient to discriminate between haplotypes in our collection, given that nearly 100% of the genotypes could be detected with 13 markers.
The comparative study of the discriminatory power of RS1-MLVA14 and MLST-7 typing schemes using 94 isolates representing 9 sequevars of phylotype I (S4 Table) showed that RS1-MLVA14 reveals twice as many haplotypes as MLST-7: 27 MLVA Type (MT) instead of 14 Sequence-Type (ST). The discriminatory power is 1.38 times higher with the RS1-MLVA14 than with the MLST-7 (HGDI = 0.793 vs 0.574). RS1-MLVA14 was able to split some sequevars more broadly than MLST-7 (S4 Table). However, it should be noted that both MLST-7 and RS1-MLVA14 schemes were unable to subdivide sequevars I-13, I-14, I-16, I-34, I-46, each of which were represented by only a single ST and a single MT.

RS1-MLVA14 revealed genetic diversity among Malagasy phylotype I populations and was discriminative at the field scale
The RS1-MLVA14 scheme was applied on 148 Malagasy isolates. The 14 VNTR loci were amplified from all isolates except the GMIch_3461 locus, which was not amplified, despite several assays in two isolates (RUN0306 and RUN0307). Nonetheless, the latter was retained for the analysis of diversity (H nb , allelic richness, private allelic richness), genetic structure (F ST , R ST , MDS) and the genotype accumulation curve (GAC). Overall, 31 haplotypes were identified, including two major haplotypes: MT007 and MT004, which represented 30.14% and 16.44% of the isolates, respectively (S5 Table). The frequency of the other haplotypes varied from 0.68% to 5.48%. Regarding the host of isolation (S3 Fig), the majority of haplotypes (70.97%) were isolated from one host: 32% haplotypes were isolated from S. lycopersicum, 32% from C. annuum, 18% from S. aethiopicum, 14% from S. tuberosum and 4% from S. melongena. However, some haplotypes were found on several hosts: the most prevalent haplotype MT007 was found on four hosts (S. The number of haplotypes per field varied from one to six (Fig 2). Interestingly, three haplotypes MT004, MT006 and MT007 were present at both high and low altitudes (Fig 2). The most prevalent haplotype MT007, identified in five fields (Fig 2), was found in the three studied regions (Fig 2). The haplotypes MT004 and MT006, identified in three and two fields, respectively (Fig 2), were found in the regions of Itasy and Atsinanana (Fig 2). Four haplotypes (MT015, MT020, MT023 and MT024) were present in the highland regions of Analamanga and Itasy (Fig 2). The 24 remaining haplotypes were identified in only one region: 9 haplotypes in the Atsinanana region, 8 haplotypes in the Itasy region and 7 haplotypes in the Analamanga region (Fig 2).
The overall genetic diversity of Malagasy phylotype I isolates was H nb = 0.226. However, there were marked differences between the three regions, with a H nb value of 0.109, 0.318, and 0.430 for Itasy, Analamanga and Atsinanana regions, respectively. The average allelic richness over loci and the average private allelic richness over loci also differ between the three regions, 1.54 and 0.05 in Itasy, 2.32 and 0.49 in Analamanga, and 2.98 and 1.28 in Atsinanana (S6 Table).

RS1-MLVA14 revealed genetic structure among Malagasy phylotype I populations and showed the singularity of the most prevalent genetic cluster
The MDS representation showed that the Malagasy haplotypes could be grouped into two clusters with a probability of 67.5% (Fig 3). Cluster 1 and cluster 2 included 25 and 6 haplotypes, respectively. Haplotypes were assigned to these clusters with high probability (90.9%). Axis 1 and Axis 2 explained 81.7% and 9.2% of the variance, respectively (Fig 3).
Interestingly, the MDS and MST representations were congruent with the phylogenetic assignment (i.e. sequevar) of the isolates based on egl partial sequences. Indeed, cluster 1 (MCC1, MCC2) grouped the isolates belonging to the sequevar 18, while cluster 2 gathered isolates belonging to sequevar 33 (MCC4) and sequevar 46 (MCC3). Moreover, MCC1 was the only CC which included isolates isolated from both highlands and lowlands. The remaining CCs comprised isolates isolated from the highlands only (MCC2, MCC4) or from the lowlands only (MCC3) (Fig 4). All CCs included isolates collected from several hosts except MCC3, for which isolates were only collected from C. annuum (S3 Fig). A global MST was built to display the genetic relationships between phylotype I isolates from Madagascar (146 isolates, 31 haplotypes) and from worldwide isolates (145 isolates, 76 haplotypes) (Fig 5). Only the seven worldwide haplotypes, which had genetic links with the 31 Malagasy haplotypes, are shown in Fig 5. The global MST revealed 4 clonal complexes. Strikingly, Malagasy haplotypes in cluster 1 (MCC1, MCC2), corresponding to sequevar 18, appeared genetically distant from worldwide phylotype I isolates. Indeed, the major Malagasy MCC1 differed to international haplotypes by at least ten loci (except one strain from Guadeloupe, MT059, which differed by 2 loci). The Malagasy minor MCC2 appeared closely related to MCC1 and unrelated to other worldwide haplotypes. In contrast, Malagasy haplotypes in cluster 2 (MCC3 and MCC4, which correspond to sequevars 46 and 33, respectively) appeared related to other worldwide haplotypes. Indeed, the three Malagasy haplotypes (MT010, MT011, and MT014) belonging to MCC3 were very closely related to haplotypes from Mayotte and Côte d'Ivoire (single-locus variant). Thus, they were grouped in the same CC. Interestingly, the haplotypes from Madagascar (MT010, MT011 and MT014) and Mayotte (MT081) were isolated from the same host species,

PLOS ONE
C. annuum. The two Malagasy haplotypes, MT021 and MT025, belonging to MCC4 were included in a CC, which gathered isolates from Rodrigues, Mauritius and Réunion. Interestingly, within this CC, the haplotype MT025 isolated from S. lycopersicum was found both in Mauritius and in Madagascar. S. tuberosum was also the host plant shared by the haplotypes collected in Madagascar (MT021) and Mauritius (MT025, MT094).

Discussion
A growing interest in the use of MLVA on plant pathogenic bacteria for studying population biology has been observed in recent years [4,11,[13][14][15][16][17]. As far as the RSSC is concerned, MLVA schemes have been applied to the global surveillance of phylotype III [38], to trace potential sources of contamination by phylotype IIB-1 in England and Ethiopia [21, 60]. Very recently, they have been used to study the genetic diversity and structure of phylotypes I and  IIB-1 populations in Uganda [59]. The study of RSSC populations (phylotypes I, II and III) collected in the central highlands of Madagascar in 2013 was the first project of such magnitude (1224 isolates collected) [19]. Remarkably, this work revealed contrasting epidemiological patterns between phylotype IIB-1 and phylotype III populations. The population biology of Malagasy phylotype I had not yet been studied due to the absence of reliable MLVA markers to amplify phylotype I isolates [19]. The objective of the present study was to: develop and apply an optimized MLVA scheme to study the population genetics of Malagasy phylotype I; analyse the genetic relationships with worldwide phylotype I; and compare the epidemiological situation of phylotype I with Malagasy phylotypes IIB-1 and III.

RS1-MLVA14 scheme, a genotyping tool adapted to study the population genetics of phylotype I, both at field and global scales
By using previously identified VNTR markers [20,22,38], as well as by screening genomes for new VNTR markers, we developed an optimized MLVA scheme adapted to RSSC phylotype I isolates. The 14 selected VNTR markers were all present in phylotype I, to a lesser extent in phylotype III and rarely found in phylotypes II and IV. This is consistent with the species delineation of phylotypes I and III (belonging to R. pseudosolanacearum) compared to the phylotype II (R. solanacearum) and phylotype IV (R. syzygii) [27,28]. The RS1-MLVA14 scheme showed a very high typeability because only one VNTR could not be amplified from 2 out the 293 phylotype I isolates tested. We assessed whether the RS1-MLVA14 scheme improved genotypic resolution compared to MLST-7 [37]. MLST is the gold standard for epidemiological surveillance and outbreak investigations of bacterial diseases. The value of the discriminatory power of the RS1-MLVA14 scheme (HGDI = 0.793) was higher than for the MLST-7 scheme (HGDI = 0.574). This revealed greater genetic and haplotypic diversity and highlighted the interest of this new MLVA scheme. VNTR loci with low diversity values are useful to establish phylogenetic relationships. Those with a high diversity have a strong discriminatory power [61]. VNTR are supposed to have high mutation rates, which increases the likelihood of homoplasy [62]. Another point is that molecular samples with lower values of HGDI may indicate high levels of homoplasy. However, low values may also be due to low mutation rates [62]. The RS1-MLVA14 scheme combines VNTR loci with variable diversity values and retains the phylogenetic signal, since our MDS and MST analyses showed that the isolates were grouped according to their phylogenetic assignment (sequevar). These results suggest that the probability of homoplasy is low. They support previous studies, such as the work on phylotype III of the RSSC [38] or other plant pathogens [3,17]. Lastly, we showed that the RS1-MLVA14 scheme can be used to explore the genetic diversity at the field scale (several haplotypes were disclosed in single fields), as well as at a global scale. Thus, RS1-MLVA14 is well suited to deciphering the epidemiology of phylotype I populations.

RS1-MLVA14 unveiled contrasting genetic diversity and epidemiology among Malagasy phylotype I populations
The analysis of phylotype I populations revealed contrasting genetic diversity depending on the three Malagasy regions. Genetic diversity is greater in the lowlands than in highland areas. In the lowlands, the Atsinanana region imports huge amounts of vegetables from other Malagasy producing areas, such as Ambatondrazaka or wholesale markets in Antananarivo, the Malagasy capital [63]. In addition, this region has the Toamasina seaport, where there is a great deal of trade in plant material with foreign countries, which could be responsible for the introduction of contaminated agricultural products in this lowland region. It might also explain why the level of genetic diversity of phylotype I isolates is greater.
Our findings showed a lower genetic diversity in phylotype I populations in the central Malagasy highlands. Some haplotypes were found in different fields in the same region (for example MT007 was detected in plots 7, 9 and 12 in the Itasy region), which suggests possible field contamination from exchanges of infected plant material. Indeed, producers do not always have access to healthy seeds [64]. Farmers often produce their own seeds or exchange seeds with other farmers, with no sanitary guarantee. Trade in agricultural products might encourage the propagation of haplotypes between regions. Most vegetables from the Itasy region are sold in the capital, Antananarivo (Analamanga region), at Madagascar's wholesale markets, where large volumes of agricultural products are traded [64].
Another interesting point shown by our study is that Malagasy phylotype I populations (sequevars 18, 33 and 46) are genetically differentiated depending on the elevation. Remarkably, a similar situation was observed in China, where the genetic structure of phylotype I isolates (for sequevars 13, 14, 15, 17, 34, 44, 54 and 55) was associated with elevation [65]. Sequevar 18 seems to be adapted to both warm and cool temperatures in Madagascar, since it was isolated in the lowlands at sea level in the Atsinanana region (temperature range is from 17˚C to 30.1˚C), as well as in the highlands at an altitude of up to 1500m in the Analamanga region (from 8.9˚C to 26.6˚C) and the Itasy region (from 9.1˚C to 28˚C) [66]. Interestingly, sequevar 31 isolated in the lowlands and up to altitudes above 1000m in Réunion [37], as well as in Côte d'Ivoire [67], exhibited a similar environmental distribution. So far, sequevar 18 has always been reported from lowlands in the Americas [68][69][70], Africa [67], Asia [71][72][73], Oceania [74] and on islands in the South-West Indian Ocean [37,75]. In contrast, even though a very limited number of isolates was used in our study, sequevar 33 was only isolated at moderate and high elevations (above 900m) and sequevar 46 was only isolated in lowlands (below 300m). Thus, in addition to the studies in China [65], Réunion [37] and Côte d'Ivoire [67], our study in Madagascar shed new light on the ecology of phylotype I isolates, which are often reported as only being composed of tropical and subtropical isolates adapted to lowland areas. Altogether, the ecology of phylotype I isolates differs from that of phylotype IIB-1 and phylotype III isolates, which are considered as cold-tolerant and adapted to highland areas [19,23,24,76,77]. Further research and massive sampling of phylotype I isolates would certainly provide a better understanding of the differences in the prevalence of sequevars according to elevation and the factors determining their differential environmental adaptation.
Our study of population structure showed that Malagasy phylotype I isolates were distributed in two genetic clusters, which displayed different epidemiological features. A striking result is the singularity of the most prevalent Malagasy isolates belonging to sequevar 18 (corresponding to cluster 1). Indeed, despite the fact that sequevar 18 is distributed worldwide [37,[67][68][69][70][71][72][73]75], our study revealed no genetic links between Malagasy isolates and worldwide isolates (South-West Indian Ocean, Africa, Americas, Asia isolates). The closest haplotype (one strain) was a double-locus variant from Guadeloupe. Cluster 1 could be derived from the evolution of an ancient introduction of isolates that evolved locally. This would explain the differentiation from worldwide isolates, with short-distance dissemination restricted to Madagascar. This work must be continued by integrating more Malagasy and global phylotype I isolates to further our understanding of the singularity of isolates belonging to cluster 1. In contrast, cluster 2, which included the least prevalent isolates in Madagascar (sequevars 33 and 46), showed genetic links with isolates from South-West Indian Ocean islands (Mayotte, Mauritius, Rodrigues and Réunion). So far, apart from the recent introduction of phylotype I via rose cuttings in the Netherlands [78], no long-distance dissemination of phylotype I isolates has been reported. Only the spread of phylotype IIB isolates via bananas, potato tubers and geranium cuttings [79][80][81] has been clearly documented. Interestingly, based on the egl sequence analysis, the phylotype I isolates introduced in the Netherlands were assigned to sequevar 33 [82], the sequevar that is likely to have been disseminated between Madagascar and other South-West Indian Ocean islands. Moreover, the egl sequences from the Dutch isolates appeared 100% identical to egl sequences from isolates from Madagascar, Mauritius [37], Rodrigues [37,83] and India [84]. This highlights the phylogenetic links between isolates from distant geographical areas. As far as the sequevar 46 is concerned, it has now been reported in Madagascar and Mayotte (South-West Indian Ocean) [75], as well as in more geographically distant areas, such as Côte d'Ivoire [67] and Myanmar [85]. All these data strongly support the theory of the global dissemination of phylotype I, which is probably linked to human activities and the transport of contaminated material [86].

Comparative epidemiology of Malagasy phylotype I, IIB-1 and III populations
Our study showed that the level of genetic diversity of Malagasy phylotype I populations was globally low (H nb = 0.226). This result was surprising since, according to McDonald and Linde [1], phylotype I is considered to have a high evolutionary potential due to its natural ability to transform and recombine [37,86,87]. The level of genetic diversity of Malagasy phylotype I populations was quite similar to that of Malagasy phylotype IIB-1 populations (H nb = 0.19) and clearly lower than that of Malagasy phylotype III populations (H nb = 0.40) [19]. As determined for Malagasy phylotype IIB-1 [19], the Malagasy phylotype I haplotypes appeared to be closely related genetically and gathered into clonal complexes. Some haplotypes were shared between fields in the same agroecological region and even between agroecological areas (at low and high elevations). In contrast to the endemic nature of phylotype III [19], these results suggest that the phylotype I isolates have a similar epidemic population pattern to phylotype IIB-1 isolates.
In Madagascar's central highlands above 1000 m elevation, phylotype I populations cooccurred with populations of phylotype IIB-1 and III [19]. In addition, as for phylotypes IIB-1 and III [19], phylotype I populations were isolated from many solanaceous crops (S. lycopersicum, S. tuberosum, S. melongena, S. gilo and C. annuum), and weeds (S. nigrum, S. scabrum). Weeds play an important role. They often remain in the soil after harvesting and harbour pathogenic populations, which survive in the environment as a result [24,88,89]. Our data supports this because although our weed sampling was very limited, RS1-MLVA14 did not reveal specific haplotypes on weeds. However, in the central Malagasy highlands, phylotype I populations were the least prevalent (10% instead of 18% for phylotype III and 72% for phylotype IIB-1) [19]. This observation could be explained by the fact that phylotype IIB-1 populations are reported to be better adapted to low temperatures compared to phylotype III populations and even more so compared to phylotype I populations [76,77,90]. Interestingly, a comparative pathogenicity test of Malagasy phylotypes I, IIB-1 and III isolates on eight potato varieties showed that all isolates were strongly pathogenic at tropical lowland temperatures, but that IIB-1 isolates were the most aggressive [18]. Similar comparative pathogenicity tests should now be performed on different hosts at cold temperatures to determine whether temperature influences the capacity of the three phylotype isolates to co-occur in the same cropping areas and cause disease.
In conclusion, during this study, we developed an optimized MLVA scheme dedicated to phylotype I populations. We showed that the RS1-MLVA14 scheme is highly resolutive at global, regional and field scales, which makes it suitable for epidemiological studies. This work on Malagasy phylotype I represents a first step. Our research was limited to three vegetable growing regions in Madagascar. A broader study has been launched with a wider spatial scale, including all the vegetable cropping areas in Madagascar. The goal is to further our understanding of the migration routes and population biology of phylotype I in Madagascar and between Madagascar and other countries.
Supporting information S1  Table. F