Figures
Abstract
Bacterial canker, caused by Xanthomonas citri pv. mangiferaeindicae (Xcm), is a disease that has a devastating impact on mango and cashew industries in many regions. Yet, despite its agricultural importance for these Anacardiaceae species, Xcm has been neglected. Little is known about its epidemiology, evolution and molecular interactions with host plants. The most relevant studies reporting its genetic structure were primarily based on amplified fragment length polymorphism (AFLP) data. This technique provides reliable assessments of the genetic relatedness among bacteria, but is limited in terms of interlaboratory comparisons. Alternative genotyping techniques are required to decipher the global epidemiology and geographic expansion of Xcm. Herein, we screened the genome of the Xcm strain CFBP1716 for tandem repeats. We developed and evaluated the performance of an optimized Multi Locus Variable number of tandem repeat Analysis (MLVA), targeting 16 tandem repeat loci primarily with large repeat units, i.e., minisatellites (MLVA-16). To achieve this, we genotyped a comprehensive collection of 152 Xcm strains, representative of the pathogen’s worldwide genetic diversity, together with some reference strains of X. citri pv. anacardii, another genetically-related pathogen of Anacardiaceae. MLVA-16 allowed us to distinguish the two pathovars. Although MLVA-16 was slightly less discriminative than AFLP, the two derived datasets were strongly correlated, suggesting that MLVA-16 provides a good phylogenetic signal. Five clusters with some geographic coherence were delineated, based on discriminant analysis of principal components. The two major clusters grouped strains from multiple geographic origins. In contrast, all strains that have emerged on mango or cashew in West Africa grouped in one cluster, which did not contain any strains of different origin. MLVA-16 represents an opportunity to improve our understanding of the structure of Xcm populations, by sharing genotyping data. The MLVA-16 data generated in this study was deposited in a dedicated online database.
Citation: Boyer K, Zombre C, Payan L, Wonni I, Pruvost O (2025) A new tandem repeat-based genotyping scheme for the global surveillance of Xanthomonas citri pv. mangiferaeindicae, an understudied bacterial pathogen of major importance to mango and cashew production. PLoS One 20(11): e0336768. https://doi.org/10.1371/journal.pone.0336768
Editor: Baochuan Lin, Defense Threat Reduction Agency, UNITED STATES OF AMERICA
Received: April 29, 2025; Accepted: October 30, 2025; Published: November 26, 2025
Copyright: © 2025 Boyer et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data are available from the MLVAbank database https://webphim.ird.fr/MLVA_Bank/Genotyping/index.php?&largeur=1920.
Funding: KB and OP report the following sources of funding: The European Union (ERDF contracts GURDT I2016‐1731‐0006632 and 2024-1248-005756) https://commission.europa.eu/funding-tenders/find-funding/eu-funding-programmes/european-regional-development-fund-erdf_en Conseil Régional de La Réunion https://regionreunion.com Centre de coopération internationale en recherche agronomique pour le développement (CIRAD) https://www.cirad.fr.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Protecting crops from pests and diseases is a priority for global food security. Pathogens not only cause significant economic losses, they also directly affect food availability, causing food insecurity [1]. Surveillance programs are essential to help mitigate these potentially serious social and economic consequences. They can provide important data on the early detection of outbreaks and guidance for disease management. Genotyping techniques targeting DNA markers or single nucleotide polymorphisms (SNPs), identified from whole genome sequencing (WGS) data, can be applied to the study of outbreak pathogen populations. Indeed, they can determine important factors to help improve disease management, such as the source populations, transmission pathways and migration distances. The epidemiology of bacterial pathogens has greatly benefited from high throughput WGS [2,3] over the last decade. However, there are still many cases where marker-based genotyping is useful. Among markers with desirable characteristics, tandem repeats (TR), assayed in a Multi Locus Variable Number of Tandem Repeat (VNTR) Analysis (MLVA) format, are simple, discriminative, robust and cheap. They can be used to (i) decipher the population biology of bacterial pathogens, especially in the case of the so-called genetically monomorphic pathogens, or (ii) provide guidance for strain selection in studies on WGS-based population genomics [4,5]. Selecting TR markers with a rate of evolution that fits the evolutionary scale under investigation is key [6]. For studies at a large evolutionary scale, MLVA typing schemes that target a large number of minisatellites (defined here as TRs with a repeat unit length ranging from 10 to 300 bp), can produce datasets with a strong phylogenetic signal [7].
Two Anacardiaceae species, mango (Mangifera indica L.) and cashew (Anacardium occidentale L.) represent major cash crops in tropical and subtropical countries. Mango world production reached approximatively 61 M tons in 2023 and ranked 6th in fruit production (https://www.fao.org/faostat/en/#data/QCL). The largest mango producers, India, Indonesia, China, Mexico and Pakistan, represent 65% of global production. Cashew world production reached approximatively 4 M tons in 2023, of which 43% came from the top 5 producers in Africa (Ivory Coast, Benin and Tanzania) and Asia (India and Vietnam). With the exception of Mexico and Vietnam, production in the major mango- and cashew-producing countries is threatened by the potentially destructive mango bacterial canker (MBC), also called mango bacterial black spot (MBS) [8–10]. The earliest reports of MBC were in India (herbarium specimens collected in Bihar) in 1881 [11] and South Africa (outbreak description) in 1915 [12]. MBC occurs in many countries in Asia, Oceania, Africa, the Indian Ocean region and North America [8,13,14].
The causal agent of MBC was first described in the early 20th century and classified in the 1970s as Xanthomonas campestris pv. mangiferaeindicae [15]. Then, with the evolution of taxonomy, it was reclassified as Xanthomonas citri pv. mangiferaeindicae (Xcm) [16,17]. Xcm is a Gram-negative bacterium that produces non-pigmented colonies, unlike most Xanthomonas species [8,18]. Xcm was first isolated from the mango tree, but was found to be pathogenic to other Anacardiaceae species by inoculation [11,19]. Xcm strains were subsequently shown to induce symptoms under natural conditions on other members of the Anacardiaceae family, including cashew, Brazilian pepper [8,14,20]. The genetic diversity among Xanthomonas strains isolated from Anacardiaceae worldwide was previously studied using several genotyping techniques, for example, using amplified fragment length polymorphism (AFLP) and a derivative of AFLP targeting an insertion sequence (IS-LMPCR). Restriction fragment length polymorphism (RFLP) has also been used to target different genomic regions, i.e., IS1595, the hrp cluster and a transcription activator-like type III effector (TALE) [16,18,21]. AFLP combines a high discriminatory power with a good phylogenetic signal and was most suited to deciphering the global Xcm genetic structure [16,22]. Among strains isolated from Anacardiaceae, other than mango, some strains were confirmed as Xcm, while others were reclassified into distinct pathovars, such as X. citri pv. anacardii (Xca) and X. axonopodis pv. spondiae [14,22]. These studies also suggested that the Xcm population structure has three genetic clusters (referred to as A, B and CD to match the early RFLP-based clustering). Among these genotyping techniques, only TALE-RFLP delineated strains of cluster CD, based on their host of origin (mango vs. Brazilian pepper) [18,22]. However, these techniques are clearly limited in terms of their capacity to analyse large strain numbers (RFLP) or to compare datasets produced in different laboratories (AFLP). Therefore, new typing techniques are required [23].
Clustered regularly interspaced short palindromic repeats (CRISPR) and tandem repeats (TR) were recently proposed as useful targets for deciphering the genetic diversity of some xanthomonads [24]. CRISPR loci are only present in some Xanthomonas pathovars [23,25,26]. In contrast, TRs, referred to as microsatellites and minisatellites according to the size of the repeats, are widely present in Xanthomonas genomes. Microsatellites (TRs < 10 bp in size) are primarily used for delineating the structure of populations at small evolutionary scales (i.e., local molecular epidemiology). However, high mutation rates make them suboptimal for assessing deep genetic relatedness among populations. Minisatellites (TRs ranging from 10 to 250 bp in size) display a less rapid evolution, compared to microsatellites. Minisatellite typing, which was previously developed for the genetically related X. citri pv. citri, showed a phylogenetic structure congruently matching that derived from AFLP and single nucleotide polymorphism (SNP) data [27,28]. Similarly, minisatellites were recently found to be useful for assessing the global genetic diversity of the cassava pathogen, X. phaseoli pv. manihotis [29]. Given the technical simplicity and portability of minisatellite typing, i.e., good interlaboratory reproducibility, availability of online access databases, such as MLVAbank [30], this technique is suitable for describing the overall structure of pathogen populations (i.e., global molecular epidemiology) [28,31]. It is superior to older typing methods, such as RFLP and AFLP.
The aims of this study were to (i) evaluate CRISPR and TR suitability for subtyping Xcm, and (ii) develop a robust MLVA scheme suitable for global studies on its epidemiology and to build a dedicated online public database. Herein, we report a new MLVA genotyping scheme targeting 14 minisatellites and two microsatellites (MLVA-16). We demonstrate its pertinence for studying the global diversity of Xcm.
Materials and methods
Bacterial cultures and media
In this study, we used 152 Xcm strains, originating from several continents and host species, and representative of its genetic and pathological diversity [14,22] (S1 Table). Solid cultures were performed on YPGA plates (yeast extract 7 g l-1, peptone 7 g l-1, glucose 7 g l-1 and agar 18 g l-1; propiconazole 20 mg l-1; pH 7.2), at 28°C. For each strain, a single colony was transferred on YPGA and grown at 28°C overnight. One µl of culture was diluted in 400 µl of 0.01M Tris buffer, pH 7.2, and stored at −30°C in deepwell microplates prior to PCR amplifications.
In silico VNTR and CRISPR mining from genomic sequences of Xcm
The CFBP1716 Xcm genome sequence (Genbank accession CP156913), herein considered our reference, together with two other high-quality genome sequences produced by our group were used to detect TR loci using Tandem Repeat Finder (TRF) [32,33]. The total length was set in a range of 50–1000 bp, the length of tandem repeats was set at ≥10 bp, and other parameters were attributed default settings. Loci with TR sequence conservation < 80% and those corresponding to TALE genes were not considered further. Genomic flanking regions (500 bp each side of the TR region) were also used to define primer pairs for PCR amplification, with the Oligo 6 software (https://www.oligo.net/). CRISPR mining was performed on the available high quality Xcm genomes [25,26] using CRISPRCasFinder with default parameter settings [34]. The high-quality genome sequences of strains CFBP9184, CFBP9185 and GXG07 (Genbank accessions CP156909, CP156905 and CP073209, respectively) were screened a posteriori upon availability [25,26].
MLVA genotyping
Genotyping was performed using a comprehensive strain collection representing the currently known genetic diversity worldwide [16] and three reference strains of Xanthomonas citri pv. anacardii, another X. citri pathovar that is pathogenic to Anacardiaceae (S1 Table). Primer pairs targeting single-locus alleles were used in a multiplex PCR format, using the Clonetech Terra PCR Direct Polymerase Mix (Takara Bio, CA, USA). For each primer pair, the reverse primer in the PCR mix was 5’-labeled with one of the following fluorescent dyes: 6-FAM, Yakima Yellow, ATTO 565 or Dragon Fly Orange (Eurogentec, Liege, Belgium). PCR mixes contained 7.5 µl of 2x Terra Buffer (containing a hot-start Taq DNA polymerase), 1.5 µl of 5x Q-Solution (Qiagen, Courtaboeuf, France), 0.2 to 0.6 µM of each primer (S2 Table), 1 µl of purified culture and RNase-free water to yield a final volume of 15 µl. PCR amplifications were performed in a Veriti thermocycler (Applied Biosystems, Courtaboeuf, France), under the following conditions: 2 min at 98°C for hot-start activation; followed by 25 cycles of denaturation at 98°C for 10 s, annealing at a temperature ranging from 62 to 69°C, extension at 68°C for 1 min or 1 min 30 s (for long amplicons), and a final extension step at 68°C for 30 min. We mixed 1 µl of diluted amplicons (diluted at a rate determined by test runs, with dilutions ranging from 1:40–1:120) with 10.6 µl of Hi-Di formamide and 0.4 µl of a GeneScan 600 LIZ V2, as an internal size standard (Applied Biosystems, Courtaboeuf, France), or 10.5 µl of Hi-Di formamide (Applied Biosystems, Courtaboeuf, France) and 0.5 µl of a GeneScan 1200 LIZ (Applied Biosystems, Courtaboeuf, France), depending on the size of amplicons. Then, the mixture was denatured for 5 min at 95°C and placed on ice for at least 5 min. Capillary electrophoresis was performed in an ABI PRISM 3500xl genetic analyzer (Applied Biosystems, Courtaboeuf, France), using a performance-optimized polymer, POP-7, at 15000V at 60°C, with an initial injection of 23 s. Reference sample CFBP1716 was used as an internal control in all experiments. Fragment sizes were determined using Genemapper V6.0 (Applied Biosystems, Courtaboeuf, France). In a few cases, no amplification was obtained by multiplex PCR. Corresponding samples were assayed again using the same primers in a simplex PCR format. If a lack of amplification was confirmed, more external primers were designed and tested. In this case, we used a long-template polymerase. Simplex PCRs were performed using the BD advantage 2 polymerase mix (Takara Bio, CA, USA). Briefly, 2 µl of boiled bacterial suspension was used as a template in mixes containing 5 µl of 10x Advantage 2 PCR Buffer SA 10x, 0.2 mM of dNTPs mix, 0.2 µM of each primer (S2 Table), 1 µl of Advantage 2 polymerase mix and RNase-free water to yield a final volume of 50 µl. PCR were performed in a Veriti thermocycler (Applied Biosystems, Courtaboeuf, France), under the following conditions: 3 min at 95°C followed by 40 cycles of denaturation at 95°C for 15 s, annealing at temperatures ranging from 62 to 69°C, extension at 68°C for 1 min and a final extension step at 68°C for 7 min. All simplex amplicons were visualized using a Qiaxcel advanced system (Qiagen, Courtaboeuf, France).
TR sequencing
Rare alleles (n < 5) were checked by Sanger sequencing of amplicons. Amplicons were obtained by simplex PCR, with the same primers that were used for genotyping using the BD advantage 2 polymerase mix (Takara Bio, CA, USA), as described before. Amplicons were then purified using Sera-Mag magnetic beads (Cytiva, Villacoublay, France), as recommended by the manufacturer. Once purified, amplicons were sequenced using the BigDye™ Terminator v.3.1 Cycle Sequencing Kit (Applied Biosystems, Courtaboeuf, France): 5–20 ng of PCR product were used in mixtures containing 2 µl of BigDye Terminator V3.1, 1 µl of 5x sequencing buffer, 1 µl of primer at 5 µM and water to yield a final volume of 10 µl. Sequencing reactions were performed in a Veriti thermal cycler according to the manufacturer’s instructions. Sequencing products were purified using the BigDye XTerminator (Applied Biosystems, Courtaboeuf, France), as recommended by the manufacturer. Capillary electrophoresis was performed in an ABI PRISM 3500XL genetic analyzer (Applied Biosystems, Courtaboeuf, France) with a 50 cm capillary array.
Data scoring and exploration
In a single case (Xcm-4431), TRF predicted different TR unit size alternatives. Genotyping data meant that the correct size could be deciphered unambiguously. Amplicon sizes were turned into tandem repeat numbers using Genemapper V6.0. Repeat numbers of TR arrays with truncated repeats were rounded up to the nearest higher integer. Nei’s diversity index was calculated for each TR locus using poppr v.2.9.3 package in R. A minimum-spanning tree was built using PHYLOVIZ 2.0 [35,36], using an algorithm, which combines global optimal eBURST (goeBURST) and Euclidian distance best suited for MLVA data. The population structure of our strain collection was analyzed using Discriminant Analysis of Principal Components (DAPC), a multivariate method used to identify and describe clusters of genetically related strains using the adegenet v.2.1.7. R package [37].
MLVA data were compared to the reference method for Xcm typing, AFLP [16,22]. Our subset comprised 98 strains, for which genotyping data is available for both typing techniques. We compared the methods’ discriminatory power by computing the Hunter Gaston discriminatory index (HGDI), using the DescTools package v.0.99.48 in R [38]. The correlation between distance matrices (Dice dissimilarity for AFLP and Manhattan’s distance for MLVA, respectively) was tested by computing the Kendall’s coefficient of concordance (W) among the distance matrices, using a permutation test (9,999 permutations) with the “CADM.global” function of the ape package v. 5.6.2 in R [39]. The non-parametric Kendall’s W statistic evaluates congruence among multiple rating systems, with W ranging from 0 (no agreement) to 1 (complete agreement) [40].
Results
In silico analyses and TR marker selection
In this study, we identified minisatellites from the genome of the Xcm pathotype strain (CFBP1716), and then subsequently assessed their conservation level in other available high-quality genome sequences (Fig 1). A genotyping method was developed for selected markers, based on multiplex PCR and resolution of the fragments produced by capillary electrophoresis. Finally, a public database including all typing data was created, with the aim that the scientific community could use it for further studies.
Block colors refer to the MAUVE alignment performed in Geneious R10.2.6.
CRISPRCasFinder suggested the absence of cas genes in Xcm genomes. We detected an array composed of five direct repeats and five spacers, which were monomorphic among all queried genomes. This indicates the inappropriateness of CRISPR elements for subtyping Xcm (S3 Table). TRF identified 23 TR loci that were evaluated using the Xcm strain collection and three reference Xca strains (S1 Table). No amplification was produced for Xca strains for 10 TR loci (Xcm0497, Xcm0794, Xcm0909, Xcm0943, Xcm2532, Xcm3117, Xcm3948, Xcm4232, Xcm4578 and Xcm4970) (Table 1). Moreover, three markers made it possible to distinguish Xcm and Xca strains. Xcm4486 was found monomorphic among Xcm strains and produced a different sized amplicon among Xca strains. Xcm2555 and Xcm3141 were found polymorphic among Xcm, but Xca strains yielded a different sized amplicon. Fourteen TR markers, therefore, have a diagnostic value at the pathovar level.
Six candidate TRs (Xcm2977, Xcm3763, Xcm3942, Xcm4486, Xcm4970 and Xcm5175) were found monomorphic for all Xcm strains assayed and were not considered further. No amplicon was produced from any West African strain for Xcm2532 and Xcm2555 in contrast with the Xcm strains of different origin. The concomitant absence of amplification at loci Xcm2532 and Xcm2555 has some epidemiological value, by allowing us to distinguish these unique strains (i.e., the only ones reported to date causing severe outbreaks on cashew, in addition to mango). However, genotyping data derived from these two markers were not used further in the genetic diversity analysis.
The analysis of two high quality genome sequences of strains from Burkina Faso revealed that these two markers are part of two deletions (23 and 8 kb in size, respectively – Fig 1), which were detected within an integrative and conjugative element (ICE), as compared to CFBP1716 [26]. No amplicon was detected for strain JF955 from the Comoros for Xcm2532 (but one was obtained for Xcm2555). Conventional PCR yielded a 1.5 kb-long amplicon for this strain (therefore, it was undetectable by capillary electrophoresis). This amplicon was sequenced and showed the insertion of a 1,203 bp-long ISXcd1 (IS407-like element of the IS3 family – accession AF263433), with 92% nucleotide identity over 100% length. The produced sequence allowed us to determine the number of repeats for strain JF955.
Fourteen minisatellite markers showed polymorphism and produced amplicons for all assayed Xcm samples. In order to increase its resolutive power, the derived dataset was supplemented with genotyping data previously obtained for two microsatellites displaying a low polymorphism rate, XL6 and XL7 [21].
MLVA genotyping
All strains of Xcm could be genotyped, with TR units per locus ranging from 0 (no amplicon detected for Xcm2532 and Xcm2555) to 12 (Xcm0943) (Table 1). Rare alleles were identified for Xcm0909, Xcm1496, Xcm2532, Xcm37117, Xcm3141, Xcm3755 and Xcm4578. For these samples, Sanger sequencing confirmed that the typing results were indeed correct. We identified 46 haplotypes among the assayed strains and available high-quality genomes. The genotyping data and analysis of TRs in the four available high-quality Xcm genomes yielded identical results. MLVA-16 was slightly less discriminative than AFLP with Hunter’s D values of 0.948 and 0.911 for AFLP and MLVA-16, respectively. Distance matrices derived from MLVA-16 and AFLP were highly correlated (K = 0.882; p = 0.0001). The mean Nei’s genetic diversity (HT) was 0.300 overall and ranged from 0.013 (Xcm3117 and Xcm3755) to 0.795 (Xcm0497) (Table 1).
BIC values derived from the k-means analyses suggested that five is the most appropriate number of clusters (Fig 2; S1 Fig). All individuals had an assignation probability to a given cluster > 0.99. The different genetic clusters were separated from each other by 3–5 locus-variations (Fig 3). The two major DAPC clusters (clusters 3 and 4) grouped strains from many geographic origins (Pacific, South-West Indian Ocean). Strains identified in cluster 1 were geographically restricted to West Africa, whereas clusters 2 and 5 group strains from Asia. Clusters 2 and 4 contained strains previously assigned by AFLP to the B and CD clusters, respectively. Strains previously assigned to the A cluster by AFLP split into clusters 3 and 5.
Numbers and colours represent the five genetic clusters retained from Bayesian information criterion (BIC) values (S1 Fig). A. scatterplot representing axes 1 and 2 of the DAPC; B scatterplot representing axes 1 and 3 of the DAPC.
All strains from distinct networks or singletons differed by ≥ 4 TR loci. Dots represent haplotypes. Dot diameter and colour represent the number of strains per haplotype, country and AFLP clusters, respectively (red: A cluster; blue: B cluster; green: CD cluster; grey: AFLP data not available). Lines linking dots represent the amount of polymorphism among haplotypes (thick line = single-locus variation; thin line = double-locus variation; dashed line = triple-locus variation). Coloured background ellipses and numbers represent DAPC clusters, as shown in Fig 2.
Discussion
MBC is a serious disease threatening both mango and cashew, two major agricultural industries in tropical and subtropical regions. Herein, we report a new MLVA-based resolutive and robust genotyping method targeting 16 TR markers, which can be applied to the global surveillance of Xcm. The newly developed MLVA-16 genotyping scheme was only slightly less discriminative than AFLP. Data derived from the two datasets were highly correlated, suggesting a fairly good resolution and phylogenetic signal. Moreover, it allowed us to clearly distinguish Xcm from Xca, another pathovar pathogenic to Anacardiaceae in the same bacterial species. MLVA-based genotyping has many advantages over AFLP. It is cost-effective, easy to use, portable and provides a satisfactory degree of resolution. MLVA-based genotyping was proven to be useful for the surveillance of xanthomonads [24]. It represents a cheap frontline analysis in situations where massive whole genome sequencing (WGS) cannot be implemented, a common occurrence in countries where mango and cashew industries are economically important. MLVA can also be applied to the selection of strains for WGS. WGS provides a more robust view of the genetic structure of bacterial pathogens. It also allows access to the accessory genome content, including mobile genetic elements. These are major drivers of bacterial adaptation to selection pressures [41–45] and, therefore, are useful for surveillance.
The data produced herein were made publicly available in the MLVABank database (https://webphim.ird.fr/MLVA_Bank/Genotyping/index.php). This should facilitate comparative analyses of outbreak strains, which can then be considered in the context of the global diversity reported for Xcm [30,46]. Collectively, MLVA-16 and the previously reported MLVA-12 targeting microsatellites [21] have the ability to produce complementary datasets to match the evolutionary scale and to shed light on the epidemiological question under investigation [6,47,48].
Strains that originated from Anacardiaceae species other than mango clustered in two genetic groups. Based on the DAPC analysis, all strains isolated from Brazilian pepper grouped in cluster 4, which is totally in agreement with earlier AFLP data [22]. Herein, we report the first genetic characterization of strains originating from a related Anacardiaceae species, the large-leaved rhus (Searsia longipes), which were also assigned to cluster 4. All strains isolated from cashew grouped in cluster 1. The low diversity of these West African strains, as previously highlighted by microsatellite data [14], suggests that all strains isolated from this host species or mango in this region would cluster in this group. Globally, we revealed that strains grouping in all five delineated clusters were able to cause outbreaks on mango. However, we do not yet know whether they could cause outbreaks on alternative host species. Indeed, strains from mango and Brazilian pepper exhibit some host specificity. The significance of the latter species to act as an efficient inoculum source for mango has not been established [8]. A more thorough understanding of plant-Xcm interactions is required to address this question.
MBC causes major disease outbreaks on mango and cashew, with severe consequences for crop yield and quality in many countries, which directly impacts the industries. Yet, it is clearly a neglected disease, as shown by the limited number of published papers on the subject and the lack of public microbial and genomic resources. No available bacterial strains or genomes could be identified that closely matched the strain cluster that had emerged on mango and cashew in West Africa [10]. Therefore, proposing a genetically-sound hypothesis to putatively identify the source of this emergence was precluded. Nevertheless, our study allowed us to reject an early hypothesis suggesting that Xcm strains that are now established in West Africa may have originated from South Africa as a result of the long-distance movement of mango propagative material (budwood). Indeed, a deeper analysis of Xcm global molecular epidemiology involving new resources is needed to improve our understanding of the pathogen’s large-scale expansion. Despite the first report of the disease in South Africa at the beginning of the 20th century, Xcm may have originated from Asia, the area of origin of mango [49]. This is suggested by (i) herbarium samples collected from India in the second half of the 19th century, which show typical leaf lesions [11], and (ii) recent phylogenomic analyses [50]. The case study of West Africa, where MBC was very likely absent until the early 2000s and now represents a major constraint for the mango and cashew industries, emphasizes the importance of (i) efficient pre-entry control measures, and (ii) timely genotyping-based surveillance of crop pathogens. We advocate the importance of mobilizing research efforts to improve our understanding of this major pathosystem, which causes severe damage to two cash crops of agricultural importance.
Supporting information
S1 Fig. Bayesian information criterion derived from the DAPC k-means analysis performed on Xanthomonas citri pv. mangiferaeindicae tandem repeat dataset.
https://doi.org/10.1371/journal.pone.0336768.s001
(PDF)
S1 Table. Bacterial strains used in this study.
https://doi.org/10.1371/journal.pone.0336768.s002
(XLSX)
S2 Table. Information on the tandem repeat loci selected for subtyping Xanthomonas citri pv. mangiferaeindicae.
https://doi.org/10.1371/journal.pone.0336768.s003
(XLSX)
S3 Table. Repeat and spacer sequences of CRISPR-like elements detected by CRISPRCasFinder in high-quality Xanthomonas citri pv. mangiferaeindicae genomes.
https://doi.org/10.1371/journal.pone.0336768.s004
(XLSX)
Acknowledgments
We would like to thank A. Dereeper and R. Koebnik for their kind support (MLVAbank database). The authors greatly acknowledge the Plant Protection Platform (3P, IBiSA).
References
- 1. Savary S, Willocquet L, Pethybridge SJ, Esker P, McRoberts N, Nelson A. The global burden of pathogens and pests on major food crops. Nat Ecol Evol. 2019;3(3):430–9. pmid:30718852
- 2. Gardy JL, Loman NJ. Towards a genomics-informed, real-time, global pathogen surveillance system. Nat Rev Genet. 2018;19(1):9–20. pmid:29129921
- 3. Stam R, Gladieux P, Vinatzer BA, Goss EM, Potnis N, Candresse T, et al. Population Genomic- and Phylogenomic-Enabled Advances to Increase Insight Into Pathogen Biology and Epidemiology. Phytopathology. 2021;111(1):8–11. pmid:33513042
- 4. Lindstedt B-A. Multiple-locus variable number tandem repeats analysis for genetic fingerprinting of pathogenic bacteria. Electrophoresis. 2005;26(13):2567–82. pmid:15937984
- 5. van Belkum A. Tracing isolates of bacterial species by multilocus variable number of tandem repeat analysis (MLVA). FEMS Immunol Med Microbiol. 2007;49(1):22–7. pmid:17266711
- 6. Keim P, Van Ert MN, Pearson T, Vogler AJ, Huynh LY, Wagner DM. Anthrax molecular epidemiology and forensics: using the appropriate marker for different evolutionary scales. Infect Genet Evol. 2004;4(3):205–13. pmid:15450200
- 7. Supply P, Allix C, Lesjean S, Cardoso-Oelemann M, Rüsch-Gerdes S, Willery E, et al. Proposal for standardization of optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of Mycobacterium tuberculosis. J Clin Microbiol. 2006;44(12):4498–510. pmid:17005759
- 8. Gagnevin L, Pruvost O. Epidemiology and Control of Mango Bacterial Black Spot. Plant Dis. 2001;85(9):928–35. pmid:30823104
- 9.
Ploetz R, Freeman S. Foliar, floral and soilborne diseases. In: Mango 2nd Ed Bot Prod Uses. 2009. p. 231–302. https://doi.org/10.1079/9781845934897.0231
- 10. Sossah FL, Aidoo OF, Dofuor AK, Osabutey AF, Obeng J, Abormeti FK, et al. A critical review on bacterial black spot of mango caused by Xanthomonas citri pv. mangiferaeindicae: Current status and direction for future research. Forest Pathology. 2024;54(3):e12860.
- 11. Patel MK, Kulkarni YS, Moniz L. Pseudomonas mangiferae-indicae, pathogenic on Mango. Indian Phytopathol. 1948;1:147–52.
- 12. Doidge EM. A bacterial disease of the mango Bacillus mangiferae n. sp. Ann Appl Biol. 1915;2:1–45.
- 13. Sanahuja G, Ploetz RC, Lopez P, Konkol JL, Palmateer AJ, Pruvost O. Bacterial Canker of Mango, Mangifera indica, Caused by Xanthomonas citri pv. mangiferaeindicae, Confirmed for the First Time in the Americas. Plant Disease. 2016;100(12):2520.
- 14. Zombre C, Sankara P, Ouédraogo SL, Wonni I, Boyer K, Boyer C, et al. Natural Infection of Cashew (Anacardium occidentale) by Xanthomonas citri pv. mangiferaeindicae in Burkina Faso. Plant Dis. 2016;100(4):718–23. pmid:30688624
- 15. Dye DW, Bradbury JF, Goto M, Hayward AC, Lelliott RA, Schroth MN. International standards for naming pathovars of phytopathogenic bacteria and a list of pathovar names and pathotype strains. Rev Plant Pathol. 1980:153–68.
- 16. Ah-You N, Gagnevin L, Grimont PAD, Brisse S, Nesme X, Chiroleu F, et al. Polyphasic characterization of xanthomonads pathogenic to members of the Anacardiaceae and their relatedness to species of Xanthomonas. Int J Syst Evol Microbiol. 2009;59(Pt 2):306–18. pmid:19196770
- 17. Constantin EC, Cleenwerck I, Maes M, Baeyen S, Van Malderghem C, De Vos P, et al. Genetic characterization of strains named as Xanthomonas axonopodis pv. dieffenbachiae leads to a taxonomic revision of the X. axonopodis species complex. Plant Pathology. 2015;65(5):792–806.
- 18. Gagnevin L, Leach JE, Pruvost O. Genomic Variability of the Xanthomonas Pathovar mangiferaeindicae, Agent of Mango Bacterial Black Spot. Appl Environ Microbiol. 1997;63(1):246–53. pmid:16535490
- 19. Patel M, Moniz S, Kulkarni Y. A new bacterial disease of Mangifera indica L. Curr Sci. 1948;6: 189–90.
- 20. Programme Régional de Protection des Végétaux-PRPV. Base de données des organismes associés aux végétaux dans l’océan Indien. CIRAD Dataverse; 2023.
- 21. Pruvost O, Vernière C, Vital K, Guérin F, Jouen E, Chiroleu F, et al. Insertion sequence- and tandem repeat-based genotyping techniques for Xanthomonas citri pv. mangiferaeindicae. Phytopathology. 2011;101(7):887–93. pmid:21323466
- 22. Ah-You N, Gagnevin L, Chiroleu F, Jouen E, Neto JR, Pruvost O. Pathological Variations Within Xanthomonas campestris pv. mangiferaeindicae Support Its Separation Into Three Distinct Pathovars that Can Be Distinguished by Amplified Fragment Length Polymorphism. Phytopathology. 2007;97(12):1568–77. pmid:18943717
- 23. van Belkum A, Tassios PT, Dijkshoorn L, Haeggman S, Cookson B, Fry NK, et al. Guidelines for the validation and application of typing methods for use in bacterial epidemiology. Clin Microbiol Infect. 2007;13(Suppl 3):1–46. pmid:17716294
- 24. Catara V, Cubero J, Pothier JF, Bosis E, Bragard C, Đermić E, et al. Trends in Molecular Diagnosis and Diversity Studies for Phytosanitary Regulated Xanthomonas. Microorganisms. 2021;9(4):862. pmid:33923763
- 25. Bie F, Li Y, Liu Z, Qin M, Li S, Dan X, et al. High-Quality Genome Resource of Mango Bacterial Black Spot Pathogen Xanthomonas citri pv. mangiferaeindicae GXG07 Isolated from Guangxi, China. Plant Dis. 2022;106(3):1027–30. pmid:34633234
- 26. Boyer C, Lefeuvre P, Zombre C, Rieux A, Wonni I, Gagnevin L, et al. New, Complete Circularized Genomes of Xanthomonas citri pv. mangiferaeindicae Produced from Short- and Long-Read Co-Assembly Shed Light on Strains that Emerged a Decade Ago on Mango and Cashew in Burkina Faso. Phytopathology. 2025;115(1):14–9. pmid:39387826
- 27. Gordon JL, Lefeuvre P, Escalon A, Barbe V, Cruveiller S, Gagnevin L, et al. Comparative genomics of 43 strains of Xanthomonas citri pv. citri reveals the evolutionary events giving rise to pathotypes with different host ranges. BMC Genomics. 2015;16:1098. pmid:26699528
- 28. Pruvost O, Magne M, Boyer K, Leduc A, Tourterel C, Drevet C, et al. A MLVA genotyping scheme for global surveillance of the citrus pathogen Xanthomonas citri pv. citri suggests a worldwide geographical expansion of a single genetic lineage. PLoS One. 2014;9(6):e98129. pmid:24897119
- 29. Rache L, Blondin L, Diaz Tatis P, Flores C, Camargo A, Kante M, et al. A minisatellite-based MLVA for deciphering the global epidemiology of the bacterial cassava pathogen Xanthomonas phaseoli pv. manihotis. PLoS One. 2023;18(5):e0285491. pmid:37167330
- 30. Costa J, Pothier JF, Bosis E, Boch J, Kölliker R, Koebnik R. A Community-Curated DokuWiki Resource on Diagnostics, Diversity, Pathogenicity, and Genetic Control of Xanthomonads. Mol Plant Microbe Interact. 2024;37(3):347–53. pmid:38114082
- 31. Supply P, Lesjean S, Savine E, Kremer K, van Soolingen D, Locht C. Automated high-throughput genotyping for study of global epidemiology of Mycobacterium tuberculosis based on mycobacterial interspersed repetitive units. J Clin Microbiol. 2001;39(10):3563–71. pmid:11574573
- 32. Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27(2):573–80. pmid:9862982
- 33. Denoeud F, Vergnaud G. Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains: a web-based resource. BMC Bioinformatics. 2004;5:4. pmid:14715089
- 34. Couvin D, Bernheim A, Toffano-Nioche C, Touchon M, Michalik J, Néron B, et al. CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins. Nucleic Acids Res. 2018;46(W1):W246–51. pmid:29790974
- 35. Francisco AP, Vaz C, Monteiro PT, Melo-Cristino J, Ramirez M, Carriço JA. PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods. BMC Bioinformatics. 2012;13:87. pmid:22568821
- 36. Nascimento M, Sousa A, Ramirez M, Francisco AP, Carriço JA, Vaz C. PHYLOViZ 2.0: providing scalable data integration and visualization for multiple phylogenetic inference methods. Bioinformatics. 2017;33(1):128–9. pmid:27605102
- 37. Jombart T, Devillard S, Balloux F. Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genet. 2010;11:94. pmid:20950446
- 38. Hunter PR, Gaston MA. Numerical index of the discriminatory ability of typing systems: an application of Simpson’s index of diversity. J Clin Microbiol. 1988;26(11):2465–6. pmid:3069867
- 39. Campbell V, Legendre P, Lapointe F-J. Assessing Congruence Among Ultrametric Distance Matrices. J Classif. 2009;26(1):103–17.
- 40. Kendall MG, Smith BB. The problem of m rankings. Ann Math Stat. 1939;10: 275–87.
- 41. Hemara LM, Hoyte SM, Arshed S, Schipper MM, Wood PN, Marshall SL, et al. Genomic Biosurveillance of the Kiwifruit Pathogen Pseudomonas syringae pv. actinidiae Biovar 3 Reveals Adaptation to Selective Pressures in New Zealand Orchards. Mol Plant Pathol. 2025;26(2):e70056. pmid:39915983
- 42. Jibrin MO, Sharma A, Mavian CN, Timilsina S, Kaur A, Iruegas-Bocardo F, et al. Phylodynamic Insights into Global Emergence and Diversification of the Tomato Pathogen Xanthomonas hortorum pv. gardneri. Mol Plant Microbe Interact. 2024;37(10):712–20. pmid:38949619
- 43. Richard D, Pruvost O, Balloux F, Boyer C, Rieux A, Lefeuvre P. Time-calibrated genomic evolution of a monomorphic bacterium during its establishment as an endemic crop pathogen. Mol Ecol. 2021;30(8):1823–35. pmid:33305421
- 44. Timilsina S, Kaur A, Sharma A, Ramamoorthy S, Vallad GE, Wang N, et al. Xanthomonas as a Model System for Studying Pathogen Emergence and Evolution. Phytopathology. 2024;114(7):1433–46. pmid:38648116
- 45. Vanhove M, Sicard A, Ezennia J, Leviten N, Almeida RPP. Population structure and adaptation of a bacterial pathogen in California grapevines. Environ Microbiol. 2020;22(7):2625–38. pmid:32114707
- 46. Grissa I, Bouchon P, Pourcel C, Vergnaud G. On-line resources for bacterial micro-evolution studies using MLVA or CRISPR typing. Biochimie. 2008;90(4):660–8. pmid:17822824
- 47. Kremer K, Arnold C, Cataldi A, Gutiérrez MC, Haas WH, Panaiotov S, et al. Discriminatory power and reproducibility of novel DNA typing methods for Mycobacterium tuberculosis complex strains. J Clin Microbiol. 2005;43(11):5628–38. pmid:16272496
- 48. Schlötterer C. The evolution of molecular markers--just a matter of fashion? Nat Rev Genet. 2004;5(1):63–9. pmid:14666112
- 49.
Bompard JM. Taxonomy and systematics. In: Mango 2nd Ed Bot Prod Uses. 2009. p. 19–41. https://doi.org/10.1079/9781845934897.0019
- 50. Bansal K, Midha S, Kumar S, Patil PB. Ecological and Evolutionary Insights into Xanthomonas citri Pathovar Diversity. Appl Environ Microbiol. 2017;83(9):e02993-16. pmid:28258140