Genome-driven evaluation and redesign of PCR tools for improving the detection of virulence-associated genes in aeromonads

Many virulence factors have been described for opportunistic pathogens within the genus Aeromonas. Polymerase Chain Reactions (PCRs) are commonly used in population studies of aeromonads to detect virulence-associated genes in order to better understand the epidemiology and emergence of Aeromonas from the environment to host, but their performances have never been thoroughly evaluated. We aimed to determine diagnostic sensitivity and specificity of PCR assays for the detection of virulence-associated genes in a collection of Aeromonas isolates representative for the genetic diversity in the genus. Thirty-nine Aeromonas strains belonging to 27 recognized species were screened by published PCR assays for virulence-associated genes (act, aerA, aexT, alt, ascFG, ascV, ast, lafA, lip, ser, stx1, stx2A). In parallel, homologues of the 12 putative virulence genes were searched from the genomes of the 39 strains. Of the 12 published PCR assays for virulence factors, the comparison of PCR results and genome analysis estimated diagnostic sensitivities ranging from 34% to 100% and diagnostic specificities ranged from 71% to 100% depending upon the gene. To improve the detection of virulence-associated genes in aeromonads, we have designed new primer pairs for aerA/act, ser, lafA, ascFG and ascV, which showed excellent diagnostic sensitivity and specificity. Altogether, the analysis of high quality genomic data, which are more and more easy to obtain, provides significant improvements in the genetic detection of virulence factors in bacterial strains.


Introduction
Aeromonas are common inhabitants of aquatic environments and can be involved in fish and human diseases. They are frequently found in drinking water and food, including meat, fish, vegetables, and, more recently in ready-to-eat foods [1][2][3][4][5][6]. Among other illnesses, aeromonads are responsible for gastrointestinal syndromes following ingestion of contaminated food, and wound infection following water exposure in human [1][2][3][4][5][6], and for furunculosis and septicemia in fish causing major losses in the aquaculture sector [7]. The pathogenesis of Aeromonas infection is only partially elucidated [1,5,6,8] although a wide repertoire of virulence factors contributing to biofilm formation, cell adherence, invasion and cytotoxicity have been described [8]. It has been suggested that only certain subsets of strains within a species, referred as "pathotypes", produce disease in certain individuals [1,9,10]; presumably due to differences in the content of virulence-associated genes. From this assumption, many studies leading to an abundant literature searched for virulent strains from water, fish, food or clinical samples using Polymerase Chain Reaction (PCR) for detecting virulence factors [11][12][13][14], notably for epidemiologic surveys (e. g., [15][16][17]). The accurate inference of virulence in such an approach depends on how accurate the PCR-based methods are for detecting genetic markers of virulence in aeromonads. But these methods have often been developed from a small number of strains, few species and for a specific purpose, and their performances have never been evaluated for the whole genus. Before raising conclusions on pathotype patterns in aeromonads, PCR performances at the genus level deserve to be questioned considering the high level of genetic polymorphism in the genus Aeromonas [18,19].
The recent availability of whole genome sequences (WGS) data from many Aeromonas isolates from various species [20,21] allows the search for virulence-associated genes by sequence comparison and the evaluation of performance of molecular methods. In the present study, we aimed to evaluate the accuracy of several widely used virulence PCRs by matching PCR results against available genomes. When necessary, accuracy was improved by the genome-driven design of new PCR tools.

Bacterial strains and DNA extraction
Thirty-nine Aeromonas spp. strains were chosen to cover the whole genus. They belonged to 27 species (Table 1) among the 30 validated species in the Aeromonas genus at the time of January 2018. The current taxonomic affiliations are indicated for every strain with previously published names of strains indicated inside braces, where appropriate. Strains were grown on Trypticase Soy Agar at 35˚C for 16-24 h, and genomic DNA was extracted using the Master-Pure™ DNA Purification Kit (Epicentre, USA).

Genome sequences
Thirty-seven draft genomes of Aeromonas spp., sequenced during previous works [20][21][22], were provided at the Microbial Analysis, Resources and Services (MARS) facility at the University of Connecticut (Storrs, USA) ( Table 1) with a high quality level of draft genome on the basis of the quality of the sequencing and assembling (e.g., number of average genomes coverage and number of scaffolds) and of the verification of the automated annotation (16 housekeeping genes and 47 ribosomal protein-coding genes checked in every genome). Two other genomes were downloaded from GenBank (Table 1). To check the taxonomical assignation of genomes employed, we performed a ML phylogenetic tree of all the genomes (n = 39) calculated by the program kSNP3.1 [25], a kmer based alignment free method (see S1 Fig).

Genome analysis
Genomic contigs and circular genomes were annotated by using the RAST annotation server to identify RNAs and protein-coding genes [26]. The genomes were queried for genes encoding for a set of virulence factors described in Aeromonas spp. by using reference protein sequences (Table 2), either using translated sequences of the validated subset of UniProtKB-S-wissProt or annotated genes in UniprotKB-TrEMBL database. Sequence comparisons with reference protein sequences were performed with SEEDviewer that uses bidirectional proteinprotein BLAST (BlastP) sequence comparison of translated open reading frames. Proteins with amino acid sequence similarities !65% and E-value 10 −10 were considered to be homologous [27].

Polymerase Chain Reaction (PCR)
Both selected and new primers were used to amplify DNA from aerolysin/enterotoxin cytotoxic (aerA/act), heat-stable cytotonic enterotoxin (ast), heat-labile cytotonic enterotoxin (alt), lipase (lip), serine protease (ser), ADP-ribosylating toxins (aexT and aexU), shiga toxins 1 (stx1) and 2 (stx2a), T3SS needle proteins (ascFG), T3SS inner membrane channel protein (ascV) and lateral flagellin A (lafA) genes. PCR assays were carried out in a 25 μL reaction mixture containing 0.2 μM of each primer (Sigma-Aldrich, Saint-Louis, US), 0.2 mM of each deoxynucleoside triphosphate (dNTP) (Euromedex, Souffelweyersheim, France), 2.5 mM MgCl2 (Promega, Madison, US), and 1.25 U of GoTaq DNA polymerase (Promega, Madison, WI) in the appropriate reaction buffer containing 1.5 mM MgCl2 and 25 ng of genomic DNA as the template. The amplification conditions performed on a GeneAmp PCR System 9700 (Applied Biosystems, Foster city, US) were as follows: initial denaturation for 3 min at 95˚C, followed by (1) main amplification program of the reference source specified in the Table 3 for already published PCR, with length of extension adapted to the efficacy of the DNA polymerase used (1 min/kb) when necessary, and a final extension step at 72˚C for 10 min or (2) 32 amplification cycles as indicated in Table 4 for newly developed PCR. The PCR products and the 50 bp DNA ladder (New England BioLabs, Ipswich, US) were separated in 1.5% agarose gels in 0.5X TBE buffer-ethidium bromide 500 μg/mL and revealed under UV. Positive and negative controls were added in each PCR to assess the validity and specificity of amplification reaction. The positive controls of amplification corresponded to genomic DNA of one of the strains included in the study and for which the presence of gene of interest was checked on the basis of genomic analysis as follows: A. veronii 77C (aer/act, aexT and ser), A. dhakensis CIP 107500 T (alt, ascFG, ascV and lafA), A. hydrophila BVH 25a (ast). Specificity of the assays was confirmed using PCR-grade water and DNA from A. fluvialis LMG 24681 T which harbors in its genome none of the genes of interest. All strains were tested for all primer sets. For all the : Accession numbers correspond to protein sequences in Swiss-prot (S) or TrEMBL databases (E). T : Type strain † Previously published names are indicated inside braces. ‡ ascF and ascG are flanking genes.
https://doi.org/10.1371/journal.pone.0201428.t002 PCR assays and for each strain, the results of absence/presence of amplification was determined by two different readers and in two independent experiments on the basis of the expected size of amplified products and of the positive control.

Primer design strategy
Nucleic sequences were aligned using the Clustal ω2 program within Seaview 4 package [28] (see S1 Files). Primers displaying a length of 18-21 nucleotides were designed from the conserved regions, and AmplifX 1.7.0 (CNRS, Aix-Marseille Université, France, http://crn2m. univ-mrs.fr/pub/amplifx-dist) was used to assess their intrinsic properties such as complexity (polyX and triplet repetitions), 3' stability and self-dimer formation. The specificity of the new primers was checked with NCBI nucleotide-nucleotide BLAST.

Challenging PCR accuracy characteristics
The accuracy of the PCR assays were evaluated by comparing the results of the PCR with those obtained by genome analysis, with WGS results considered as gold standard. Diagnostic sensitivity was defined as the percentage of 'positive PCR results among all the strains for which the gene was detected in WGS', while specificity was defined as the percentage of 'negative PCR results among all the strains for which the gene was not detected in WGS'. Diagnostic sensitivity and specificity values below 90% were considered as insufficient. Exact confidence interval of a frequency, i.e., binomial confidence interval, was determined for each percentage, as described elsewhere [29] by using the web server http://statpages.info/confint.html.

Genome content in virulence-associated genes
We detected homologues for all the studied genes except for stx1 and stx2 (  Advances in detection of Aeromonas virulence genes of common flanking genes (data not shown), led to consider aerA/act and alt/pla as two single virulence-associated genes for all the strains included in the study. The ser gene that is putatively involved in the aerolysin activation by proteolytic cleavage was detected in 14 out of 17 aerA/act positive strains ( Table 2). The genes coding for structural components of type III secretion system (T3SS) were detected simultaneously in the 11 T3SS gene positive strains ( Table 2). The 248 residues from the N-terminal domain of the two ADP-ribosyltransferase toxins, AexT from A. salmonicida subsp. salmonicida ATCC 33658 T (SwissProt Q93Q17; 475 amino-acids) and AexU from A. veronii bv. sobria AeG1 (UniprotKB-TrEMBL D5LUP3, 512 amino-acids) presented a high level of identity (89.5%) and similarity (94.8%) in protein sequence, but they differed significantly in their C-terminal domain (< 20% similarity). All the strains that harbor the T3SS effector genes aexT/aexU, except A. salmonicida subsp. salmonicida CIP 103209 T , also possessed the T3SS structural components. For the lateral flagellin gene, from 1 to 4 copies were detected in the genomes of the 21-lafA positive strains.

PCR performance assessment
Results of the evaluation of PCR accuracy are presented in Table 3. Ability to amplify gene of interest was assessed for all genes but stx1a or stx2a because there were no strains for which the genes were detected in WGS ( Table 2). The ability of the PCRs to amplify the gene of interest (i.e., diagnostic sensitivity, hereafter sensitivity) ranged from 34 to 100%. The PCR screening aerA/act that uses primers "aer", and the PCRs screening the ast and aexT genes displayed high sensitivities with the tested set of strains, with values of 91%, 100% and 100%, respectively. However, the corresponding confidence intervals for ast and aexT PCR were large (range >40%) because the number of positive genomes was low, 6 and 2 out of the 39 genomes studied, respectively. On the opposite, with a sensitivity value of 64%, the PCR usually employed for screening aerA/act genes using primers "AHC" was disappointing and was associated with frequent false negative results. Similarly, low values of sensitivity were observed for the following genes: ser (59%), alt/pla (primers "alt", 34%; primers "lip", 68%), ascFG (45%), ascV (55%) and lafA (55%). Depending on the gene, the lack of sensitivity resulted from either an excessive variability in annealing of forward and/or reverse primers, or even errors in primer sequence transcript (e.g., Serine-r and Laf1; Table 3). The specificities of the PCRs that could be evaluated ranged from 71% to 100% depending on the gene considered (Table 3). Specificity was not determined for alt because only one out of the 39 WGS lacked alt gene (A. fluvialis LMG 24681 T ). PCRs used for screening the following genes were highly specific: aerA/act-using primers "AHC" (100%), ser (92%), ast (91%), aexT (100%) and lafA (100%). The test method was insufficiently specific for aerA/act genes using primers "aer" (82%), ascF-G (74%) and ascV (71%) genes. Therefore, an overestimation of the prevalence of these virulence-associated factors by generation of false positive results was expected. Depending on the gene, the default of specificity likely resulted from PCR primers properties such as high GC content, self-end dimer formation and/or 3'-end unstability (Table 3).

Genome driven design and evaluation of new PCR primers
Considering the low performance of some PCRs, we designed new primer pairs on the basis of multiple alignments (Table 4) with partial and full-length sequences retrieved from Genbank or EMBL databases (S2 Table), and from the WGS provided in this study (Table 1). To design the new primers, we selected low variable areas of sequence alignments and used degenerate bases in case of polymorphism. Newly designed primers were associated with an increased sensitivity of the virulence gene detection (Table 3). Meanwhile, the use of degenerate primers did not impair the specificity (Table 3). For screening of ADP-ribosylating toxins, we have designed primers matching in the homologous part shared by aexT and aexU genes that were associated with a sensitivity of 100%.

Discussion
Many virulence factors are described in the genus Aeromonas, but to date, the pathogenicity of specific strains still cannot be predicted from the genome content. The genus Aeromonas is characterized by intraspecific and interspecific genetic diversities of both ribosomal and housekeeping genes that impair the definition of species and lead to the population structure of the genus in several complexes of species [19,36]. However, using the whole genome sequences and the average nucleotide identity index to compare them, all the species could be clearly delineated [20,21]. The huge variability observed in some housekeeping genes also occurs in virulence-associated genes, as already shown for the "S-layer protein" gene (vapA) that harbored variability up to 15% at the amino-acid level among the strains of different subspecies of A. salmonicida [18].
This study focuses on the performances of molecular techniques used for the detection of virulence-associated genes in Aeromonas strains. Some previous work aiming to detect virulence factors in the genus Aeromonas reported discrepancies between studies due to molecular methods used (e.g., [37]). The choice of strains used for the primer design was particularly questioned. The present study does not question the validity of previously published PCRs, especially when the original study aimed to screen the presence of virulence-associated genes in particular groups or species (e.g., in A. caviae, [38]), but aims to evaluate PCR performances at the genus level. The comparison of PCR results and high-quality WGS with automatic annotation and careful manual checking allows for the first time the proper evaluation of the PCR performances for detecting virulence-associated genes in aeromonads. Most new primers and PCR conditions proposed herein improve PCR performances although the design of degenerate primers was required for some virulence genes, i.e. when sequence polymorphisms are scattered all along the gene. Neither diagnostic sensitivity nor specificity was hampered in these cases.
Similar to most other population studies, our study suffers from some limitations. Studying whether strains are virulent may be challenging for opportunistic pathogens but at first, tools need to be accurate in establishing whether a gene is present or absent from genomes. Our study may have some potential bias in the strain sampling. But the collection of strains included in the study aimed to be as representative of the diversity found in the genus as possible, covering the interspecies diversity in the whole genus, given the studied genes are sought in a wide range of species [39], and the intra-species diversity of species that are the most frequently isolated in human clinical contexts [1,40]. Illustrating the unavoidable sampling bias, the shiga-toxin coding genes stx were not detected in this study by PCR nor by genome analysis. However, the stx1 gene seems to be quite rare in the genus [41]. Another explanation would be stx instability due to the mobility of the stx phage that can lead to the gene loss after subculture [41] or may occur during storage before genome sequencing and PCR. The prevalence of some virulence markers was either too low (e.g., stx) or too high (e.g., lip, alt) so that PCR assay diagnostic sensitivity or specificity could not be evaluated, respectively. Despite the limitation in strain and sequence collections, this study provides a meaningful highlight in accuracy of widely used tools together with virulence gene content in a large collection of strains.
Besides performance assessment and method improvements, this study provides some interesting data on virulence in aeromonads, like the possible matching between several virulence markers identified in different Aeromonas species that could be established or confirmed. For instance, the homology between aerolysin gene aerA from A. hydrophila and cytotoxic enterotoxin gene act from A. dhakensis has previously been considered [42]. In fact, both toxins exhibit hemolytic, cytotoxic and enterotoxic activities and both cause lethality in mice [43][44][45][46][47]. Moreover, a similar mechanism of action, i.e., oligomerization followed by pore-formation, has been reported for these 2 toxins [46,48]. These two toxins are closely related but it is still possible that the aerolysin/cytotonic enterotoxin from A. dhakensis SSU possesses some structural and functional originality [46], and further characterization of the purified homologous proteins is required to clarify this point. The aerA/act PCR based on newly designed primers improved the diagnostic sensitivity for the panel of aeromonads covering the whole genus from 64-91% to 100%, and dramatically improved sensitivity for the species A. dhakensis (from 0-67% to 100%). Given the clinical importance of this species [49], an optimized performance for detecting the well-studied aerA virulence gene is critical, and should provide advances in the patho-epidemiological knowledge of the species. The putative correspondence between phospholipase A gene (pla) from A. piscicola AH-3 and heat-labile cytotonic enterotoxin gene (alt) from A. dhakensis SSU, was also observed by Balsalobre et al. [50]. In this case, the nomenclature of the virulence factor depends on the study [51,52]. The biological features are possibly divergent as mentioned by Merino et al. [52] and could depend on the allele of the gene alt/pla but this requires further studies.
Biological interpretation of the presence of virulence genes needs to be cautious because the presence of a virulence gene does not imply its expression in the host. In addition, we have to consider the pathogenic and non-pathogenic interactions depending on the context. For example, the aeromonads T3SS system is important for both pathogenicity and mutualism as demonstrated inside the leech microbiota [53]. In parallel, post-translational modifications for activation or effector translocation should be examined. The joint presence of the ser gene and aer/act should be considered because aerolysin is activated by a serine protease [54]. The ser gene was absent from the genomic data of three aerA/act positive strains included in this study: A. diversa CECT 4254 T , A. molluscorum CIP 108876 T and A. schubertii CECT 4240 T . Similarly, the presence of T3SS effectors AexT/U should be considered with respect to the presence of T3SS components, e.g., ascFG and ascV. Their presence is required for the delivery of AexT/U [55,56]. The presence of both genes aexT and aexU that we observed with the strains A. veronii 77C and BVH26b has already been mentioned for other A. veronii isolates and appears to be a frequent situation for strains carrying T3SS in the A. veronii group [37]. Therefore, for screening aexT and aexU genes in aeromonads, we propose an aexT/U PCR, followed by a molecular search of the aexT gene alone when the screening assay is positive.
In epidemiologic and pathogenesis studies of Aeromonas, large collections of strains were screened for their content in virulence-associated genes. Given the polymorphisms in virulence-associated genes highlighted in this study and because PCR assays to detect virulence associated genes were widely used without prior evaluation of their performance (diagnostic sensitivity and specificity) at the genus level (S1 Table), the published prevalence of virulence genes should be interpreted with caution. When the PCR performance is poor, the estimated prevalence is blurred with assay errors and does not reflect the true prevalence of the virulence factors. Importantly, assumptions on the pathogenic behavior of aeromonads have been made from virulence patterns obtained from defective/inaccurate tool results, and this may have led to wrong conclusions. This may contribute to the current complexity of virulence patterns in aeromonads, to the poor understanding of aeromonad virulence and to the frequent lack of links between the virulence factor pattern and the pathogenic behavior. Optimization of PCR conditions and primer is crucial to avoid biased data in applied and clinical microbiology, and in epidemiology.
The bacterial genomes, today widespread and easy to obtain, have a very high potential in numerous research areas, including the design of consolidated microbiological diagnostic tools, although they are maybe currently underused for this purpose. In our study, genomes allowed an accuracy evaluation of virulence-associated gene PCRs commonly used in the genus Aeromonas and their improvement by designing several improved oligonucleotide primers. WGS could obviously be used as primary data for searching virulence associated genes but WGS are still expensive when a large panel of strains is studied. Therefore, we propose here an accurate approach based on these genomic data to improve PCR methods aimed to detect virulence-associated genes in environmental bacteria.
Further advances in the knowledge of the genetic diversity and of the evolution of virulence-associated genes should improve the understanding of aeromonad adaptation to pathogenic behavior or patho-adaptation, and thereby the revision of pathotype definition.