MALDI-TOF Mass Spectrometry Is a Fast and Reliable Platform for Identification and Ecological Studies of Species from Family Rhizobiaceae

Family Rhizobiaceae includes fast growing bacteria currently arranged into three genera, Rhizobium, Ensifer and Shinella, that contain pathogenic, symbiotic and saprophytic species. The identification of these species is not possible on the basis of physiological or biochemical traits and should be based on sequencing of several genes. Therefore alternative methods are necessary for rapid and reliable identification of members from family Rhizobiaceae. In this work we evaluated the suitability of Matrix-Assisted Laser Desorption Ionization-Time-of-Flight Mass Spectrometry (MALDI-TOF MS) for this purpose. Firstly, we evaluated the capability of this methodology to differentiate among species of family Rhizobiaceae including those closely related and then we extended the database of MALDI Biotyper 2.0 including the type strains of 56 species from genera Rhizobium, Ensifer and Shinella. Secondly, we evaluated the identification potential of this methodology by using several strains isolated from different sources previously identified on the basis of their rrs, recA and atpD gene sequences. The 100% of these strains were correctly identified showing that MALDI-TOF MS is an excellent tool for identification of fast growing rhizobia applicable to large populations of isolates in ecological and taxonomic studies.


Introduction
The family Rhizobiaceae currently contains fast growing species of bacteria that may be saprophytic or able to establish beneficial or deleterious plant interactions. These species are currently arranged into three genera, Rhizobium, Ensifer and Shinella [1,2]. The former genera Agrobacterium and Allorhizobium are now included in genus Rhizobium [3] and Sinorhizobium is currently named Ensifer [4]. The identification of members of the family Rhizobiaceae is necessarily based on gene sequencing since there is not phenotypic information that allows the differentiation and identification of rhizobial species [3]. Therefore, although gene sequencing is the most reliable method for identification of rhizobia, it is still a tedious and time-consuming method to be applied to wide populations and therefore alternative methods are necessary for reliable identification of these bacteria shortening the time needed to achieve this process.
Matrix-assisted laser desorption ionization-time-of-flight mass spectrometry (MALDI-TOF MS) has been suggested as a fast and reliable method for bacterial identification, based on the characteristic protein profiles for each microorganism. Using this technology it has been estimated that up to 99% of strains tested are correctly identified when comparing with commercial phenotypic identification panels or rrs gene sequencing [5][6][7][8].
However MALDI-TOF MS has been basically applied to the identification of clinical isolates [9][10][11][12][13][14][15][16] so most of the species currently included on available databases are those of clinical interest. For example in the case of family Rhizobiaceae only the type strains of three species, Rhizobium tropici, Rhizobium radiobacter and Rhizobium rubi, and eight pathogenic non-type strains of R. radiobacter, R. rhizogenes and Agrobacterium tumefaciens (currently R. radiobacter) are included in Biotyper 2.0 database (Bruker Daltonics) used in this study.
Therefore the objectives of this work were: (i) the evaluation of MALDI-TOF MS technology for species differentiation within family Rhizobiaceae, (ii) the construction of a database that includes the type strains of currently accepted species within family Rhizobiaceae and (iii) the validation of the MALDI-TOF MS technology to identify rhizobial strains isolated from nodules and tumours previously identified by gene sequencing.

Bacterial strains and culture conditions
To build a reference database for MALDI-TOF MS-based rhizobial species identification, the type strains of 56 species belonging to the family Rhizobiaceae were used (Table 1). In addition 35 strains isolated from legume nodules or plant tumours Table 1. Type strains of family Rhizobiaceae included in the extended database for MALDI-TOF MS-based species identification.

Sample preparation for MALDI-TOF MS
Cells of a whole colony were transferred from the plate to a 1.5 ml tube (Eppendorf, Germany) with a pipette tip and mixed thoroughly in 300 ml of water to resuspend the bacterial cells. Then, 900 ml of absolute ethanol was added and the mixture was centrifuged at 15,500 g for 2 min, and the supernatant was discarded. The pellet was air-dried at room temperature for 1 hour. Subsequently, 50 ml of formic acid (70% v/v) was added to the pellet and mixed thoroughly before the addition of 50 ml acetonitrile to the mixture. The mixture was centrifuged again at 15,500 g for 2 min. One microliter of the supernatant was placed onto a spot of the steel target and air-dried at room temperature. Each sample was overlaid with 1 ml of matrix solution and air-dried.

MALDI-TOF MS
Measurements were performed on an Autoflex III MALDI-TOF/TOF mass spectrometer (Bruker Daltonics, Leipzig, Germany) equipped with a 200-Hz smartbeam laser. Spectra were recorded in the linear, positive mode at a laser frequency of 200 Hz within a mass range from 2,000 to 20,000 Da. The IS1 voltage was 20 kV, the IS2 voltage was maintained at 18.6 kV, the lens voltage was 6 kV, and the extraction delay time was 40 ns.
For each spectrum, 500 laser shots were collected and analyzed (10650 laser shots from different positions of the target spot). The spectra were calibrated externally using the standard calibrant mixture (Escherichia coli extracts including the additional proteins RNase A and myoglobin, Bruker Daltonics

Spectrum generation and data analysis
For automated data analysis, raw spectra were processed using the MALDI Biotyper 2.0 software (Bruker Daltonics, Leipzig, Germany) at default settings. The software performs normalization, smoothing, baseline subtraction, and peak picking, creating a list of the most significant peaks of the spectrum (m/z values with a given intensity, with the threshold set to a minimum of 1% of the highest peak and a maximum of 100 peaks). To identify unknown bacteria, each peak list generated was matched directly against reference libraries (3,476 species) using the integrated patterns matching algorithm of the Biotyper 2.0 software (Bruker Daltonics, GmbH, Germany). The unknown spectra were compared with a library of reference spectra based on a pattern recognition algorithm using peak position, peak intensity distributions and peak frequencies. Once a spectrum has been generated and captured by the software, the whole identification process was performed automatically, without any user intervention. MALDI-TOF MS identifications were classified using modified score values proposed by the manufacturer: a score value $2 indicated species identification; a score value between 1.7 and 1.9 indicated genus identification, and a score value ,1.7 indicated no identification.
For reference library construction, 36 independent spectra were recorded for each bacterial isolate (three independent measurements at twelve different spots each). Manual/visual estimation of the mass spectra was performed using Flex Analysis 3.0 (Bruker Daltonics GmbH, Germany) performing smoothing and baseline substraction. Checking existence of flatlines, outliers or single spectra with remarkable peaks differing from the other spectra was done, taking into account that mass deviation within the spectra set shall not be more than 500 ppm. Finally, 20 spectra were selected, removing questionable spectra from the collection. To create peak lists of the spectra, the BioTyper software was used as described above. The 20 independent peak lists of a strain were used for automated ''main spectrum'' generation with default settings of the BioTyper software. Thereby, for each library entry a reference peak list (main spectrum) which contains information about average masses, average intensities, and relative abundances in the 20 measurements for all characteristic peaks of a given strain was created, so a main spectrum displayed the most reproducible peaks typical for a certain bacterial strain.
Cluster analysis was performed based on comparison of strainspecific main spectra created as described above. The dendrogram was constructed by the statistical toolbox of Matlab 7.1 (Math-Works Inc., USA) integrated in the MALDI Biotyper 2.0 software. The parameter settings were: 'Distance Measure = Euclidian' and 'Linkage = complete'. The linkage function is normalized according to the distance between 0 (perfect match) and 1000 (no match).

Phylogenetic analyses
The results of MALDI-TOF MS analysis were compared with those obtained after rrs, recA, atpD and nodC gene sequence analyses. In this work we obtained some sequences of these genes that are absent in databases according to Rivas   gene, Gaunt et al. [20] for recA and atpD genes and Laguerre et al. [21] for nodC gene. The sequences were aligned using the Clustal W software [22]. The distances were calculated according to Kimura's two-parameter model [23]. Phylogenetic trees were inferred using the neighbour-joining method [24] and the MEGA 4.0 package [25].

Database setting
In Biotyper 2.0 database only three species of genus Rhizobium are included and none of genus Ensifer or Shinella. Therefore a database extension in order to include the species currently described in these genera is necessary before applying MALDI-TOF MS to the identification of rhizobial isolates.
Owing to the fact that in Biotyper 2.0 database the type strains of three species of genus Rhizobium are already included, R. tropici DSM 11418 T , R. rubi DSM 6772 T and R. radiobacter DSM 30147 T , we verified the reproducibility of MALDI-TOF MS using the type strains of these species that were cultivated in two different media (YMA and TY) and incubated at 24 and 48 h.
The results obtained showed that the analysed strains matched with high score values (higher than 2.5) with each corresponding type strain already present in Biotyper 2.0 database when TY medium and 24 h incubation were used (Table 3). Lower score values were found with YMA medium incubated at 24 h and only R. rubi ATCC 13335 T and R. radiobacter ATCC 19358 T were correctly identified with score values higher than 2 (Table 3). This was probably due to the production of higher amounts of exopolysaccharide in YMA medium which makes the sample preparation difficult. After an incubation time of 48 h the score values were lower when both TY and YMA media were used and only R. rubi ATCC 13335 T and R. radiobacter ATCC 19358 T were correctly identified using YMA medium. Therefore best results for rhizobial species were obtained with TY medium and 24 h incubation, in spite of previous studies that have demonstrated high reproducibility of MALDI-TOF MS analysis in different culture media and growth phases [14,26,27].
Before the extension of Biotyper 2.0 database we also checked the suitability of MALDI-TOF MS system to differentiate the spectra of representative species from the three genera currently accepted in Family Rhizobiaceae.
Firstly we compared the spectra of the type strains from the type species of the three genera currently included in family Rhizobiaceae.
The results obtained showed that the spectra of Rhizobium leguminosarum USDA 2370 T , Ensifer adhaerens LMG 20216 T and Shinella granuli DSM18401 T were clearly distinguishable since there were not common peaks among their spectra ( figure 1).
Subsequently, we analyzed the spectra of two phylogenetically close and one phylogenetically divergent species from each genus. We selected from genus Rhizobium the close species R. leguminosarum (type species of genus Rhizobium) and R. pisi as well as the species R. cellulosilyticum, phylogenetically distant from them. From genus Ensifer we chose the close species E. meliloti and E. medicae and the species E. adhaerens, which is the type species of genus Ensifer and it is phylogenetically distant from the other two species. Finally, from genus Shinella we chose the close species S. granuli, type species of genus Shinella, and S. kummerowiae and the phylogenetically distant S. fusca.
All these spectra were quite different with almost any common peaks among those of species belonging to different genera as we previously observed for the type species of each genus. Within the same genus the spectra of close species were more similar than those from divergent species. For example, considering the mass tolerance    62 m/z for each peak as we have previously described [13] 1C). These results showed that the spectra of both phylogenetically close and distant species from the same genus, as well as those of species of different genera within family Rhizobiaceae can be differentiated by MALDI-TOF MS. Therefore we extended the database MALDI BioTyper 2.0 with 56 type strains of species from genera Rhizobium, Ensifer and Shinella belonging to Family Rhizobiaceae (Table 1).

Comparison between MALDI-TOF MS and phylogenetic analyses
To compare the data obtained by MALDI-TOF MS analysis with those based on gene sequence analysis (figures 2, 3 and 4), a cluster analysis was performed based on a correlation matrix using the integrated tools of the MALDI Biotyper 2.0 software package. Figure 5 showed that the genus Rhizobium was divided into several clusters whose distribution basically coincided with that observed after rrs, recA and atpD gene analyses. The results evidenced that some reclassifications performed within genus Rhizobium are correct as occurs in the case of the former species Agrobacterium tumefaciens reclassified into A. radiobacter [28]. MALDI-TOF MS results confirmed that they are the same species since their type strains held in different collections matched with score values higher than 2 (Table 2A and 2B). These results are congruent with those obtained from recA and atpD gene analyses since these strains presented nearly identical sequences (figures 3 and 4). After reclassification of the complete genus Agrobacterium into Rhizobium, the current valid name for these species is Rhizobium radiobacter [3].
MALDI-TOF MS analysis also confirmed the R. trifolii ATCC 14480 reclassification into R. leguminosarum [29], since the strain ATCC 14480 matched with R. leguminosarum USDA 2370 T with a score value higher than 2 (Table 2B), and Blastobacter aggregatus DSM 1111 T into R. aggregatum [30] since strain DSM 1111 T clustered with species of genus Rhizobium ( Figure 5).
On the contrary, some species of genus Rhizobium were erroneously reclassified. For example, R. phaseoli type strain was reclassified into R. leguminosarum [31]. Later the biovar phaseoli type I of this species was reclassified into R. etli [32], so it was not clear the location of the R. phaseoli type strain. A revision based on rrs, recA and atpD analysis showed that R. phaseoli is a valid species distinguishable from both R. leguminosarum and R. etli [29]. The results of the MALDI-TOF MS confirmed these results since R. phaseoli ATCC 14482 T matched with R. etli CFN42 T with score values lower than 2 (Table 2C).
Moreover, the MALDI-TOF MS cluster analysis showed, in agreement with rrs, recA and atpD gene analyses, that some current Rhizobium species are indistinguishable (figures 2, 3, 4 and 5). For example, the type strains of R. mongolense, R. loessense and R. yanglingense matched with R. gallicum R602sp T with score values higher than 2 (Table 2D). In addition, R. indigoferae CCBAU 71042 T matched with R. leguminosarum USDA 2370 T with a score value of 2.219 and R. fabae LMG 23997 T matched with R. pisi DSM 30132 T with a score value of 2.258 (Table 2D). Therefore the taxonomic status of all these species should be revised according to the current rules of bacterial taxonomy.
The genera Shinella and Ensifer MALDI-TOF cluster analysis was performed together (figure 6) since they are closely related on the basis of recA and atpD gene analyses (see figures 3 and 4). This closeness was confirmed after MALDI-TOF cluster analysis although the distribution of Shinella species was slightly different ( figure 6). The species S. yambaruensis was the closest related species to S. granuli on the basis of MALDI-TOF MS analysis, whereas these two species were distant according to their rrs gene sequences ( figure 2). However, S. yambaruensis DSM 18801 T matched with S. granuli DSM 18401 T with a score value lower than 2 corresponding to different species from the same genus.
The distribution of species in the genus Ensifer was coherent with those found after rrs analysis with E. medicae and E. meliloti forming the same group, E. morelense close to E. adhaerens and E. americanum (a not yet validated species) close to E. fredii ( figure 6).
In the genus Ensifer also MALDI-TOF MS analysis confirmed some reclassifications as that of species E. xinjiangense into E. fredii [33] since the former type strains E. xinjiangense LMG 17930 and CECT 4657 matched with E. fredii LMG 6217 T with score values of 2.413 and 2.151, respectively (Table 2B). Also was confirmed the reclassification of the strain Rhizobium sp. Br816 as Ensifer sp. [34,35] since it clustered with E. americanum ( figure 6). However, in agreement with rrs, recA and atpD gene analyses (figures 2, 3 and 4), strain Br816 does not belong to this species since it matched with E. americanum DSM 15007 T with score values lower than 2 (Table 2G).
However, other reclassifications were not correct as occurs with E. morelense reclassified into E. adhaerens [36] since E. morelense Lc04 T matched with E. adhaerens LMG 20216 T with a score value of only 1.245 (Table 2C) in agreement with rrs, recA and atpD gene analyses (figures 2, 3 and 4).
In the genus Ensifer, also some species were indistinguishable, for example, E. kummerowiae CCBAU 71714 T matched with E. meliloti ATCC 9930 T with a score value of 2.261 suggesting that they belong to the same species (Table 2D). Since this result coincides with the analysis of rrs, recA and atpD genes, the taxonomic status of E. kummerowiae should be revised.
All these findings showed that MALDI-TOF MS results are comparable to those obtained after the phylogenetic analysis of core genes from members of family Rhizobiaceae including that of rrs gene in which is currently based the classification within this family [1]. These results are in agreement with those previously reported for other bacterial groups [37] and therefore we analysed the potential of MALDI-TOF MS for identification of fast-growing rhizobia isolates.

Identification of rhizobial strains by MALDI-TOF MS
To prove the suitability of the extended MALDI Biotyper 2.0 database for routine identification and discrimination of fastgrowing rhizobial species we analysed several strains previously identified by rrs and housekeeping gene sequencing belonging to different species and genera of family Rhizobiaceae (Table 2F and 2G).
The species R. leguminosarum contains some strains with identical rrs gene and divergent recA and atpD genes [29,38,39]. For   . Although all these strains clustered with R. leguminosarum USDA 2370 T after MALDI-TOF MS cluster analysis (figure 5), only when the housekeeping genes were almost identical the score values were higher than 2 with respect to R. leguminosarum USDA 2370 T (Table 2F). These results were congruent with those from recA and atpD gene analyses showing that, in spite of the complete identity of rrs gene, R. leguminosarum could contain several subspecies perfectly distinguishable by MALDI-TOF MS analysis as it has already been observed in other bacterial species [26,40,41].
Although housekeeping gene sequences present higher variability than those of rrs genes, the ITS fragment located between 16S and 23S gene in fast growing rhizobia is the most hypervariable chromosomic region and has been proposed as a tool for species differentiation [42]. However, MALDI-TOF MS showed that strains with housekeeping genes nearly identical but different ITS sequences belong to the same species. For example, the strains RPA12 and RPA02 shared only 73% identity in their ITS sequences with respect to R. giardinii H152 T suggesting they can represent different species [43]. However, in agreement with rrs, recA and atpD gene analyses, MALDI-TOF MS showed that they belong to R. giardinii since they matched with the type strain of this species with score values higher than 2 (Table 2F).
The same was found for the genus Ensifer strains RTM17 and GVPV12 that matched with E. meliloti ATCC 9930 T with score values higher than 2 (Table 2F) in spite of the differences in the ITS region (95% identity) [44] and in agreement with the results of the housekeeping gene analyses (figures 3 and 4).
Intraspecific variability in species of family Rhizobiaceae could be also due to the presence of large plasmids codifying for symbiotic or virulence factors. Nodulating species may contain different biovars that carry different nodC genes [21,38,44,45] and pathogenic species contain strains that carry plasmids involved in tumour (pTi) or hairy roots induction (pRi) [46,47]. Therefore we analysed strains with different combinations of chromosomal backgrounds and symbiotic or virulence plasmids by MALDI-TOF MS.
For example, within genus Rhizobium, R. leguminosarum contains three biovars: viciae, trifolii and phaseoli [31,38], perfectly distinguishable on the basis of their nodC gene sequences (figure 7). However MALDI-TOF MS analysis showed that strains with housekeeping genes close to R. leguminosarum USDA 2370 T (RVS11, RPVF18 and ATCC 14480) [29,38,39] matched with score values higher than 2 with this strain with independence to the biovar they belong to ( figure 7). Likewise, the strains FL27 from biovar gallicum [45] and PhD12 from biovar phaseoli [21] carrying divergent nodC genes (figure 7) matched with R. gallicum R602sp T with score values higher than 2 (Table 2F). The same was found in R. lusitanum whose strains P1-7 T and P3-13 have phylogenetically distant nodC genes (figure 7) but they matched with a score value of 2.314 (Table 2F).
In genus Ensifer, E. meliloti also contains different biovars with divergent nodC genes (figure 7). However, the strains RPA13 and RTM17 from biovar meliloti and the strain GVPV12 from biovar mediterranense [44] were matched with E. meliloti ATCC 9930 T with score values higher than 2 by MALDI-TOF MS (Table 2F).
Conversely, strains from the same biovar but divergent housekeeping genes were perfectly distinguished by MALDI-TOF MS in genus Rhizobium. For example, the strain CVIII14 matched with a score value lower than 2 with R. leguminosarum USDA 2370 T , although both strains belong to the biovar viciae (Table 2G). To this biovar also belongs R. pisi DSM 30132 T that was correctly distinguished by MALDI-TOF MS from R. leguminosarum USDA 2370 T (figure 7 and Table 2C). Moreover, strains CFN299 and CIAT 899 T , whose rrs and housekeeping genes showed they belong to different species [48], matched with score values lower than 2 (Table 2F) in spite of the complete identity of their nodC genes ( figure 7).
Finally, two species of genus Rhizobium, R. rhizogenes and R. radiobacter, contain non-pathogenic strains, tumourigenic strains and hairy roots inducing strains (Table 2F). In both cases their strains were correctly identified by MALDI-TOF MS in agreement with the rrs and housekeeping gene analyses (figures 2, 3 and 4) in spite of the plasmidic content. In this way the nonpathogenic strain K84 [46], the tumourigenic strain 163C and the root inducing strain IAM 13571, matched with R. rhizogenes ATCC 11325 T , a root inducing strain, with high score values (2.185, 2.195 and 2.158, respectively). The tumourigenic strain ATCC 23308 (type strain of the former species A. tumefaciens) and the root inducing strain ATCC 13332 (erroneously named R. rhizogenes) also matched with the non-pathogenic strain R. radiobacter DSM 30147 T with score values higher than 2 (Table 2F).
Conversely, although the pTi plasmids of the tumourigenic strains 163C and C58 are closely related [47], they belong to different species according to MALDI-TOF MS results (Table 2G) in agreement with the rrs and housekeeping gene analyses (figures 2, 3 and 4).
All these results showed that plasmids carried by fast growing rhizobial strains do not affect their identification by MALDI-TOF MS since strains of the same species carrying very different plasmids and strains from different species carrying similar plasmids were correctly identified by MALDI-TOF MS. In conclusion, the results presented in this work clearly showed that MALDI-TOF MS is a reliable and rapid method for rhizobial identification comparable to housekeeping gene sequence analysis since it is able to discriminate between strains with identical rrs gene sequences but divergent recA and atpD. This feature represents important advantages based on the rapidity and cost per sample with respect to gene sequencing. With this methodology, if the databases include all rhizobial species described in each moment, it will be possible to identify all isolates belonging to species already described as well as the detection of new species. Therefore, MALDI-TOF MS open a new and very useful way for diversity and ecological studies applicable to analysis of large populations of isolates allowing the differentiation of strains, species and genera of fast-growing rhizobia with an effectiveness of 100% in the identification at species level.