Matrix-assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) Can Precisely Discriminate the Lineages of Listeria monocytogenes and Species of Listeria

The genetic lineages of Listeria monocytogenes and other species of the genus Listeria are correlated with pathogenesis in humans. Although matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has become a prevailing tool for rapid and reliable microbial identification, the precise discrimination of Listeria species and lineages remains a crucial issue in clinical settings and for food safety. In this study, we constructed an accurate and reliable MS database to discriminate the lineages of L. monocytogenes and the species of Listeria (L. monocytogenes, L. innocua, L. welshimeri, L. seeligeri, L. ivanovii, L. grayi, and L. rocourtiae) based on the S10-spc-alpha operon gene encoded ribosomal protein mass spectrum (S10-GERMS) proteotyping method, which relies on both genetic information (genomics) and observed MS peaks in MALDI-TOF MS (proteomics). The specific set of eight biomarkers (ribosomal proteins L24, L6, L18, L15, S11, S9, L31 type B, and S16) yielded characteristic MS patterns for the lineages of L. monocytogenes and the different species of Listeria, and led to the construction of a MS database that was successful in discriminating between these organisms in MALDI-TOF MS fingerprinting analysis followed by advanced proteotyping software Strain Solution analysis. We also confirmed the constructed database on the proteotyping software Strain Solution by using 23 Listeria strains collected from natural sources.


Introduction
The genus Listeria is gram-positive bacterium that can grow in saline and cold environments [1]. At present, the bacterial genus Listeria consists of 17 species, including Listeria Here, we report the construction of an accurate and reliable m/z database of annotated biomarker proteins for Listeria spp. based on the S10-GERMS method. We demonstrate that the selected biomarker proteins from the database can fulfill our aim in discriminating lineages of Listeria monocytogenes, and species of the genus Listeria (L. monocytogenes, L. innocua, L. welshimeri, L. seeligeri, L. ivanovii, L. grayi, and L. rocourtiae) using MALDI-TOF MS analysis.

Bacterial strains
For constructing the theoretical MS database, we used publically available L. monocytogenes strains and Listeria spp. listed in Table 1 obtained from the Japan Collection of Microorganisms, RIKEN BRC (Tsukuba, Japan), the American Type Culture Collection (Manassas, VA, USA), and the National BioResource Project GTC Collection (Gifu, Japan). They were aerobically cultivated in the brain heart infusion medium (Becton Dickinson, Franklin Lakes, NJ, USA) at 30°C. L. monocytogenes strains were serotyped using the Agglutinating Sera Listeria Antisera Set (Denka Seiken, Tokyo, Japan) and multiplex PCR as a countercheck [26].
To evaluate the constructed MS database, environmental isolates of Listeria species (L. monocytogenes, L. innocua, L. ivanovii, L. seeligeri, and L. rocourtiae) listed in Table 2 were used. They were screened from river water in Japan and identified by the conventional method using ALOA media (bioMérieux, Lyon, France) and antisera serotyping kit. Species of Listeria were determined by a physiological biochemical test using the Listeria identification system Api (bioMérieux), and 16S rRNA sequencing [27]. Pathogenicity of L. monocytogenes was confirmed by the CAMP test [28] and multiplex PCR for the inlB, plcA, plcB, and clpE genes [29]. Serotypes of L. innocua were not defined because antisera for serotyping were not available.

Construction of an MS database
The primers for DNA sequence analysis were designed based on the consensus sequence information available from the Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www. genome.jp/kegg/) and NCBI (http://www.ncbi.nlm.nih.gov/) ( Table 3). DNA fragments for sequence analysis were amplified by PCR using KOD plus DNA polymerase (Toyobo, Osaka, Japan) and extracted genomic DNA as the template as described previously [24]. We used the Big Dye ver. 3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) for sequencing, and calculated the theoretical m/z values of biomarker proteins using an in-house macro software programmed based on the workflow of DNA to amino acid conversion, calculating protein MS from each amino acid, and addition of a proton (m/z 1.01) and reducing MS of Nterminal Met (m/z 131.19) when the second amino acid is Ala, Cys, Gly, Pro, Ser, Thr, or Val.

MALDI-TOF MS analysis
Bacterial cells grown overnight on an agar plate (three colonies) or in 2 mL liquid medium were collected and suspended in 0.5 mL of 70% (v/v) ethanol. Cells were then separated by centrifugation at 10,000 × g for 2 min at 4°C and dehydrated using a centrifugal evaporator (CVE-3100, EYELA, Tokyo, Japan), after which a 10 μL aliquot of 35% (v/v) formic acid and cells were mixed by pipetting. Then, 1.5 μL of this mixture was mixed with 10 μL matrix reagent containing 20 mg/mL sinapinic acid (SA; Wako Pure Chemical Industries, Osaka, Japan) and 1% trifluoroacetic acid (Wako Pure Chemical Industries) in 50% (v/v) acetonitrile; then 1.5 μL was spotted onto the analytical metal plate. Samples were analyzed using an AXIMA Microorganism Identification System (Shimadzu Corporation) with 100 laser shots at a spectrum range of 2000 m/z-35000 m/z with 500 ppm tolerance. We used α-cyano-4-hydroxycinnamic acid (CHCA) as a matrix for the SARAMIS database searching, as described in a previous report [25]. For calibrating the instrument, Escherichia coli DH5α was used according to the manufacturer's instructions.

Analysis with Strain Solution software
The datasets of MS and peak intensity in ASCII files were incorporated into the software Strain Solution version 1.0.0. (Shimadzu Corporation) and analyzed according to the instructions. The MS values of the parental database were registered in the software in advance.

Results
Construction of an MS database for discriminating L. monocytogenes lineages First, we constructed an m/z database of ribosomal proteins encoded in the S10-spc-alpha operon and additional potential biomarkers using publically available L. monocytogenes strains ( Fig 1A). The ribosomal proteins L23, L2, L24, and L6 in the S10-spc-alpha operon and the These strains (JCM7689 and JCM7682) were deposited as L. monocytogenes in JCM but we revealed that they are L. seeligeri. three additional ribosomal proteins L10, L13, and S9 appear capable of differentiating serotypes of L. monocytogenes because their unique MS depending on the strains. From the comparison of the theoretically calculated MS of S9 and its actual observed MS peaks, the mass weight of the observed peak of S9 shifted +43, indicating an acetylated S9. Among these, L23, L2, and L10 were not detected in the MALDI-TOF MS analysis, although they were expected to be powerful biomarkers from their theoretical MS value varieties. We observed the MS peaks of L13 with m/z 16200.61 or 16184.61 in all L. monocytogenes samples and m/z 16187.61 in L. seeligeri strains (data not shown); however, this protein was not suitable as biomarkers because MS differences (shifts) were too small. In contrast, the three ribosomal proteins L24, L6, and S9 were ideal biomarker candidates because they were always detected, regardless of the strain, with the current MS detection tolerance (more than 500 ppm).

Analysis of L. monocytogenes with Strain Solution software
From the results above, we selected three ribosomal proteins (L24, L6, and S9) as biomarker proteins capable of discriminating L. monocytogenes lineages. Four patterns of MS values of these proteins (listed in Fig 1B), which divided L. monocytogenes into four groups (A to D), were preregistered in the analytical software Strain Solution. First, 14 L. monocytogenes strains were identified as 'L. monocytogenes' using the conventional fingerprinting analysis software, SARAMIS (data not shown). Next, the MS data obtained with matrix SA was imported into the Strain Solution software. As shown in Fig 1C, the attribution of all biomarkers generated from automated analysis with Strain Solution were correct. The lineages of L. monocytogenes were classified into four groups (A to D) as follows: lineage I [registered group B (serotypes 1/2b, 3b, Construction of an MS database for Listeria species As described above, three promising sets of ribosomal proteins (L24, L6, and S9) were consistently detected by MALDI-TOF MS analysis and were effective for typing of L. monocytogenes. Table 3. DNA Primers used in this study.
Comparison of fingerprinting analysis and S10-GERMS method As mentioned above, the conventional fingerprinting analysis software SARAMIS could properly identify all L. monocytogenes strains to the species level. Here, we analyzed L. innocua, L. ivanovii, L. seeligeri, L. welshimeri, L. rocourtiae, and L. grayi by SARAMIS. As shown in Table 4, most of them were identified as only 'Listeria spp.' except for L. grayi, which was correctly identified to the species level. However, L. ivanovii JCM7681, L. seeligeri JCM 7679 and JCM 7682 were misidentified as L. monocytogenes. While L. rocourtiae had a 0% match to the parental database in SARAMIS, resulting in 'not identified' at first, it was correctly identified to species level after an additional database was imported according to the manufacturer's instructions for SuperSpectra™ in SARAMIS. This implies that fingerprinting depends greatly on the reference database quality. The parental database for L. rocourtiae registered here is shown in S1 Table. Next we tried automated discrimination at the species level using eight biomarkers in the database of Fig 2. First, MS patterns belonging to the group A to K were preregistered in Strain Solution based on Fig 2. The m/z 15797.08, 15797.03, and 15796.09 of L15 were preset as m/z 15797.08 because the MS differences were too small to be detected on a normal MALDI-TOF MS system. Similarly m/z 19270.08 and 19270.04 (specific to L. seeligeri) of L6 were registered as m/z 19270.08 because their MS differences were too small to be recognized. As a result, all Listeria spp. strains listed in Table 1 were assigned to the proper groups by our system. The spectra of the biomarkers obtained in our MALDI-TOF MS analysis are shown in S1 Fig.

Evaluation of the constructed MS database
Strains isolated from environment were used for blind test to evaluate the constructed MS database (Table 2). In fingerprinting analysis with SARAMIS, all L. monocytogenes strains (1-6 and 17) and L. rocourtiae (18)(19)(20) were correctly identified as L. monocytogenes and L. rocourtiae at the species-level, respectively. In contrast, the other 13 strains were just identified as Listeria spp., although the SARAMIS database has already included the m/z information of L. innocua and L. ivanovii.
The Strain Solution software further scanned m/z data of a total 20 strains, except for L. rocourtiae. The analytical results of the traditional method and Strain Solution are summarized in Table 5. Using eight biomarkers in Fig 2, all L. monocytogenes (seven strains), L. innocua (six strains), and L. ivanovii (three strains) were correctly identified at the lineage or species level in our system, with a 100% match for each group. However identification accuracy of L. seeligeri was 3/4 (75%) due to the misidentification of No. 13 strain into L. innocua. S10-GERMS method, which combines both genomics and proteomics. First, we constructed a database of ribosomal proteins encoded in the S10-spc-alpha operon for L. monocytogenes and found that the MS of 15 kinds of proteins varied with serotype (Fig 1). Among these, three potential biomarkers, L24, L6, and S9, whose MS peaks were always detected in MALDI-TOF MS analysis were selected as biomarkers for typing L. monocytogenes. In addition, five potential biomarkers, L18, L15, S11, L31 type B, and S16 with a specific MS value in L. innocua, could be used to differentiate Listeria species, including L. monocytogenes (Fig 2). CHCA is a recommended matrix reagent in fingerprinting methods; therefore, it is difficult to identify high molecular m/z proteins. However, when SA was used as a matrix reagent, the novel biomarkers with high m/z were successfully detected in this study as follows:  [16,19] were confirmed as solid values of peaks corresponding to the ribosomal protein L24 with m/z 11180. 22, 11194.25, or 11254.35, respectively, by proteotyping for the first time. The differences in MS values between previous reports and this study come from the accuracy of the experimental procedure. To realize strainor serotype-level microbial discrimination at a higher resolution than that of conventional fingerprinting analysis, the accuracy of MS values is important because it relies upon the MS database, which reflects even single amino acid substitutions. In fact, we often observe slight differences in MS peaks derived from the same proteins in closely related microorganisms [23,25]. Fingerprinting analysis may still greatly be influenced by the culture and/or growing conditions of the target bacteria [30]; therefore, proteotyping data backed up by genetic sequences will be of great importance for correct identification. Our results indicate that ribosomal proteins L24, L6, and S9 are especially important to discriminate lineages of L. monocytogenes and to differentiate L. monocytogenes and L. seeligeri (Figs 1 and 2).
In contrast, the MS peaks of m/z 5590, 5594.85, 5601.21, and 6184.39, identified by previous reports as biomarkers for serotyping L. monocytogenes [16,19], were observed in this study as m/z 5590 (lineages II and III) and m/z 5595 (lineage I), although serotype 4d strains did not exhibit the corresponding peaks (data not shown). We did not select this unknown protein as a potential biomarker because the peak intensities were insufficient in our analysis, likely due to the use of SA as the matrix.
L. grayi and L. rocourtiae were correctly identified at the species level by SARAMIS and SuperSpectra (Table 3). L. grayi is known to be evolutionarily distant from other Listeria spp. [31], and L. rocourtiae is a recently emerging species isolated from lettuce in Australia [32].
The MS values of the eight biomarkers in these two species differed greatly from those of the other Listeria spp. (Fig 2A). This result is consistent with the concept that ribosomal protein evolution is well associated with bacterial species [33]. However, SARAMIS could not identify them correctly at the species level due to their very similar MS profiles, which may be indistinguishable by conventional fingerprinting (Table 3). Even in such a case, the MS database in Fig  2 with the Strain Solution software plays a significant role in the success of discrimination.
We further analyzed environmental strains for blind test and validated the MS database (Tables 2 and 5). The hemolytic species L. monocytogenes, L. seeligeri, and L. ivanovii are genetically close [34,35] and sometimes misidentified. In our study, one L. seeligeri strain (No. 13) was identified as L. innocua by Strain Solution. It was most likely to be L. seeligeri by 16S rRNA sequence analysis, but the profiling of physiological biochemical test was L. innocua or L. welshimeri due to the positive signal of arylamidase and D-xylose utilization (data not shown). These results supported by our discrimination result, suggesting that Strain Solution could distinguish such minor differences between very similar strains that might be misidentified or overlooked by traditional methods.
The main aim of this study was the construction of a standardized and reliable database for an important pathogen, the genus Listeria. We successfully demonstrated the capability of the constructed database and Strain Solution software to discriminate L. monocytogenes at serotype level, as well as different species of Listeria that were difficult to identify by conventional fingerprinting methods. Although we assessed this database using naturally isolated strains (Table 2), demonstration using larger scale samples still required for validation. Nevertheless, we believe that proteotyping software, Strain Solution, together with the accurate MS database constructed here can be broadly applied by any laboratory using any MALDI-TOF MS system to perform strain-or serotype-level microbial classification beyond conventional fingerprinting. Therefore, we are willing to evaluate the constructed database in collaboration with institutions possessing Listeria isolates. These investigations will open a new window to discriminate bacteria in clinical and diagnostic laboratories, and also food-related industries.  Table. MS data of L. rocouritiae registered in SuperSpectra. The registration of entries refers to the manual of SuperSpectra. (DOCX)