Fast DNA Serotyping and Antimicrobial Resistance Gene Determination of Salmonella enterica with an Oligonucleotide Microarray-Based Assay

Salmonellosis caused by Salmonella (S.) belongs to the most prevalent food-borne zoonotic diseases throughout the world. Therefore, serotype identification for all culture-confirmed cases of Salmonella infection is important for epidemiological purposes. As a standard, the traditional culture method (ISO 6579:2002) is used to identify Salmonella. Classical serotyping takes 4–5 days to be completed, it is labor-intensive, expensive and more than 250 non-standardized sera are necessary to characterize more than 2,500 Salmonella serovars currently known. These technical difficulties could be overcome with modern molecular methods. We developed a microarray based serogenotyping assay for the most prevalent Salmonella serovars in Europe and North America. The current assay version could theoretically discriminate 28 O-antigens and 86 H-antigens. Additionally, we included 77 targets analyzing antimicrobial resistance genes. The Salmonella assay was evaluated with a set of 168 reference strains representing 132 serovars previously serotyped by conventional agglutination through various reference centers. 117 of 132 (81%) tested serovars showed an unique microarray pattern. 15 of 132 serovars generated a pattern which was shared by multiple serovars (e.g., S. ser. Enteritidis and S. ser. Nitra). These shared patterns mainly resulted from the high similarity of the genotypes of serogroup A and D1. Using patterns of the known reference strains, a database was build which represents the basis of a new PatternMatch software that can serotype unknown Salmonella isolates automatically. After assay verification, the Salmonella serogenotyping assay was used to identify a field panel of 105 Salmonella isolates. All were identified as Salmonella and 93 of 105 isolates (88.6%) were typed in full concordance with conventional serotyping. This microarray based assay is a powerful tool for serogenotyping.


Introduction
Salmonellosis caused by salmonellae belongs to the most prevalent food-borne zoonotic diseases throughout the world [1]. Therefore, serotype identification for all culture-confirmed cases of Salmonella infection is important for epidemiological purposes. The genus Salmonella includes two species: Salmonella (S.) enterica and Salmonella bongori. The species Salmonella enterica is divided into the following six subspecies: S. enterica subsp. enterica (I), S. enterica subsp. salamae (II), S. enterica subsp. arizonae (IIIa), S. enterica subsp. diarizonae (IIIb), S. enterica subsp. houtenae (IV) and S. enterica subsp. indica (VI) [2]. The subspecies Salmonella enterica subsp. enterica (I) includes the most relevant zoonotic pathogens with a global occurrence. A serotyping scheme, proposed by Kauffmann 1934 [3], divides all subspecies into serovars by immunologic analyses of two surface structures, O-polysacharide (O-antigen) and flagellin protein (Hantigen). The Kauffman-White scheme was expanded from 44 serovars in 1934 to 2,587 serovars currently known [2].
Genes required for the biosynthesis of the O-antigen are organized in the rfb cluster [4,5]. Within this cluster, sequences of the sugar transferases are relatively conserved and two genes are responsible for most of the genotypic and phenotypic differences of the 46 Salmonella O-serogroups described in the Kauffmann-White scheme. The genes of the O-antigen flippase (wzx) and polymerase (wzy) are highly variable and specific for their respective serogroup [5,6]. The H-antigen used for serotyping is encoded by two flagellar structure genes; fliC (phase 1 flagellin) and fljB (phase 2 flagellin). Both genes are highly conserved at their 59 and 39 ends and variable in their central region [7,8]. Most Salmonella serovars are diphasic where fliC or fljB is expressed alternately. Serovars with only one H-phase are considered to be monophasic. Monophasic Salmonella could theoretically originate in two different ways. They either might represent ancestral forms which lack the second flagellar antigen and did not yet evolve the necessary switching mechanism. Alternatively, they could be deletion mutants of biphasic salmonellae that have lost either the switching mechanism or the ability to express the second flagellar antigen [9]. The genetic switching between these two phases is regulated by the hin gene, coding for a DNA-invertase [10]. This approximately 900-base pair (bp) DNA fragment adjacent to fljB, which specifies the synthesis of the H2 flagellar antigen, can exist in either orientation with respect to fljB. The orientation of the inversion region controls the expression of fljB, i.e., in one orientation the adjacent fljB is expressed and in the opposite orientation fljB is not expressed [11].
As a gold standard, the traditional culture method is used to detect Salmonella and, since 2002, ISO 6579 represents a legislative norm for the detection of Salmonella [12]. This method includes the non-selective pre-enrichment in buffered peptone water followed by selective enrichment and plating on two solid selective media. Colonies of interest are confirmed biochemically and serologically by agglutination with specific sera. However, the procedure according to ISO 6579:2002 takes 4-5 days to be completed. Additionally, classical serotyping is labor-intensive, expensive, requires highly experienced laboratory staff and more than 250 reagents [13] that are necessary to characterize more than 2,500 Salmonella serovars currently listed. Besides, commercially available sera are not standardized and their availability is often limited due to a lack of resources and funding. In contrast, genotyping methods use DNA sequence information for identification. Such sequence information is unique and techniques can easily be reproduced and standardized between different laboratories. For this reason, there is an increasing need for a simple genotyping method that does not require a stock of different sera, but can be performed automatically in high throughput and for which reagents are available worldwide. Different molecular typing systems have been developed to meet this demand, such as multiplex real time PCR [14,15], primer extension [16], microarrays [13,17], DNA sequence approaches [18], bead-based suspension arrays [19,20] and ligation based microarrays [21]. Some recent molecular techniques have the disadvantage that only a small subset of serotypes can be typed whereas other approaches do not provide an antigenic formula compatible with the Kauffmann-White scheme. Some techniques are too expensive and/or labor intensive to be implemented in public health or diagnostic laboratories. Ballmer et al (2007) proposed a genotyping microarray for Escherichia (E.) coli [6]. Using a comparable system we aim to develop a high throughput, economical, array-based system to serotype Salmonella via its genotype. The microarray includes 255 different targets to analyze O-and H-phases and assign the genotype to the antigenic formula according to the Kauffmann-White scheme. Additionally, we included 77 targets related to antimicrobial resistance (AMR). Validation and testing of the array was completed with 132 different Salmonella serovars, including the most prevalent Salmonella serovars from human and non-human sources from North America and Europe ( [1], www.cdc.gov/ncezid/dfwed/ PDFs/SalmonellaAnnualSummaryTables2009.pdf) to ensure the development of a comprehensive assay with a global scope.

Bacterial strains, growth conditions, and genomic DNA extraction
The microarray based assay was evaluated with a set of 168 reference strains representing 132 serovars previously serotyped by conventional agglutination through various reference centers, including the Centers for Disease Control and Prevention (CDC, Atlanta, USA), German Collection of Microorganism and Cell Cultures (DSMZ, Brunswick, Germany), Salmonella Genetic Stock Center (SGSC, Calgary, Canada) and National Reference Laboratory for Salmonellosis in Cattle at the Friedrich-Loeffler-Institute (FLI, Jena, Germany) ( Table 1). For the S. serovar (ser.) Typhimurium strain LT2, the complete genome sequence (GenBank NC_003197.1), the antigenic formula and a theoretical prediction of the microarray hybridization pattern (which was generated using a probe-matching matrix; see Table S1), were available. The strains were cultivated on tryptone yeast agar, and genomic DNA was extracted using a Roche High Pure PCR Template Preparation Kit (Roche Diagnostics, Germany) or a Qiagen DNeasy Blood & Tissue Kit (Qiagen GmbH, Hilden, Germany) according to manufacturer's instructions after treatment with lysis enhancer (Alere Technologies, Germany). If necessary, DNA was concentrated to at least 100 ng/ml using a SpeedVac centrifuge (Eppendorf, Hamburg, Germany) at room temperature for 30 min/1,400 rpm.

Array design
Discrimination of the 46 described O-serotypes is mainly determined by the genes wzy (polymerase) and wzx (flippase). The 114 known H-antigens are encoded by two genes; fliC (phase 1 flagellin) and fljB (phase 2 flagellin). The resulting 24-to 34-bp primer and oligonucleotide probes for serogenotyping were selected from variable parts of multiple sequence alignments for these determinative genes aiming to be as discriminating as possible and to contain similar G+C contents to ensure a very similar hybridization behavior. 255 serotyping probes were designed by analyzing all available annotated GenBank sequences (NCBI, http://www.ncbi.nlm.nih.gov/) related to the genes wzx and wzy as well as fliC and fljB. Additionally, the genes manC (O7, O11, O18, O40, O41), wbuH (O41, O62), weiB (O66), and rfbV (O4) were used to discriminate O-serotypes ( Table 2) (A, B, C1, C2-C3, D1, D2, E1, E4, F, G, H, I, K, M,  N, O, P, T, U) were tested on the microarray using the 132 reference serovars (Table 1). Due to the high similarity between the serogroups A and D1, additional probes were designed to discriminate S. ser. Nitra and S. ser. Enteritidis. For this purpose, specific probes located in the genes lygA, lygD, sefA, sefB and sefC were designed to specifically identify S. ser. Enteritidis (Table 2). In order to identify S. ser. Paratyphi A, probes were designed to target the intergenic region SSPAI, a genomic island next to clpA [22]. For the discrimination of S. ser. Dublin from S. ser. Kiel, the genes SeD_A1100, SeD_A1101 and SeD_A1102 were used as they code for a conserved putative protein being specific for serovar Dublin [23].
All 332 probes (255 serotyping and 77 AMR probes, synthesized by Metabion, Martinsried, Germany) were spotted at Alere Technologies, Germany in duplicates to the ArrayStrips as previously described [28]. Biotinylated oligonucleotides with a random sequence were spotted as a staining control and spotting buffer was spotted as a negative control.

Multiplex linear DNA amplification and labeling for hybridization to prepared ArrayStrips
For multiplex linear DNA amplification, a set of 292 primers (220 serotyping primer and 72 AMR primer, synthesized by Metabion, Martinsried, Germany) was used. These primers are located on the complementary strand, downstream of the sequence of the covalently immobilized oligonucleotide detection probes (the number of probes and primers do not need to be identical, a primer can target a consensus region, while probes might bind to more variable parts close by, which allows discerning different alleles of one gene). The labeling of the genomic DNA was accomplished during the linear amplification step by using dUTP linked biotin as a marker, thereby allowing site-specific internal labeling of the corresponding target region (Fig. 1a). Using the HybPlus Kit (Alere Technologies, Germany), at least 0.5 mg genomic DNA were labeled according to the manufacturer's instructions. The linear amplification steps included 5 min of initial denaturation at 96uC, followed by 50 cycles with 20 s of annealing at 50uC, 40 s of elongation at 72uC, and 60 s of denaturation at 96uC. This reaction results in a multitude of specifically amplified, single-stranded, biotin-labeled DNA molecules for subsequent hybridization to the corresponding DNA microarray.

Hybridization of the ArrayStrips
For the hybridization procedures, the HybPlus Kit (Alere Technologies, Germany) was used according to the manufacturer's instructions with an adapted protocol. This included hybridization buffer C1, washing buffer C2, peroxidase-streptavidin conjugate C3, conjugation buffer C4, washing buffer C5 and peroxidase substrate D1. First, ArrayStrips were placed in a thermomixer (Quantifoil Instruments, Jena, Germany) and subsequently washed with 200 ml of de-ionized water for 5 min at 55uC/550 rpm and with 100 ml hybridization buffer C1 for 5 min at 55uC/550 rpm. All liquids were always completely removed with a soft plastic pipette to avoid scratching of the chip surface. In a separate tube, 10 ml of the labeled, single-stranded DNA were dissolved in 90 ml hybridization buffer C1. The hybridization was carried out at 55uC, shaking at 550 rpm for 1 h. After hybridization, the ArrayStrips were washed two times for 5 min with 200 ml washing buffer C2 at 45uC, shaking at 550 rpm. Peroxidase-streptavidin conjugate C3 was diluted 1:100 in buffer C4. A total of 100 ml of this mixture were added to each slot of the ArrayStrip, and subsequently incubated for 10 min at 30uC and 550 rpm. Afterwards, washing was carried out two times at 550 rpm with 200 ml C5 washing buffer at 30uC, with each step performed for 5 min. The visualization was achieved by adding 100 ml of peroxidase substrate D1 to the ArrayStrips, and signals were detected with the ArrayMate device (Alere Technologies, Jena, Germany) (Fig. 1b-c).
The described, final protocol was achieved by optimizing hybridization conditions (45uC-58uC) and washing temperatures (45uC-58uC) whereas the concentration of substances and incubation periods for each step were always constant. For this procedure, only strains were used for which published genome sequences (NCBI genome database) allowed to theoretically  Processing data using PatternMatch algorithm Hybridization signals were processed using the IconoClust software, version 3.2r1 (Fig. 1d). All spots were normalized automatically by the software according to the quotation

NI~1{ M BG
where NI is the normalized intensity, M the average intensity of the automatically recognized spot, and BG the intensity of the local background. The output range of the signals were between 0 and 1 with 0 being negative and 1 being the maximal possible signal value. A probe-matching matrix was used to construct the theoretical hybridization pattern of the fully sequenced strains listed in NCBI database (Table S1). The definition of the theoretical signal intensity was 0.9 for perfect match, 0.6 for 1 mismatch, 0.3 for 2 mismatches, 0.1 and below for 3 mismatches and no signal for more mismatches. For each of these sequenced strains, at least one reference strain was used to assign the expected pattern with the pattern of the real hybridization experiments. For this operation, the PatternMatch algorithm was used [29]. The final numerical output was given as the matching score (MS), which represents the overall sum of all differences between corresponding signal intensities of theoretical and real hybridization experiments. Thus, the MS value is a measure of overall similarity/dissimilarity between two hybridization patterns. An ideal match of two patterns based on the same set of oligonucleotide probes will yield MS = 0, whereas values above MS = 6.5 require critical scrutiny because they may indicate a poor match. The Delta MS value, defined as the arithmetic difference between best and second best match, served as measure for the accuracy of species identification. A Delta MS higher than 1.5 was considered to be sufficient for an unambiguous distinction between two patterns. Calculation of similarities was carried out by comparing signals for all 255 probes between theoretical predictions and real experiments. Signals with intensities higher than 0.3, were considered positive and set as ''1''. Signals lower than 0.3, were regarded negative and set as ''0''. The number of probe differences was summarized and the percentage was calculated. In order to assess the reproducibility, eight experiments were performed under identical conditions. All experiments were compared to each other using the PatternMatch algorithm and the mean, maximum and minimum MS were calculated. Antimicrobial resistance All isolates in which AMR genes were detected, a total of 34 Salmonella isolates belonging to 18 serovars, were tested for their phenotypic antimicrobial resistance. This was carried out using the VITEK 2 system with the AST-N111 test panel (bioMérieux Deutschland GmbH, Nürtingen, Germany). Additionally, chloramphenicol (30 mg), kanamycin (30 mg) and streptomycin (10 mg) were tested by disk diffusion assay. This assay was performed using CLSI.

Field study
A panel of 105 Salmonella isolates was tested in a blinded field study. All isolates were serotyped using the standard procedures [12] at the National Reference Laboratory for Salmonellosis in Cattle at the Friedrich-Loeffler-Institute (Jena, Germany). The serotyped isolates were subsequently genotyped using the automated PatternMatch software installed on the ArrayMate device (IconoClust version 3.2r1). Finally, serotyping and genotyping results were compared. Within this blinded panel, mistakes were defined as ''major'', if a serovar was completely falsely identified (e.g., a Dublin isolate as Naestved). A ''minor'' mistake was if the serovar was identified correctly, but if a variant of this serovar was not recognized (e.g., S. ser. Typhimurium var. Copenhagen as S. ser. Typhimurium). Per definition, only ''major'' mistakes were regarded as incorrect hits.

Verification of the assay and database building for PatternMatch
A set of 168 Salmonella strains representing 132 different serovars were used to evaluate the probes printed on the array, the primers in the labeling mixture, and to build a database for identification of the globally most prominent Salmonella serovars. Comparison of predicted and real hybridization results was performed for strains with fully sequenced genomes (see Materials and Methods and Table 4). The similarity between the predicted and real hybridization results of the serogenotyping array was more than 99 percent (Table 4). Because both, the full sequence information of the genome and the antigenic formula of S. ser. Typhimurium strain LT2, were available, an exact comparison of predicted and actual experimental hybridization pattern was possible (Fig. 2). It showed a 100% identity when regarding just positive and negative signals. A more detailed analysis, also considering signal intensities, showed a high degree of similarity between theoretical predictions and actual experiments with exceptions at probe hp-3221-wzx_O4 (signal intensity increased about 42% as predicted by the theoretical experiment) and hp-3282-Q8ZK10 (signal intensity decreased about 43% as predicted by the theoretical experiment). The highest discrepancy was found for S. ser. Paratyphi A and S. ser. Paratyphi B. Analysis of the results of S. ser. Paratyphi B showed that two probes were negative in actual hybridizations compared to the theoretical predictions (Fig. 2). However, the missing probes were redundant for one target gene (e.g.; S. ser. Paratyphi B fliC-H1:b), so that this issue did not influence the identification. Because of the high correlation between theoretical predictions and actual experiments, as well as the high similarity of T m of all 255 serotyping probes, it is assumed that the detection efficiency with other Salmonella serovars will also be comparably precise under the same conditions. Furthermore, the results of these theoretical experiments were used to find an optimal protocol (data not shown) for the hybridization of the Salmonella array so that an optimal, stringent hybridization and washing temperature could be defined (see Methods part).
Using this optimized protocol (as described in Materials and Methods), strains of all 132 Salmonella serovars were analyzed Each serovar was tested at least three times using the Salmonella array to ensure consistent results and the identification of the unique and reproducible serovar-specific probe patterns. These unique patterns were used to build a PatternMatch database consisting of data from real experiments instead of theoretical experiments from defined strains. A manual serotyping using the probe-function table (Table 2) was restricted by the resolution of probes identify the H2-phase. This phase was mainly a combination of different probes, e.g. H2:1,5 of different serovars was always a combination of different ''FL-1+e,n,x'' probes ( Table 2). Nevertheless it was possible to estimate the Salmonella serotype at least for the serogroup and in most cases for the phase H1. In the end the probe-function table served as a control for classical serotyped Salmonella before they were used in the PatterMatch database.  Detection software Using the described PatternMatch module, a software package was developed to analyze Salmonella serovars directly at the ArrayMate device directly after scanning and calculating signals of the stained arrays (IconoClust Software version 3.2r1) (Fig. 1e). The detection software used the same database comprising 168 reference Salmonella strains (representing 132 Salmonella serovars) which were classically serotyped. Patterns of unknown Salmonella were compared to the whole database and the two best hits were given in a result sheet (Fig. 3). Prior to PatternMatching, all calculated signals were normalized within a range of 0 and 1. Briefly, the mean of valid signals was calculated and subsequently, the formulaS n~S m {min max{min (S n = normalized signal, S m = mean of signal, min = Minimum of all signals, max = maximum of all signals) was used to normalize the mean of valid values. Due to the normalization procedure, experiments with very low signal intensity could also be analyzed and subsequently compared with the database. This method guaranteed a correct assignment to the reference pattern within the provided database. Furthermore, different parameters were requested by the software: a) two biotin marker spots as positive staining controls, b) spotting buffer as a negative control and c) marker for detection of Salmonella. These results were included in the result sheet (Fig. 3). Additionally, the report contains the genotyping results of all AMR genes. The software tool was evaluated using all reference strains included in the database. All 168 reference strains were perfectly identified even if the experiment showed week signals (data not presented here). A multiple PatternMatch analysis of eight identical hybridization experiments with the same genomic DNA isolated from S. ser. Typhimurium DSM5569 showed a mean matching score (MS) of 2.1260.65 with a maximum MS of 3.32 and a minimum MS of 1.15. The mean and maximum MS were significantly (t-test, p,0.05) lower than the MS value for poor matches (MS. = 6.5). These results showed the high reproducibility of this assay described in this study.

Antibiotic resistance
In a panel of 34 Salmonella strains 26 different AMR genes were detected and subsequently compared with the AMR phenotype of these strains (Table 5, detailed view in Table S2). A high correlation was observed for all detected genes relating to the AMR phenotype.
An extended-spectrum beta-lactamase (ESBL) gene, ctxM1, was detected once, in an isolate of S. ser. Anatum AMR07. This strain was resistant against ceftazidime and cefpodoxime, both members of third generation beta lactams.
AMR phenotypes for which no corresponding AMR genotype were detected included streptomycin resistance in two isolates (S. ser. Saintpaul and S. ser. 1,4, [5],12:i:-) and ampicillin resistance in one S. ser Bredeney isolate. The latter isolate yielded a positive signal in a nitrocefin assay (BBL DrySlide Nitrocefin, Becton Dickinson).
No assessment was possible for resistance genes sul1 and sul2 that should cause isolated resistance to sulfonamides because Figure 3. Result sheets of the PatternMatch software as generated by the ArrayMateTM Reader. General information: sample ID, negative/positive staining control and marker detecting genus Salmonella (invA, galf, manC). Serotyping assignment section: two best hits according to the processed sample were given as matching scores (MS) and as percentage. Antibiotic resistance and virulence genotyping section: detectable resistance and virulence genes. The pattern looks slightly different because of positive probes for resistance genes (aadA1, aadA2, cmlA1, dfrA12, sul3, tem1, tetB) that usually are located on mobile genetic elements. doi:10.1371/journal.pone.0046489.g003 Table 5. Comparison of antimicrobial resistance (AMR) genotype and AMR phenotype. testing was performed only for co-trimoxacole only.). A gene mphA mediating erythromycin resistance was found in S. ser. Anatum strain AMR05 (Table S2), but erythromycin susceptibility was not tested using a panel for gram-negatives on the VITEK 2 system.

Field study
After assay verification, the Salmonella serogenotyping assay was used to identify a field panel of 105 Salmonella isolates (Table 6) sampled and serotyped by the National Reference Laboratory for Salmonellosis in Cattle at the Friedrich-Loeffler-Institute (FLI, Jena, Germany). All tested isolates were identified as Salmonella and, out of 105 isolates, 93 were typed correctly (88.6%, Table 6). The limitation of the actual assay was that certain strains yielded identical patterns on the current array thus prohibiting further differentiation ( Table 6). Such limitations occurred for S. ser. Enteritidis, which actually cannot be discriminated from S. ser. Nitra and S. ser. Blegdam. Furthermore, the pattern of S. ser. Dublin was identical to S. serovars Naestved, Moscow and Kiel. A discrimination of S. ser. Dublin and S. ser. Kiel was impossible as probes representing the genes SeD_A1100, SeD_A1101 and SeD_A1102 were positive for both serovars. Similar limitations were also observed for S. ser. Panama (identical pattern as S. ser. Koessen), S. ser. Indiana (identical pattern as S. ser. Kiambu) and S. ser. Senftenberg (identical pattern as S. ser. Westhampton). A monophasic S. ser. Typhimurium isolate (1,4,[5],12:i:-) was identified correctly. Salmonella ser. Typhimurium var. Copenhagen (1,4,12:i:1,2) was assigned to S. ser. Typhimurium (1,4,[5],12:i:1,2) and a rough form of S. ser. Infantis was assigned to non-rough S. ser. Infantis. These limitations were evaluated as minor mistakes and subsequently regarded as correct hits.

Discussion
The microarray for Salmonella serogenotyping was validated against the gold standard and was evaluated as an economical, fast, accurate and easy-to-use diagnostic tool with a high potential for standardization and automated high throughput use. For identification of Salmonella using serogenotyping assays, several studies have already been published [13,19,20,21]. The results of these publications showed high correlation of genotypic and phenotypic characterizations for genus Salmonella. Similar studies serogenotyping Escherichia coli [6,30,31] or Chlamydia [29,32] also found a direct correlation of geno-and phenotype.
For Salmonella, at least four genes seem to be significant for specification of the genotype; wzx and wzy specify the Oserogroup, and the genes fliC and fljB specify the H antigens. To improve the correlation of geno-and phenotype, we analyzed fully sequenced Salmonella strains (Table 4) using theoretical hybridization with all probes on the microarray. The result was a similarity of over 99% between the phenotype represented by the antigenic formula and the genotype represented by the microarray based assay. Within the panel of theoretical reference experiments, strain S. ser. Typhimurium LT2 was the only one which was both fully sequenced and classically serotyped. Therefore, it was possible to compare the genotype represented through the NCBI database entry (NC_003197.1) with our theoretical experiments and subsequently with the real experiments using the same strain, S. ser. Typhimurium LT2. Theoretical and real experiments had a concordance of 100%. Even a deeper view of the signal-mismatch prediction from theoretical experiments resulted in a good correlation to the real experiment (Fig. 2). Only two probes showed signal strengths that differed from the results predicted by the theoretical experiment. Such discrepancies may occur due to secondary structure of the amplicon which decreases the binding Table 5. Cont. only co-trimoxazol was tested, sulfamethoxazol was not available. b potential resistance against erythromycin was not tested. c AMR phenotype without AMR genotype was detected.
All AMR genes detected in 34 Salmonella strains listed and compared with the AMR phenotype. The phenotype was defined using a VITEK 2 system with an AST-N111 panel and a disk diffusion assay using Oxoid discs with chloramphinicol (30 mg), streptomycin (10 mg) and kanamycin (30 mg  strength to the probes. Finally, due to the high correlation between theoretical and real experiments, we conclude that the described serogenotyping assay will also correctly detect other Salmonella serovars. A similar approach to the system described in this study was published by Franklin and colleagues in 2011 [13]. In this study, the labeling procedure is divided into two principal steps. The first step is a multiplex PCR targeting several regions of the Salmonella genome which are important for serogenotyping (i.e. wzy, wzx, fliC, fljB). In the second step, short SSELO primers (sequence-specific end labeling of oligonucleotides, [33]) bind specifically to the target regions in the multiplex PCR products and are elongated with single biotin-labeled dideoxynucleoside triphosphate molecules (biotin-ddCTP, biotin-ddGTP and biotin-ddUTP). The incorporation of these molecules also terminates any further elongation. During the microarray hybridization, the 39 end labeled SSELO primers bind specifically to the complementary probes that are covalently attached on the array. Advantages of this method are the facility to recognize SNPs (single nucleotide polymorphism) and the potential to use DNA samples with low concentrations, i.e., DNA isolated directly from a field sample. However, a requirement for typing field samples is that the sample material contains only a single Salmonella serovar. For field samples containing more than one Salmonella strain, the serogenotype cannot be identified. Additionally, the labeling procedure is laborintensive, complex and a target update would require, beside inclusion of new multiplex PCR primers, the introduction of new SSELO primer for detection on the microarray. With regard to the permanently increasing number of available sequences, the difficulties of up-dating and expanding this assay might pose a limitation for this concept.
A more recent method to identify Salmonella is a system using a microsphere-based liquid array [19,20]. This method uses a set of beads which are coupled with probes for one attribute within the antigenic formula of Salmonella serovars. While the method is highly sensitive and specific, a multitude of different beads is required for every attribute within the antigenic formula (e.g., Oantigen). Therefore, at least three reactions have to be performed before obtaining the antigenic formula. A drawback of the method is the multiplex PCR used to amplify short DNA fragments which are then hybridized to the probes on the beads. Due to the inherent disadvantages of any multiplex PCR [34,35], the options are limited for a further expansion of the assay beyond the serovars it currently recognizes.
The described microarray based serogenotyping assay for Salmonella overcomes most of these bottlenecks. It is easy-to-use, an unlimited expandability and fully automated data analysis, making it an attractive platform for a widespread application. The multiplex primer extension reaction used for labeling is highly specific, but exhibits low sensitivity, due to linear (non-exponential) amplification. However, for typing colony material of a fast growing organism, such as Salmonella, this is no issue. The use of colony material instead of original field samples allows both, to obtain the necessary amount of DNA and to ensure pureness and clonality of cultures to be genotyped. Besides, the limited amplification can prove to be an advantage under routine conditions as the assay becomes less susceptible to contamination. Using a classic multiplex PCR, the sensitivity is very high, but contaminants will also be amplified to a detectable level because of the near-exponential kinetics of a PCR. This fact might cause difficulties in high-throughput routine laboratories.
In our approach, primers and their respective probe binding sites are very close to each other. The probability of secondary structures (e.g., hairpins) forming in short generated fragments is lower than in long fragments and this may increase signal intensity. Additionally, the use of single stranded DNA prevents the competition between probe and antisense strand and increases the probability of the single stranded amplicon binding to the probe. Labeling methods using biotin attached to primers were often used [36], but we assumed that, due to cross hybridizations of biotin labeled primer which are in relatively high concentrations, false positive signals will occur more often. In this study, biotin labeled dUTP was used for internal labeling of the multitude of single stranded amplicons. This method prevents false positive signals due to unused primer which bind on empty probes. Another significant advantage of the described serogenotyping method is the economical and ready-to-use availability of all components, even in large scales. For DNA isolation, we used standard DNA isolation kits from Roche or Qiagen. Furthermore, it is conceivable that, after heating at 100uC and RNase treatment (assay sensitivity may decreases due to single stranded RNA which may trap primer used in the multiplex linear DNA amplification), the crude cell extract could be used directly with this assay. All substances for the linear multiplex PCR and the labeling process are available as HybPlus Kit (Alere Technologies, Germany). Due to the standardized availability of all components, this method can immediately be used for routine serogenotyping of Salmonella. Up to 96 samples can be analyzed simultaneously. So far, the serogenotyping assay shows the limitation of the inability to discriminate between serogroup A (O:2) and serogroup D1 (O:9). This is due to the high sequence similarity within the rfb region between strains of both serogroups. Within the genome of serogroup A strains, the rfb region has been shown to be a minor modification of a serogroup D1 rfb region; it has a frameshift mutation that inactivates tyv, a sugar biosynthesis gene required for the biosynthesis of tyvelose [37]. Serogroup D1 strains have tyvelose as their O-antigen side chain sugar, whereas serogroup A strains have paratose, the substrate for tyvelose, as its side chain sugar. Thus, a small genetic change is responsible for a substantial O-antigen difference. Additional probes, including lygA, lygD, sefA, sefB and sefC, which were only described for serovar Enteritidis [38,39], also give positives signals for serovar Nitra. Additionally, S. ser. Blegdam (O9:g,m,q:-) showed an identical pattern on the microarray, but in this case the antigenic formula is highly similar to S. ser. Enteritidis (O9:g,m:-). This result showed how closely related these serovars are to each other. A similar observation between the serogroups A and D1 were made for the serovars Dublin (O9:g,p:-) and Kiel (O2:g,p:-), where additional probes for SeD_A1100, SeD_A1101 and SeD_A1102 were also positive with serovar Kiel. This observation may indicate a high degree of relationship between these two serovars. Furthermore, we assume a high genome sequence similarity between Panama (O9:l,v:1,5) and Koessen (O2:l,v:1,5) as the microarray pattern were also identical. Paratyphi A could be unambiguously identified due to the probes of intergenic region SSPAI. With the knowledge about the genotype of these described serovars a question arises: Is there a need to differentiate between serogroup D1 and A or between g,m and g,q/g,m,q (both a mutation variant of g,m)? Or should the Kauffmann-White Scheme be updated based on our knowledge of Salmonella genomics and the current role of serotyping (e.g., a first line typing method prior to modern molecular methods). Nevertheless, a future version of the assay aims to include probes to discriminate such important zoonotic pathogens as S. ser. Enteritidis and S. ser. Dublin. For this propose, genome databases (e.g., NCBI) are regularly screened for new Salmonella sequences.
In addition, all isolates (n = 34) in which resistance genes were found, were also tested phenotypically. A clear correlation between AMR genotype detected by the Salmonella array and the AMR phenotype was observed ( Table 5, Table S2) for most genes. However, the genes sul1 and sul2 are known to mediate resistance to sulfamethoxazol, but in the study only co-trimoxazol, a mixture comprising trimethoprim and sulfamethoxazol, was tested. Strains that harbored only sul genes were found to be susceptible when being tested with co-trimoxacol. Resistance to co-trimoxazol was associated to the presence of a sul gene (sul1, sul2 or sul3) combined with a dfr gene (dfrA1, dfrA12, dfrA13, dfrA14, dfrA15 or dfrV). Similar correlations between AMR pheno-and genotype were found in other studies using microarrays for E. coli and Salmonella [40,41]. In these studies, all probes and primers used in this study were already evaluated by testing phenotypic antimicrobial resistance in parallel.
Another interesting result was the detection of an extendedspectrum beta-lactamase (ESBL) gene, ctxM1, in S. ser. Anatum strain AMR07 (Table S2). This gene has previously been described several times for Salmonella species [42,43]. In one isolate, ampicillin resistance and a positive nitrocephin test were observed despite a negative genotyping result. For this observation, two explanations are possible: (a) the probe/primer combination of our assay did not bind on the known beta-lactamase gene due to random mutation(s) within the binding side of either the primer or the probe, or (b) the isolate carried a beta-lactamase gene which was not yet described for Salmonella and which therefore was not included into the Salmonella assay. Regarding the unexplained streptomycin resistances, it might be speculated that aminoglycoside modifying enzymes for which the corresponding genes were not covered by the array or efflux systems as known from other bacteria [44] might have played a role.
The study, with a panel of 105 different Salmonella isolates, demonstrates the high correlation between genotype classification by the serogenotyping assay and phenotype representation by the classical serotyping method. Approximately 90% of the strains were correctly identified ( Table 6). The accuracy depends strongly on the underlying database used by the PatternMatch software, but any unknown serovar encountered can easily be included in the database and software upon conventional identification. The main drawback of the assay turned out to be the similarity of patterns between different Salmonella serovars, as described above. With the primer-probe design used here, we were not able to discriminate between these problematic strains, mainly because of the high sequence similarity in the target genes of serogroup A and D1. However, all these ambiguous strains are very rare in a clinical environment, each being reported less than 10 times worldwide during the last 10 years ( [1], www.cdc.gov/ncezid/dfwed/PDFs/ SalmonellaAnnualSummaryTables2009.pdf), and additional probes can easily be introduced should a need arise, or should new sequence information become available.
Due to the absence of a probe which can determine the genetic loci of the O:5 epitope, the isolate S. ser. Typhimurium var. Copenhagen, which is O:5-negative by serotyping, was identified as S. ser. Typhimurium; this minor mistake was regarded as a correct hit.
Another minor limitation is that R-forms (rough forms) cannot be identified using the current array as observed in one isolate of Infantis. R-forms mainly result from mutations of genes within the lipopolysaccharides core [45]. Mutations within the genes rfa (glycosyltransferase), galE (UDP-galactose epimerase), or galF (UDP-glucose pyrophosphorylase) can cause an interruption of the biosynthesis of the lipopolysaccharides. No probes detecting such mutations were included to the array, and failure to identify R-forms regarded as minor issue.
The described assay for serogenotyping is the basis for a fast method to identify Salmonella serovars. We believe that the usage of this assay in a routine laboratory setting is warranted due to the high correlation between serotype and genotype. An advantage of the genotype as the basis for serovar identification is that phenotypic differences (e.g., R-forms that are difficult to analyze by classical serology) play no role. Furthermore, the serogenotyping assay could be used worldwide, where antisera are not available. In such areas, a Salmonella infection in livestock or Salmonella contamination in food could be identified very quickly. Salmonella outbreaks could consequently be retraced to their origin. This microarray based assay is a powerful tool for epidemiological studies, as many samples can be analyzed rapidly and in parallel. For such cases, a point-of-care application represents an ideal standard. During an outbreak situation, this assay could be extremely helpful to identify the outbreak isolate including AMR genotype within hours after they are obtained as clonal serovar. Finally, an interlaboratory comparison in cooperation with several international reference centers will follow in the near future.

Supporting Information
Table S1 Probe-matching matrix used to construct the theoretical hybridization pattern of the fully sequenced strains listed in NCBI database. (XLSX)