Diversity of SCCmec Elements in Staphylococcus aureus as Observed in South-Eastern Germany

SCCmec elements are very important mobile genetic elements in Staphylococci that carry beta-lactam resistance genes mecA/mecC, recombinase genes and a variety of accessory genes. Twelve main types and a couple of variants have yet been described. In addition, there are also other SCC elements harbouring other markers. In order to subtype strains of methicillin-resistant S. aureus (MRSA) based on variations within their SCCmec elements, 86 markers were selected from published SCC sequences for an assay based on multiplexed primer extension reactions followed by hybridisation to the specific probes. These included mecA/mecC, fusC, regulatory genes, recombinase genes, genes from ACME and heavy metal resistance loci as well as several genes of unknown function. Hybridisation patterns for published genome or SCC sequences were theoretically predicted. For validation of the microarray based assay and for stringent hybridisation protocol optimization, real hybridization experiments with fully sequenced reference strains were performed modifying protocols until yielded the results were in concordance to the theoretical predictions. Subsequently, 226 clinical isolates from two hospitals in the city of Dresden, Germany, were characterised in detail. Beside previously described types and subtypes, a wide variety of additional SCC types or subtypes and pseudoSCC elements were observed as well as numerous composite elements. Within the study collection, 61 different such elements have been identified. Since hybridisation cannot recognise the localisation of target genes, gene duplications or inversions, this is a rather conservative estimate. Interestingly, some widespread epidemic strains engulf distinct variants with different SCCmec subtypes. Notable examples are ST239-MRSA-III, CC5-, CC22-, CC30-, and CC45-MRSA-IV or CC398-MRSA-V. Conversely, identical SCC elements were observed in different strains with SCCmec IVa being spread among the highest number of Clonal Complexes. The proposed microarray can help to distinguish isolates that appear similar or identical by other typing methods and it can be used as high-throughput screening tool for the detection of putative new SCC types or variants that warrant further investigation and sequencing. The high degree of diversity of SCC elements even within so-called strains could be helpful for epidemiological typing. It also raises the question on scale and speed of the evolution of SCC elements.


Introduction
Methicillin-resistant Staphylococcus aureus (MRSA) is one of the major pathogens in hospitals and the community. MRSA is not only resistant to methicillin (which serves as indicator for this phenotype) but against all beta-lactam antibiotics with the two recently developed compounds (ceftobipirole and ceftaroline) being notable exceptions. Resistance is caused by a modified penicillin binding protein, PBP2a, which is encoded by alleles [1] of the gene mecA. In 2011, a second gene, mecC has been discovered that also causes methicillin/beta-lactamresistance [2,3]. Both genes are situated on large, potentially mobile genetic elements, so-called SCCmec elements (staphylococcal cassette chromosome mec). These elements also harbour regulatory genes, recombinase genes and a variety of accessory genes. Twelve different types of SCCmec elements have so far been described ( [4,5,6,7,8]; http://www.sccmec.org/Pages/SCC_ TypesEN.html). Their nomenclature relies on the identity of the mec complex, i.e., the immediate surroundings of mecA, including its regulatory genes, and on the identity of the recombinase gene (ccr) complex [6]. Furthermore, there are the so-called J-regions ("joining" or "junkyard" region) that might include a variety of other genes, including additional resistance or virulence determinants. Due to variations within the J-regions some SCCmec types can further be differentiated into subtypes.
In addition to SCCmec elements, a variety of different SCC elements have been described and/or sequenced that might lack the mecA/mecC genes but that carry a fusidic acid resistance marker fusC [9], various heavy metal resistance genes or other genes such as the arginine catabolic mobile element (ACME) or a high-affinity ATP-driven potassium transport system catalysing the hydrolysis of ATP coupled with the exchange of hydrogen and potassium ions (kdp locus). Their presence suggests that SCC elements as a system facilitating horizontal gene transfer between staphylococci predates the emergence of SCCmec elements, and that mecA/C genes could be regarded as just one "payload" for SCC elements among others.
Since MRSA are associated with high morbidity and mortality, rapid molecular tests without culture would be useful for infection control and timely guidance of treatment. SCC elements and mecA can also be found in other, clinically less relevant staphylococci, so that a mere PCR for the detection of mecA from a patient sample is not sufficient to diagnose the presence of MRSA. Additional markers need to be detected to prove that mecA was present in S. aureus and thus to ensure discrimination of MRSA from possibly colonising methicillinresistant "coagulase-negatives". Integration sites of SCCmec elements can be targeted for that purpose designing molecular tests in which one primer detects the species-specific sequences within the core genome while the other one aims on a primer-binding site within the SCCmec element. However, it is necessary to ensure to identify and to cover all relevant alleles of a potential primer-binding site in order to avoid false negatives. The constant evolution of MRSA and the emergence and/or geographic spread of new strains that could displace and marginalise previously epidemic strains requires close monitoring of these trends and a constant adaption of molecular tests because otherwise a decreasing performance of said tests is to be expected.
Another reason for studying variability of SCCmec elements could be their use for high-resolution typing purposes. Many strains that share more or less the same core genome (and thus yield identical spa and MLST types) differ in SCCmec elements. Detecting more SCCmecrelated markers could allow a higher degree of discrimination and might be helpful especially for subtyping abundant and widespread strains.
For these reasons, a DNA hybridisation array was designed that in addition to a previously characterised system, facilitates detection of a total of 83 SCCmec-related markers, which were previously shown to be situated in SCCmec elements. It was used for validation with reference strains of known genome sequences as well as for characterisation of a collection of clinical isolates collected within 15 years at two primary care hospitals in the city of Dresden in Germany.

Strain collection
The study was performed at a tertiary care hospital in Dresden, Saxony, i.e., in South-Eastern Germany. The hospital has approximately 1,200 beds and treats 57,000 in-patients per year (https://www.uniklinikum-dresden.de/de/das-klinikum/jahresberichte/). Isolates were collected routinely from intensive care units, diabetological or surgical wards, suspected transmissions or because of symptoms suggesting PVL-associated disease [10]. Approximately 1,300 isolates collected between 2000 and 2015/2016 were thus characterised using the previously described [11,12] arrays allowing assignment to clonal complexes, epidemic strains and main SCCmec types (Table 1). Additional isolates were obtained from another, secondary care hospital in the same city (http://www.khdn.de/). Here, no systematic typing was performed. Isolates were collected because of conspicuous susceptibility tests, clinical conditions or travel history, and they were typed using the same methods [11,12].
A subset of 226 isolates from both sites was selected for SCCmec subtyping aiming on a high diversity of isolates (see S1 Table). For that purpose, epidemiologically linked or consecutive isolates from a single patient were excluded, while isolates from different years and different wards were prioritised as well as isolates that differed in carriage of additional resistance or toxin genes.

Bacteriological procedures
MRSA isolates passed through standard clinical routine diagnostics. After primary culture and subculturing of single colonies, clumping factor was detected utilising the Pastorex StaphPlus kit (Bio-Rad Laboratories GmbH, Munich, Germany). Antibiotic susceptibility tests were performed by VITEK 1 or VITEK 2 systems (BioMerieux, Nuertingen, Germany). Methicillin resistance was confirmed by detection of PBP2a using the Innogenetics MRSA-screen agglutination assay (Innogenetics, Ghent, Belgium). Isolates were stored frozen using cryobank tubes (Microbank, Pro-Lab Diagnostics, Richmond Hill, Canada) at -80°C. Only one isolate per patient was considered.
Linear DNA amplification, labelling and array procedures An initial characterisation of the isolates was performed using StaphyType DNA microarrays (Alere Technologies GmbH, Jena, Germany). This array covers 333 different targets that correspond to approximately 170 distinct genes and their allelic variants. These genes include species and typing markers, toxin genes and resistance genes. Detailed descriptions have been published previously [11,12,13]. This array also covers several SCC-associated markers such as mecA, mecC, fusC, recombinase genes etc. that are listed in Table 2 as well as in S2  A further characterisation of SCCmec elements of selected isolates (see above) was performed using probes and primers for new targets also listed in Table 2. Criteria for the selection of these target genes are discussed below, in the Results section. Probes and primers for SCCrelated targets are described in S2 Table. The procedures for all array experiments were identical and they have been described previously [11,12]. S. aureus was cultured and cloned on Columbia blood agar plates, harvested and enzymatically lysed. DNA was purified using Qiagen spin columns. A linear amplification was performed using one specific primer per target. Biotin-16-dUTP was randomly incorporated into the amplicons during that step. After incubation with the array and after washing steps, hybridization to probes immobilised to the array was detected using streptavidin-horseradishperoxidase that catalyses a local precipitation of a dye. Microarrays were then photographed and analysed using a designated reader and software (Iconoclust, Alere Technologies GmbH, Jena, Germany). This allowed establishing the presence or absence of certain genes or alleles as well as, by automated comparison of resulting patterns to a database, an assignment to clonal complexes, strains and SCCmec types.

Virtual hybridisations
For comparison of real-life experiments with published genome sequences, a computer-based method for predicting DNA array hybridization patterns from full genome sequences was used. Predicted patterns were generated either from fully finished genomic sequences (all gaps closed) or from partially assembled sequences as typically obtained from next generation sequencing (NGS). A large number of partially assembled sequences of staphylococci is available in the WGS section of NCBI GenBank (http://www.ncbi.nlm.nih.gov/Traces/wgs/). The computational method identified the binding sites of the hybridisation probes in the genomic sequences. For simplicity, only the probe binding sites were determined, while the binding sites SCCmec VT+czrC (as in SO385, but ydhKnegat.)* Since routine typing of MSSA is not performed, no reliable data on the prevalence of this strain can be provided. However, based on data from other regional studies that included MSSA [32,33,34,35], it appears locally not to be common.˚T his strain was only identified in sporadic cases from Dresden Neustadt Hospital (where no systematic typing was performed). The absence from Dresden University Hospital indicates that it either generally very rare in Saxony, and/or that infections might be associated with travel and thus randomly detected.˚˚T his strain was accidentally detected in one healthy carrier, not in a patient. Thus it is not included into the routine typing figures.˚˚˚T his strain was found once in an imported case tested for diagnostic purposes. * Unknown variant, no matching sequence identified among published genome or SCC sequences. For details see Table 3.

arcD-SCC
Arginine/ornithine antiporter Part of ACME 1 and ACME 2 clusters, that occurs alone or in combination with SCCmec elements.   of the labelling primers were not considered. If more than one binding site was found, only the one with the highest number of matches between probe and target was taken into account. The number of mismatches between probe and target sequence was used to predict the strength of the normalized hybridization signal. Perfect matches (i.e., no mismatches) were set to the maximum signal, while four or more mismatches were set to no signal at all. One mismatch yielded a slightly attenuated maximum signal, two mismatches yielded half of the maximal signal, and three mismatches yielded a weak signal which is set slightly above the noise level. The computation method thus resulted in datasets comparable to those from real experiments. This approach was validated by comparing experimental to predicted data of fully sequenced, well known reference strains (such as MSSA476, GenBank BX571857; N315, GenBank BA000018;  [14]; see S1 Table). Several predicted hybridisation patterns matched experimental results of clinical isolates characterised herein (see Table 3 and S1 Table). However, some isolates also were observed that yielded patterns for which no matching sequence could be identified (see Table 3 and S1 Table).

SCCmec typing markers
Probes including those that were used for clonal complex determination as well as detection of toxin genes and resistance genes have been previously discussed, and these probe and primer sequences were provided elsewhere [11,12]. SCC related markers together with definitions, short explanations and reference sequences are listed in Table 2. S2 Table contains, beyond this information, also the individual primers and probes utilised for this study. These markers have been selected from published genome sequences because of an unambiguous, strict linkage to SCC elements and their variable presence in those elements. Some have been annotated differently, introducing identical names or gene symbols to genes or features which are highly similar in sequence. For several genes, allelic variants were distinguished. Table 2 also shows estimated prevalences for the individual markers. This is based on strain prevalences as shown in Table 1 and [10]. It should be noted that these are projections rather than actual figures because i) not all isolates recovered were fully characterised (especially, it was not possible to include each isolate of abundant strains such as CC22-MRSA-IV "Barnim EMRSA"), ii) no systematic testing was performed in one of the two hospitals (see above) and iii) there were clear changes to the population structure of MRSA over time [10], and this also affects marker prevalences.
Various resistance markers are typically located on transposons or insertion elements and are known to be occasionally associated with SCCmec elements (such as aadD, erm(A) or tet genes) were covered by the array. However, they were not used for SCCmec subtyping because DNA array hybridisation cannot provide information whether they are associated with SCC elements or carried on other mobile elements such as on plasmids or transposons.
The mercury resistance operon was used for subtyping some strains, but since it can be carried outside of SCC elements, these results should be regarded as preliminary. For other heavy metal resistances, probes were designed that distinguish alleles that are known to be associated with SCC from those that were described from other mobile elements; and only the former ones were analysed for this paper.

SCC termini
SCC termini were investigated because of their relevance for the design of PCRs that span the integration sites of SCC elements into the S. aureus genome proving that a positive SCCmec/ mecA amplification was attributable to the presence of MRSA in a sample rather than to mecApositive staphylococci of other species. SCC elements insert into the chromosomes of staphylococcal species by site-specific recombination. Insertion is catalysed by the cassette chromosome recombinase which is encoded on the SCC element by genes ccrA, ccrB or ccrC. The recognition site is a stretch of 16 nucleotides located at the 3'-end of the coding sequences of orfX (a putative 23S rRNA methyltransferase). The sequence of the insertion site is doubled upon insertion giving rise to pairs of direct repeats. Composite SCC elements often have more than two direct repeats. In genomes carrying a SCC element, the terminal region of the SCC element located downstream of orfX has been called downstream constant segment (dcs). The name reflects   that this region was found to be highly conserved in all the longer known sequences of SCC elements. Nowadays, a much larger and more diverse set of SCC sequences is available. The SCC terminal region is the intergenic region between orfX and the first codons annotated in the SCC element. In an analysis of complete SCC sequences from GenBank we have identified dsc but also 14 other distinct types of the intergenic region between orfX and the first codon annotated in the SCC element.
For dcs and another 13 terminal integration site sequences, primers and probes were developed and used to screen isolates. A 14th sequence was identified in published genome sequences of mecC/SCCmec XI-positive strains, but it was not screened for, being redundant to the other SCCmec XI-associated markers already covered by the array.
Ten terminal integration site sequences were indeed found among the isolates herein. Multiple SCC termini, in 15 different combinations, were identified in strains and isolates that harbour composite or multiple SCC elements (see Table 3). This is in accordance to published genome sequences of strains harbouring composite or multiple SCC elements, where additional SCC termini can be found also in a distance from orfX. An example is GenBank FR753166.1, where SCC terminus 3 (positions 481 to 586), dcs (24011 to 24292) and SCC terminus 5 (13044 to 13465) are found.
The association of SCC termini with types and subtypes SCC elements is shown in Table 3.
Given the prevalences of strains as shown in Table 1, it can be estimated that dcs is present in about 95% of isolates from the study region. However, especially among the sporadic and/or travel-associated strains, other SCC termini were observed.

Characterisation of SCC elements and subtypes
Sixty-one distinct SCC elements and subtypes were identified using the set of primers and probes described herein. An overview is given in Table 3, full data are provided in the S1 Table. Clonal complexes and strains that were found to harbour SCC elements Full details on CC/strain assignments, SCC elements and subtypes are provided in Table 1.
CC1-MSSA. One CC1-MSSA isolate was identified that carried a SCCfus element apparently identical to the one in MSSA476, GenBank BX571857.1. CC1-MRSA. PVL-negative CC1-MRSA-IV were rare and all of them harboured SCCmec IVa elements apparently identical to the one in the sequenced strain MW2 (BA000033.2). SCCmec IV/SCCfus composite elements as described elsewhere [15,16,17] were not identified. One PVL-positive CC1-MRSA was identified that carried a SCCmec V element as also observed in the Bengal Bay clone (ST772, see below) and another one harboured a SCCmec V +SCC fus composite element.
CC5-MRSA. As previously described, CC5/ST228-MRSA-I, the so-called "South German" epidemic MRSA (EMRSA) strain used to be common in Dresden around the year 2000, but it nearly disappeared since [10]. With regard to SCCmec elements, two variants were observed. One appeared identical to SCCmec I from the CC8 strain COL (GenBank CP000046.1) as well as to those of CC5-MRSA-I strains from Switzerland (GenBank HE579059.1 to HE579069.1). These genome sequences also suggest that the variable presence of the mer operon in CC5-MRSA-I was related to plasmid carriage rather than to variability of the SCCmec element. A second variant harboured a pseudoSCCmec element, lacking recombinase genes. Nearly all isolates harbouring this variant were cultured within one year suggesting an epidemiological linkage.
A single isolate of CC5-MRSA-I/fus, "Geraldine Clone" [18], was found in 2012 in a patient with a history of foreign travel. It carried a combined SCCmecI/SCCfus element that also included tirS matching the predicted pattern for the sequence of strain MRSA-7 [15].
CC5-MRSA-II is a common strain known as "New York/Japan clone" or, in Germany, as CC5-MRSA-IV, "Paediatric clone", was only sporadically found. SCCmec IVa, IVb/d/i and IVc were identified among PVL-negative isolates of this "strain". All PVL-positives yielded SCCmec IVc. One CC5-MRSA-[IV+ccrA/B-4] (as in a strain from Spain, SA_ST125, GenBank ASTH) was detected in a patient with travel history to the Canary Islands.
SCCmec V was only found once in a CC5 MRSA. CC6-MRSA. Three isolates were found; and two of them originated from patients with Middle Eastern travel history. All harboured SCCmec IVa elements.
CC7-MRSA. Just five isolates were identified. Two carried SCCmec IVb/d/I; SCCmec IVa, VT and VI+fus elements were found once each. CC8-MSSA. Three different strains of MSSA were identified that harboured SCC elements without mecA/mecC. They yielded signals for different combinations of ACME II, speG, czrC and ccA/B-4 genes ( Table 1).
CC8-MRSA. Only a single isolate of ST247-MRSA-I, "North German/Iberian EMRSA", was found. Its hybridisation pattern, with regard to SCC genes, was in concordance to the predicted pattern for strain PSP1996 (GenBank ANHU) but differed from SCCmec elements of COL (GenBank CP000046.1) and the local CC5/ST228-MRSA-I strain in the absence of mvaS-SCC.
For CC8-MRSA-IV, several different strains of PVL-negative have been described previously that could be distinguished mainly based on enterotoxin gene carriage. All were only sporadically found. "UK-EMRSA-14", without enterotoxin genes, was found twice, harbouring SCCmec IVc or h/j. The "Lyon Clone", i.e., sea-positive CC8-MRSA-IV yielded SCCmec IVc. "USA500" (seb-positive CC8-MRSA-IV that occasionally also carry sea, sek, seq) were found to carry SCCmec IVa (from a patient with Ethiopian background) or an mvaS-negative variant of SCCmec IVb/d/i (from two cases with infections acquired in Mozambique and Zimbabwe).
USA300-like, PVL-positive CC8-MRSA could be assigned by SCCmec subtyping to four distinct strains out of which the most common one was ACME-positive. This strain harboured SCCmec IVa, an ACME I element and a copper resistance gene as present in genome sequences FPR3757, GenBank CP000255.1, and TCH1516 GenBank CP000730.1. A second strain harboured a SCCmec IVc element, two copper resistance genes (copA2-SCC, mco-SCC), and the mercury resistance operon but lacked ACME. It is likely to be identical to the SCCmec element from MRSA177, GenBank AECP. A third strain lacked ccrA/B-2 and B2Y834. A forth strain, identified once, harboured SCCmec IVa (as also present in the genome sequence of a USA300-like strain, IS-88, GenBank AHLO).
At least three distinct strains could be distinguished that were originally merged under the label "Hannover EMRSA". Two of these strains were common in Dresden around the turn of the century [10]. One harboured a pseudoSCCmec element and the mercury resistance operon. The other one harboured a composite element including SCCmec IV, ccrC and D1GU38 (a marker that accompanies "additional" ccrC, see Table 1) as well as Q2G1R6 suggesting that this composite element derived from a SCCmec IVa t. A third variant of the "Hannover Epidemic Strain" was represented by "Hannover 100-93" from the Harmony Strain Collection and UK-EMRSA-10 (courtesy of G. Coombs). However, they harboured B6VQU0 indicating relationship to SCCmec IVh/j rather than to IVa.
Finally, a single CC8-MRSA-VT strain was found that carried czrC, thus resembling the SCC elements of CC398 livestock-associated MRSA.
CC22-MSSA. CC22-MSSA with two different SCC elements were identified. One element, harbouring arsB and ccrA/B-4, was identified just once. The other one comprised speG, czrC and ccrA/B-4 and it was found in several recent (2014/2015) isolates some of which showed growth on MRSA selective media.
CC22-MRSA. The vast majority of CC22 were assigned to the CC22-MRSA-IV strain known as "UK-EMRSA-15" or locally also as "Barnim EMRSA" [20]. This strain appeared first in 2001 [10] become rapidly more common and accounted nearly 80% of all genotyped MRSA isolates in 2013. These isolates harboured SCCmec IVh/j elements.
Few CC22-MRSA-IV isolates were identified that yielded positive signals for fnbB (indicating that they could belong to different lineage within CC22 in which fnbA and fnbB are not fused, [21]). Some (4 out of 7) fnbB-positive isolates also carried SCCmec IVh/j elements, one had a SCCmec IVa element, and two were assigned to SCCmec IVc.
Another CC22-MRSA-IV that differs from UK-15/Barnim in being positive for tst1 has been described from the Middle East and Mediterranean regions as "Gaza Strain" [22,23]. A single isolate was identified in 2015 from a patient with a Middle Eastern name, and it harboured SCCmec IVa.
A single CC22 isolate was found that carried a SCCmec IVh/j+ACME2 composite element. An apparently identical element has previously been described in a CC22 strain from Ireland [24].
A few isolates were also identified that harboured composite elements in various constellations (see Tables 1 and 3).
PVL-positive CC30-MRSA-IV are known as "Southwest Pacific Clone", "WSPP" (West Samoa Phage Pattern) clone or "USA1100". Published genome sequences indicate the presence analysis shows a remarkable variability of their SCCmec elements, especially with regard to the presence of heavy metal resistance operons and an additional recombinase gene, ccrC.
This strain was only sporadically found in Dresden, always related to importation. Isolates from an outbreak in 2001, with an index patient repatriated from Greece [10], showed a composite SCCmec III element including ccrC as well as cadmium and mercury resistance operons that was identical to the predicted pattern of strain SK1585 (GenBank AYLT). Another isolate was found in 2015 in a patient with history of Middle Eastern travel. It also harboured a slightly different composite SCCmec III element (see Table 3) that resembled Bmb9393, GenBank CP005288.1. One 2008 isolate from a patient with history of hospitalisation in Turkey [10] yielded a composite SCCmec III element including ccrC as well as the Cd resistance operon. It appeared to be identical to the SCCmec element in CN79, GenBank ANCJ, and 16K, GenBank BABZ. Another 2016 isolate, also from a patient with travel history, showed a similar element that matched no known sequence (see Table 3).
CC398-MRSA. "Livestock-associated" CC398-MRSA have sporadically been detected from 2005 on, with a slight increase in recent years. The majority of isolates carried the SCCmec VT/czrC composite element as the sequenced strain SO385. However, sporadic isolates with an ydhK-negative variant thereof, a SCCmec IVc, a composite SCCmec VT/heavy metal resistance element and a pseudoSCCmec element have also been observed (see Table 3).
ST617-MRSA. ST617 was described as a putative recombinant of CC8 and CC45 parents sporadically observed in Germany [28]. A single isolate was found to carry a SCCmec IVa element.
ST772-MRSA. ST772 is lineage that is by MLST related to CC1. As previously described [29], it differs in several core genomic features. One emerging community-associated MRSA strain, the "Bengal Bay Clone", belongs to this lineage. Five isolates have been found; all carried a unique variant of SCCmec V that was also present in several published genome sequences of the "Bengal Bay Clone".

Discussion
Although the characterised strain collection that was rather small, confined to a sampling period of a few years and a restricted geographic area, a remarkable variety of SCCmec elements was observed. For all major SCCmec types, distinct subtypes could be identified, and some additional rare or even not yet sequenced SCCmec elements were observed. When expanding the panel of genes used for SCCmec typing/subtyping, a number of additional variants of SCCmec types or subtypes can be discerned. Several common strains showed a remarkable variability of SCCmec types or subtypes. These include ST239-MRSA-III, CC5-MRSA-IV, CC22-MRSA-IV or CC398-MRSA-V/VT.
Theoretically, there are two explanations for this observation. A common and widespread strain might further evolve, after geographic dissemination, by acquiring additional markers of selective advantage (such as additional antibiotic resistance genes or heavy metal resistance genes), or by losing genes that do not actually confer an advantage (such as possibly additional recombinase genes from composite elements). This could be an explanation for the different variants of ST239-MRSA-III that can harbour a number of different SCCmec III/heavy metal resistance composite elements. Another example could be the observation of SCCmec II and IVa subtypes that just differ in the presence or absence of mvaS-SCC. The most parsimonious explanation would just be a random deletion of that gene from SCCmec elements in some specimens.
Another possibility is that a "strain" was in fact polyphyletic. This means that related parental MSSA from one lineage independently might have acquired different (although sometimes similar or related) subtypes of SCCmec elements. This could be the case for CC5-MRSA-IV, CC22-MRSA-IV and PVL-positive CC30-MRSA-IV. These observations also imply that the emergence of novel MRSA strains by acquisition of SCCmec elements might be rather common.
Another issue is strain definition and nomenclature. While naming strains is highly practical for routine use one should be aware that a nomenclature is an artificial convention and depends on resolution power of typing methods. Traditionally, a strain or clone has been defined as a "group of isolates that can be distinguished from other isolates of the same genus and species by phenotypic characteristics or genotypic characteristics or both" [30,31]. This might be feasible when using MLST/spa or PFGE typing but is not practical when applying microarrays or genome sequencing. This might appear to be a somewhat esoteric issue but it has practical consequences, especially when ruling out or confirming identity in outbreak investigations. For practical purposes, clinicians and infection control officers need clear breakpoints indicating how many "differences" (in terms of numbers of single nucleotide polymorphisms or of mobile genes being present or absent) safely rule out a possible transmission or how many might still be considered as consistent with "identity" and thus with a possible transmission.
Regarding SCCmec nomenclature, the main types easily can be categorised using the framework previously provided (http://www.sccmec.org/Pages/SCC_TypesEN.html). However, microarrays and genome sequencing reveal a high degree of variability on a "subtype" level as well as a rather common presence of irregular and/or composite elements. In order to avoid cumbersome, subjective and eventually ambiguous designations, sequencing and referencing on sequence data (i.e., accession numbers or strain designations unambiguously linked to accession numbers) appears to be inevitable.
Since SCCmec elements often contain repetitive and mobile sequences (such as IS431) they are especially prone to be fragmented and split across several contigs when performing NGS. As array hybridisation, NGS thus cannot always and instantly provide information on gene localisations, and it also cannot reliably recognise duplications or inversions. A set of probes and targets as used herein could not only be used in vitro for typing and for selecting isolates that warrant sequencing, but also for a computerised analysis of NGS sequences allowing a quick assignment to strains and variants. With increasing availability of NGS technology, rapid data analysis and data transfer to non-expert users will become a major challenge. The development of specific sets of markers that can be interrogated in vitro as well as in NGS datasets might help to solve that problem and might also help to find a practical solution to the problem of defining identity or non-identity as discussed above.
The proposed microarray can help to distinguish isolates that appear similar or identical by other typing methods and it can be used as high-throughput screening tool for the detection of novel SCC variants that warrant detailed investigation and sequence analysis. The high degree of heterogeneity of SCC elements even within so-called strains can be utilised for epidemiological typing.
Supporting Information S1 Table. Hybridisation profiles for tested reference strains (highlighted in red) and clinical isolates as well as predicted hybridisation patterns for reference sequences (highlighted in blue). (PDF) S2 Table. Target genes, primers and probes. (PDF)