Genomic Heterogeneity of Methicillin Resistant Staphylococcus aureus Associated with Variation in Severity of Illness among Children with Acute Hematogenous Osteomyelitis

Introduction The association between severity of illness of children with osteomyelitis caused by Methicillin-resistant Staphylococcus aureus (MRSA) and genomic variation of the causative organism has not been previously investigated. The purpose of this study is to assess genomic heterogeneity among MRSA isolates from children with osteomyelitis who have diverse severity of illness. Materials and Methods Children with osteomyelitis were prospectively studied between 2010 and 2011. Severity of illness of the affected children was determined from clinical and laboratory parameters. MRSA isolates were analyzed with next generation sequencing (NGS) and optical mapping. Sequence data was used for multi-locus sequence typing (MLST), phylogenetic analysis by maximum likelihood (PAML), and identification of virulence genes and single nucleotide polymorphisms (SNP) relative to reference strains. Results The twelve children studied demonstrated severity of illness scores ranging from 0 (mild) to 9 (severe). All isolates were USA300, ST 8, SCC mec IVa MRSA by MLST. The isolates differed from reference strains by 2 insertions (40 Kb each) and 2 deletions (10 and 25 Kb) but had no rearrangements or copy number variations. There was a higher occurrence of virulence genes among study isolates when compared to the reference strains (p = 0.0124). There were an average of 11 nonsynonymous SNPs per strain. PAML demonstrated heterogeneity of study isolates from each other and from the reference strains. Discussion Genomic heterogeneity exists among MRSA isolates causing osteomyelitis among children in a single community. These variations may play a role in the pathogenesis of variation in clinical severity among these children.


Introduction
Within the past decade there has been an increase in the incidence of community acquired methicillin-resistant Staphyloccocus aureus (MRSA) as the cause of deep infections in children [1][2][3][4][5][6]. Severe forms of infection have been associated with disseminated disease, deep venous thrombosis, septic pulmonary emboli, abscess formation requiring surgical drainage, and multifocal skeletal involvement [7][8][9]. However, some children develop a mild clinical illness which resolves quickly following brief hospitalization and antibiotic treatment without surgery [10]. The pathogenetic mechanisms leading to such variation in clinical manifestations of acute hematogenous osteomyelitis have yet to be clearly established.
It is possible that virulence genes, in isolation or combination, may play a role in the pathogenic behavior of specific strains, but this has been difficult to prove with traditional approaches to bacterial genetics using pulse field gel electrophoresis (PFGE) or multi-locus sequence typing (MLST) because they focus on macromolecular patterns or on core housekeeping genes which are common to all Staphylococcus aureus strains [11,12]. Bacterial isolates from persons with nasal carriage, community-acquired invasive disease, and hospital-acquired invasive disease appear to be evenly distributed among S. aureus clonal complexes, suggesting no significant differences in their propensity to cause invasive disease [11]. Research using polymerase chain reaction (PCR) technology to identify the occurrence of individual virulence factors in bacteria isolated from children with severe disease has not led to definitive conclusions due to the complexity and redundancy of the S. aureus genome [13][14][15].
After technologies emerged to permit rapid whole genome sequencing of clinical isolates [16][17][18][19][20] several reference strains of S. aureus were sequenced to provide details of the specific location and composition of virulence genes [16,17,19]. Next-generation sequencing (NGS) permits the identification of single nucleotide polymorphisms (SNPs) and indels (insertions or deletions) with respect to the sequences of reference isolates [21,22]. The purpose of this study is to utilize NGS to evaluate the genomic heterogeneity of MRSA isolates which have been procured from children with osteomyelitis who have varying severity of illness.

Ethics Statement
This study was conducted according to the principles expressed in the Declaration of Helsinki. The study was approved by the Institutional Review Boards of the University of Texas Southwestern Medical Center and Children's Medical Center of Dallas (IRB #0802-447) and Baylor Institute of Immunology Research (BIIR, IRB # 002-141). Written Informed consent was obtained from the families or legal guardians of all children enrolled in the study. Written Informed Assent was obtained from patients 10 years of age and older prior to any studyrelated procedure. The study was IRB approved.
Previously healthy children between the ages of birth and 18 years who were admitted to the hospital with acute hematogenous osteomyelitis due to MRSA were consecutively enrolled and prospectively studied after institutional review board (IRB) approval. Acute osteomyelitis was defined as an infection involving bone diagnosed within 2 weeks of the onset of symptoms. The infection was acquired by hematogenous dissemination as opposed to direct inoculation of the bone due to trauma or surgery. All infections were confirmed by the isolation of MRSA obtained from the site of infection under sterile conditions in the operating room. The surgical specimen was plated in the microbiology lab on sheep blood agar and incubated at 37°C. Following culture confirmation of MRSA, the bacterial isolates were catalogued and stored frozen at -80°C in the Children's Medical Center microbiology laboratory for subsequent processing. Children were excluded from the investigation if they had any underlying medical disorder which may lead to immune compromise such as: congenital immune deficiency, leukemia, transplant, or treatment with chemotherapy or immune-modulatory agents. Also excluded were children with infection due to any bacterial organism other than MRSA.

Demographic and Clinical Data
The following clinical and laboratory data were gathered: age; gender; ethnicity; infection site; number of surgeries; and total length of hospitalization (including pediatric intensive care unit stay and readmission days). According to our previously described protocol, a severity of illness score [23] was calculated for each child based on the following clinical and laboratory parameters: C-reactive protein (CRP) values at admission, 48 hours, and 96 hours; febrile days on antibiotics; admission respiratory rate (age adjusted); and evidence of disseminated disease (multifocal involvement, deep venous thrombosis, pulmonary involvement, meningitis, or endocarditis) [23].

Specimen Processing
Frozen MRSA specimens were thawed and inoculated onto blood-agar plates at 37°C for 8 hours. Isolated colonies were transferred to enriched Mueller Hinton broth media overnight. DNA extraction was performed with DNeasy blood and tissue kits with lysostaphin replacing lysozyme during the initial lysis. (Qiagen, Germantown, MD).

Microbial Genome Sequencing
2-5 μg of isolated genomic DNA, quantified by Picogreen, was submitted to the UT South western Genomics Core (http://genomics.swmed.edu/) for sequence analysis. Sequencing libraries were produced using TruSeq adaptors (Illumina, Inc., San Diego, CA) with inserts of 300-500 bp in length. DNA quality was analyzed with a Bioanalyzer (Agilent technologies). Sequencing was performed on an Illumina HiSeq 2000 using a paired-end protocol in which 100 bp sequences were obtained from each end of the sequencing library fragments. An average of 6.9 million reads were obtained for the each sample in this study, resulting in an average of 243-fold coverage for the MRSA genome. Sequencing protocols and basic information concerning the sequence generation are available from the UT Southwestern Genomics Core website.

Optical Mapping
Selected samples representative of the full spectrum of clinical severity of illness scores in our cohort were sent for optical mapping through Opgen's Mapit service (Opgen Inc., Gaithersburg, MD). The resulting data was analyzed on Opgen's MapSolver software version 3.2 (Opgen Inc., Gaithersburg, MD). Whole genome maps of each sample, obtained de novo, were aligned against in silico reference strains in order to identify insertion, deletion, re-arrangement, or copy number variation events.

Sequence Assembly and Analysis
All sequencing data were imported and analyzed using CLC Genomics Workbench (Aarhus, Denmark). Sequences were initially assembled to USA300 FPR3757 and USA300 TCH1516 reference genomes obtained from GenBank, National Center for Biotechnology Informations (http://www.ncbi.nlm.nih.gov/nucleotide). FPR 3757 is a community acquired strain from the wrist abscess of a 36 year old male HIV positive, intravenous drug user in California. TCH 1516 (aka USA 300-HOU-MR) is a community acquired strain obtained from an adolescent female with "severe sepsis syndrome" in Houston. Multi Locus Sequence Typing (MLST) was performed using consensus sequences from each sample for seven housekeeping genes (arc, aroE, glp, gmk, pta, tpi, yqiL) using standard programs available at (saureus.mlst.net) to assign allele and sequence types (STs). Similarly, all samples were analyzed for six virulence genes (alt, essC, geh, hlgA, htrA, and sdrC) as previously proposed for MLST [12]. Analysis of the virulence gene alleles in the study samples was performed via neighbor joining analysis with 10,000 bootstrap (the CLC bio Rapid Neighbor-Joining plugin). Single nucleotide variation and other genetic variation present in the samples were detected using the variant detection function on the genomic workbench using both USA300 strains as reference. Additionally, in some samples which showed large deletions with whole genome mapping, we were able to visually confirm the deletions with the sequence data either in the "stand-alone mapping" or the "tracks" format.

Phylogenetic Analysis
USA 300 S. aureus was selected as the clonal complex with the closest phylogenetic similarity to our isolates to permit meaningful comparison with the study strains. In addition to the USA 300 MRSA reference strains (FPR 3757 and TCH 1516), TCH 959 (aka USA 300-HOU-MS) was chosen as a community acquired strain of USA 300 Methicillin-sensitive Staphylococcus aureus (MSSA) which was isolated from a buttock abscess of a 12 year old white female. Using MUMmer (http://www.mummer.sourceforge.net/) the reference genome sequences and SNP calling was performed. The haplotypes for each of the study samples were generated with confident homyzygous single nucleotide variant (SNV) sites for which the read fractions supporting major alleles were greater than 0.9 for every sample. These haplotypes were used as the nucleotide input for MEGA 6.06 (Molecular Evolutionary Genetics Analysis) http://www. megasoftware.net/mega.php during the phylogenetic analysis. Maximum Likelihood (ML) phylogenetic reconstruction was peformed with the Jukes-Cantor model using the ML Heuristic Method of Nearest-Neighbor-Interchange (NNI). The initial tree was assembled using maximum parsimony to estimate the evolutionary distances between sequences by computing the proportion of nucleotide differences between each pair of sequences.

Clinical Severity of Illness of Affected Children
Twelve children with osteomyelitis due to MRSA were evaluated for clinical severity of illness. Their clinical characteristics are shown in Table 1. All children in this study were culture positive for MRSA which was grown from material obtained directly from the site of infection. No other organisms grew within the cultures of these children. Clinical Severity of Illness Scores (SIS) [23] ranged from 0 to 9 with an average score of 6.1 and median score of 7. The scores were calculated from clinical and laboratory parameters identified by hospital day 5 in all children studied ( Table 2). One out of four (25%) children with severity of illness score of 5 or less had a positive blood culture, whereas all eight children (100%) with severity of illness scores greater than 5 had positive blood cultures.

Bacterial Genetic Heterogeneity
Optical mapping of six selected bacterial isolates identified a close similarity (<1% variation) to the USA300 reference strains. This was confirmed for all twelve isolates by NGS. MLST showed that all isolates were sequence type (ST) 8 and aligned strongly to staphylococcal cassette chromosome (SCC) mec type IVa. MVLST revealed differences between the isolates and both USA300 strains in two virulence loci (HlgA and sdrC) and differences among the isolates in the gamma hemolysin (HlgA) locus (Fig 1). There were no differences among isolates at the sdrC locus (the dendrogram is not shown). Optical mapping and NGS differentiated the isolates from the USA300 strains by identifying the presence and location of limited insertions and deletions in various specimens. There were no rearrangements or copy number variations identified between the samples and the USA300 strains. With optical mapping and NGS, 40 Kb insertions were detected in isolates 7 and 10. An additional 10 Kb deletion was detected in isolate 9. Isolates 2, 4, 8 were not distinguishable from the USA300 strains by optical mapping. NGS identified genetic heterogeneity among the study isolates when SNP analysis was performed with respect to the USA300 reference strains. Each isolate showed an average of 99.6 synonymous and 67.3 nonsynonymous SNPs. There was an average of 11 strain specific, nonsynonymous SNPs per isolate (Fig 2). Seventy-eight virulence genes of USA300were identified through literature review (Table 3) [12,13,[15][16][17][18][19][24][25][26][27][28][29][30][31]. The rate of occurrence of these virulence genes was determined for the two reference strains and for our isolates. Fifty-nine and 65 of these genes were present in USA300 FPR3757 and TCH1516, respectively. There was a significantly higher representation of these genes in our clinical isolates when compared to the reference strains, with an average of 76 (median of 78) virulence genes in the study isolates (p = 0.0124).
Univariate analysis of the MRSA isolates did not identify any individual virulence gene or SNP which was significantly associated with low severity (score 5) versus high severity of illness (score > 5) of the affected children. One combination of genes (sek and ear) was noted to be absent in 4 out of 8 strains which caused infection in children with severe illness, but present in all 4 children who did not have severe illness. This association was not statistically significant (p = 0.208), but may be biologically significant.

Phylogenetic Analysis
There was evolutionary distance identified between each of the study isolates. Evolutionary distance was also identified between the aggregate of study isolates and the USA 300 MRSA reference strains (FPR3757 and TCH1516) (Fig 3). The greatest evolutionary distance was noted between the superficial skin and soft tissue MSSA reference strain (TCH959) and the remaining isolates, including both MRSA reference strains and all study isolates. The pathogenetic mechanisms underlying the variation of clinical manifestations of children with osteomyelitis due to MRSA are incompletely understood. We have previously reported that children with invasive MRSA infections display an over-expression of innate immunity and under-expression of adaptive immunity and that specific genes correlate with severity of illness [32]. Although host factors are likely to play a role in clinical variability, Fig 1. Dendrogram of the differences between study isolates (EA1 through EA12) and the USA 300 reference strain at the gamma hemolysin subunit (HlgA) (left) and dendrogram of the polymorphisms of the gamma hemolysin subunit of the study isolates (right). The tree was generated by neighbor joining analysis with a bootstrap of 10,000. The numbers refer to the confidence of related associations. Higher numbers indicate greater confidence.          attention must still be directed to the bacteria in light of the diverse virulence capabilities of MRSA. There is increasing evidence to suggest that a cascade of events may permit Staphylococcus aureus to survive in the bloodstream, avoid destruction by neutrophil phagocytosis, traverse the endothelium to gain access to deep structures, and adhere to collagen in bone so as to permit local skeletal invasion and destruction [24,25,27,29,33]. These events correspond to upregulation and expression of specific bacterial genes related to these activities [34,35]. It is therefore likely that genetic loci variations might act to amplify or attenuate the severity of infection during a complex cascade of events. In this study, we have demonstrated bacterial genomic heterogeneity among MRSA isolates from children with variable clinical manifestations of disease. This bacterial heterogeneity was detected by next generation technologies and would not have been identified using standard methods of PFGE, or MLST. All study isolates were Sequence Type 8 USA 300 MRSA (SCC mec type IVa). However, there was variability of our isolates compared to standard USA 300 reference strains when analyzed by MVLST, optical mapping and NGS. MVLST demonstrated heterogeneity of the gamma hemolysin gene of isolates at the HlgA locus. Such a difference may prove to be important in permitting varying degrees of bacterial survival within the bloodstream so as to enhance dissemination of disease. Without hemolytic capability, S. aureus would be unable to survive in human blood. The up-regulation of iron metabolic pathways of the organism is required because the concentration of free iron is 10 −11 times lower than that necessary for bacterial survival [33]. While several other virulence determinants of S. aureus demonstrate genetic redundancy due to the presence of alternative subunits, the hemolysin subunits do not have this feature [33]. Gamma-hemolysin is a bicomponent pore-forming leukotoxin encoded by three genes, hlgA, hlgB, and hlgC, all within the same operon. A recent study of genetic evolution of colonizing strains of MRSA found a minor rearrangement and accumulation of synonymous and nonsynonymous SNPs in clusters of hlgA and hlgC genes [36]. The polymorphisms of the gamma hemolysin subunit identified in our study may play a role in varying hemolytic capability of the organisms. We found that only 25% of children with mild severity of illness (scores of 5 or less) had positive blood cultures, whereas 100% of children with scores greater than 5 had bacteremia.
A critical barrier to progress in this field has been the inability to prove that specific bacterial virulence determinants are responsible for clinical manifestations of infection. Some evidence suggests an association between bacterial strains which express Panton Valentine leukocidin (PVL) and severe complications, such as deep venous thrombosis and septic pulmonary emboli [5,7,27]. However, there is limited direct evidence of the role of single virulence determinants in the pathogenesis of virulent strains [37]. The redundancy of bacterial genes coding for leukocyte defense and the presence of PVL in non-virulent strains imply a more complex cascade of events than that which is explained by the activity of an isolated virulence gene [34,37]. In support of this, one group of investigators evaluated 33 putative virulence factors in bacterial isolates from individuals with either invasive or non-invasive S. aureus infections and identified a combination of virulence genes (fnbA, cna, sdrE, sej, eta, hlg, and ica) that were significantly associated with invasive disease [25]. No single factor predominated and the effects of the virulence genes appeared to be cumulative [25]. However, the study was done using PCR technology focusing on a limited number of virulence genes and it did not assess the more expansive genome of S. aureus. Recent evidence suggests that the difference in pathogenesis between strains is more likely due to subtle changes rather than to large-scale acquisition of virulence genes [38]. One study found that bacterial heterogeneity can be discovered in closely related bacterial isolates by investigating SNPs [38]. The investigators sequenced ten MRSA USA300 strains from eight different states and found that 80% of the isolates had very few SNPs and appeared closely related [38].
An important finding of this study is the heterogeneity of the bacterial isolates at the SNP level within a single community. The disproportionate occurrence of SNPs in virulence genes among children who demonstrated severe illness in our study requires further investigation to see if any pattern of SNP variation may lead to amplification or attenuation of the pathogenic cascade. It is possible that SNPs could represent variable and transient events within a bacterial species. However, recent literature suggests that genomic differences between closely related bacterial strains are indeed not representative of spontaneous mutations because they have already undergone selection [39]. The inter-isolate SNP variation identified in this study therefore most likely corresponds to an established heterogeneity of the isolates.
An important limitation of this study is the small sample size. This was intentionally designed as an exploratory study with a narrow focus on children within a single community who were infected with a single bacterial isolate (MRSA) within a limited timeframe. Because of the extremely small sample size, there is limited power to detect significant differences of clinical severity due to selected virulence genes, either in isolation or combination. However, our findings suggest that the greater discriminatory power of NGS may succeed in delineating this information when applied to a substantially larger sample size using appropriate controls, including methicillin sensitive strains, isolates from superficial skin and soft tissue infections, and nasal swab isolates from asymptomatic hosts. A second limitation of this study is the lack of information regarding the gene expression of the isolates. Ultimately, it will be necessary to determine the transcriptome of the isolates at a variety of stages during disease pathogenesis. It is possible that nonsynonymous SNPs could result in amplification or attenuation of the cascade of events during the host-pathogen interaction due to altered bacterial proteomics. One other limitation of our study pertains to the selection of reference strains. Within GenBank, there are currently no reference strains with complete genomes that have been isolated from individuals with similar clinical conditions to the children with acute hematogenous osteomyelitis within our community. An isolate from an HIV positive adult residing in California who had a superficial skin and soft tissue infection is clinically dissimilar to that found among population of children with deep infection in Dallas. This is also true regarding the isolate from the adolescent with "severe sepsis syndrome" in Houston in 2007 due the chronological and geographic separation of cases. This limitation points to the need for the submission of new whole genome sequences of new strains to GenBank with relevance to the clinical phenomena being studied.

Conclusions
This exploratory study represents the most extensive effort to date utilizing NGS and molecular epidemiology to identify bacterial genomic heterogeneity among MRSA isolates affecting a well-defined population of children who demonstrated diverse severity of illness. NGS offers sufficient discriminatory power to gradually reveal the underlying pathogenetic mechanisms of acute osteomyelitis in children when carefully considered in context of the disease manifestations. Although causation cannot be derived from our findings, this study offers direction for further investigation including studies of bacterial gene expression and applied cell biology including hemolysis, neutrophil killing, and endothelial cell uptake when exposed to bacteria with genetic differences. The intention of this study was to assess the feasibility of continuing this line of inquiry, using higher statistical power through a much larger sample size. We conclude that next generation sequencing technology confirmed the genomic variation among our isolates. We are unable to establish a relationship of this bacterial heterogeneity and the variation in host response due to the small sample size. However, there is enough support to encourage further investigation with a carefully designed and appropriately powered study. Ultimately, the ability to identify the virulence potential of individual clinical isolates early in the course of illness would impact clinical care and allow devotion of appropriate resources to children with the greatest need. For example, rapid genetic testing of bacteria identified in the initial blood culture obtained in the emergency room from a child infected by a highly virulent strain might lead to urgent acquisition of magnetic resonance imaging and surgical decompression of the foci of infection. It also might prompt investigation for deep venous thrombosis which would be addressed with appropriate anticoagulation therapy or IVC filter placement. The admitting team might choose intensive care monitoring to allow for pulmonary support in anticipation of respiratory compromise after early recognition of septic pulmonary emboli, identified by CT scan, which otherwise might not have been obtained.

Acknowledgments
This study was accomplished with financial support from Texas Scottish Rite Hospital for Children and a UT-STAR Grant from the Translational Pilot Program 2012 Service Package supported in part by grant UL1TR000451 from the National Center for Advancing Translational Sciences, National Institutes of Health (January 31, 2013 to January 31, 2014). The authors have no financial or other relationships that represent a conflict of interest relevant to this study.