Host Immune Transcriptional Profiles Reflect the Variability in Clinical Disease Manifestations in Patients with Staphylococcus aureus Infections

Staphylococcus aureus infections are associated with diverse clinical manifestations leading to significant morbidity and mortality. To define the role of the host response in the clinical manifestations of the disease, we characterized whole blood transcriptional profiles of children hospitalized with community-acquired S. aureus infection and phenotyped the bacterial strains isolated. The overall transcriptional response to S. aureus infection was characterized by over-expression of innate immunity and hematopoiesis related genes and under-expression of genes related to adaptive immunity. We assessed individual profiles using modular fingerprints combined with the molecular distance to health (MDTH), a numerical score of transcriptional perturbation as compared to healthy controls. We observed significant heterogeneity in the host signatures and MDTH, as they were influenced by the type of clinical presentation, the extent of bacterial dissemination, and time of blood sampling in the course of the infection, but not by the bacterial isolate. System analysis approaches provide a new understanding of disease pathogenesis and the relation/interaction between host response and clinical disease manifestations.


Introduction
Staphylococcus aureus has emerged as one of the most frequent cause of community-acquired invasive bacterial infections, with significant morbidity and mortality. Just in 2005 in the United States, 18,650 deaths were reported due to methicillin-resistant S. aureus (MRSA) [1,2,3,4,5,6]. The spectrum of communityassociated (CA) S. aureus disease is wide and patients can present with a variety of clinical illness, ranging from mild soft tissue infections to invasive disease such as bacteremia, pneumonia, or osteoarticular infections [1,7,8]. The emergence of multidrug resistant S. aureus strains worldwide [9,10] combined with limited therapeutic options demand novel approaches to further elucidate host-pathogen interactions, and especially host responses to S. aureus infection. Circulating leukocytes represent an accessible source of molecular information, which can be studied by analyzing the whole blood genome-wide transcriptome. The value of blood microarray analysis has been illustrated in infection, cancer and autoimmunity [11,12,13,14], leading to diagnostic and therapeutic advances [14].
We have previously used microarray analysis to characterize the differences in host responses to different microbial pathogens [11] including S. aureus (Ardura et al, 2009) by analyzing peripheral blood mononuclear cells (PBMC) from pediatric patients hospitalized with acute infections. The neutrophil depletion from PBMC may result in the loss of key information in the systemic characterization of bacterial infections [15]. As new research tools became available we performed whole blood microarray profiling -including neutrophils -in a new cohort of 99 pediatric patients with S aureus infection. Using this large cohort we now defined: a) the common host response to S aureus infection; and b) the differences in host response patterns depending on the site of infection and the clinical presentation of the disease. The main objective of this analysis was to determine whether the transcriptional profiles reflect the variation of clinical disease manifestations and to provide new insights in disease pathogenesis by defining the differences in host responses among patients with different clinical presentations.

S. aureus induces a distinct and robust transcriptional signature in whole blood
To define the whole blood biosignature of S. aureus infection in children, 99 patients and 44 healthy controls were assigned to two independent groups of subjects, to serve as ''training'' and ''test'' sets. The training set was used to identify the signature of S. aureus infection. This signature was validated in the independent test set by assessing its capacity to separate the patients from healthy controls by hierarchical clustering. The training set included 40 patients with S. aureus infection and 22 healthy controls, matched for age, sex and race (Table 1). Statistical group comparison yielded 1,422 differentially regulated transcripts. Hierarchical clustering of these transcripts grouped them according to similarities in gene expression patterns ( Figure 1A). This signature was validated in the independent test set of 59 patients and 22 healthy controls ( Figure 1B). Hierarchical clustering of these 1,422 transcripts grouped 52 out of 59 patients from the test set together. The seven patients who clustered with controls presented with either mild or moderate disease, and were closer to recovery and hospital discharge compared with all other patients.

S. aureus infection activates innate immunity and suppresses the adaptive immune response
To better understand the whole blood response to S. aureus infection and the immune pathways activated, we used an analytical framework of 62 transcriptional modules that group together genes with shared expression pattern and similar biological function across independent blood transcriptional datasets [16]. Module maps were derived independently for the training ( Figure 1C) and test sets ( Figure 1D) These findings were confirmed in the test set as shown by significant correlation of module expression between training and test sets ( Figure 1F, p,0 . 0001, Spearman R = 0 . 94) demonstrating the consistency of these observations.
The blood signature of S. aureus infection demonstrates significant heterogeneity As the clinical presentation of individual patients was diverse, once we defined the global whole blood response to S. aureus infection, we then aimed to characterize how the variation in anatomical site, dissemination of the infection and bacterial isolate ( Table 2) affect the signatures. Because the training and test sets yielded similar modular fingerprints ( Figure 1F), they were merged into one dataset for subsequent analysis.
Our initial step was to examine the heterogeneity in the signatures and then determine which factors could be associated with such variation. To this end, we first calculated the molecular distance to health, or MDTH for each individual patient. The MDTH is a score that measures the global transcriptional perturbation in each patient compared to the median of healthy controls. Thus by summarizing the overall transcriptional activity in one score, the MDTH facilitates the correlation with clinical parameters [12,17]. Overall, as expected, the median MDTH of patients was significantly higher than that of healthy controls (Figure 2A, p,0 . 001). However, 25 patients had MDTH scores that were within the range of healthy controls MDTH . These patients were flagged as transcriptionally quiescent (TQ) and separated from transcriptionally active patients (n = 74). Unsupervised hierarchical clustering of the 10,972 transcripts expressed in at least one of the 143 subjects grouped these quiescent patients with healthy controls ( Figure 2B). When analyzed from a clinical perspective the majority of these patients presented with mild and local infections with low C-reactive protein (CRP) and the samples were obtained late in the course of the infection as reflected by an elevated mean draw index (0.64) (see methods).
The median values for different clinical findings and laboratory parameters were calculated for each cluster ( Figure 2E, Table S5). In addition, we assessed whether the differences observed at the modular level were related to the time of sample collection in relation to the length of stay (draw index). Cluster C1 displayed a lower draw index, indicating that blood samples were collected at an earlier stage of hospitalization. Patients from C1 required a longer duration of hospitalization than other patients. Additionally, patients from C1 had significantly higher CRP, WBC, neutrophil and monocyte counts. Thus, these routine laboratory markers of inflammation corroborated the over-expression of myeloid and inflammatory modules.
The C3 patients showed decreased hemoglobin, hematocrit, and MCHC ( Figure S6F, Table S5), suggesting an anemic state that might trigger erythropoiesis as indicated by the signature. The C3 erythropoietic signature overlapped with transcripts from CD71+ early erythroid precursors [20] ( Figure S6D).
Then, we analyzed how infection site, clinical presentation, and type of bacterial strain were distributed among clusters ( Figure 2F, Table S6). All patients in cluster C1 had invasive or disseminated disease, and a higher proportion of patients with pneumonia. Clusters C4 and TQ (transcriptionally quiescent) included most patients with skin and soft tissue abscesses, confirming the association between low MDTH and mild presentation.

Bacterial factors had limited influence on the host expression profiles
Finally, to determine whether heterogeneity in bacterial virulence factors was associated with clustering patterns, the isolates from 63 patients were characterized ( Figure S1). The majority (87%) of isolates tested were PVL-positive. Both clusters C1 and C3 displayed a higher percentage of MRSA isolates (100% and 92%) than the overall mean (67%), suggesting that clones of CA-MRSA USA300 might induce a stronger host response than the currently circulating MSSA clones. No difference in other bacterial characteristics such as agr locus type or genetic relatedness was observed between the four clusters. (Table 3, Table S7).

Elements of the molecular signature correlate with laboratory parameters
We then asked whether patient's molecular profiles correlated with clinical laboratory parameters commonly used to assess clinical disease status (Table S8). The MDTH positively correlated with neutrophil counts, white blood cell counts, C-reactive protein, band neutrophil counts, red blood cell distribution width, and monocyte counts. MDTH inversely correlated with relative lymphocyte counts, red blood cell count, hemoglobin concentration, mean corpuscular hemoglobin (MCH) concentration, and hematocrit. Correlations were assessed on a module-by-module basis ( Figure 3A). Inflammatory modules positively correlated with neutrophil counts, CRP and WBC while modules linked to adaptive immunity negatively correlated with these parameters. Hematopoiesis (M3.3) and cell cycle (M6.11) modules correlated with absolute band (immature neutrophils) count. Significant correlations between clinical nodes and molecular nodes are summarized as a network ( Figure 3B).

Time of blood sampling, infection dissemination, and clinical presentation influence the transcriptional signature
To determine whether the type and dissemination of the disease as well as time of sample collection influenced the genomic fingerprint, a supervised analysis was conducted to compare transcriptional signatures according to: 1) time in the course of the infection as defined by draw index quarters; 2) infection site; and 3) types of invasive clinical presentations. Three myeloid lineage and three inflammation related modules were differentially regulated from quarter to quarter ( Figure 4A). The MDTH also displayed a decreasing trend ( Figure 4B) that was paralleled by the decrease in CRP ( Figure 4C). MDTH increased as the infection became more disseminated ( Figure 4D), which was most evident during the first half of the hospitalization period ( Figure S7A, Figure S7B), and was paralleled by increased CRP and hospitalization duration (Table S9).
Next, we analyzed patients according to the type of clinical presentation. Patients with pneumonia had a higher median MDTH than patients with osteoarticular infections as well higher WBC, neutrophil and monocyte counts ( Figure 4E, Figure S8A, and Table S10).
Patients with osteoarticular infections display transcripts linked to activated blood coagulation that are not present in patients with pneumonia To assess transcriptional differences between patients with distinct clinical presentations we selected nine patients with osteoarticular infection and compared them to nine patients with pneumonia matched for MDTH ( Figure S8B). Eighteen healthy controls (nine from each training and test sets) were used as reference. Module fingerprints identified over-expression of the coagulation cascade (M1.1) and platelet adhesion (M6.14) in patients with osteoarticular infection but not pneumonia ( Figure 5A). From the 385 genes differently expressed ( Figure 5B), PANTHER analysis for pathway enrichment ( Figure 5C) identified blood coagulation as the most significant pathway over-expressed in osteoarticular infection, and cholesterol biosynthesis over-expressed in pneumonia. No significant correlations were observed between bacterial isolates (Table S7) and clinical presentation.

Discussion
This study is the first analysis of whole blood transcriptional profiles in pediatric patients with acute community-associated S. aureus infections. Both conserved and heterogeneous elements of the transcriptional signature were identified, which correlate with time elapsed since hospitalization, bacterial dissemination, and the type of clinical presentation. Supporting our earlier study on PBMC [21], the whole blood signature was characterized by significant over-expression of myeloid lineage and inflammation transcripts, and under-expression of lymphoid lineage transcripts. This may reflect the large expansion of circulating neutrophils and monocytes during the acute phase of infection.
While all patients share a global signature to S. aureus infection, analysis of fingerprints of individual patients revealed heterogeneous elements of the molecular signature. The major transcriptional patterns identified included a pro-inflammatory myeloid signature, linked to sampling early in the course of infection, high neutrophil and monocyte counts, and elevated CRP (C1). Some patients displayed an erythropoiesis fingerprint with limited myeloid components (C3), which likely represents an increased release of hematopoietic precursors from the bone marrow during acute infection [22]. This erythropoiesis signature was observed in systemic onset juvenile idiopathic arthritis (SOJIA) [20] and was proposed to reflect the expansion of immature precursor cells and ineffective erythropoiesis. The latter results in accumulation of iron in tissues [23], which supports bacterial survival. It was suggested that erythropoietin (EPO) inhibits NF-kB and TNFamediated pro-inflammatory pathways [24], which could explain Table 3. Distribution of bacterial isolate characteristics by infection localization, presentation and cluster.

Infection Dissemination Clinical Presentation Cluster
Fisher's Exact Test

Fisher's Exact Test
Fisher's Exact Test why C3 patients displayed limited pro-inflammatory signature. This could represent a bacterial survival mechanism, whereby increased erythropoiesis would prevent the development of an adequate pro-inflammatory response and subsequent bacterial clearance. Despite increased numbers of neutrophils, we did not detect consistent changes in transcripts linked to the interferon response, as previously observed in patients with active pulmonary tuberculosis [12]. Fifteen out of 99 patients displayed overexpression of the three IFN modules but tested negative for concomitant viral infections. It may reflect IFN activation to counter arrest the bacteria-induced pro-inflammatory milieu [25] in the later stages of infection.
Currently, there are no established laboratory markers that either objectively define the extent of clinical disease severity in patients with S. aureus infections or monitor the spread of the infection to multiple organs [8]. Transcriptional profiling highlighted global quantitative differences between patients with local or disseminated disease, supporting the value of microarrays and the quantitative MDTH genomic score to monitor the spread of infection.
Additionally, the signatures provided unique transcriptional information on the pathogenesis of different clinical syndromes. Patients with osteoarticular infections displayed over-expression of transcripts linked to blood coagulation, a finding possibly related with the increased systemic coagulation and deep venous thrombosis observed in patients with musculoskeletal infections [26]. Additionally, patients with pneumonia displayed overexpression of genes involved in cholesterol synthesis, which might be involved in neutrophil recruitment to the lung [27]. Importantly, the location and extent of disease dissemination impacted transcriptional profiles to a greater extent than molecular characteristics of the S. aureus strains. This suggests that either: 1) the early innate response is directed towards features of S. aureus conserved across strains; 2) the human response is specifically tailored to the infection site; 3) staphylococcal gene expression differs based on the site of disease.
Despite the qualitative variability of the host response, the intensity of the signature decreases as patients get closer to discharge, regardless of the localization of the infection or the type of clinical presentation, suggesting that this assay has potential to monitor the clinical course of infection. Future studies should focus on the diagnostic and prognostic value of this approach in identifying patients at risk for infection dissemination and eventually determine its value in guiding therapeutic decisions.

Ethics Statement
This study was conducted according to the principles expressed in the Declaration of Helsinki. The study was approved by the Institutional Review Boards of the University of Texas Southwestern Medical Center and Children's Medical Center of Dallas (IRB #0802-447) and Baylor Institute of Immunology Research (BIIR, IRB # 002-141). Informed written consent was obtained from legal guardians and informed assent was obtained from patients 10 years of age and older prior to any study-related procedure.

Patient characteristics
Blood samples from 99 patients hospitalized with communityacquired S. aureus infection and 44 healthy controls were collected in tempus tubes (Applied Biosystems, PN 4342792). Patients represented the clinical spectrum of acute S. aureus infection, including skin and soft tissue infection, bacteremia, osteomyelitis, suppurative arthritis, pyomyositis, pneumonia with empyema, and disseminated disease defined as bacteremia and the involvement of 2 different anatomical sites. Patients with a diagnosis of toxic shock syndrome, polymicrobial infections, or treated with corticosteroids in the preceding four weeks were excluded. Viral direct fluorescent antibody testing and/or culture of the nasopharynx was performed in all patients and healthy controls to exclude concomitant viral infections. Patient demographic data and clinical characteristics are summarized in Table S1 and Table S2. The median duration of hospitalization was ten days (range: 1-98 days). The median time from patient hospitalization to blood sample acquisition was five days (range: 1-35 days).

Patient classification
The study cohort of 99 patients and 44 healthy controls was divided into independent training (40 patients, 22 healthy controls) and test sets (59 patients, 22 healthy controls) ( Table 1). Patients were categorized according to three schemes (Table 2) based on assessment by an independent clinician who was blinded to the transcriptional data: i) by localization of infection, defined as local (n = 10), invasive (n = 74) or disseminated (n = 13); ii) by clinical presentation, separating patients with skin and soft tissue infection with negative blood culture (n = 10), patients with osteoarticular infections (n = 56), and patients with pneumonia (n = 11).

Sampling time: Draw index and hospitalization quarter
This cross-sectional study included samples drawn at different days during hospitalization. To assess the influence of the time the sample was obtained during the course of the infection we calculated a draw index, a numeric score between 0 and 1 calculated as the ratio of the blood draw day over the duration of hospitalization. Accordingly, samples were classified according to the hospitalization quarter (0#Quarter 1,0 . 25#Quarter 2,0 . 50#Quarter 3,0 . 75#Quarter 4).

Characterization of Bacterial Isolates
Bacterial isolates from 63 patients were recovered from blood culture, synovial fluid, or abscesses. Single colonies were selected Patients were organized in four quarters Q1 through Q4 based on blood draw index (ratio draw day/hospitalization duration). A low draw index signifies proximity to hospital admission while a high draw index signifies proximity to discharge. Transcripts differently expressed between the four draw index quarters were selected by non-parametric ANOVA (Kruskal-Wallis, p,0 . 01, Benjamini-Hochberg false discovery rate) and represented as a heatmap (red, yellow, blue). The same statistical filter was applied at the module level (red, white, blue heatmap below). Individual MDTH was represented above as a line chart. and sub-cultured. S. aureus was confirmed by nuclease PCR and isolates were tested for methicillin resistance by mecA PCR. SCCmec typing of MRSA isolates was performed by classifying the ccr and mec complexes [28]. agr locus typing was performed and genetic relatedness was determined by repetitive-element, sequence-based PCR (rep-PCR). Gene encoding of toxins was detected by traditional PCR [28,29] (Figure S1, Table S3).

RNA Preparation and Microarray Hybridization
RNA was processed as described elsewhere [12]. The data are deposited in the NCBI Gene Expression Omnibus (GEO, http:// www.ncbi.nlm.nih.gov/geo, GEO Series accession number GSE30119).

Batch Correction
To prevent batch effect between training and test sets, principal variance component analysis (PVCA) was conducted using JMP Genomics (SAS Institute, Cary, NC) to identify sources of batch effect. Cohort number accounted for 58% of the variability observed ( Figure S2A) and scatter plot visualization ( Figure S2B) with ellipsoids ( Figure S2C) revealed strong segregation of samples based on cohort. The batch correction algorithm CombatR [30] was used on the cohort variable to reduce its contribution to the global variance. The batch effect from the cohort was reduced to approximately 0% ( Figure S2D, Figure S2E and Figure S2F).

Module Framework Development
This analysis strategy has been described elsewhere [16]. A set of 62 transcriptional modules derived from 410 whole blood gene expression profiles was applied to the dataset described herein. Modules were annotated with Ingenuity Pathway Analysis (IPA) (Ingenuity Systems, Redwood City, CA), Pubmed, iHOP, and Novartis Gene Atlas (http://biogps.gnf.org) databases. Module transcript content and annotations are available online (http://www. biir.net/public_wikis/module_annotation/V2_Trial_8_Modules).

Module-level Analysis
Gene expression levels were compared between patients and healthy controls on a module-by-module basis. The percentage of transcripts showing significant differences (Mann-Whitney, p,0N05) in expression was used as an indicator of module activity. Modules containing transcripts with increased expression were Figure 5. The osteoarticular infection signature displays increased blood coagulation. We compared the transcriptional signatures from patients with pneumonia and patients with osteoarticular infections. To properly balance osteoarticular and pneumonia groups, patients with pneumonia with a draw index less than 0 . 75 (nine patients) were selected (active disease). Nine patients with osteoarticular infection were selected with matching MDTH so that global quantitative signature was equivalent between the two groups. Nine healthy controls were selected from the training and nine from the test set (18 healthy controls in total) as reference. A. Top left panel: mean module map for the nine patients with osteoarticular infections compared to the 18 healthy controls. Bottom left panel: mean module map for the 9 patients with staphylococcal pneumonia compared to the 18 healthy controls. Top right panel: substraction map of osteoarticular infections minus pneumonia. Only differences greater than 40% are represented. Bottom right panel: Annotation legend for modules identified. B. Heatmap representing genes differentially expressed (t-test, ,0 . 05, no correction) between osteoarticular infections and pneumonia (hierarchical clustering, Pearson). 190 genes were upregulated 1 . 5-fold or more in osteoarticular infections versus pneumonia and healthy controls. 185 genes were upregulated 1 . 5-fold or more in pneumonia versus osteoarticular infections and healthy controls. C. Area chart representing PANTHER comparison for pathway enrichment between the two lists from C. doi:10.1371/journal.pone.0034390.g005 represented on a red scale while those containing transcripts with decreased expression were represented on a blue scale. Modules with 15% or less transcripts with significant change in abundance compared to healthy controls were not displayed.  Figure S3 Determination of the best number of Kmeans clusters from individual patient module expression. An appropriate K was chosen for clustering this dataset using an information theoretic approach called the ''jump method''. First, the data was clustered for all K 1 to 5, inclusive. A. The distortion of each clustering was calculated. In this instance, we chose to approximate the covariance matrix with the identity matrix so the distortion is simply the mean squared error. B. Next, the transformed distortion for each K is calculated by raising the distortion to a power of -(# of dimensions/2). C.