Promiscuous signaling by a regulatory system unique to the pandemic PMEN1 pneumococcal lineage

Streptococcus pneumoniae (pneumococcus) is a leading cause of death and disease in children and elderly. Genetic variability among isolates from this species is high. These differences, often the product of gene loss or gene acquisition via horizontal gene transfer, can endow strains with new molecular pathways, diverse phenotypes, and ecological advantages. PMEN1 is a widespread and multidrug-resistant pneumococcal lineage. Using comparative genomics we have determined that a regulator-peptide signal transduction system, TprA2/PhrA2, was acquired by a PMEN1 ancestor and is encoded by the vast majority of strains in this lineage. We show that TprA2 is a negative regulator of a PMEN1-specific gene encoding a lanthionine-containing peptide (lcpA). The activity of TprA2 is modulated by its cognate peptide, PhrA2. Expression of phrA2 is density-dependent and its C-terminus relieves TprA2-mediated inhibition leading to expression of lcpA. In the pneumococcal mouse model with intranasal inoculation, TprA2 had no effect on nasopharyngeal colonization but was associated with decreased lung disease via its control of lcpA levels. Furthermore, the TprA2/PhrA2 system has integrated into the pneumococcal regulatory circuitry, as PhrA2 activates TprA/PhrA, a second regulator-peptide signal transduction system widespread among pneumococci. Extracellular PhrA2 can release TprA-mediated inhibition, activating expression of TprA-repressed genes in both PMEN1 cells as well as another pneumococcal lineage. Acquisition of TprA2/PhrA2 has provided PMEN1 isolates with a mechanism to promote commensalism over dissemination and control inter-strain gene regulation.


Introduction
Streptococcus pneumoniae (pneumococcus) is one of the most important community acquired human pathogens, and is responsible for an estimated 850,000 deaths annually in children under the age of 5 [1].Pneumococcus colonizes the nasopharynx of young children at very high rates, and is asymptomatic in most cases [2,3].However, it can also disseminate from the nasopharynx into tissues leading to diseases such as otitis media, pneumonia, bacteremia, meningitis, and inflammation of the heart [4][5][6].The pneumococcal molecules responsible for this transition from a commensal to a pathogen are not well understood.Here we characterize a novel quorum sensing (QS) system (TprA2/PhrA2) that limits pneumococcal disease, without affecting nasopharyngeal colonization.
At the genomic level, there is extensive diversity among pneumococccal lineages.These genomic variations contribute to the differences in colonization and virulence potential [7].Only half of the pangenome is shared across all strains (core set), while the other half is unevenly distributed amongst isolates [8,9].The Pneumococcal Molecular Epidemiology Network (PMEN) has grouped strains of multi locus sequencing type (MLST) 81 into the PMEN1 lineage (also known as Spain 23F -1 and SPN23F) [10].Over the past 30 years, PMEN1 has distinguished itself by its worldwide distribution, multi-drug resistant profile, and emergence of vaccine-escape strains.
Historically, the PMEN1 lineage was responsible for the Spanish epidemic of the 1980s and has since spread to North and South America, Europe, Asia, Africa, and Australia [2,10].Most PMEN1 isolates are resistant to penicillin, chloramphenicol, and tetracycline, and many isolates have additional resistances to fluoroquinilones and macrolides [11,12].PMEN1 isolates are predominantly of serotype 23F, but there are also capsular switches to other serotypes, some of which represent vaccine-escape isolates [13].Further, the PMEN1 lineage has impacted the genome content of the pneumococcal population by virtue of its high frequency of DNA donation, including genes for drug-resistance, to other pneumococcal lineages [14].The PMEN1 genome encodes an integrative conjugative element (ICESp23FST81) [13,15,16].As described by Croucher and colleagues upon sequencing of the first PMEN1 genome, this ICE encodes drug resistance determinants, a complete lanthionine-peptide gene cluster and a regulator-peptide pair, which in this study we have identified as the TprA2/PhrA2 QS system.
Quorum sensing systems serve as a critical, decision-making process in the response of bacteria to the environment, and their ability to colonize and/or disseminate to tissues.The best characterized kind of QS machinery is the two component system, where the signal is sensed by a surface-localized histidine kinase and transferred to a cytosolic response regulator [17].Streptococci, enterococci and bacilli have been shown to encode a second kind of QS characterized by the emerging RRNPP (Rgg/Rap/NprR/PlcR/PrgX) superfamily of transcriptional regulators and their cognate peptides [18].In these systems, the secreted peptide is exported from the producer cell, processed, and imported into the cytosol of producing or neighboring cells, where it interacts with the RRNPP regulator [18].RRNPP-peptide systems have been shown to regulate virulence, biofilm formation, and the production of bacteriocins [19][20][21].
In pneumococcus, the majority of characterized peptides signal via two component systems [17].These peptides regulate competence and class II bacteriocin production [22,23].The first RRNPP-peptide pair was recently characterized in the pneumococcus strain D39 [24].It is composed of the TprA regulator and its cognate peptide PhrA.PhrA alleviates gene inhibition leading to the expression of physiologically important genes [24].PhrA levels are repressed by glucose and activated by galactose, consistent with activity in the upper respiratory track where galactose is a major source of energy [25].
In this study we characterize the TprA2/PhrA2 QS system, a novel pneumococcal RRNPPpeptide pair, highly expressed in middle ear effusions.TprA2/PhrA2 is present almost exclusively in PMEN1 isolates where it restrains dissemination.Unlike other lineages, the PMEN1 strains encode both the TprA/PhrA and the TprA2/PhrA2 signaling systems.Extracellular PhrA2 leads to induction of TprA in PMEN1 cells as well as in D39 cells.Thus, horizontal acquisition of TprA2/PhrA2 has provided the PMEN1 lineage with a QS system and associated regulon, as well as the molecular machinery to regulate a widespread cell-cell communication system and in doing so, influence not only its own gene expression but also that of other strains.

Results
The genes encoding the TprA2/PhrA2 system are enriched in the PMEN1 lineage Genes enriched in the PMEN1 strains may provide this lineage with exclusive phenotypic properties, explaining its prevalent occurrence and rapid spread.We performed a comparative genomic screen to search for genes that are present in the majority of the PMEN1 isolates, but absent in other pneumococcal lineages.The analysis was performed on 60 pneumococcal genomes, selected to capture the diversity in the pneumococcal population (S1 Table, labeled "To establish PMEN1 enrichment").We employed RAST [26] to annotate the whole genome sequences (WGS) into 125,612 coding sequences (CDSs), and organized these into 3,571 clusters of homologous sequences as previously described [27].The screen identified a genomic region present only in the PMEN1 strains.This region encodes a transcriptional regulator (tprA2) on the opposite strand of a small peptide (phrA2) and three ABC transporters.Immediately downstream are three genes lcpA, lcpM, and lcpT.LcpA encodes a putative 71aa peptide with the full size weight of 7.5kDa, which we predict is a lanthionine containing peptide.Lanthionine and methyllanthionine are usually formed by the dehydration of threonines or serines, and subsequent cyclization to cysteine (lcpA encodes for serine, threonines, and cysteines) [28].Cyclization is performed by lanthipeptide synthetases, of which there are four known classes [29].The lcpM gene downstream of LcpA is consistent with class II synthetases (CDD score: LanM-like e-value 0e+00 [30]).Finally, the lcpT encodes a transporter with a C39 peptidase domain, which we predict is involved in LcpA cleavage and export (Fig We performed a detailed assessment on the phylogenetic distribution of the QS-Lcp genes in the pneumococcus species and the Streptococcus genus.First, for the assessment of the distribution of TprA2 in the PMEN1 lineage, we searched for this gene in 215 PMEN1 isolates.To this end we used either polymerase chain reaction (PCR) or genomic data assembled by Croucher and colleagues [13].The tprA2 gene was present in 212 isolates.It was either disrupted or deleted in the genomes of strains 111 (ERS004810), 11933 (ERS005313) and HKP38 (ERS004775) (genome data was confirmed by PCR).Next, we broadened our search into the non-redundant database, which revealed that tprA2 was present in only one strain outside the PMEN1 lineage (GA13494) [15] (Fig 2).Finally, we expanded our search for tprA2 in related streptococcal species, specifically S. pseudopneumoniae, S. mitis, S. oralis, and S. infantis (S1 Table labeled "Distribution with Streptococcus sp").We found one occurrence in S. mitis and one in S. infantis, but these species did not encode the downstream lcpAMT locus (Fig 2).These phylogenetic analyses demonstrate that the QS system and lcpAMT are present in >98% of the PMEN1 isolates and are rare outside this lineage.This distribution suggests these genes were acquired via horizontal gene transfer by a PMEN1 ancestral strain.

QS-Lcp genes are induced and highly expressed in vivo
To determine whether QS-Lcp genes are active during infection, we measured their gene expression during middle ear infection.We utilized the nCounter NanoString technology since this allows for an automated, highly sensitive enumeration of pathogen's mRNA transcripts in the infected host tissue.Our probes capture tprA2, lcpA, lcpM, and lcpT.Further, since we were unable to design a probe for the short coding sequence of phrA2, we used ABCATPase as a proxy since it is present on the same transcript (S1 Fig) .For normalization we used probes to gyrB and metG, and normalized to the geometric mean of these housekeeping genes.The PMEN1 strain PN4595-T23 [31] was inoculated transbullarly into the chinchilla.We isolated RNA from effusions of the chinchilla middle ears at 48h post-transbullar inoculation.All five genes were expressed in middle ear effusions (Fig 3).The average counts for ABCATPase and lcpA were comparable to those of psaA (56,036 counts), which has been shown to be highly expressed in vivo [32], consistent with high levels of QS-Lcp in vivo.To evaluate whether these genes were induced in the middle ear relative to growth in rich media, we calculated the ratio of the average number of transcripts between middle ear effusions and in vitro planktonic cultures.The gene expression levels of ABCATPase, lcpA, lcpM and lcpT were 69, 108, 93 and 45-fold higher in vivo relative to planktonic cultures, respectively.From these in vitro and in vivo measurements we infer that the QS-Lcp system is both induced and highly expressed during infection.The expression of phrA2 is regulated in a density-dependent manner The expression of sensory peptides can be cell-density dependent (reviewed in detail in [33].Using quantitative real time PCR (qRT-PCR) we found that phrA2 is regulated in a densitydependent manner.Expression of phrA2 increases at higher cell density, as observed by measuring gene expression at lag, early-log, mid-log and stationary phase (Fig 4, solid bars).Further, when a lag phase culture was left to grow for one hour, the levels of phrA2 expression increased 3 fold.When the same culture was exposed to cell-free supernatant from a wild-type high-density culture, the levels of phrA2 expression increased 8 fold.Yet, when it was exposed to the cell-free supernatant from a ΔphrA2-ABC high-density culture, the levels of phrA2 did not increase (Fig 4 , striped bars).Thus, the wild-type cells but not the ΔphrA2-ABC mutant, secrete a molecule that induces expression of phrA2 in the population.These data are consistent with secretion and autoinduction of PhrA2.

TprA2 is a negative regulator of phrA2-ABC and lcpAMT
To identify the TprA2 regulon, we compared the gene expression levels of the wild-type (WT) PMEN1 strain PN4595-T23 and the isogenic tprA2 deletion mutant (ΔtprA2), utilizing a pneumococcal gene array (S5 Table) [34].The expression of the phrA2-ABC and lcpAMT genes were >30-fold higher in ΔtprA2 relative to the WT strain.These results were verified, using independent biological replicates, by both qRT-PCR and nanoString technology (Table 1).These findings suggest that TprA2 is a negative regulator of these neighboring genes.
To confirm the role of TprA2, we generated a complemented strain (ΔtprA2::tprA2) where tprA2 was inserted into the ΔtprA2 strain at a distant chromosomal location, under the influence of the constitutive erythromycin-resistance gene promoter (ermB).We measured gene expression of tprA2, phrA2, ABC transporter ATPase, and lcpA in the WT, ΔtprA2 and ΔtprA2:: tprA2 strains (Fig 5).The tprA2 gene was expressed in the ΔtprA2::tprA2 strain, and its expression level was higher than in the WT.Further, low levels of phrA2, ABC transporter ATPase, and lcpA were re-established in the complement strain.These findings strongly support our conclusion that the gene product of tprA2 is a negative regulator of phrA2 and lcpAMT.

PhrA2 modulates the TprA2 regulon
The TprA2 regulator displays sequence similarity to the Bacillus sp.transcription factor, PlcR and to the pneumococcal TprA, which are regulated by extracellular forms of the C-terminal heptapeptides from their cognate peptides [24,35].Given that TprA2 is part of the PlcR family, we hypothesized that the C-terminal heptapeptide of PhrA2 would encompass a functional peptide capable of influencing TprA2 activity.Thus, we utilized synthetic peptides corresponding to the seven terminal residues of PhrA2 (sequence: VDLGLAD) and a scrambled control (sequence: DAGVLDL).Addition of the PhrA2 peptide, but not the scrambled peptide to planktonic culture led to a significant increase in expression levels of tprA2, phrA2, ABC transporter ATPase and adjacent lcpAMT genes (Fig 6).The PhrA2 peptide up-regulates its own production demonstrating autoinduction of this density-dependent system.We also observed an increase in the levels of tprA2 suggesting that TprA2 serves as a negative regulator of its own expression.
The induction of gene expression by the synthetic peptide explains the observation that supernatant from a high-density WT culture, but not a ΔphrA2-ABC, can induce gene expression (Fig 4).Further, cell-free supernatant from a PhrA2 overexpressing strain increases levels of phrA2 and lcpA by over 5 fold when compared to media alone (S2 Fig) .These findings strongly support a model in which the phrA2 gene product is exported.

TprA2 regulon in the middle ear
We investigated the regulation of the TprA2/PhrA2 system in vivo to verify whether our in vitro finding were relevant to the in vivo environment.We analyzed WT, ΔtprA2, and ΔtprA2:: tprA2.Three chinchillas were independently inoculated with each strain, middle ear effusions were extracted 48 hours post-inoculation, and bacterial mRNA for tprA2, ABCATPase, lcpA and lcpM was quantified using nanostring technology.As observed in vitro, deletion of tprA2 led to increase expression of ABCATPase (on the same transcript as phrA2) and lcpM (Fig 7).LcpA values were also higher in this mutant, but display elevated inter-animal variability such that the change was not statistically significant.The modest fold increase is consistent with our observation that the TprA2-regulon in the WT is highly expressed in vivo, such that complete removal of the negative regulator has a moderate effect.In contrast, overexpression of tprA2 in the complement strain led to a decrease in the levels of ABCATPase and lcpA.Together, these findings suggest TprA2 is negative regulator of its neighboring genes in vivo.

TprA2 promotes commensalism over tissue dissemination
To assess the in vivo role of the QS-Lcp region we made use of two pneumococcal infection models.To study colonization of the nasopharynx and spread to the lungs we utilized a murine model where animals are inoculated intranasally and disease progresses causing pneumonia or sepsis or both [36,37].To study middle ear disease we utilized the chinchilla otitis media model.
The murine model revealed that TprA2 protects against lung disease.We did not observe infection in mice inoculated with PN4595-T23 strains, thus we generated the parallel mutants in another naturally occurring PMEN1 strain with a type 3 capsule (SV36).Cohorts of ten

Gene ID Target Gene qRT-PCR Microarray Nanostring
ΔtprA2 BALB/c mice were infected with SV36, SV36ΔtprA2 or SV36ΔphrA2-ABC and observed over 4 days.The bacterial titers in the nasal lavages were similar for all three strains when tested at 48 hours post-inoculation (Fig 8B).Notably, SV36ΔtprA2 displayed a statistically significant increase in mortality (Fig 8A).TprA2 is a negative regulator of lcpAMT (Fig 5).To test whether overexpression of lcpAMT in the SV36ΔtprA2 was associated with the increase virulence of this strain, we tested a double mutant with deletions in tprA2 and lcpAMT and observed that it restored the wild-type phenotype.These results strongly suggest that LcpA is a virulence determinant, and that TprA2 can modulate virulence by controlling levels of lcpAMT.
Finally, to study middle ear disease, bacteria were inoculated directly into the middle ear of chinchillas.The overall mortality was the same for all three strains, perhaps reflecting differences in peripheral disease progression from the chinchilla middle ear versus the murine nasopharynx (Fig 8C).Further, we observed a trend toward increased middle ear disease in the ΔtprA2 (Fig 8D ), and the ΔtprA2 displayed the highest lung dissemination (S3 Table ), consistent with our finding that lcpAMT plays a role in virulence.In conclusion, our findings suggest that TprA2 controls lcpA expression and in doing so can promote commensalism over dissemination.PMEN1 codes for two related regulator/peptide systems TprA2 shares moderate homology to TprA, another streptococcal transcription factor that belongs to the recently characterized TprA/PhrA system, where TprA inhibits expression of PhrA and downstream lantibiotic genes [24].Unlike tprA2, which occurs rarely outside the PMEN1 lineage, tprA has a wide distribution in pneumococci.Using a set of highly curated WGSs, with representatives of the major lineages of S. pneumoniae, we found that tprA was present in over 90% of the isolates in our set (Fig 2, all tprA genes displayed > = 86% similarity).The prominent exception is a set of strains in a basal pneumococcal branch associated with unencapsulated strains and conjunctivitis infections [38,39] (Fig 2).Hoover and colleagues first characterized the TprA/PhrA system, and also reported a wide distribution (approximately 60%) in pneumococcal strains [24].
PMEN1 strains are notable in that they code for both the TprA2/PhrA2 and TprA/PhrA QS systems.In the PMEN1 strain PN4595-T23, the TprA and TprA2 protein sequences share approximately 60% identity.We searched the genomes of 55 streptococcal strains, identified 48 sequences to construct a phylogenetic tree of these regulators using maximum likelihood, and found that the tprA2 and tprA homologues are separated into two distinct branches (Fig 9A ).Their cognate peptides in PMEN1, PhrA2 and PhrA share only 28% identity over the full length, but display very high similarity at their C-termini.To analyze the extent of conservation of the C-terminal residues, we generated a consensus logo from the six PhrA2 sequences and the thirty-six PhrA sequences.The C-terminal residues are either identical or share similar charge in 6/7 residues; but can be distinguished by position -3 that codes for a conserved leucine in PhrA2 and a lysine in PhrA (Fig 9A and 9B).The sequence separation between the QS components suggests that the tprA2/phrA2 genes did not originate from a recent duplication within PMEN1, and is consistent with acquisition of TprA2/PhrA2 by horizontal gene transfer.
Interaction of TprA2/PhrA2 QS system with the TprA/PhrA QS system The co-occurrence of both QS systems in the PMEN1 strains led us to investigate whether PhrA2 and PhrA peptides can exert regulatory effects on their non-cognate QS systems, TprA/ PhrA and TprA2/PhrA2 respectively.To test this, we measured how the addition of synthetic peptides to the extracellular milieu affects gene expression of the non-cognate regulon.Addition of synthetic PhrA2 (VDLGLAD), but not the scrambled peptide, induced gene expression of the TprA regulon (tprA, phrA, and the TprA-associated lanA, lanM, and lanT) at levels similar to those induced by cognate PhrA (LDVGKAD) itself (Fig 10A).In contrast, neither the addition of synthetic PhrA nor the addition of the scrambled peptide had any effect on expression of the tprA2, phrA2, or lcpA genes in the TprA2/PhrA2 regulon (Fig 10B).These findings suggest that PhrA2 regulates gene expression of the TprA regulon, and PhrA has no effect on the TprA2 regulon.
PhrA2 regulates the TprA/PhrA system in non-PMEN1 strains The unidirectional influence of PhrA2 gene expression upon TprA/PhrA led us to investigate whether the PMEN1 peptide could influence gene expression in non-PMEN1 cells.We used strain D39 as a representative of the non-PMEN1 strains since TprA/PhrA system has been The gene encoding phrA is expressed in galactose and repressed in glucose, and the phrA promoter region contains a cre (catabolite response element) site for CcpA catabolite repression [24,40].In contrast, we have not identified a cre site in the phrA2 promoter region.Therefore, to maximally discern the input through PhrA2 in our experiment, we used a D39-derived strain with a deletion of phrA and grew it in chemically-defined medium with galactose as the sole sugar.
We found that exogenous PhrA2 interacts with the TprA regulon in non-PMEN1 strains.Specifically, D39ΔphrA cultures were exposed to treatments with synthetic PhrA2, PhrA, and scrambled peptides for an hour and gene expression of tprA and lanA was measured relative to no treatment.Treatment with PhrA2 significantly induced expression of tprA and lanA by 11-fold and 2-fold, respectively (Fig 11).Treatment with scrambled peptide showed no induction of gene expression in D39ΔphrA.The extent of lanA induction by PhrA is lower in the D39ΔphrA strain than in experiments with the WT strain (Fig 10A ), we presume this difference is due to the absence of phrA-autoinduction in the mutant strain.These findings suggests that PhrA2 can be internalized by strains outside the PMEN1 lineage and induce changes in their gene expression.

Discussion
Our findings demonstrate that acquisition of the TprA2/PhrA2 QS system by horizontal gene transfer into the PMEN1 lineage has endowed these strains with a virulence determinant and a mechanism to regulate its expression and thereby control disease.PMEN1 (ST81) lineage is postulated to have evolved from an ancestor in 1967, and by the end of 1990s it represented an estimated 40% of penicillin resistant strains in US [14,41].These strains display very high rates of carriage [2,3,41,42].PMEN1 also displays very high rates of disease [2,3,43].Is the prevalence of PMEN1 in invasive disease a function of its carriage rates or does it reflect a propensity to cause disease?Multiple studies have shown that sequence types vary regarding their propensity to cause disease [44][45][46][47] and Sjostrom et al. show that PMEN1 displays a low propensity to cause invasive disease [47].Thus, high rates of PMEN1 invasive disease in the population likely reflect high carriage rates, and not heightened virulence potential.In this context, it is possible that acquisition of the TprA2/PhrA2 by PMEN1 strains contributes to its low proclivity to cause invasive disease.TprA2/PhrA2 may provide PMEN1 strains with the means to manipulate gene expression in neighboring strains from other lineages in multi-strain infections.We show that synthetic C-terminal PhrA2 can stimulate expression of the TprA/PhrA system as well as its associated lantibiotic biosynthesis cluster in distantly related strain D39 (Figs 11 and 12).We have observed that the expression of PMEN1-phrA2 is six fold that of D39-phrA in rich media, thus exemplifying a condition where PMEN1-phrA2 expression is high when D39-phrA is low (S3 Fig) .We are currently investigating this interaction in physiologically relevant conditions.The activation of phrA in response to galactose has led to the conclusion that TprA/PhrA may promote colonization in the nasopharynx where free sugars are rare and pneumococci survive by breaking down host mucins to free complex sugars, most prominently galactose [24].However, experiments with TprA/PhrA in the murine model demonstrate that this system is a virulence determinant in multiple models of pneumococcal disease (personal communication, Motib and Yesilkaya), in this manner, PhrA2 may trigger a virulence regulon in neighboring strains.We propose that PhrA2 signaling across systems is physiologically relevant in multistrain infections.
We conclude that PhrA2 peptide is secreted by PMEN1-cells, since cell-free culture supernatants reiterate the function of extracellular addition of synthetic PhrA2.We predict that export occurs via the Sec secretion system, consistent with other peptides from the PlcR family of regulator-peptide pairs [48][49][50].Import must occur via a relatively widespread transporter, given that PhrA2 can influence D39 gene expression.Further, the high sequence similarity between the functional C-termini of PhrA and PhrA2 suggests common import machinery.The oligopeptide permease amiACDEF has been shown to be required for import of processed PhrA, and its homologues are required for import of PlcR-associated peptides in other species [48][49][50].Thus, amiACDEF is a high value candidate for a PhrA2 importer.
Sequence comparisons suggest that LcpA is a bacteriocin, however its function remains unknown.We propose that its effect on virulence is not the result of bacteriocidal activity given that mouse experiments where performed with single strains.However, we cannot exclude the possibility that an interaction between LcpA and the natural microbiome of the mouse influences the outcome of the infection.The function of LcpA is under investigation.
We have identified and characterized a new quorum sensing system from the emerging RRNPP family.TprA2/PhrA2 consists of a negative regulator of a lanthionine containing peptide and a cognate activating peptide.Our findings suggest that this system has provided PMEN1 with the ability to control LcpA virulence and perhaps influence its propensity to cause invasive disease.Finally, to our knowledge this is the first example of a gene transfer event that has integrated with an ancestral regulatory networks to control inter-strain gene regulation.

Ethics statement
Laboratory animals were maintained in accordance with the applicable portions of the Animal Welfare Act and the guidelines prescribed in the DHHS publication, Guide for the Care and Use of Laboratory Animals.The Office of Laboratory Animal Welfare (OLAW) Assurance of Compliance number is A3693-01.All chinchilla experiments were conducted with the approval of the Allegheny-Singer Research Institute (ASRI) Institutional Animal Care and Use Committee (IACUC) A3693-01/1000.Research grade young adult chinchillas (Chinchilla lanigera) weighing 400-600 grams were acquired from R and R Chinchilla Inc., Ohio.Animals were maintained in BSL2 facilities and all experiments were done while chinchillas were under subcutaneously injected ketamine-xylazine anaesthesia (1.7mg/kg animal weight for each).For virulence studies, chinchillas (a minimum of 10 in each cohort) were infected with 100 CFUs/ ear by transbullar inoculation within each middle ear.During the course of the experiment (10 days), animals with severe acute infection perished; animals showing prolonged signs of discomfort were administered with pain relief (Rimadyl, 0.1ml of 50mg/mL)).Animals with severe signs of pain and illness were euthanized by administering an intra-cardiac injection of 1mL potassium chloride after regular sedation.All experiments involving mice were performed with prior approval of and in accordance with guidelines of the St. Jude Institutional Animal Care and Use Committee.The St Jude laboratory animal facilities have been fully accredited by the American Association for Accreditation of Laboratory Animal Care.All mice were maintained in BSL2 facilities and all experiments were done while the mice were under inhaled isoflurane (2.5%) anesthesia.Mice were monitored daily for signs of infection.This work was approved under the IACUC protocol number 538-100013-04/12 R1.Mice were monitored for disease progression and euthanized via CO 2 asphyxiation.

Comparative genomics
We performed a comparative genomic analysis of PMEN1 and non-PMEN1 strains to identify genes unique to the PMEN1 lineage [27].To this end, we used a set of 60 curated pneumococcal whole-genome sequences (WGS), including four from the PMEN1 lineage (S1 Table ).The set of 60 genomes includes the 44 genomes used for the first large-scale pneumococcal pangenome study [8], additional genomes from PCV-7 immunized children [51], as well as genomes from non-encapsulated strains [52].Together these strains reflect a large variety of multilocus sequence types (MLSTs) and serotypes, as well as strains isolated from different disease states and geographic locations.
To determine the distribution of tprA2 across pneumococcal strains we searched for this gene in the genome sequence of 215 PMEN1 isolates [13].A few genomes displayed disruption in the tprA2 locus, so the sequences were confirmed by PCR.Primers to tprA2 and gapdh (positive control) were used to amplify these respective genes from genomic DNA.The genomes from strains 111 (ERS004810), 11933 (ERS005313) and HKP38 (ERS004775) display substantial differences in the locus encoding TprA2/PhrA2.
To search for cre sites we inspected the 190 basepairs upstream of phrA2 and before the start of tprA2.We searched for the cre site motif from L. lactis (WGWAARCGYTWWMA), and allowed for up to three discrepancies as has been observed in a subset of S. pneumoniae cre [40,53].
For growth on solid media, S. pneumoniae (PN4595-T23) and isogenic mutants were streaked on TSA II plates with 5% sheep blood (BD BBL, New Jersey, USA).For growth in liquid culture, colonies from a frozen stock were grown overnight on TSA plates, inoculated into Columbia broth (Remel Microbiology Products, Thermo Fisher Scientific, USA), and incubated at 37˚C and 5% CO2 without shaking.Columbia broth contains 10mM glucose.Experiments in chemically defined media (CDM) were performed utilizing previously published recipe [40], and galactose was used at a final concentration of 55mM.Growth in CDM was initiated by growing a pre-culture for 9 hours and back dilution to OD 600 0.1 to initiate a culture.

Generation of deletion mutants and complement strains
All deletion mutant strains were generated by site-directed homologous recombination where the target region was replaced with the spectinomycin-resistance gene (aadR) or kanamycinresistance gene, as previously described [27] [54].Briefly, ~2kb of flanking region upstream and downstream of the deletion target were amplified from the parental strain by PCR using Q5 2x Master Mix (New England Biolabs, USA) generating flanking regions, and the spectinomycin resistant gene was amplified from the plasmid pR412 (provided by Dr. Donald Morrison).Assembly of the transforming cassette was achieved either by sticky-end ligation of restriction enzyme-cut PCR products or by Gibson Assembly using NEBuilder HiFi DNA Assembly Cloning Kit.The resulting construct was transformed into PN4595-T23 and confirmed using PCR and DNA sequencing.
Complement strains were made by generating a cassette where ~100bp of the 5'UTR and the CDS of the gene to be complemented were fused at the 3' end of an antibiotic selection cassette lacking a transcription terminator.This cassette was introduced in the genome of the strain at one of the two regions: the intergenic region between the orthologues of spr_0515 and spr_0516, an inert genomic region that has been successfully employed in other constructs in the lab, or the bga region a commonly employed site for complementation [55].After subsequent transformation, qRT-PCR (LightCycler480, Roche Life Sciences, USA) was done to verify the levels of expression of the complemented gene.Primers used to generate the constructs are listed in S4 Table.

Bacterial transformations
For all bacterial transformations, about 1μg of transforming DNA was added to the growing culture of a target strain at OD 600 of 0.05, supplemented with 125μg/mL of CSP2 (sequence: EMRISRIILDFLFLRKK; purchased from GenScript, NJ, USA), and incubated at 37˚C.After 4 hours, the treated cultures were plated on Columbia agar containing the appropriate concentration of antibiotic for selection; spectinomycin, 100μg/mL; erythromycin 2μg/mL, kanamycin 150μg/mL).Resistant colonies were cultured in media, the region of interest was amplified by PCR and the amplimer was submitted for Sanger sequencing (Genewiz, Inc., USA) to verify the sequence of the mutants.The strains generated in this study are listed in Table 2.

Treatment with synthetic peptides
Bacterial cultures were treated with synthetic peptides corresponding to the following sequences: 1) C-terminal PhrA2 heptamer (VDLGLAD); 2) C-terminal PhrA heptamer (LDVGKAD); and 3) scrambled peptide comprised of the same residues as the PhrA2 heptamer (DAGVLDL).These were custom ordered from GenScript, (NJ, USA) at 99.7% purity.1μM peptide was added in the mid-log phase (OD 600 of 0.5), cultures were incubated at 37˚C, 5% CO 2 for 1 hour, after which RNA later (Ambion 1 , Thermo Fisher Scientific, USA) was For experiments where different peptides were compared in parallel, the original culture was distributed into separate tubes, and each one was treated with the relevant peptide, in addition to a no-peptide control.Using a single parent culture for different peptide additions ensured minimal variation when comparing treatments.

Treatment with cell-free supernatant
To determine whether secreted peptides can stimulate gene expression in a recipient wild-type culture, recipient cultures and supernatant donor cultures were grown alongside to selected OD 600 .To prepare cell-free supernatant, bacterial cells were pelleted and the supernatants were filtered (pore size 0.2 microns).At the desired OD 600 , the wild-type recipient culture was distributed into separate tubes, cultures were centrifuged at 4000g for 7 minutes, and resuspended in the same volume of cell-free supernatant or media control.At 1 hour post-treatment, RNA later (Ambion 1 , Thermo Fisher Scientific, USA) was added to each culture, and samples were prepared for RNA extraction and qRT-PCR.

Preparation of cell lysates and RNA collection, extraction, and quality assessment
For experiments on in vitro transcriptional analysis, samples were collected for RNA extraction at an OD 600 of 0.5 unless otherwise stated and preparation of RNA was performed as previously described in [34].For RNA extraction from in vivo experiments, chinchillas were euthanized 48h post-inoculation of PN4595-T23, and a small opening was generated through the bulla to access the middle ear cavity.Effusions were siphoned out from the middle ear and flash frozen in liquid nitrogen to preserve the bacterial RNA.For bacterial cell lyses, the sample were re-suspended in an enzyme cocktail (2mg/mL proteinase K, 10mg/mL lysozyme and 20μg/mL mutanolysin), and submitted to bead beating with glass beads, acid-washed 425-600μm (Sigma) and 0.5mm Disruption Beads made by Zirconia/Silica in FastPrep-24 Instrument (MP Biomedicals, USA).These cell lysates were frozen for microarray, qRT-PCR or nanoString analyses.The RNA concentration was measured by NanoDrop 2000c spectrophotometer (Thermo Fisher Scientific, USA) and its integrity was confirmed on gel electrophoresis.

Microarray analyses of gene expression levels
We utilized the Pneumococcal Supragenome Hybridization Array (SpSGH) to compare gene expression between the wild-type PN4595-T23 strain and the ΔtprA2 [34].The array provides coverage for ~85% of the PMEN1 open reading frames.Strains were grown to mid-log cultures (OD 600 0.5) in Columbia broth (note, that glucose in the media will inhibit genes under catabolic repression).RNA extraction, cDNA preparation and cDNA labeling were performed as previously described [34].Cyber T was used for data analysis [56,57].Genes with at least a 10-fold difference between strains and Bayesian P values < 0.05, Benjamini-Hochberg FDR < 10%, and Bonferroni-corrected P value < 0.05 are displayed in Table 1.The complete dataset is deposited in GEO web storage (under submission).

qRT-PCR analyses of gene expression levels
High quality RNA (DNA free and A 260/280 ~2.1) was used as template for the synthesis of first strand of cDNA using SuperScript VILO synthesis kit (Invitrogen).After first strand cDNA synthesis, the product was directly used for qRT-PCR using LighCycer480 Master Mix SYBR-Green in a LightCycler480 Instrument (Roche Life Sciences, USA).For normalization, we used 16S rRNA, as well as gyrB (DNA gyrase subunit B) and/or gapdh (glyceraldehyde-3-phosphate dehydrogenase).The raw data was converted using LC480 Conversion: conversion of raw LC480 data" software (available at http://www.hartfaalcentrum.nl/index.php?main= files&sub=0) and LinregPCR for expression data analysis [58,59], where the output expression data is displayed in arbitrary fluorescence units (N 0 ) that represent the starting RNA amount for the test gene in that sample.Statistical significance was determined by performing Student t-test (unpaired samples, one tailed), using GraphPad Prism 6 tool.

NanoString technology for in vivo gene expression
nCounter Analysis System from nanoString technology provides a highly sensitive platform to measure gene expression of a pathogen during host infection [60].The fully-automated, barcode technology directly detects mRNA transcripts, thereby eliminating the amplification and enzymatic steps of DNase treatment and cDNA synthesis.The probes used in our study were custom designed by nanoString Technologies and included housekeeping genes gyrB and metG as normalization controls (S4 Table ).Nanostring probes for long coding sequences were generated and probes for phrA2 could not be manufactured.5μL of extracted RNA samples, collected directly from processing of middle ear effusions with the RNeasy Mini Kit, were hybridized onto the nCounter chip following manufacture's instruction.RNA concentration ranged from 80-200 ng/μl for in vivo samples, and 50ng total nucleic acid for planktonic samples.Manufacturer's software, nSolver, was used for quality assessment of the raw data and normalization.The data was normalized across samples against the geometric mean of the housekeeping genes, gyrB and metG [40,61].16srRNA and gapdh were not used as in vivo controls, given the very high abundance of 16SrRNA that overwhelms the nanoString signal, and the evidence of a role for GAPDH during infection that may led higher expression in vivo [62].Finally, the in vitro and in vivo levels were compared using Student's t-test in the GraphPad Prism 6 tool.

Virulence studies in the chinchilla OM model
All chinchilla experiments were conducted with the approval of the Allegheny-Singer Research Institute (ASRI) Institutional Animal Care and Use Committee (IACUC) A3693-01/1000.Research grade young adult chinchillas (Chinchilla lanigera) weighing 400-600 grams were acquired from R and R Chinchilla Inc., Ohio.Animals were maintained in BSL2 facilities and all experiments were done while chinchillas were under subcutaneously injected ketaminexylazine anaesthesia (1.7mg/kg animal weight for each).For virulence studies, chinchillas (a minimum of 10 in each cohort) were infected with 100 CFUs/ear by transbullar inoculation within each middle ear.During the course of the experiment (10 days), animals with severe acute infection perished; animals showing prolonged signs of discomfort were administered with pain relief (Rimadyl, 0.1ml of 50mg/mL)).Animals with severe signs of pain and illness were euthanized by administering an intra-cardiac injection of 1mL potassium chloride after regular sedation.We evaluated mortality, time to death, and spread of bacteria to the brain and the lungs.Tissue dissemination was tested by plating homogenized tissue on TSA plates with 5% sheep blood to establish pneumococcal presence.Additionally, we assessed local diseases using visual otoscopic inspection (VetDock, USA).Otologic disease ranged from no disease to a ruptured tympanic membrane, where a score of '1' is given for animals with mild or no disease, '2' with moderate disease (where pus and air are present), '3' with frank purulence, and '4' with tympanic membrane rupture [7,63].

Virulence studies in the murine lung model
All experiments involving mice were performed with prior approval of and in accordance with guidelines of the St. Jude Institutional Animal Care and Use Committee.The St Jude laboratory animal facilities have been fully accredited by the American Association for Accreditation of Laboratory Animal Care.Laboratory animals were maintained in accordance with the applicable portions of the Animal Welfare Act and the guidelines prescribed in the DHHS publication, Guide for the Care and Use of Laboratory Animals.All mice were maintained in BSL2 facilities and all experiments were done while the mice were under inhaled isoflurane (2.5%) anesthesia.Mice were monitored daily for signs of infection.This work was approved under the IACUC protocol number 538-100013-04/12 R1.For bacterial burden and survival studies, strains were grown in C+Y media to an OD 620 of 0.4 and diluted according to a previously determined standard curve.Bacteria were enumerated to assure that the proper amount of bacteria was used in infection.Bacteria were introduced into 7-week-old female BALB/c mice (Jackson Laboratory) via intranasal administration of 5 x10 4 CFU of bacteria in PBS (100 μL).Mice were monitored for disease progression and euthanized via CO 2 asphyxiation.Blood for titer determination was collected via tail snip at 24 and 48 hours post-infection and subsequent serial dilution and plating.Bacteria colonizing the nasopharynx were collected by insertion and removal of PBS (20 μL) into the nasal cavity.One cohort was used for ΔphrA2-ABC, ΔlcpAMT, and ΔtprA2ΔlcpAMT, while two cohorts were used for WT and ΔtprA2 (Fig 8A and  8B).Survival data were analyzed using the Mann-Whitney U test in Prism 6. Bacterial titers were compared using nonparametric Mann-Whitney U t test in Prism 6.

Generation of phylogenetic trees and their analyses
Generation of streptococcal species tree.Fifty-five streptococcal strains were selected for phylogenetic analysis (S1 Table, labeled "Distribution within Streptococcus sp.").The 33 pneumococcal strain were selected to capture the major sequence clusters within this species, including 4 PMEN1 genomes given the focus of this manuscript on this lineage.The S. mitis and S. pseudopneumoniae strains represented the available genomes for these species at the time this study was initiated.The S. tigurinus were selected as a potential novel species related to S. mitis [64].According to our analysis, the S. tigurinus genomes and a subset of the S. mitis genomes cluster with S. oralis.The whole genome sequence (WGS) for all 55 strains were aligned using MAUVE [65,66] and the core region corresponding to 995531 total sites and 352,371 informative sites, was extracted from the Mauve output files.Alignment of the core region was performed using MAFFT (FFT-NS-2) [67] and model selection was performed using MODELTEST [68].The phylogenetic tree was built with PhyML 3.0 [69], model GTR+I (0.63) using maximum likelihood and 100 bootstrap replicates.
Gene distribution analysis and generation of TprA2/TprA gene tree.To identify genes that are highly enriched within the PMEN1 lineage relative to other pneumococcal lineages we clustered the coding sequences from 60 highly curated pneumococcal whole genome sequences (WGS), and selected clusters unique to the PMEN1 genomes.The 60 genomes are listed in S1 Table and marked as "To establish PMEN1 enrichment", and the analysis has been previously described in detail [27].Briefly, it involved CDS prediction by RAST [26], CDS clustering by utilizing tfasty36 (FASTA v.3.6 package) [70] and parsing the output to assemble genes that share at least 70% identity over 70% of their length into clusters of homologous sequences, and selecting clusters that are present in all PMEN1 genomes while absent in all other lineages.
To establish the gene presence/absence profiles within the 215 PMEN1 WGSs we performed an in silico PCR on the genomes previously published by Croucher and colleagues at the Sanger Center (listed in S1 Table [71]).In cases where the in silico analysis was inconclusive, we performed experimental PCR using forward and reverse primers to tprA2.To establish the gene presence/absence profiles within the 55 Streptococcal WGSs (S1 Table, strains labeled as "Distribution within Streptococcus sp."), as displayed in Fig 1B , we employed the basic local alignment search tool (Blastn) using an e-value threshold of 1e-20 [72].All of the tprA2 CDSs displayed > = 95% similarity.The Lan locus is represented by three CDSs downstream of TprA2/PhrA2, and the Lan Ã locus is represented by seven CDSs downstream of TprA/PhrA; the genes with Lan and Lan Ã display exactly the same phylogenetic distribution in the 55 samples (i.e all present or all absent).In the vast majority of the genomes, the lantibiotic genes were neighboring the associated QS systems; the exceptions are genomes with contig breaks or low sequence coverage in these regions (these are noted in Fig 2).
The phylogenetic tree of tprA2/tprA was generated on the 48 sequences extracted in the analysis of the 55 streptococcal genomes.The nucleotide sequences were aligned using MAFFT (G-INS-i), and model selection was performed using MODELTEST.The phylogenetic tree was built with PhyML 3.0, model HKY+I(0.39)using maximum likelihood and 100 bootstrap replicates.Logos were generated from the C-terminal heptapeptides of (i) 6 PhrA2 sequences and (ii) 36 PhrA peptides using WebLogo [73] (Fig 9A and 9B).

Fig 2 .
Fig 2. Intra-and inter-species distribution of the TprA2/PhrA2 and TprA/PhrA genomic regions.Phylogenetic analyses displaying bootstrap values on the branches.Left side: Maximum likelihood tree of streptococcal genomes generated from the core genome.Right side: Gene distribution, where blue columns display the distribution of tprA2, phrA2, and associated lcpAMT, and purple columns display the distribution of tprA, phrA, and downstream lantibiotic genes (seven consecutive genes, including predicted lanA and lanM labeled as Lan).Presence of the gene is marked with the following symbols: '•' gene present in one copy; '' low coverage of region; '□'multiple copies of the gene.Red box indicates isolates from the PMEN1 lineage.https://doi.org/10.1371/journal.ppat.1006339.g002

Fig 3 .
Fig 3. Gene expression levels of the TprA2/PhrA2 system and associated lcpAMT locus in chinchilla middle ear effusions and planktonic cultures.nCounter nanoString technology was used to quantify mRNA transcripts from planktonic cultures (dotted bars, n = 2) and chinchilla middle ear effusions (black bars, n = 3).Data was normalized to the geometric mean of the expression of gyrB and metG using nSolver software.The X-axis denotes the test genes assayed for gene expression.The Y-axis displays the log 10 of the total number of transcripts for each gene averaged over biological replicates.Error bars represent the standard deviation.'*' Significantly higher in vivo expression (P-value < 0.05), as determined by Student's ttest.https://doi.org/10.1371/journal.ppat.1006339.g003

Fig 4 .
Fig 4. Density-dependent gene expression and extracellular secretion of phrA2 during planktonic growth.qRT-PCR measurements of phrA2 gene expression in PN4595-T23.The Y-axis displays expression levels as a ratio to expression in lag phase culture.The X-axis denotes culture conditions.Black bars displays density-dependent gene expression at lag phase (OD 600 0.05), early-log phase (OD 600 0.2), mid-log phase (OD 600 0.6), and stationary phase (OD 600 1.0).Striped bars display treatment by cell-free supernatants.The lag phase culture was divided into three tubes and grown for 1h in one of three ways in: original supernatant (lagWT+1hour), cell-free supernatant from a high density wild type culture (OD 600 1.2), or cell-free supernatant from a high density ΔphrA2-ABC culture (OD 600 1.2).16SrRNA was used as normalization control.Error bars represent standard deviations from biological duplicate experiments.'**' P-value<0.01and '*', P-value<0.05as determined by Student's t-test.https://doi.org/10.1371/journal.ppat.1006339.g004

Fig 6 .
Fig 6.Gene expression measured by qRT-PCR of QS-Lcp genes in WT strain PN4595-T23 upon treatments.Data was normalized to 16S rRNA expression.Y-axis displays fold change in gene expression upon exposure to a peptide treatment relative to untreated control.Error bars represent standard deviations for biological replicates (n = 3).On the left, dark bars display expression from cells exposed to the PhrA2 Cterminal heptapeptide (VDLGLAD); on the right side, stripped bars display expression from cells exposed to the scrambled control peptide (DAGVLDL)."**" Statistically significant difference in gene expression after PhrA2 treatment compared to scrambled peptide (P-value<0.01).https://doi.org/10.1371/journal.ppat.1006339.g006

Fig 7 .
Fig 7. Gene expression of TprA2 regulon in the middle ear.Bars represent gene expression as measured by nCounter platform by NanoString technology on RNA extracted from middle ear effusions of chinchillas cohorts (n = 3) infected with three different strains: WT (dotted bars), ΔtprA2 (striped bars), and ΔtprA2::tprA2 (black bars) individually.The data is represented as ratios relative to the geometric mean of housekeeping genes gyrB and metG (Y-axis).Target genes are indicated on the X-axis.Error bars represent standard deviations.Statistical significance was determined by Student's t-test and was calculated with reference to WT in each set of test gene; '*', P-value = <0.05;'**', P-value<0.01.https://doi.org/10.1371/journal.ppat.1006339.g007

Fig 8 .
Fig 8.In vivo effects of TprA2/PhrA2 system.(A,B) Analysis of PMEN1 strain SV36 WT and isogenic mutants ΔtprA2; ΔphrA2-ABC; ΔlcpAMT; and ΔtprA2ΔlcpAMT in the murine model with intranasal inoculations.(A) Percentage survival of mice after intranasal inoculation.Cohorts of at least ten mice were assessed for the duration of four days.Statistical significance relative to WT was calculated using Mann-Whitney U test; '*', P-value<0.05.(B) Bacterial counts from nasal lavages of mice 48h post-inoculation.(C,D) Analysis of PMEN1 strain (4595-T23) WT and isogenic mutants ΔtprA2 and ΔphrA2-ABC in the chinchilla model of otitis media.(C) Percentage survival of chinchillas after transbullar inoculation.Cohorts of at least ten chinchillas were assessed for the duration of ten days.(D) Scatter plots illustrate the maximal otologic score for animals infected with WT (green), ΔphrA2-ABC (red) or ΔtprA2 (blue).Each triangle represents one animal.Otologic disease ranged from no disease to a ruptured tympanic membrane, where a score of '1' is given for animals with mild or no disease, '2' with moderate disease, '3' with frank purulence, and "4" with tympanic membrane rupture.https://doi.org/10.1371/journal.ppat.1006339.g008

Fig 9 .
Fig 9. Phylogenetic analysis of separation between TprA2 and TprA systems.(A) Gene tree generated from the coding sequences for tprA and tprA2 using maximum likelihood.Each branch displays a sequence logo, derived from the predicted C-terminal heptapeptide of PhrA2 (top) and PhrA (bottom).In the logo, amino acids are represented in one letter abbreviation where their height within the stack represents its relative frequency at a given position, in zappo color-coding scheme: blue/positive; red/negative; salmon/hydrophobic; orange/aromatics; purple/glycine or proline; green/ hydrophilic.(B) Alignment of predicted coding sequence of PhrA and PhrA2 in PMEN1 strain PN4595-T23.Representation showing alignment (top) and consensus (bottom).Seven amino acids of the C-termini are highlighted in the red box indicating the sequence of synthetic peptides used in this study.https://doi.org/10.1371/journal.ppat.1006339.g009

Fig 12 .
Fig 12. Model for regulation of gene expression by TprA2-PhrA2.(A) In the OFF state, TprA2 inhibits gene expression.(B) In the ON state, PhrA2 releases TprA2-mediated gene inhibition.This effect of PhrA2 is observed from synthetic peptide added to the extracellular milieu and cell-free supernatant, suggesting that PhrA2 is exported, activated and re-imported before it modulates TprA2 activity, in both the producer PMEN1 cells and surrounding PMEN1 population.(C) PhrA2 secreted by PMEN1 cells activates gene expression of tprA and associated lanA, in both PMEN1 and non-PMEN1 cells.Red circular shape/TprA2, purple triangle/PhrA2, blue circular shape/TprA; blue triangle/PhrA.https://doi.org/10.1371/journal.ppat.1006339.g012

Table 2 . Strains used in this study.
added to the cultures to preserve RNA and subsequent RNA extraction and qRT-PCR were performed.