A Large Family of Antivirulence Regulators Modulates the Effects of Transcriptional Activators in Gram-negative Pathogenic Bacteria

We have reported that transcription of a hypothetical small open reading frame (orf60) in enteroaggregative E. coli (EAEC) strain 042 is impaired after mutation of aggR, which encodes a global virulence activator. We have also reported that the cryptic orf60 locus was linked to protection against EAEC diarrhea in two epidemiologic studies. Here, we report that the orf60 product acts as a negative regulator of aggR itself. The orf60 protein product lacks homology to known repressors, but displays 44–100% similarity to at least fifty previously undescribed small (<10 kDa) hypothetical proteins found in many gram negative pathogen genomes. Expression of orf60 homologs from enterotoxigenic E. coli (ETEC) repressed the expression of the AraC-transcriptional ETEC regulator CfaD/Rns and its regulon in ETEC strain H10407. Complementation in trans of EAEC 042orf60 by orf60 homologs from ETEC and the mouse pathogen Citrobacter rodentium resulted in dramatic suppression of aggR. A C. rodentium orf60 homolog mutant showed increased levels of activator RegA and increased colonization of the adult mouse. We propose the name Aar (AggR-activated regulator) for the clinically and epidemiologically important orf60 product in EAEC, and postulate the existence of a large family of homologs among pathogenic Enterobacteriaceae and Pasteurellaceae. We propose the name ANR (AraC Negative Regulators) for this family.

Virulence gene expression in EAEC is activated in coordinate fashion by a regulator called AggR, a member of the AraC/XylS family of bacterial transcription factors [7][8][9]; AggR is most closely related to the CfaD/Rns activator of enterotoxigenic E. coli (ETEC) [10], RegA in C. rodentium [11] and PerA in enteropathogenic E. coli (EPEC) [12,13]. AggR, Rns/CfaD, RegA and PerA are required for expression of genes mediating the biosynthesis of colonization fimbriae and other virulence-associated factors. We have recently reported that AggR regulates the expression of at least 44 genes, including itself, at the onset of the logarithmic growth phase [14].
Among the most dramatically upregulated genes in the AggR regulon was a very small cryptic open reading frame (orf; designated variously orf60 or orf61, but here only orf60) lacking homology to any known gene, but located on the pAA2 virulence plasmid nearby to the gene encoding AggR itself [14]. We were struck by the high prevalence and conservation of orf60 among EAEC strains. In an effort to understand the highly mosaic character of the EAEC genome and its relationship to endemic diarrhea in developing countries, we scored the presence of AggRdependent and AggR-independent putative virulence genes among collections of EAEC strains from children under 5 years of age with diarrhea in Mali [15] and Brazil [16]. In both studies, we were struck by the strong and consistent negative association between the presence of orf60/61 and diarrhea. Given compelling data favoring some important role for cryptic locus orf60/61 in EAEC epidemiology, we undertook a thorough investigation of its role in pathogenesis.
We report here that EAEC orf60 encodes a novel protein negative regulator, now formally re-named Aar (AggRactivated regulator). Although expression of the Aar-encoding gene in EAEC is indeed activated by AggR, expression of the gene down-regulates expression of AggR itself. Most surprisingly, we show that Aar is a member of a previouly unrecognized large class of regulators in pathogenic Gram negative bacteria.

Altered fimbrial expression in 042orf60
To examine the role of orf60, a 042orf60 mutant was generated, assessed for established EAEC phenotypes and observed by TEM. The orf60 mutant was complemented in trans by expressing the predicted orf downstream of the arabinose promoter in pBAD30, to yield pOrf60. Since the orf generated two possible protein products given translational start from an ATG or alternative start codon, the construct included the most upstream (alternative) potential start site.
Surprisingly, 042orf60 was found to be hyper-fimbriated by negative staining in TEM (Figure 1, B and E), and readily distinguishable from the parent strain ( Figure 1, A and D), or the 042orf60 mutant complemented in trans with the pOrf60 plasmid ( Figure 1, C and F), both of which showed sparse fimbriae. As predicted from the documented role of AAF in aggregation, larger bacterial aggregates were visualized by TEM in the 042orf60 mutant, when compared to the wild type or complemented strains. The hyper-aggregative phenotype of 042orf60 was also apparent in undisturbed broth cultures (Fig. S1 in Text S1, panels A and B). We have previously demonstrated that the AAF/II major pilin protein AafA binds to fibronectin [17]. To evaluate fimbrial function in the absence of orf60, we assessed this phenotype in 042orf60 and complemented constructs. As expected, the hyperfimbriated 042orf60 strain exhibited more abundant binding to fibronectin-coated plates (,200% binding) compared to the parent strain ( Figure 1, panel G), while overexpression of orf60 in the complemented EAEC strain yielded marked reduction of fibronectin binding.
As predicted from these phenotypes, 042orf60 yielded more abundant expression of the major AAF/II fimbrial subunit protein AafA than its wild type parent in immunoblot analysis, and this phenotype was reversed in the complemented construct ( Fig. S1 in Text S1, panel C). We therefore employed qRT-PCR to assess the effect of orf60 on aaf fimbrial gene expression. 042 wild type, 042orf60 and 042orf60(pOrf60) strains were grown in DMEM high glucose and samples from log-phase cultures were used for qRT-PCR analysis (Figure 1, H). We observed higher levels of expression for fimbrial genes aafA (10-to 25-fold), aafB (4-to 11fold), aafC (4-to 6-fold) and aafD (3-to 8-fold) in the 042orf60 mutant compared with the parent strain, while frimbrial gene expression in the complemented strain was nearly undetectable.
AggR and orf60 are cognate partners in a regulatory feedback system Our data suggested that orf60 may act as a negative feedback regulator of fimbrial gene expression, despite the fact that both the fimbriae and orf60 are under AggR activation. We sought to dissect features of this regulatory system in EAEC using qRT-PCR. Levels of transcription of aggR (Figure 2, A) and the aggRregulated genes aap (Figure 2, B) and aatPABCD (Figure 2, C) were compared between 042orf60 and the parent strain by qRT-PCR. We observed two-fold higher aggR mRNA levels in 042orf60 compared to the parent strain ( Figure 2, A). Expression of other AggR-regulated genes was similarly affected by orf60 deletion, including aap, aatP, aatA, aatB, aatC, and aatD ( Figure 2, B and C). The greatest effect was observed for the aatC gene, expression of which was 37-fold higher in the orf60 mutant compared with the wild type parent strain (Figure 2, C). Complementation in trans with the orf60 gene resulted in undetectable levels of aggR and AggR-regulated genes, comparable to the levels seen in EAEC lacking the pAA2 plasmid ( Figure 2A, B and C). We examined similarly aggR and orf60 expression in the 042aggR and 042orf60 mutant strains during the middle-late log phase of growth (5-8 h). aggR expression was consistently high in the absence of orf60 ( Figure 2, E), whereas orf60 levels were drastically reduced in the absence of aggR (Figure 2, F). These data suggest that the effect of orf60 on fimbrial gene expression was via a negative feedback effect on aggR itself, accompanied by the predicted suppression of other genes downstream in the AggR regulon.
We hypothesized that orf60 may represent a delayed negative feedback regulator of aggR. To test this hypothesis, RNA was obtained from strain 042 cultures prepared in DMEM high glucose at various stages of cell growth, spanning the log and early stationary phases (0-8 h of growth), and analyzed for aggR and orf60 transcripts (Figure 2, D). As previously reported [18], we observed a rapid increase in aggR expression in the first 2 h of growth (early log phase), while expression of orf60 was delayed, achieving maximal levels after 4 h of incubation (middle log phase) ( Figure 2, D); at this time point aggR expression was seen to be declining, consistent with activation of orf60 by AggR and repression of aggR by the orf60 product. At 6 h of growth, both aggR and orf60 levels were declining. Interestingly, orf60 gene expression reached its lowest expression level at the end of late log phase (8 h), while aggR expression was again increasing. Both aggR and orf60 were at low levels by 10 hr of growth (Figure 2, D).

orf60 encodes a regulatory protein
We sought to determine whether orf60 encodes a small regulatory RNA molecule or a regulatory protein. Expression of orf60 from pOrf60 was assessed after mutation of three potential start codons in the open reading frame: predicted codons 1, 4 and 5 (illustrated in Figure 3A as M1, M4, and M5). Plasmid derivatives (pOrf60-M1, pOrf60-M4, and pOrf60-M5) were used to complement the 042orf60 strain. Transcripts for orf60, aggR and aap (Figure 3, B, C and D) were determined in all 042orf60(pOrf60) derivatives. In all strains, high levels of transcripts for orf60 were detected by qRT-PCR (Figure 3, B). Replacement of the ATG only in position 5 with a stop codon lead to abrogation of the orf60-induced phenotypes, manifested as expression of aggR and the AggR-regulated gene aap ( Figure 3, panels C, D). The hyperaggregative phenotype of 042orf60 was not reversed when the

Author Summary
We report here the identification and characterization of a new family of negative regulators in Gram-negative bacteria, including many pathotypes of diarrheagenic Enterobacteriaceae and members of the Pasteurellaceae. Members of this regulator family in enteroaggregative (EAEC) and enterotoxigenic E. coli (ETEC) and in Citrobacter rodentium downregulate the expression of positive regulator partners AggR, CfaD/Rns and RegA, respectively, all members of the AraC/XylS family of regulators. Accordingly, we propose the name ANR (AraC Negative Regulators) for this family. ANR members orf60 (termed Aar), orf02851 (Rnr), orf0450 and orf01070 (Cnr) from EAEC, C. rodentium and ETEC respectively were characterized in this study. Deletion of ANR homologs upregulated the expression of AggR and RegA in EAEC strain 042 and C. rodentium respectively; overexpression of orf60, orf02851, orf0450 and orf01070 in EAEC strain 042 down-regulated AggR. C. rodentium harboring a null mutation in orf02851 exhibited a significant increase in expression of the regA and RegAregulated fimbriae. The orf02851 mutant showed higher levels of C. rodentium in feces and colonic contents, and greater weight loss compared to mice inoculated with the wild-type.
strain was transformed with pOrf60-M5, indicating requirement of this predicted start codon in orf60 expression ( Fig. S1 in Text S1, panels A,B).
Our data suggested that orf60 would be transcribed and translated as a small protein of 7.23 kDa. In order to observe and localize the orf60 protein, bacterial samples were fractionated into outer membrane, cytoplasmic/periplasmic fractions, and analyzed by Western immunoblot (Figure 3, E). Localization of the product at the predicted mass confirmed its existence and localized the product to the cytoplasmic/periplasmic fraction of the bacterium.
Using the Bprom algorithm, a strong sigma-70 bacterial promoter region was predicted upstream of the ATG start site; predicted 235 and 210 regions were included in plasmid pOrf60. To confirm that constitutive expression of orf60 was driven by this predicted promoter, the araC gene and P BAD promoter were removed from pOrf60 plasmid to generate pOrf60-2 ( Fig. S2 in Figure 1. Increased expression of AAF in 042orf60. Transmission electron microscopy of negatively stained samples was performed for 042 (Panels A and D), 042orf60 (Panels B and E), and 042orf60(pOrf60) (Panels C and F). Samples were visualized at a magnification of 65,000 (Panels A, B and C), or 625,000 (Panels D, E, and F). Bacterial adherence to fibronectin-coated plates was quantified as described in Materials and Methods (Panel G). Transcript levels for aaf fimbrial genes were measured in 042 (black bars), 042orf60 (gray bars), 042orf60(pOrf60) (open bars) and 042pAA2(-) (negative control; striped bars) by qRT-PCR during the log phase of growth (Panel H). Asterisks indicate significant differences by ANOVA (***, P, 0.0001; *, P,0.05). doi:10.1371/journal.ppat.1004153.g001 Text S1, panel A). pOrf60-2 plasmid was transformed into 042 and its orf60 mutant, and RNA transcripts for orf60 and aggR were quantitated by qRT-PCR. High transcriptional levels of orf60 were detected by qRT-PCR in strains transformed with pOrf60 and pOrf60-2 ( Fig. S2 in Text S1, panel C). 042 derivatives transformed with pOrf60-2 still retained the inhibitory effect of orf60 ( Fig. S2 in Text S1, panel D).
High transcriptional levels of orf60 did not compromise the fitness of the cell, as manifested by growth curves in DMEM medium ( Fig. S3 in Text S1, Panel A). To assure that the effects of orf60 expression in trans were not due to high copy number overexpression artifacts, we cloned the gene into low copy number vector a pACYC177. This lower copy number construct restored the wild type phenotype (Fig. S3 in Text S1, Panels B, C, D).

orf60 belongs to a new family of regulators
Blast analysis of orf60 and its predicted product did not suggest significant relatedness to any known bacterial regulatory protein; however, the analysis did identify nearly 50 hypothetical proteins, present in 7 major genera/species: Escherichia coli, Citrobacter spp., Haemophilus spp., Pasteurella spp., Mannheimia spp., Pantoea spp. and Aggregatibacter spp. (Figure 4). Like orf60, all predicted products of this family of hypothetical proteins exhibited low predicted molecular mass (39 to 80 amino acids, 4.36-9.54 kDa), and exhibited 44-100% similarity to orf60 (Figure 4).
Haemophilus orf60 homologs were mainly found in pathogenic strains including H. influenzae, H. paraphrohaemolyticus, H. parainfluenzae, H. aegypticus, and H. pittmaniae subspecies [19,20]. The predicted protein masses of orf60 homologs in Haemophilus sp. varied from 7.73 to 8.65 kDa and exhibited 49 to 56% amino acid similarity to the EAEC orf60 product (Figure 4, dark gray boxes). orf60 homologs were also found in pathogens Pasteurella multocida and Pasteurella bettyae [21]; these predicted proteins displayed 54% similarity to orf60 with sizes between 6.82 to 8.67 kDa ( Figure 4). Orf60 homologs with 51 to 55% similarity and masses between 8.26 to 9.04 kDa were also identified in Mannheimia haemolytica, the agent of bovine respiratory disease complex [22].
Orf60 homologs in Aggregatibacter actinomycetemcomitans and Aggregatibacter aphrophilus exhibit 50 to 55% similarity to EAEC orf60 and their masses range from of 7.85 to 8.17 kDa. Orf60 homologs are also found in Pantoea spp., a Gram-negative bacterium associated with infections in neonates [23]. The orf60 homolog in Pantoea is a 7 kDa protein that shows 50% similarity to EAEC orf60. Orf60 homologs in the SF1 subfamily share 44 to 58% amino acid identity across the full length of the predicted proteins ( Figure 4, light gray box). SF2 comprised homologs among EAEC (55989), ETEC (H10407, and 1392/75), and Shiga-toxin producing EAEC (HUSE41) strains; the archetype orf60 from 042 was assigned to this SF, all members of which shared 72-100% amino acid similarity (Figure 4).
Among the most noteworthy predicted proteins was ROD_02851 (8.9 kDa) from C. rodentium, sharing 52% similarity to orf60 (Figure 4). C. rodentium is an enteric pathogen which pursues a pathogenic strategy similar to that of EPEC and STEC, and which harbors both an AggR homolog (RegA) and an orf60 homolog.
We examined the genetic organization of the orf60 homologs among the diarrheagenic E. coli strains EAEC, ETEC, Shiga-toxin producing EAEC and C. rodentium. Unexpectively, in all the analyzed EAEC and ETEC genome sequences, four highly conserved features were found: a fimbrial operon, a dispersinlike protein, an Aat translocator system for dispersin, and an AraC family transcriptional regulator. Whereas the precise location of the genes varied, all were found to be located close to and in divergent organization up or downstream of AraC activators (Figure 5, A-G).
Most interesting was the case of archetype ETEC strain H10407 [24]. This strain has been described as having two copies of cfaD/ rns, which encodes the AraC homolog most closely related to AggR; one of the copies exhibits a frameshift mutation and is therefore considered to be inactive [25]. We found orf60 homologs divergently arranged in close proximity to both cfaD genes ( Figure 5, E). Another ETEC strain, 1392/75, harbors two orf60 genes, located in the p746 (p746_052) and p1018 (p1018_093) plasmids (Figure 5, F, G). In contrast to EAEC and ETEC, the orf60 homolog of C. rodentium was located within a predicted phage-encoded region, flanked by flagellar operon genes and the Aat translocator system (Figure 5, H).
Conservation of function would presumably dictate conservation of orf60 protein structure across the family. To test this hypothesis, we assembled an alignment of orf60 family members and applied a variety of prediction algorithms to reveal the presence of conserved predicted secondary structure. PRO-MALS3D algorithm strongly predicted the presence of three consensus alpha helices at conserved locations in the family (Fig.  S4). Several amino acid residues were highly conserved in identity or character across the family; these included a very highly conserved cysteine at position 60 in the EAEC orf60 protein.
We transformed 042orf60 with pOrf0450, pOrf1070, and pOrf02851; strains were grown in DMEM high glucose, then proteins and RNA were analyzed by western blot using anti-AafA antibody and qRT-PCR, respectively. AafA protein expression in 042 was completely abolished when complemented with any of the orf60 homologs (Fig. 6B). Similarly, aggR expression was suppressed in strains complemented with any of the ETEC or C. rodentium orf60 homologs (Fig. 6A). Thus, orf60 homologs from two different subfamilies were capable of complementing orf60 function in 042.

C. rodentium orf60 homolog repress RegA and its regulon
We sought to determine if the orf60 homolog of C. rodentium (ROD_02851) is a negative regulator of the AggR homolog RegA (Fig. 8). C. rodentium was transformed with pOrf02851K, grown and prepared for qRT-PCR analysis as above. In the absence of orf02851, we observed a significant increase in expression of regA (,10-to 12-fold) (Fig. 8, B), as well as the genes regulated by RegA: fimbrial genes kfcC, kfcE and kfcH (4-to 11-fold) (Fig. 8, C), dispersin (,80-fold) and the aat system (20-to 30-fold) (Fig. 8, D). Complementation in trans down-regulated the expression of regA and RegA-regulated genes (Fig. 8, B, C, D). Of note, mutations in the Kfc pilus genes were found to demonstrate early loss of colonization by C. rodentium [11].
Virulence of the C. rodentium orf02851 mutant in the C57BL/6 mouse model To examine potential effects of orf60-homolog orf02851 on virulence, groups of five mice were inoculated with the wild-type C. rodentium strain and the orf02851 mutant. Although both strains were recovered at high levels in stools collected from days 1 to 15 ( Figure 9A), orf02851 mutant showed sustained high levels of bacteria (10 7 to 10 10 ) for more than 11 days (day 3 to 15) ( Figure 9, A), whereas the wild-type strain exhibited a significant decrease in bacterial shedding levels after day 12 (Fig. 9, A). Only two mice out of five were shedding the wild type strain by day 16 and 17 (Table S3 in Text S2), while high levels of bacteria were detected in feces of mice inoculated with the orf02851 mutant (at levels of 10 7 to 10 8 per gram of stool) ( Table S3 in Text S2). Consistent with levels of shedding in the stools, we observed higher levels of challenge strain colonization in the colonic lumen in mice fed the wild type strain compared with those fed the orf02851 mutant (Fig. 9C). Mice inoculated with orf02851 strain also showed greater  weight loss compared to mice inoculated with wild-type (Fig. 9B). Transmission electron microscopic examination of colonic tissues revealed more intense cytoplasmic vacuolization in animals infected with the mutant strain compared with wild type, accompanied by greater amounts of luminal debris (Fig. 10).

Discussion
Many microorganisms utilize members of large families of regulators, presumably tailoring them to their pathogenetic idiosyncrasies. The large AraC/XylS family of gram-negative transcriptional activators is defined by a 100-amino-acid region of sequence similarity that contains two helix-turn-helix (HTH) DNA binding motifs [7,9]; a number of family members are involved in carbon metabolism, responses to environmental stress, and many serve as virulence gene regulators [7,9]. Several virulence-related AraC-family activators, including AggR of EAEC, have been shown to exercise autoactivation, presumably in order to assure very rapid activation of their regulons upon entry into an environment conducive to pathogenic lifestyles. Interestingly, however, for few autoactivators has a very simple engineering question been plausibly addressed: how does an autoactivator check runaway positive feedback? Unrestrained activation could be maladaptive in terms of optimal bacterial metabolism, or could conceivably render the bacterium more virulent than is fitting for optimal persistence in the host population. For these reasons, mutations in negative feedback regulators of autoactivators could engender increased or decreased virulence.
We suggest here the term Aar (AggR-activated regulator) to describe the orf60 gene product, and propose the term AraCfamily negative regulators (ANR) to describe its family. The precise contribution of Aar in pathogenesis remains unknown. It could act by simply preventing excessive action of its activator partner, and/or could assure finer levels of on-demand gene expression at precise times or places in the setting of pathogenesis. Alternatively, Aar could act directly or indirectly as a virulence suppressor, perhaps to modulate virulence as a result of selection towards clinical attenuation. Our finding of increased pathogenicity in aar-negative strains in Mali and Brazil would support this latter hypothesis [15,16]. Although the Aar-homolog mutation in C. rodentium yielded a predicted effect (enhanced intestinal colonization), it is unclear whether all Aar homologs act as virulence modulators.
The mechanism by which orf60 checks aggR expression is not apparent from our analysis. Alignment of the orf60 family revealed conserved motifs: strongly predicted alpha helical structure with several highly conserved residues ( Figure S4). Alpha-helical dominance is found in several DNA binding proteins [26], suggesting that this function could be provided by orf60. Alternatively, orf60 could bind AggR, preventing its interaction with the DNA upstream of its activated promoters. A third possibility entails action of orf60 via an intermediate regulator.  The existence of antivirulence genes is attracting increased attention. Early work on this area suggested the loss of basic metabolic genes whose activities interfered with virulence functions (e.g. cadA of Shigella spp.) [27]. Some pathogens express protein effectors that act to suppress virulence, often by diminishing the inflammatory response (e.g. Pic from Shigella flexneri) [28]. However, to our knowledge, Aar represents the first example of a dedicated pathogen-specific attenuator of virulence genes in gram-negative bacteria. Additional investigations are required to understand the role of Aar and its homologs in the setting of disease.

Bacterial strain and growth conditions
Bacterial strains used in this study are shown in table S1 in Text S2. Strains EAEC 042, 042aggR, and 042aggR(pBADaggR) were previously described [18,29,30]. Bacterial cultures were routinely propagated in Luria Broth (LB) and Dulbecco's modified Eagle's medium with 0.4% glucose (DMEM high glucose) (Gibco, Grand Island, NY) as previously described [14]. Mutagenesis of orf60 in 042 and C. rodentium strain ROD_02851 was accomplished by using lambda red technology [31]. The locus (46,236-46,438, GenBank FN554767.1) in 042 and (319,732-319,967, GenBank NC_013716.1) in C. rodentium were replaced with the kanamycin (km) resistance marker as previously reported [31]. 042orf60 and C. rodentium orf02851 strains were identified by PCR using specific primers for orf60, orf02851 and a km resistance marker (Table S2 in Text S2). We isolated a spontaneous pAA2 plasmid-cured derivative of strain 042 for use as a negative control in several assays.

Complementation of 042orf60 and C. rodentium orf02851
For the complementation of 042orf60, plasmids pOrf60, pOrf0450, pOrf1070 and pOrf2851 were generated in this study (Table S1 in Text S2). Briefly, a 328-kb fragment encompassing the orf60 gene and its predicted promoter (see Fig. S2 in Text S1) was amplified by PCR (spanning nucleotides 46,141-46,433 in GenBank accession FN554767.1), and cloned into the EcoRI and XbaI sites of pBAD30; the resulting plasmid was designated pOrf60. pOrf60-2 was generated by removing araC and the P BAD promoter from the plasmid vector backbone (Fig. S2 in Text S1). The orf60 gene with native promoter was generated by PCR (pACYC primers) ( Table S2 in Text S2) and cloned into the HindIII/SmaI site of vector pACYC177 (GenBank X06402) ( Figure. S3 in Text S1).
To express orf60 alleles from other bacteria, the EAEC orf60 gene was deleted by reverse PCR of pOrf60 so as to preserve the original EAEC orf60 translational start codon; the respective heterologous allele was fused downstream to assure transcription and translation at levels similar to EAEC. orf60 homologs from ETEC (p948_0450, genomic region 43,990-44,184 and Figure 10. Transmission Electron Microscopy of the intestinal tissues of mice infected with C. rodentium or its orf02851 mutant. Groups of 5 mice were inoculated orally with 10 10 CFU of C. rodentium (Panel B) and orf02851 mutant (Panel C) strains. Five mice were inoculated with PBS as control (Panel A). Mice were euthanized on day 9 and colonic tissues were prepared for TEM analysis. White and black asterisks indicate cellular vacuolization and debris observed in orf02851 mutant respectively (Panel C). Samples were visualized at a magnification of 63,000 (Panel A), or 65,000 (Panels B and C). doi:10.1371/journal.ppat.1004153.g010 p948_1070, genomic region 88,912-89,109, GenBank NC_017724.1) [32] and C. rodentium (ROD_02851, genomic region c319966-319733, GenBank NC_013716.1) [33] were employed to generate plasmids pOrf0450, pOrf1070, and pOrf2851 respectively. For the complementation of C. rodentium orf02851, the pOrf2851K plasmid was generated by inserting an encoding km r marker into pOrf2851 backbone.
Site directed mutagenesis of orf60 start codons  Table S2 in Text S2) and 2.5 U/ml of Pfu Turbo DNA polymerase. Samples were treated for 1 cycle/3 minutes at 95uC followed by 18 cycles of 95uC/30 seconds, 55uC/1 minute, 68uC/ 8 minutes. Samples were digested with DpnI for 2 h at 37uC and transformed into XL10-Gold Ultracompetent Cells (Agilent Technologies, CA USA). Plasmid derivatives were designated pOrf60-M1, pOrf60-M4, and pOrf60-M5 respectively. All constructs were verified by nucleotide sequencing at the University of Virginia DNA Science Core.

Real-time quantitative reverse transcription-PCR (qRT-PCR)
Overnight bacterial cultures of EAEC were diluted 1:100 into 13 ml of DMEM high glucose (aggR-inducing conditions), and incubated at 37uC without shaking for the times indicated. For ETEC and C. rodentium, bacterial cultures were diluted 1:100 in 13 ml of LB or DMEM-LB broth respectively. Extraction of RNA, cDNA synthesis and qRT-PCR assays were performed as previously described [14]. Primers for EAEC were previously published [14]. Primers for ETEC and C. rodentium genes are reported in Table S2 in Text S2. Reactions were run in experimental duplicate using two independent cDNA preparations. Expression levels for each queried gene were normalized to the constitutively expressed cat gene in EAEC 042, rpoA in ETEC H10407 and rpoD in C. rodentium as previously described [14,34,35].

Transmission Electron Microscopy (TEM)
Strains were grown in LB overnight with shaking at 37uC, then diluted 1:100 in 5.0 ml of DMEM-high glucose, and incubated with shaking to reach OD 600 of 0.8. Samples were prepared for negative staining with 2% uranyl acetate as previously described [14]. Colonic sections of mice infected with C. rodentium derivatives were analyzed by TEM. Briefly, groups of five C57BL/6 mice were inoculated with 10 10 CFU of C. rodentium derivatives. Mice were euthanized at day 9 post-infection and descending colonic tissue was dissected and prepared for TEM analysis. The TEM samples were viewed on a Jeol JEM1230 Transmission Electron Microscope (80 kV) at the Advanced Microscopy Laboratory at the University of Virginia (AML-UVA).

Detection of AAF/II and orf60 proteins
For detection of EAEC Aggregative Adherence Fimbriae II (AAF/II), strains were grown in 13 ml of DMEM high glucose to reach an OD 600 of 0.8. Bacteria were pelleted, resuspended in 100 ml of 0.5 mM Tris, 75 mM NaCl and heated for 30 min at 65uC. The major pilin subunit of AAF/II (AafA) was analyzed in the supernatant by SDS-PAGE and Western blot analysis. For detection of orf60 protein, outer membrane (OM) and cytoplasmic/periplasmic (C/P) fractions of E. coli 042orf60 and 042orf60(-pOrf60) were prepared as described [36]. Protein samples were separated in acrylamide gels and transferred to Immobilon-P membranes (BioRad, Hercules CA, USA) by using standard protocols. The membranes were incubated overnight with anti-AafA or anti-orf60 antibodies respectively. The next day, the membranes were washed twice in PBS-0.1% tween, and incubated for 1 h with a horseradish peroxidase-conjugated goat anti-rabbit IgG antibody. Membranes were developed by using TMB Membrane peroxidase substrate (KPL, Gaithersburg, MD, USA) following the manufacter's specifications.

AAF phenotype assays
Fibronectin-binding was assessed as previously reported [17]. The autoaggregation assay was performed as previously described (13).

Bioinformatic and statistical analysis
Hypothetical proteins were analyzed by using Clustalw algorithms (http://www.genome.jp/tools/clustalw/). Protein mass, secondary structure and similarity was determined in predicted proteins by using bioinformatic tools (http://www.bioinformatics. org/sms/prot_mw.html, http://www.ch.embnet.org/software/ LALIGN_form.html and http://prodata.swmed.edu/promals3d/ promals3d.php. 235 and 210 regions of the P orf60 promoter were predicted by using Bprom algorithm (http://linux1.softberry. com/cgi-bin/programs/gfindb/bprom.pl). Statistical analysis of the data for fibronectin binding, fecal shedding and gene expression was performed by using the GraphPad Prism 6 (GraphPad Software, Inc., CA, USA). The statistical significance of the differences in the sample means was calculated by using ANOVA with post hoc Tukey's correction. Results were considered significant at P,0.05.

C. rodentium mouse model
All animal work has been conducted according to relevant national and international guidelines. Four to five week old male C57BL/6 mice were inoculated with 200 ml of a bacterial suspension containing 10 10 CFU in PBS using a feeding needle. The mice were weighed daily and fecal pellets were collected aseptically from each mouse. The number of viable bacteria per gram of feces was determined by plating serial dilutions of the samples onto media containing appropriate antibiotics. For bacterial colonization, groups of 8-10 mice were inoculated with containing 10 10 CFU in 200 ml PBS. Mice were euthanized at day 9 post-infection and bacteria in the large intestinal lumen was quantified as described above.

Ethics statement
Animal experiments were performed in accordance with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health and with the permission of the American Association for the Assessment and Accreditation of Laboratory Animal Care. The protocol was reviewed and approved by the Institutional Animal Care and Use Committee of the University of Virginia (Protocol No. 3894).

Accession numbers
Presented are database accession numbers for genes identified by BLAST search and presented in Fig. 4. Numbers are listed in