Colonization factors among enterotoxigenic Escherichia coli isolates from children with moderate-to-severe diarrhea and from matched controls in the Global Enteric Multicenter Study (GEMS)

Background Enterotoxigenic Escherichia coli (ETEC) encoding heat-stable enterotoxin (ST) alone or with heat-labile enterotoxin (LT) cause moderate-to-severe diarrhea (MSD) in developing country children. The Global Enteric Multicenter Study (GEMS) identified ETEC encoding ST among the top four enteropathogens. Since the GEMS objective was to provide evidence to guide development and implementation of enteric vaccines and other interventions to diminish diarrheal disease morbidity and mortality, we examined colonization factor (CF) prevalence among ETEC isolates from children age <5 years with MSD and from matched controls in four African and three Asian sites. We also assessed strength of association of specific CFs with MSD. Methodology/Principal findings MSD cases enrolled at healthcare facilities over three years and matched controls were tested in a standardized manner for many enteropathogens. To identify ETEC, three E. coli colonies per child were tested by polymerase chain reaction (PCR) to detect genes encoding LT, ST; confirmed ETEC were examined by PCR for major CFs (Colonization Factor Antigen I [CFA/I] or Coli Surface [CS] antigens CS1-CS6) and minor CFs (CS7, CS12, CS13, CS14, CS17, CS18, CS19, CS20, CS21, CS30). ETEC from 806 cases had a single toxin/CF profile in three tested strains per child. Major CFs, components of multiple ETEC vaccine candidates, were detected in 66.0% of LT/ST and ST-only cases and were associated with MSD versus matched controls by conditional logistic regression (p≤0.006); major CFs detected in only 25.0% of LT-only cases weren’t associated with MSD. ETEC encoding exclusively CS14, identified among 19.9% of 291 ST-only and 1.5% of 259 LT/ST strains, were associated with MSD (p = 0.0011). No other minor CF exhibited prevalence ≥5% and significant association with MSD. Conclusions/Significance Major CF-based efficacious ETEC vaccines could potentially prevent up to 66% of pediatric MSD cases due to ST-encoding ETEC in developing countries; adding CS14 extends coverage to ~77%.


Introduction
Enterotoxigenic Escherichia coli (ETEC) cause diarrheal disease in children <5 years of age in developing countries and travelers' diarrhea among persons from industrialized countries who visit developing countries [1,2]. Human ETEC strains can produce a heat-labile enterotoxin (LT) that resembles cholera toxin and one or more heat-stable enterotoxins (ST) including human ST (STh) or porcine ST (STp). Strains can produce both LT and ST (LT/ST) or be STonly or LT-only. Most ETEC encode colonization factors (CFs) that allow the pathogen to attach to proximal small intestine enterocytes, the critical site of host-parasite interaction, before expressing enterotoxins that decrease villus tip cell absorption and evoke secretion of electrolytes and water by crypt cells [3].
Three main families of Colonization Factor Antigens (CFAs) are encoded by ETEC that cause diarrhea in humans including CFA/I, CFA/II and CFA/IV [3]. CFA/I is the sole member of the first family. CFA/II strains encode coli surface (CS) antigen 3 (CS3) alone or in combination with CS1 or CS2 [3], while CFA/IV strains encode CS6 alone or in conjunction with CS4 or CS5 [3]. CFA/I, CS1, CS2, CS4 and CS5 are rigid fimbriae~6-7 nm in diameter, CS3 are thin flexible fibrillae 2-3 nm in diameter [4], and CS6 morphology is nondescript.
ETEC vaccines intending to stimulate anti-CF immunity, with or without accompanying antitoxin immunity, are in clinical development. These include purified fimbriae or tip adhesins [5], inactivated fimbriated ETEC [6], attenuated ETEC expressing CFs [7], bacterial live vectors such as Shigella encoding ETEC CFs [8], multiple epitope fusion antigens [9], and ST toxoids [10]. Stimulating intestinal secretory IgA antibodies that bind CFs and prevent ETEC from attaching to human small intestine mucosa is generally considered to be fundamental to a successful ETEC vaccine, although some contend that parenteral vaccine-induced serum IgG antibodies that transude onto intestinal mucosa may also prevent diarrhea in humans caused by bacterial enteropathogens [11]. Most current ETEC vaccines contain a mix of antigens directed against CFA/I, CFA/II and CFA/IV antigens.
DNA-based high-throughput diagnostics have enabled large epidemiologic studies to quantify the ETEC disease burden among young children in developing countries and to assess the prevalence of various CFs. The overall objective of the Global Enteric Multicenter Study (GEMS) was to estimate the population-based burden, microbiologic etiology and adverse clinical consequences of moderate-to-severe diarrhea among children 0-59 months of age in study sites in sub-Saharan Africa and South Asia to guide the development and implementation of vaccines and other interventions [1]. GEMS tested for a large number of diarrheal pathogens, including ETEC, among cases of moderate-to-severe diarrhea (MSD) and matched (by age, gender, neighborhood and time of presentation) control children without diarrhea in three age strata (0-11, 12-23 and 24-59 months) at four sites in sub-Saharan Africa and three in South Asia [1,20], the geographic regions where 80% of global diarrheal deaths occur. The main underlying assumption of GEMS was that a limited number of etiologic agents may be responsible for a disproportionately large fraction of all MSD [21]. ST-producing ETEC, i.e., LT/ST and ST-only strains, were significantly incriminated as pathogens and placed ETEC as one of the top four pathogens associated with MSD across all seven sites and age groups [1].
A secondary aim of GEMS was to elucidate the proportion of ETEC strains, by toxin genotype that encode the main CFs and selected minor CFs. Herein we present the proportion of GEMS ETEC isolates that encode the main CFs found in most ETEC vaccines under development, and the prevalence of ten other putative attachment factors (CS7, CS12, CS13, CS14, CS17, CS18, CS19, CS20, CS21, CS30) that have been proposed as potential antigens to broaden ETEC vaccine immunoprophylaxis. In addition, based on the GEMS case/control design, we utilized conditional logistic regression to assess the strength of association with MSD of ETEC of the different toxin genotypes encoding the major and minor CFs.

Study design and population
The rationale [20], assumptions, clinical, epidemiological and microbiological methods of GEMS [1,22], a three-year case-control study undertaken among children <5 years of age in Gambia (Basse), Mali (Bamako), Mozambique (Manhiça) and Kenya (Siaya County) in sub-Saharan Africa and India (Kolkata), Bangladesh (Mirzapur) and Pakistan (Karachi-Bin Qasim Town) in South Asia, have been described. MSD was defined as an acute episode of diarrhea (�3 loose stools during a 24-hour period) that started within the previous seven days, was separated from another episode by �7 days, and was accompanied by either signs of dehydration (sunken eyes, slow abdominal "skin pinch" recoil or administration of intravenous fluids), dysentery or admission to hospital based on clinical concern over diarrheal disease severity [1,23].
The current GEMS report includes a descriptive summary of the prevalence of CFs among ETEC isolates from cases and controls by toxin profile and country, followed by analyses that utilize the GEMS matched case-control design to test hypotheses that major or minor CFs might be significantly related to the risk of MSD. Collectively, this information can help guide ETEC vaccine developers.

Ethics
This research involved characterization of isolates of enterotoxigenic Escherichia coli obtained from participants in the Global Enteric Multicenter Study (GEMS). The ethical review methods for this study were described in detail [23], as well as summarized in the overall publication of the results of the clinical study "The clinical protocol was approved by ethics committees at the University of Maryland, Baltimore, MD, USA, and at every field site [1]. Written informed consent was obtained from the parent or primary caretaker of each participant before initiation of study activities [1]." The clinical protocol included the collection of stool specimens or rectal swabs that were tested for the presence of colonies of enterotoxigenic E. coli and for the presence of other enteric pathogens [1,22].
The rationale for selecting some of the minor CFs for testing was because epidemiologic data incriminate them as being associated with pediatric diarrhea (e.g., LT-only strains expressing CS7) [13,14]. We tested for other minor CFs because volunteer challenges with well characterized strains encoding them showed that they can elicit diarrhea (e.g., CS17 and CS19) [29]. CS14 was studied because it has been common among ST-only and LT/ST ETEC in various reports [30,31]. CS18 and CS20 were studied because they share high homology. CS12 and CS21 ("longus") were studied because of long-term interest of some GEMS investigators [32][33][34], and their global prevalence [30], and advocates contending that they are virulence attributes [35]. CS30 was studied because it is found in LT/STp isolates and has homology to CS18 and CS20 [19].
We also selected the cited minor CFs to be studied based on their genetic relationships within the usher genomic typing system [36][37][38]. The majority of ETEC CFs are synthesized and transported utilizing a chaperone-usher system that typically contains four genes encoding a periplasmic chaperone, a major fimbrial subunit, an outer membrane usher and a minor subunit tip adhesin. Since there is only a single usher gene among these ETEC CFs, they can be readily classified by their sequences [36-38]. All the major CFs except CS3 and CS6 are found within the α usher sequence group, including CFA/I, CS1, CS2, CS4 and CS5. Minor CFs in this α group include CS7, CS14, CS17 and CS19; these homologies were another reason we tested for these CFs among the GEMS ETEC isolates. The γ 2 usher family includes four minor CFs of interest, CS12, CS18, CS20 and CS30, which is partly why we tested for them. CS13 belongs to the κ group [37]. CS3 and CS6 major CFs reside within the γ 3 usher group. CS8 (previously called CFA/III), which was not studied, and CS21 are not classifiable within the chaperone-usher system, since they are synthesized as type IVb pili.
CS18 and CS20 were initially tested using previously described primers that amplify sequences within fotG (which encodes the tip adhesin of CS18) [25], and csnA (which encodes the major subunit of CS20) [27]. With the recent report of CS30, a new minor colonization factor (CF) [19], and revelation of its similarity to CS18 and CS20, new primers were designed to enhance specificity. The new primers to detect CS18 amplify a sequence within fotA (that encodes the major fimbrial subunit) rather than fotG. Alignments of major and minor structural subunit genes of CS18, CS20 and CS30 are shown in Figs 1 & 2. Reference strains served as positive controls [16].
All isolates from the 806 cases and 711 control participants whose cultures yielded ETEC isolates were also tested by polymerase chain reaction (PCR) for genes encoding CFA/I and CS1-CS6, the major colonization factors (CFs). In addition, these isolates were all tested by PCR for several minor CFs including CS7, CS12, CS14, CS17 and CS21, all of which had been proposed to be potential virulence attributes and potential antigens to be included in an ETEC vaccine intending to elicit anti-colonization immunity.  [19] (Continued ) ETEC isolates from cases (N = 203) and controls (N = 295) that were negative for the major CFs and for minor CFs CS7, CS12, CS14, CS17 and CS21 were tested for genes encoding several additional minor CFs including CS13, CS18, CS19 and CS20; isolates from nine cases and eight controls could not be tested because they were not recoverable. After completion of testing for CS13, CS18, CS19 and CS20, a new minor CF, CS30, was reported as being found among a proportion of LT/ST isolates [19]. We thereupon re-tested for CS30 the 113 LT/ST isolates that were among the above-mentioned 203 case isolates; the 65 LT/ST isolates among the above-mentioned 295 control isolates were also tested. However, because of sequence homologies among CS30, CS18 and CS20, we also re-tested the 65 case and 113 control isolates for CS18 and CS20 using new primers that were designed to increase specificity (vide supra) (Figs 1 & 2).
Crude bacterial lysate was obtained by boiling five pooled colonies of each ETEC isolate in 0�1% Triton X-100 for 10 min, followed by centrifugation at 8000×g for five minutes to separate template DNA in the supernatant from cellular debris. PCR was performed with total bacterial DNA in a 25-μL reaction, containing 10 mmol/L deoxyribonucleotide triphosphate mix, 30 mmol/L MgCl 2 , 1× reaction buffer (10 mmol/L Tris-HCl, 50 mmol/L KCl), one Unit of Taq polymerase (GoTaq; Promega, Madison, WI), and 1 μL of template DNA. Primers were used at concentrations shown in Table 1. To prevent nonspecific amplification, we used the "hot start" technique, which includes preheating reaction mixtures to 94˚C for five minutes before adding Taq DNA polymerase. Samples were amplified for 35 cycles, with each cycle comprising 90 seconds at 94˚C for denaturation, 30 seconds at specific primers annealing temperatures, 60 seconds at 68˚C for strand elongation, and a final extension at 72˚C for five minutes. PCR products were electrophoresed in 2.0% agarose, stained with ethidium bromide, and amplicons identified based on expected size of the amplified product compared with amplicons of reference strains.
A subset of ETEC isolates were sent to the WHO Enterotoxigenic Escherichia coli Reference Laboratory, University of Gothenburg, Sweden, where they were tested for STp, STh, major CFs and phenotypic expression of CFs using monoclonal antibodies [27]. Gothenburg primers for STp and CS5 were used in Chile in addition to local primers [16,24,25,27].

Data analysis
Presentation of the descriptive observational data and analyses were restricted to ETEC cases that had a single ETEC toxin/CF genotype pattern.
Descriptive data. Prevalences of ETEC CFs were expressed as percentages in a stratified manner by ETEC toxin profile, site and region.
Matched case-control studies of the associations between ETEC CFs and MSD. Analyses of the strength of association between ETEC toxin and CF genotypes and MSD were performed using conditional logistic regression models in which the outcome was case-control status (MSD) and the independent variable (covariate) was whether the child's ETEC had the specific CFA (no/yes) [  conditional logistic regression was dictated by the matched case/control design, while the Firth approach was indicated because the subset of ETEC cases and ETEC controls that encode CFs is relatively small compared to the total number of children with ETEC infection. Matched odds ratios (ORs) and corresponding 95% confidence intervals (CIs) were obtained from these models. Because pooled as well as site-specific analyses were conducted, we examined for heterogeneity in ORs across sites using Chi square test for heterogeneity. A p �0.05 was considered significant. We did not use a Bonferroni adjustment for these 19 individual conditional logistic regression analyses of the association of individual minor CFs with MSD, as in each instance an individual hypothesis was tested [41][42][43][44]. Data were analyzed using SPSS version 23 (IBM, Inc., Armonk, NY) and SAS statistical software version 9.4 (SAS Institute Inc. Cay, NC, USA).

Results
When tested in a standardized manner in GEMS field-site laboratories using a multiplex PCR that included primers to detect genes encoding STh and LT [22], colonies from 1067 of 9439 MSD cases (11.3%) and from 975 of 13,129 matched control subjects (7.4%) tested positive. ETEC isolates were sent to the GEMS Reference Laboratory at the University of Chile to detect CFs [26]. Upon arrival, all isolates were re-tested to detect LT, STh and STp genes, since upon storage, sub-culture or transport, ETEC isolates may lose toxin or CF genes [45][46][47][48]. E. coli isolates from 894 of the 1067 cases were confirmed as ETEC and among the triplets of isolates tested from each case, 806 cases (90.2%) had a single toxin/CF profile observed; 83 others (9.6%) had two profiles and five cases (0.6%) had three different profiles recorded. A single toxin/CF profile was found among triplets of 711 controls.

Toxin genotypes among ETEC isolates from MSD cases and controls
Among the 806 single toxin/CF profile cases and 711 controls, the percentages of children at each site who harbored ETEC isolates of the different enterotoxin genotypes are shown in Table 2

Major CFs among ETEC from MSD cases and controls
The proportion of ETEC strains from MSD cases that carry major CF antigens including CFA/ I and CS1-CS6, by toxin genotype, are shown by country ( Table 3) and summarized by continent (Fig 3).   Overall, 363 (66.0%) of 550 ST-only and LT/ST strains encoded a major CF including 20.4% encoding CFA/I, 14.0% encoding CFA/II (i.e., CS3 alone or with CS1 or CS2) and 31.6% encoding CFA/IV (i.e., CS6 alone or with CS4 or CS5). The only major CF commonly observed among LT-only isolates was CS6-only, recorded in 43 of 256 LT-only strains (16.8%). Only three of 256 LT-only strains (1.2%) encoded CFA/I or CFA/II.
The 975 putative ETEC strains from control subjects that arrived at the GEMS ETEC Reference Laboratory at the University of Chile were re-tested to detect LT, STh and STp genes, of which 748 were confirmed as positive.

Minor CFs among ETEC case isolates lacking major CFs
Recognizing that 34.0% of ST-only and LT/ST strains and 82.0% of LT-only strains do not encode a major CF, we investigated those isolates to detect ones that encode exclusively one of the following characterized minor CF antigens: CS7, CS12, CS13, CS14, CS17, CS18, CS19, CS20, CS21 or CS30. We determined the proportion of ETEC MSD cases that had isolates encoding one of these minor CFs in the absence of a major CF and that accounted for at least 5.0% of the overall case isolates of that toxin genotype ( Table 5).
Among MSD cases with ST-only ETEC, only CS14, identified in 58 ST-only cases (19.9%), reached a prevalence of �5% ( Table 5); four MSD cases with LT/ST isolates lacking major CFs also encoded solely CS14 (1.5%). Cases isolates having other minor CS antigens encoded as the sole CS were uncommon (<5%) among ST-only and LT/ST isolates. As an example, we cite recently described CS30 [19]. Among the 83 LT/ST cases whose isolates lacked major CFs, 16, all LT/STp genotype, encoded CS30 but only five cases had CS30 as the sole CS. The 192 MSD cases with LT-only isolates lacking major CFs included strains encoding CS7 (7.8%) or CS17 (6.6%) as sole CS antigens, yielding a cumulative prevalence of 14.4% for LT-only strains encoding one of those two minor CFs ( Table 5).

Conditional logistic regression analyses to assess the strength of association between CF-toxin genotypes and MSD
The GEMS case/control design allowed us to assess the strength of association between the various major and minor CFs and MSD among cases versus their matched controls using conditional logistic regression models. To document the validity of this methodology, we first quantified the strength of association with MSD of the major CFs ( Table 6), since they are widely regarded as true virulence attributes. ST-only and LT/ST strains encoding CFA/I, CFA/ II and CFA/IV were all significantly associated with MSD (p<0.0001, p = 0.006, p<0.0001, respectively). In contrast, LT-only strains encoding only CS6 or CS5 and CS6 were not significantly associated with MSD (p>0.05; Table 6).
Conditional logistic regression modeling was then performed to assess the association between LT/ST and ST-only ETEC expressing one of the 10 minor CFs alone (CS7, CS12, CS13, CS14, CS17, CS18, CS19, CS20, CS21 or CS30) and MSD. Among ST-only and LT/ST ETEC strains encoding exclusively a single minor CF but no major CF, only CS14 was significantly associated with MSD ( Table 6).
When conditional logistic regression was performed for LT-only cases and ETEC strains encoding exclusively one of these ten minor CFs, only CS21 exhibited a significant association (p = 0.028). However, LT-only isolates expressing CS21 exclusively were uncommon among cases (N = 4) and matched controls (N = 0). CS7 did not show a significant association (p = 0.071) but the sample sizes of cases (N = 20) and controls (N = 12) were small. We did not use a Bonferroni adjustment for these individual conditional logistic regression analyses of the association of individual minor CFs with MSD, as in each instance an individual hypothesis was being tested [41-44].

Phenotypic expression of CFs
ETEC isolates from 443 cases (338 encoding major CFs and 105 encoding a single minor CF) were tested by dot blot immunoassay with specific anti-CF antibodies to determine the percent that phenotypically expressed on their bacterial surface the encoded major CF antigens. Of ETEC encoding CFA/I or CS1-CS5, 73.8-95.1% of isolates tested were dot blot-positive ( Table 7); the exceptions were the 65 CS6-only isolates tested that showed only 38.5% positivity. Among ETEC case isolates encoding one of the four minor CFs tested, dot blot immunoassay positivity ranged from 67.2% for CS17 to 94.4% for CS7.

Discussion
The GEMS case/control study demonstrated that ST-only and LT/ST ETEC, the enterotoxin genotypes exhibited by circa two-thirds of ETEC isolates from patients, were strongly associated with MSD [1,49]. Field studies involving small pediatric cohorts prospectively followed under active household surveillance document that these toxin types are also incriminated as causing milder diarrhea [13,50,51]. Most LT-only strains are not associated with diarrhea [1,13,50,52,53], as some descend from LT/ST strains through loss of genes encoding ST and a CF [3]. Nevertheless, evidence from diarrhea outbreaks in industrialized countries [54], experimental challenges in U.S. volunteers [12,55], and epidemiological studies in developing countries indicate that a subset of LT-only strains do appear to be bona fide diarrheal pathogens [13,14], and it would be desirable to prevent diarrhea caused by that subset. The quandary, heretofore, has been how to identify accessory virulence attributes that distinguish the subset a CFA/II strains are defined as encoding CS3 either alone or in combination with either CS1 or CS2 but never both CS1 and CS2. Very rarely isolates that encode CS1 without CS3 have been reported,[26] but the rare CFs of this nature recovered in GEMS are not included in this table. b CFA/IV strains are defined as encoding CS6 either alone or in combination with either CS4 or CS5, but never both CS4 and CS5. Very rarely isolates that encode CS5 without CS6 have been reported but the few such isolates recovered in GEMS are not included in this table.
https://doi.org/10.1371/journal.pntd.0007037.t004  Four cardinal findings emerged from examining the CF genotypes of the GEMS ETEC isolates. First, GEMS results confirm that ETEC vaccines based on stimulating immune responses to the major CFs (CFA/I and CS1-6), if highly efficacious in blocking CF-mediated attachment to enterocytes, could prevent diarrhea caused by up to 66% of the ST-only and LT/ST strains, the toxin genotypes strongly incriminated as pathogens ( Table 5). The fact that ETEC encoding these CFs were observed in a very large study involving multiple representative sites in sub-Saharan Africa and South Asia validates that ETEC vaccine strategy for the geographic regions where 80% of young child diarrheal deaths occur worldwide. In contrast, major CFs were uncommon among LT-only isolates.
Important avenues of ETEC vaccinology research have focused on identifying additional CFs among ST-only and LT/ST isolates that lack CFA/I, CFA/II and CFA/IV and to identify minor CFs that might be targets for protective immune responses directed against the subset of LT-only strains that are pathogenic. Thus, the second cardinal observation is identification of the proportion of strains in each toxin genotype that lacked a major CF but that exclusively expressed one minor CF, including either CS7, CS12, CS13, CS14, CS17, CS18, CS19, CS20, CS21 or CS30. Collectively these minor CFs raised the percent of ST-only cases having a recognized CF from 64.3% (187/291) to 86.3% (251/291) and raised the percent of LT/ST cases having a recognized CF from 68.0% (176/259) to 79.9% (207/259) ( Table 5). The percent of MSD cases with LT-only ETEC having a recognized CF similarly rose from 25.0% (64/256) to 54.7% (140/256). Although minor CFs encoded by the different toxin genotypes of ETEC strains collectively raised the proportion of cases that had a CF target, it was not known if these minor CFs also identified these strains as being pathogenic, i.e., significantly associated with MSD, as were the major CFs. Moreover, most individual minor CFs were uncommon, defined as <5% of strains of a toxin genotype that lacked a major CF. Thus, a third cardinal observation was assessment of the strength of association between MSD and ETEC encoding the various major and minor CFs within the different toxin genotypes. These novel analyses (  (CFA/I and CFA/IV, p<0.0001; CFA/II, p = 0.006). Thus, CFA/I, CFA/II and CFA/IV are not only surface-exposed targets for effector immune responses, but when expressed by LT/ST and ST-only ETEC they are markers indicating that these strains are strongly incriminated as diarrheal pathogens. In contrast, LT-only isolates encoding CS6 alone or CS5 and CS6 were not significantly associated with MSD (Table 6), nor was there a trend. A notable proportion (19.9%) of ST-only isolates encoded CS14 alone, i.e., with no other minor or major CFs, and these were significantly associated with MSD (p = 0.001) ( Table 6). No other individual minor CF was both common and significantly associated with MSD among the cases infected with ST-only and LT/ST isolates.
Whereas LT-only strains encoding exclusively CS21 were significantly associated with MSD (p = 0.021), these strains were distinctly uncommon. LT-only strains encoding exclusively CS7 were prevalent (7.8%, Table 5) but they were not significantly associated with MSD (p = 0.07). However, the sample sizes of cases (N = 20) and controls (N = 12) with LT-only encoding exclusively CS7 were small, so further investigation of this CF should be encouraged to explore the potential role of CS7 for potential inclusion in an ETEC vaccine. Support for this notion comes from two small infant cohort studies in Guinea-Bissau (N = 200) and Egypt (N = 348) that assessed the association of ETEC encoding specific CFs with diarrhea using logistic regression models and reported significant associations of LT-only CS7 with infant diarrhea [13,14].
Since experimental challenge with an LT-only strain encoding CS17 fomented diarrhea in adult volunteers [12], it was somewhat surprising that LT-only/CS17-only strains were not significantly associated with MSD in young children in GEMS. Nevertheless, because CS17 shares epitopes with CFA/I, CS1 and CS2, and CS7 shares epitopes with CS5, the immune responses to these major CFs within a vaccine that also contains a LT toxoid to stimulate anti-LT could collectively confer protection against pathogenic LT-only strains encoding CS17 and CS7 [56].
The fourth key observation is that inclusion of CS14 expands the breadth of vaccine coverage against ST-only pathogens, raising it from 64.3% to 84.2% (Table 6). Obviously, antigenic expansion, particularly if multiple minor CFs beyond CS14 (e.g., CS7, CS17, CS21) were to be added to a major CF-based ETEC vaccine, would increase the vaccine's complexity and cost. Nevertheless, there is precedent for successfully addressing this problem with other bacterial vaccines. Pneumococcal conjugate vaccines were expanded from 7-valent to 13-valent to allow broader global coverage, while multivalent meningococcal conjugate vaccines currently include four separate serogroup conjugates. Some ETEC vaccine strategies, such as attenuated Shigella live vectors encoding two separate CFs per vector strain, can be adapted relatively easily to express additional CFs [57].
Another approach to broaden coverage of an ETEC vaccine is based on formulating a mix of fimbrial tip adhesin proteins [5]. Fimbrial CFs can be classified based on the amino acid sequence relatedness of their tip adhesin proteins, with several important ETEC CFs falling into Class 5 fimbriae assembled by the alternate chaperone pathway. Whereas the major fimbrial subunit proteins that create the stalks of these fimbriae differ substantially from one another antigenically, their tip adhesin proteins are highly conserved into three sub-classes [58,59]. Antibody against one adhesin of the subclass cross protects against attachment by other members. Thus, protection may also be broadened by this strategy. Selecting which tip adhesins to include in a multivalent vaccine requires knowing the frequency of the CFs among ETEC globally; so the GEMS data inform this vaccine strategy as well. Another strategy to broaden ETEC vaccine coverage would include non-fimbrial surface antigens, e.g., EtpA (a non-fimbrial adhesin) and EatA (a serine protease) [60].
Among ETEC strains encoding a major CF other than CS6 alone or a minor CF, 67.2% -95.1% of isolates reacted with the specific homologous anti-CF antibody by dot blot immunoassay, thereby documenting phenotypic expression. The exceptions were the CS6-only isolates of which only 25/65 (38.5%) were dot blot-positive ( Table 7) whether they were LT-only isolates encoding CS6-only (10/32, 31.3%) or ST-encoding CS6-only strains (15/33, 45.5%). The expression of CFs is highly regulated [61][62][63], with temperature, bile, concentrations of glucose, glutamine and iron, and proximity to epithelial cells all influencing expression [64][65][66][67]. Thus, one explanation for lack of expression is that the in vitro growth conditions that we utilized did not induce the regulated biosynthesis of isolates encoding certain CFs. Transcriptional regulators such as CfaD (also called CfaR) and Rns that are members of the AraC family of transcriptional regulators modulate the expression of CFA/I, CS1, CS2, CS4 and CS5 [61,62,66,68]. In contrast, although certain growth conditions such as temperature modify CS6 expression [67], no specific positive regulator has been identified for CS6 [68][69][70]. Alternatively, isolates that are PCR-positive but dot blot-negative may have single nucleotide polymorphisms, minor mutations in structural or chaperone genes or lower copy number plasmids that still allow amplification by PCR but may diminish or abrogate expression of the CF [69,71].
One theoretical limitation of our study is that the PCR primers designed to detect ST at the field sites were optimized for STh; thus some STp-only isolates may have been missed. LT-STp strains from cases were not under-estimated in GEMS because all LT-only strains were retested with PCRs individually optimized for STp and STh in the Chilean Reference Laboratory and upon re-testing only a limited number of LT-only isolates were found to be LT/STp. We believe that few cases and controls with STp-only were missed. A GEMS follow-on study detected STp and STh in genomic DNA extracted from whole stool specimens of a subset of 5304 case/matched control pairs using a TaqMan Card-based quantitative real-time PCR (qPCR) methodology and documented that the ST burden was overwhelmingly attributed to STh [49]. Optimized detection of STp by qPCR increased the overall ETEC disease burden estimate by only 15% versus what was recorded using the gel-based PCR methodology at the field sites [49]. This is similar to the overall difference based on presumed gene loss between primary isolation and results of re-testing strains following storage and transport to the Reference Laboratory. Other studies have found that STp-only isolates are uncommon compared to STh-only when methods sensitive for STp are used [13].
Analyzing the array of CFs among GEMS ETEC isolates has provided important information to guide ETEC vaccine development and future deployment. Since ST-only and LT/ST strains are strongly incriminated as the key ETEC pathogens, a fimbrial-based ETEC vaccine that included CFA/I, CS1-6 and CS14, if highly efficacious, could theoretically confer protection against up to~77% of such ETEC pathogens.