Recognition and Degradation of Plant Cell Wall Polysaccharides by Two Human Gut Symbionts

Competition for nutrients contained in diverse types of plant cell wall-associated polysaccharides may explain the evolution of substrate-specific catabolic gene modules in common bacterial members of the human gut microbiota.


Introduction
The human distal gut is home to a densely populated microbial community (microbiota) that plays key roles in health and nutrition. The microbial symbionts that occupy this habitat produce an arsenal of enzymes that degrade dietary complex carbohydrates (glycans) that cannot be hydrolyzed by host enzymes [1]. The simple sugars generated are fermented into host-absorbable end products, including short-chain fatty acids, that can contribute as much as ,10% of the calories extracted from the human diet and are thought to play a role in preventing colorectal cancer [2]. Viewed at the broad taxonomic level of bacterial phylum, two groups of bacteria dominate the distal gut microbiota of adult humans and other mammals: the Bacteroidetes and the Firmicutes [3][4][5][6]. Studies of cultivated human microbiota species indicate that the Bacteroidetes, composed largely of members of the genus Bacteroides, exhibit broad capacities to metabolize a variety of plantand animal-derived glycans [7,8].
Bacteroidetes from a variety of environments including the human gut employ a similar strategy for binding and degrading various glycans [9]. These Gram-negative bacteria have amplified and permuted a series of gene clusters termed polysaccharide utilization loci (PULs). Each PUL that has been characterized to date encodes a suite of cell envelope-associated proteins (Sus-like system) that confer the ability to metabolize a single glycan or group of related glycans [8,[10][11][12]. Each Sus-like system contains at least one pair of outer membrane proteins homologous to SusC and SusD, which are essential for the import and degradation of starch by the prototypic starch-utilization system (Sus) for which related systems are named [10]. SusC-like proteins are predicted TonB-dependent receptors that span the outer membrane and transport oligosaccharides in an energy-dependent manner. SusDlike proteins are outer membrane lipoproteins that are oriented towards the external environment; they bind directly to specific glycans and contribute to the capture and delivery of oligosaccharides to the SusC transporter [12,13]. SusC-and SusD-like proteins function in concert with other outer membrane glycan binding proteins and polysaccharide degrading enzymes (glycoside hydrolases, polysaccharide lyases, and carbohydrate esterases), which are grouped into sequence-based families in the Carbohydrate Active Enzymes (CAZy) database [14].
PULs frequently contain genes encoding inner membrane sensor-regulator systems that control the expression of genes in their associated and usually adjacent locus [12,15,16]. These sensor-regulators are most commonly extra-cytoplasmic function sigma (ECF-s)/anti-s factor pairs or hybrid two-component systems (HTCS) that contain all of the domains of a classical twocomponent system phosphorelay in a single polypeptide [17]. While glycans are known to activate PUL-associated HTCS, there is a paucity of information about the actual molecular cues recognized, and the mechanism by which these inducers mediate their effect, although a previous study has indicated that large oligosaccharides are involved in activation of an HTCS controlled xylanase locus from a ruminant Bacteroidetes [18]. In the simplest model, the activating glycan will bind directly to the periplasmic sensor domain, as occurs when fructose activates the B. thetaiotaomicron fructan PUL by binding directly to its HTCS [12]. However, the periplasmic sensor domain of the fructan HTCS is atypical compared to those found in other HTCS (i.e., it is around one-third the size and adopts a periplasmic binding protein fold). It is therefore possible that the sensor domains most commonly associated with other PULs recognize more complex ligands, that the glycans are presented to the cognate HTCS bound to other periplasmic proteins, or that the HTCS interact with part of their cognate SusC porin in the periplasm in a manner analogous to the trans-envelope signaling that occurs in ECF-s/ anti-s systems [16,19]. Furthermore, the extent to which there is cross-talk between the regulatory systems of different PULs is unclear. Thus, while previous animal feeding experiments have shown which PULs in B. thetaiotaomicron are activated in vivo by diets containing plant polysaccharides [20,21], we do not know the specific components of the diet that activate individual PULs.
A major factor shaping the balance between different human gut bacterial phylogenetic types (phylotypes) is the ability of each group to compete efficiently for the complex glycans that are delivered to the distal intestine. To understand how human gut bacteria have evolved to occupy distinct niches, we have measured the ability of two closely related Bacteroides phylotypes to metabolize complex dietary glycans. B. thetaiotaomicron and B. ovatus share 96.5% nucleotide sequence identity in their 16S rDNA genes. With the exception of cellulose, these species are together capable of using all major glycan classes found in the gut mucosa and in plant cells as sole carbon sources. However, each species alone has evolved to target only a partial set of all possible glycans. We use transcriptional profiling, in conjunction with the characterization of mutants lacking functional PULs, to establish that their ability to metabolize different plant cell wall glycans is contingent on selective expression of PUL-encoded Sus-like systems. Most of the identified plant glycan PULs are linked to an HTCS that is activated by the presence of the PUL's target glycan. Specific activation of each PUL is achieved by direct recognition of signature oligosaccharide cues by the cognate HTCS. B. thetaiotaomicron contains many PULs that are not present in B. ovatus and confer its increased ability to target host mucin Oglycans and expanded capacity to target pectic structures. Conversely, B. ovatus harbors several unique PULs that enable it to use all of the common hemicelluloses, while B. thetaiotaomicron is unable to metabolize this group of plant structural polysaccharides. In both cases, unique species-specific PULs are scattered throughout the bacterial genome rather than being present in one or more large blocks. These data support the concept that adaptation to different glycan niches is driving selective evolution of PULs in these two species. This theme may apply to other bacteria in the human gut and has implications for both the basic ecology of the gut microbiota as well as efforts to intentionally manipulate this community to restore health or alter nutrition.

B. thetaiotaomicron and B. ovatus Have Distinct But Partially Overlapping Glycan Niches
To investigate the relationship between glycan metabolic phenotypes and underlying genetic architecture in commonly isolated human gut bacteria, we focused on B. thetaiotaomicron ATCC 29148 (also known as VPI-5482) and B. ovatus ATCC 8483. The rationale for our selection is based on a previous survey showing that while members of these two species can access a wide range of glycans [7], they display substantial differences in complex carbohydrate utilization.
To define in detail the range of glycans that B. thetaiotaomicron and B. ovatus are capable of utilizing, we constructed a custom panel of plant, animal, and microbial carbohydrates arrayed in 96well format (Table 1). These polysaccharides are more extensive than those used in previous studies, and include plant polymers

Author Summary
Bacteria inhabiting the human gut are critical for digestion of the plant-derived glycans that compose dietary fiber. Enzymes produced by the human body cannot degrade these abundant dietary components, and without bacterial assistance they would go unused. We investigated the molecular strategies employed by two species belonging to one of the most abundant bacterial groups in the human colon (the Bacteroidetes). Our results show that each species has evolved to degrade a unique subset of glycans; this specialization is reflected in their respective genomes, each of which contains numerous separate gene clusters involved in metabolizing plant fiber polysaccharides or glycans present in secreted mucus. Each glycanspecific gene cluster produces a related series of membrane-associated proteins which together serve to bind and degrade a specific glycan. Expression of each glycan-specific gene cluster is controlled by an environmental sensor that responds to the presence of a unique molecular signature contained in the substrate that it targets. These results provide a view of how related bacterial species have diverged into different carbohydrate niches by evolving genes that sense and degrade unique suites of available polysaccharides, a process that likely applies to disparate bacteria from the gut and other habitats. such as rhamnogalacturonan I and II and purified fragments from a number of other highly decorated pectins and hemicelluloses. In addition, we monitored anaerobic growth over a defined time interval, enabling quantitative measurements of both growth rate and final culture density for each glycan. B. thetaiotaomicron and B. ovatus each grew on a subset of the glycans tested (Table 1). Both species grew on plant cell wall pectins except arabinogalactan and arabinan, which were only efficiently utilized by B. thetaiotaomicron. Interestingly, as we had noted during a previous comparison of B. thetaiotaomicron growth on starch and dextran [13], each species grew more rapidly on several different polysaccharides compared to their corresponding monosaccharide components. Although this result is counter-intuitive because each polysaccharide needs to first be de-polymerized prior to metabolism, it suggests that monosaccharide transport and metabolism pathways are optimally triggered by the presence of polymerized sugar molecules. For example, growth of B. thetaiotaomicron on potato pectic galactan was 2.560.5 times more rapid than on its monosaccharide constituent galactose (p,0.001, Student's t test). More efficient utilization of polymeric glycans may be attributable to the fact that very little free monosaccharide reaches Bacteroides species in the distal intestine, with selective pressures acting on these bacteria to evolve ways to directly couple downstream catabolic pathways to glycan recognition.
B. ovatus was the only one of the two species capable of growing on the hemicelluloses tested (Table 1); it also grew on the disaccharide cellobiose but not on cellohexaose, suggesting that it is not capable of targeting higher molecular forms of cellulose. We conclude that the ability of B. ovatus to grow on cellobiose is a product of its ability to degrade other molecules that contain b1,4-glucosidic linkages, such as barley b-glucan or xyloglucan, a view supported by HTCS specificity data presented below. Additional insight into the sensing and catabolic specificities of B. ovatus for closely related b-glucan substrates was provided by testing a variety of structures that vary with respect to the linkages they contain and their relative ratios in the polysaccharide (see Text S1).
Beyond their utilization of plant cell wall glycans, both species grew well on plant cell storage carbohydrates such as fructans and starch (amylopectin), and on the starch-like molecules pullulan and glycogen. B. thetaiotaomicron and B. ovatus have reciprocal specificities for fructans: B. thetaiotaomicron grows best on the b2,6-linkages that occur in levan, whereas B. ovatus does not grow at all on levan but prefers b2,1-linked inulin [12].
Only B. thetaiotaomicron grew on mucin O-glycans, a trait that we previously demonstrated to be dependent on expression of over a dozen different PULs [8]. B. thetaiotaomicron also grew much more efficiently than B. ovatus on a-mannan, a fungal cell wall glycan that contains similar a-mannosidic linkages to those found in the core regions of N-linked glycans present on secreted mucus and epithelial surfaces. Lastly, B. thetaiotaomicron exhibited better growth (more rapid rate and higher cell density) on the third class of hostderived glycans, glycosaminoglycans (GAGs), which include chondroitin sulfate, heparin, and hyaluronan.
Together, the results indicate that these two human gutassociated Bacteroides have evolved distinct and only partially overlapping glycan niches. B. thetaiotaomicron is more adept at foraging the more soluble, and possibly more accessible, pectic components of plant cell walls. It also exhibits a well-developed capacity to metabolize host mucin O-glycans, a trait that could allow it to preferentially colonize the protective mucus layer that overlies the gut epithelium. Conversely, B. ovatus is metabolically specialized to utilize less soluble plant cell wall components like hemicelluloses, in addition to pectins, and thus is more likely to occupy physical microhabitats located in the gut lumen.

B. thetaiotaomicron Deploys a Subset of Its Sus-Like Systems to Target Cell Wall Pectins
To examine the molecular basis underlying plant cell wall glycan utilization by these two species, we used custom GeneChips representing 99.5% and all of the predicted or known ORFs in the B. thetaiotaomicron and B. ovatus genomes, respectively. Our rationale was that identifying the genes responsible for plant cell wall degradation would allow us to compare the genomic location and organization of these genes between species. Whole-genome transcriptional profiles were generated for each species during exponential growth on individual glycans (substrates upon which each species was profiled are noted in Table 1). The specific transcriptional responses of each species to growth on a particular glycan were then determined by comparison to a reference dataset of that same species grown in minimal medium with glucose as the sole carbon source (MM-G).
We previously demonstrated that genes associated with individual PULs typically exhibit large increases in their transcription when exposed to the substrates they process [8,12]. Therefore, we applied a cutoff of $10-fold change in expression in minimal medium containing a given glycan compared to MM-G. A total of 280 B. thetaiotaomicron genes exhibited altered expression in response to growth on one or more of the six pectins tested, or pullulan (a control for expression of the starch utilization system). Expression of 268 genes (96%) was increased, while only 12 exhibited decreased expression (see Table S1 for a list of genes and fold-change values). Of the genes with altered expression during growth on one or more glycans, 155 (56%) were associated with 16 different PULs and all but five of these genes were upregulated. These observations indicate that PULs are a primary component of this symbiont's response to different pectins.
To better visualize responses of entire PULs to the various glycans tested, genes were grouped into putative operons [22] and the average fold-change of each operon re-calculated from normalized GeneChip values [8]. Eleven PULs had one or more operons that still exhibited $10-fold induction when cells were exposed to pectin or pullulan ( Figure 1). Based on the predicted activities of the enzymes encoded by these PULs (enzymes in CAZy families where pectin degradation is a common feature), it is likely that 10 of them (all except the pullulan-induced starch PUL, BT3698-3704) make a significant contribution to pectin degradation (Table S2; for schematic diagrams of these 10 PULs, see Figure S1). Support for this view is provided by a recent study that showed that the three GH43 enzymes from the arabinan-activated PUL spanning BT0348-69 display arabinan-specific activity [23]. This emerging portrait of the predicted enzyme specificity in PULs that have been empirically matched with specific polysaccharide substrates will provide a valuable template for future functional annotation of the more than 100 Bacteroidetes genomes that are known to harbor similar gene clusters, as well as a starting point for more focused biochemical and enzymatic studies of how these systems each attack their specific substrates.
Comparison of our data to previous in vivo studies using gnotobiotic mice colonized with B. thetaiotaomicron alone (''monoassociated'') showed that a subset of the pectin-responsive loci were expressed in the ceca of animals consuming a diet composed of wheat-and soy-based plant material (''plant-rich diet'' column in Figure 1), suggesting that these natural dietary substrates contain pectins that are accessible to B. thetaiotaomicron in the absence of other bacteria. When these dietary substrates are withheld, as in mice fed a simple sugar diet [20] or in neonatal mice suckling on mother's milk [21], expression of the ''pectin PULs'' is reduced. In contrast, PULs targeting host-derived glycans, especially the Oglycans that are abundant in secreted mucus, are highly expressed regardless of diet, suggesting that B. thetaiotoamicron continuously forages on these substrates in vivo.
To better understand the specificity of PUL expression in response to different glycans, we also evaluated the plant glycaninducible responses of 12 previously identified PULs, known to orchestrate the degradation of host-derived glycans [8]. Despite the fact that some sugars like galactose and fucose are common to plant and host glycans, PULs specific for host glycans were not expressed in the presence of plant pectins or vice versa (compare columns A and B in the top and bottom sections of Figure 1), suggesting that individual PULs are activated in response to more complicated and unique oligosaccharide signals. Experiments that probe the specificity of HTCS regulators for oligosaccharide signals described later in this report support this view.
A notable feature of the pectin-specific PULs summarized in Figure 1 is activation of some PULs (e.g., BT4145-4183) by multiple substrates. A likely explanation for this phenomenon is that the preparations used for the growth assays were contaminated with trace amounts of other pectic glycans due to the inherent complexity and covalent connections originally present in these molecules prior to purification. This conclusion is supported by compositional analysis data available from the supplier: each preparation contained 5%-18% ''contaminating'' sugars that are not expected to be part of the purified pectic glycan but are present in other chains attached to a common backbone polysaccharide. Also, consistent with this notion, each PUL typically showed a much stronger transcriptional response on one substrate compared to the others. For example, BT4108-23 showed highest induction on homogalacturonan and lower overall responses to three other pectins-pectic galactan and rhamnogalacturonans I and II. Compositional analysis of the former two substrates indicated the presence of homogalacturonan contamination, whereas homogalacturonan is a backbone component of the rhamnogalacturonan II structure [24]. Thus, it is likely that BT4108-23 primarily targets homogalacturonan and that this substrate is responsible for its activation. This notion is compatible with the observed carbohydrate active enzyme (CAZyme) content of this PUL, which encodes seven enzymes that are members of families known to target homogalacturonan. Biochemical analysis of one of the CAZymes from the PUL (BT4116; a predicted family 1 polysaccharide lyase) supports this view as the enzyme was able to cleave homogalacturonan in an endo-like fashion but displayed no activity against rhamnogalacturonan I ( Figure S2 and Table  S2).  , growth on different forms of host glycans or a-mannan (a proxy for the mannose rich core region of N-glycans); column C, growth in vivo in adult mono-associated NMRI mice fed a ''plant rich diet'' or a ''simple sugar'' diet, or neonatal mice that are still suckling on mother's milk (''suckling''). A Venn diagram illustration of total gene changes in these conditions is provided in Figure S3. A feature shared by 8 of the 10 B. thetaiotaomicron PULs activated by pectin is their association with a HTCS regulator. To further dissect the involvement of specific PULs in pectin degradation, we disrupted the HTCS genes associated with pectin-activated PULs as well as several others associated with additional PULs (Table 2; see Figure S1 for a schematic of HTCS mutant locations and Table S3 for a list of all HTCS mutant strains tested). Six of the mutants with disruptions in HTCS genes linked to pectin-induced PULs resulted in a growth defect on different pectins (Table 2 and Figure S4). In contrast to the previously observed complete loss of growth after disruption of three B. thetaiotaomicron HTCS regulators associated with GAG-or fructan-utilizing PULs [8,12], none of the disrupted HTCS genes linked to pectin-induced PULs resulted in such a drastic phenotype on any of the pectins tested. Thus, although no individual PUL is absolutely essential for full degradation of a particular pectin preparation tested, each of the six individual PULs identified appear to be optimized for degradation of a specific substrate. This observation is supported by data presented below that show that individual PULs are specifically activated by defined oligosaccharides derived from the different pectins.

Several Unique B. ovatus PULs Target Hemicellulose
Our phenotypic analysis indicated that B. ovatus has evolved a capacity to target a series of hemicellulosic polysaccharides that B. thetaiotaomicron cannot access. To determine which B. ovatus genes are involved in hemicellulose metabolism, we performed transcriptional profiling experiments on B. ovatus cells grown on each of the different hemicelluloses (Table 1). This species was also profiled on the pectin homogalacturonan so that we could compare its responses to that of B. thetaiotaomicron on the same substrate. Growth of B. ovatus on individual hemicellulose preparations or homogalacturonan resulted in altered expression of 259 total genes using the same $10-fold cutoff threshold used for B. thetaiotaomicron: 229 of these genes were upregulated, while 30 were downregulated. As with B. thetaiotaomicron, most of the B. ovatus genes were uniquely expressed in response to just one or a few of the carbohydrates (Table S4).
We next identified putative PULs in the B. ovatus genome using the same criteria as those used previously for B. thetaiotaomicron [8]. Minimally, a PUL had to contain at least a pair of genes encoding homologs of the B. thetaiotaomicron SusC/D proteins. This effort yielded 112 candidate B. ovatus PULs encompassing 1,129 ORFs (see Table S5 for a list of genes and annotations by PUL). As observed in B. thetaiotaomicron, most annotated PULs also contained one or more genes encoding predicted CAZymes and/or an environmental sensor/transcriptional regulator. In total, 140 (61.1%) of the genes that were upregulated in response to growth on one or more glycans were located in PULs. Six different PULs were activated by hemicellulosic polysaccharides: two by xylans, one by galacto-/glucomannan, one by xyloglucan, and two by barley b-glucan ( Figure 2).
The predicted CAZyme content of the six PULs activated by hemicelluloses (Table S2 and Figure S6) is consistent with their capacity to orchestrate degradation of the inducing polysaccharides. For example, the xylan-, xyloglucan-, and galacto-/ glucomannan-regulated PULs encode GH10, GH9, and GH26 enzymes, respectively, families dominated by xylanases (GH10), endoglucanases (GH9), and b-mannanases (GH26). Indeed, a previous study has shown that two enzymes encoded within the smaller of the two xylan activated PULs (BACOVA_04387 and BACOVA_04386) display endo-xylanase and b-xylosidase activities, respectively [25]. In addition, a PUL that is homologous to the smaller B. ovatus xylan PUL was recently identified in the rumen Bacteroidete Prevotella bryantii using a transcriptomic approach and wheat arabinoxylan as a test substrate [11]. Finally, several of the enzymes from the xylan-activated PULs contain carbohydrate binding modules from families known to display xylan binding functionality (CBM6, 22, and 35), adding further support for a role of products of these loci in xylan utilization. Interestingly, the large xylan PUL also contains a number of genes encoding enzymes from CAZyme families not previously implicated in xylan deconstruction (e.g., GH31, GH95, GH97, GH98), suggesting that these sequences may exhibit novel specificities or target linkages so far not identified in various xylans (Table S2).
Together, these findings support our conclusion that both B. ovatus and B. thetaiotaomicron rely on similar PUL-based strategies to degrade plant cell wall glycans. They also highlight how broadly Bacteroidetes Sus-like systems have evolved and further define the experimentally demonstrated range of substrates they target to include all major classes of hemicelluloses.
We subsequently measured B. ovatus PUL gene expression in the distal gut (cecum) of mono-associated gnotobiotic mice fed the same plant glycan-rich diet used to examine in vivo expression of B. thetaiotaomicron. A total of 353 B. ovatus genes exhibited altered expression in vivo relative to in vitro growth in MM-G: 126 genes also exhibited altered expression in the presence of one or more of the plant cell wall glycans tested and 50% of all in vivo responsive B. ovatus genes were located within putative PULs ( Figure S5 and Table S5). Three B. ovatus PULs that were activated by the hemicelluloses xylan and b-glucan were expressed in vivo, suggesting that these substrates were present in the plant-rich diet fed to mice and could be sensed by this species in the absence of other members of the human gut microbiota (Figure 2). One of two B. ovatus PULs that responded to homogalacturonan in vitro was also expressed in vivo.

Evolution of PULs in the B. thetaiotaomicron and B. ovatus Genomes
Given the partially overlapping sets of carbohydrate degradation traits exhibited by B. thetaiotaomicron and B. ovatus, we wanted to examine the degree to which individual orthologous PULs were maintained between these two Bacteroides. We reasoned that PULs that are unique to either genome would provide evidence of independent acquisition or retention of traits that are not shared between the species. Because the genome sequence of B. thetaiotaomicron VPI-5482 has been assembled into a single circular chromosome and contains fewer PULs than B. ovatus, we performed individual searches, using each of the 88 individual B. thetaiotaomicron PULs as queries, to probe for similar loci in the deep draft assembly of the B. ovatus genome. Our method (described in detail in Materials and Methods) was based on first comparing the core SusC/D amino acid sequences from each B. thetaiotaomicron PUL to the closest set of homologs in B. ovatus and then to score potentially homologous PULs for both gene homology and synteny within the PUL and in flanking genomic regions. Using this approach, each of the 198 PULs in the two Bacteroides species was scored as ''homologous,'' ''probably homologous,'' or ''unique'' to a respective species (Table S6). Only 28 PULs met our criteria for being homologous between B. thetaiotaomicron and B. ovatus (i.e., included in the homologous or probably homologous groups), suggesting that differential acquisition or retention of novel PULs is a mechanism underlying the phenotypic differences between these species (Figure 3). Among the group of PULs shared by both species were loci corresponding to each of the glycan metabolic traits that were strongly exhibited by both species: these include PULs for targeting starch, fructans, glycosaminoglycans, and all pectins except arabinan and arabinogalactan (PULs with dark or light green labels in Figure 3).
Evidence of the divergent evolution of these two species becomes apparent through visualization of the non-homologous PULs in their respective genomes. B. thetaiotaomicron contains at least eight unique PULs associated with targeting host mucin Oglycans (see PULs with gold labels in Figure 3 that are noted as ''O-glycans''; this designation only includes PULs that have been confirmed to respond to purified neutral mucin O-glycans in vitro [8]). In addition, three previously validated PULs for degrading amannan are also unique to B. thetaiotaomicron. Conversely, B. ovatus contains five unique PULs that underlie its ability to target plant cell wall hemicelluloses. An additional PUL (BACOVA_0942-46) that responded to b-glucan weaker than a second PUL (BACOVA_02741-47) was scored as homologous.
The PULs that encode each species' unique phenotypes are scattered throughout each genome, suggesting that they arose through individual genetic events. This latter observation provides evidence for the idea that these two species are adapting to different carbohydrate niches and that these adaptations could serve to exclude access to others. For example, if B. ovatus had obtained its hemicellulose utilization PULs by lateral gene transfer or from a common ancestor, it might be expected that B. thetaiotaomicron would have been exposed to this same pool of traits during its own evolution. However, the sequenced strain of B. thetaiotaomicron analyzed here has not acquired any of these individual loci, nor do any of several dozen different B. thetaiotaomicron strains tested exhibit growth on hemicelluloses [7] (N. Pudlo and E. C. Martens, unpublished). Thus, B. ovatus appears to have evolved a predilection for hemicellulose degradation that has resulted in its retention of PULs that target a family of glycans occupying similar positions in the plant cell wall. A similar picture emerges for B. thetaiotaomicron, which has evolved a predilection for host mucin O-glycans and possibly Nglycans as evidenced by its robust ability to degrade the linkages in a-mannan.

Bacteroides HTCS Recognize Complex Oligosaccharide Signals
The data presented above indicate that each Bacteroides PUL recognizes a specific molecular cue that is a component of its target polysaccharide. The most common class of regulator associated with PULs that target plant cell wall glycans are HTCS, inner membrane spanning proteins with predicted periplasmic sensory domains [12,17]. A potential mechanism of signal perception by these regulators is direct binding of an oligosaccharide degradation product to the periplasmic domain of the HTCS. As noted above, this mechanism of signal perception has already been validated for one HTCS (BT1754) from B. thetaiotaomicron's fructan utilization PUL [12]. However, unlike BT1754, which recognizes monomeric fructose and contains a ,300 aa sensory domain that adopts the periplasmic binding protein fold, the majority of Bacteroidetes HTCS contain a much larger putative periplasmic sensor domain of ,700-900 aa unique PULs unique hemicellulose PULs [26]. Sequence analysis revealed that these large sensor domains contain multiple short motifs (Reg_prop, Pfam 07494) indicative of an overall b-propeller fold, followed by a domain of ,120 aa termed YYY (Pfam 07495) [27,28]. In addition to the predicted N-terminal periplasmic domain, most HTCS polypeptides possess all of the cytoplasmic domains present in a classical two component system phosphorelay [26], but in a single polypeptide, including a phosphoacceptor and dimerization domain, histidine kinase, receiver domain, and a DNA binding domain of the HTH_AraC family ( Figure S7).
To explore the mechanism of signal perception and identity in the HTCS proteins and further dissect their specificity, we expressed and purified the predicted periplasmic domains of several B. thetaiotaomicron and B. ovatus HTCS identified above as being involved in plant glycan utilization and assessed their ability to bind carbohydrates using isothermal titration calorimetry (ITC). The ITC data reveal that the periplasmic domains of four HTCS bind specifically to oligosaccharides that uniquely define the parent polysaccharide that the cognate PUL is optimized to degrade. Binding data are summarized in Table 3 and Figure S8, molecular illustrations of oligosaccharide signaling molecules that optimally bind to each HTCS sensor domain are shown in Figure 4, and detailed findings for each individual glycan are provided in Text S1. Notably, the binding of oligosaccharides to the HTCS is highly specific with each sensor domain only displaying affinity for glycan fragments derived from a single type of polysaccharide (e.g., the arabinan sensor BT0366 binds only arabino-oligosaccharides and not other oligosaccharides tested) (see footnote to Table 3).
Unlike the previously described fructose-sensing HTCS that binds a simpler monosaccharide signal [12], all of the HTCS sensors described here interacted directly with oligosaccharides, consistent with the idea that the specific recognition of most polysaccharides requires information contained in both the sugar content and glycosidic linkages. By sensing oligosaccharide cues that uniquely define the parent polysaccharide, B. thetaiotaomicron and B. ovatus (and by extension other Bacteroidetes that contain PUL-associated HTCS) are able to differentiate between multiple complex glycans that contain the same sugars, and to respond efficiently by activating only the appropriate PUL for their degradation. Interestingly, the preferred ligands for the HTCS were relatively large glycan fragments ranging from tetra-to octasaccharides and therefore represent early products of the depolymerization process that are likely transported into the periplasm via their specific PUL-encoded SusC-like outer membrane transporter, as has been suggested for maltooligosaccharides for the starch-utilization system [29]. Together, these data suggest that Bacteroides respond rapidly and specifically to the presence of polysaccharides in their environment. This requirement for a rapid and highly defined response may explain the localization of the signal input and output domains of HTCS in a single polypeptide, as this physical constraint will both maximize the speed of activation and minimize cross-talk among these systems. It is also notable that all four HTCS sensors interacted directly with linear oligosaccharides, suggesting that the presence of branches, which can vary between related glycans from different sources, is not a required signaling component for this subset of sensors. In one case (mannopentaose versus digalactosyl-mannopentaose), the presence of galactose branching interfered completely with detectable binding, suggesting that removal of these branches is a prerequisite for sensing of galactomannans.
To confirm that the oligosaccharides we identified by ITC were able to specifically activate PUL gene expression in bacterial cells, we measured the relative amount of the susC-like gene transcript (a proxy for expression of the whole operon) of each PUL when the bacteria were grown on the cognate HTCS ligand. The data reveal that the oligosaccharide that binds preferentially to the HTCS specifically upregulates the susC-like gene, and by inference the whole PUL, associated with that HTCS ( Figure 5). For example, exposure of B. thetaiotaomicron to arabino-octaose (the preferred ligand for BT0366 HTCS) results in a 10-100-fold greater induction of the susC homolog present in the BT0366associated PUL (BT0348-69) compared to its effects on the other PULs that are activated by arabinan, but whose associated HTCS display no binding to arabino-oligosaccharides (250-400-fold for susC-like genes BT0362 and BT0364 compared to 15-and 2-fold for the susC-like genes, BT3046 and BT4164). Similarly, growth of B. ovatus on xylotetraose, the preferred ligand for BA-COVA_04394, upregulates the susC-like gene (BACOVA_04393) associated with this PUL, but not two susC-like genes from the larger xylan PUL (BACOVA_03426 and _03428). These findings support the HTCS binding data and demonstrate that activation of each PUL is by a defined oligosaccharide cue that is specifically recognized by the associated HTCS (i.e., there is little or no crossreactivity between the PULs). Additional details of the transcriptional response of B. thetaiotaomicron to pectic oligosaccharides are provided in Text S1.

Prospectus
The distal gut of humans is constantly inundated with a dynamic array of carbohydrates. These substrates feed the dense consortium of microbes that compete in this habitat. The gut also presents biochemical gradients that result from the differential rates of digestion of dietary resources and from the presence of a mucus layer overlying the epithelium. Given the gut's intrinsic biochemical heterogeneity, it is perhaps not surprising that different microbial lineages would evolve to fill distinct glycan niches. The data presented here support the notion that two closely related human gut symbionts have taken divergent paths that have left each species with a partially unique repertoire of metabolic traits. Although B. thetaiotaomicron and B. ovatus are both common in the normal adult human gut microbiota, at least in Western societies [5], they represent only two of at least 45 different species of Bacteroidetes that have been cultured to date from human specimens. Complete or deep-draft genomic sequences will soon be available for many more members of these species as culturing efforts [30] and sequencing technology Figure 3. Comparisons of PULs in the sequenced type strain genomes of B. thetaiotaomicron and B. ovatus. The arrangement of PULs in the genome of each species is illustrated as a circular map with genes color coded as follows: ''homologous PULs'' (dark green); ''probably homologous PULs'' (light green); B. thetaiotaomicron-specific PULs (gold); B. ovatus specific PULs (light blue), B. ovatus-specific hemicellulose PULs (pink); all other genes in each species (gray). Shared PULs are labeled 1 through 28 based on their order in the B. thetaiotaomicron genome (clockwise from the top). Contigs in the deep-draft B. ovatus genome assembly are arranged in order of increasing size (clockwise from the top). Gaps in the genome assembly are illustrated as black tick marks around the interior of the B. ovatus genome. Empirically measured in vitro substrate specificities for some PULs are labeled around each genome schematic and correspond to PULs that were induced $10-fold in response to the indicated glycan class in this or previous studies [8,12]. Homologous B. ovatus PULs that correspond to a B. thetaiotaomicron locus with known substrate response are also labeled with that substrate name. CS, chondroitin sulfate; DS, dermatan sulfate. doi:10.1371/journal.pbio.1001221.g003 progress rapidly. The availability of additional cultured Bacteroidetes, together with the ability to dissect their carbohydrate active phenotypes using the approaches described here, present an opportunity to reveal basic biological parameters that have catalyzed the niche-specific adaptation of gut bacterial lineages throughout human history. At the same time, identifying the molecular machinery for acquisition, import, and catabolism of specific polysaccharides will help inform efforts to engineer carbohydrate active phenotypes in microbes (via PUL ''transplants'') in order to fulfill important industrial needs, as well as to  Figure S7. e Fit as ligand in cell. f TLTF, too low to fit (i.e., binding event observed, but affinity too low to accurately fit the data. Ka,,5.0610 2 M 21 ). g Megazyme cat. no. OGGM5. Each HTCS sensor domain was also evaluated for binding to the other oligosaccharides tested from non-inducing glycan sources. In all cases, no binding was detected. doi:10.1371/journal.pbio.1001221.t003 manipulate human gut microbiome function in ways that restore health or enhance nutrition.

Ethics Statement
All experiments involving mice used protocols approved by the Washington University Animal Studies Committee in accordance with guidelines set forth by the American Veterinary Medical Association. Trained veterinarians from the Washington University Division of Comparative Medicine supervised all experiments. The laboratory animal program at Washington University is accredited by the Association for Assessment and Accreditation of Laboratory Animal Care International (AAA-LAC).
The ability of B. thetaiotaomicron and B. ovatus to grow on pure carbohydrates was measured using a custom carbohydrate array constructed in a 96-well format. Each well of a flat bottom 96-well plate (Costar) was loaded with 100 ml of each sterilized carbohydrate stock (Table S7) at 26concentration. Each substrate was represented twice on each assay plate in two non-adjacent wells. Two carbohydrate-free water wells were included as negative controls. Cultures for assay inoculations were grown for ,24 h at 37uC under an atmosphere of 10% H 2 , 5% CO 2 , and 85% N 2 in MM-G, and a 1 ml aliquot centrifuged to pellet bacteria, which were then gently resuspended in 26 MM-no carbohydrate (MM-NC) and used to inoculate 50 ml of 26 MM-NC at a ratio of 1:50. Each carbohydrate array was loaded with 100 ml of the inoculated 26 medium to produce 96 individual 200 ml cultures. Assay plates were sealed in an anaerobic chamber (Coy manufacturing, Grass Lake, MI) under the atmosphere noted above with an optically clear gas-permeable polyurethane membrane (Diversified Biotech, Boston, MA). Plates were then loaded into a Biostack automated plate handling device coupled to a Powerwave HT absorbance reader (both devices from Biotek Instruments, Winooski, VT). Absorbance at 600 nm (A 600 ) was measured for each well at 10-15 min intervals. B. thetaiotaomiocron and B. ovatus were each tested in three separate carbohydrate arrays (n = 6 replicate cultures). Data were processed using Gen5 software (Biotek) and Microsoft Excel.
Several glycans yielded complicated polyphasic growth profiles rather than a single exponential growth phase. Thus, we  Cultures for transcriptional profiling were grown in borosilicate test tubes containing 5 ml of the same MM formulations described above, except that rhamnogalacturonan II was used at 15 mg/ml for transcriptional profiling experiments. All cultures were harvested during mid-to late-exponential phase; absorbance values (at 600 nm) of each harvested culture are summarized in Table S8.

Purification of Select Glycan Substrates for Bacterial Growth
Water soluble oat spelt xylan (OSX) was prepared by solubilizing oat spelt xylan (Fluka) in 1 M NaOH followed by centrifugation (8,7506 g for 30 min) to remove insoluble material. The soluble supernatant was then adjusted to pH 7.0 with HCl, centrifuged again to remove insoluble glycans that precipitated at neutral pH, dialyzed exhaustively against ddH 2 O and finally dried by lyophilization. Rhamnogalacturonan II (a kind gift from Malcolm O'Neil at the University of Georgia Complex Carbohydrate Research Center) was purified from red wine as described previously [33].

Whole-Genome Transcriptional Profiling
Transcriptional profiling was performed using custom Affymetrix GeneChips containing probesets representing .98% of 4,779 predicted B. thetaiotaomicron genes [20], and all of the 5,536 predicted B. ovatus genes. GeneChip targets were prepared from whole bacterial RNA and hybridized to the microarrays according to standard Affymetrix protocols (www.affymetrix.com). Data were normalized using Microarray Suite 5 or Expression Console software (Affymetrix) and processed using GeneSpringGX 7.3.1 software (Agilent) according to a previously described workflow [8]. Further details concerning bacterial growth conditions and experimental parameters are provided in Table S9, along with individual GEO accession numbers and file names for each dataset.

Colonization of Germfree Mice with B. ovatus
All mice were from the NMRI-KI inbred line and were reared in gnotobiotic isolators as previously described [21]. Six-week-old male germfree animals were used for B. ovatus colonization. Each mouse was gavaged with 100 ml of a fresh overnight culture containing ,1610 8 cfu/ml. B. ovatus colonization levels in the cecum were between 5610 10 and 5610 11 cfu/ml for all animals. Animals were sacrificed 14 d after colonization and their cecal contents harvested for RNA extraction.

Comparison of PULs between the B. thetaiotaomicron and B. ovatus Genomes
To locate putative PULs, the publicly available B. ovatus ORF annotation was searched using an iterative BLAST strategy described for other Bacteroidetes species [34]. This process yielded 112 B. ovatus PULs that minimally contained homologs of susC/susD. To compare PUL gene content between species we used a list of reciprocal best BLASTP hits between the B. thetaiotaomicron and B. ovatus genomes (E-value cutoff #210) and the ''show ortholog neighborhood regions'' in the Department of Energy Integrated Microbial Genome website (img.jgi.doe.gov) to guide analysis of PULs that were shared between these two species. Beginning with the first susC sequence in B. thetaiotaomicron, we searched for an orthologous neighborhood in B. ovatus. If this produced a hit, then we repeated the search with the adjacent B. thetaiotaomicron susD sequence to verify that the same locus was found in B. ovatus. We next compared the genomic regions surrounding each potentially orthologous PUL for conservation of gene content both within the PUL and in neighboring genomic regions. PULs were differentially scored for orthology between species based on the following criteria: (i) loci that had identical numbers of homologous PUL genes in the same orientation between species, and also contained at least three syntenic homologous genes in the region flanking the PUL, were scored as ''orthologous PULs''; (ii) loci that exhibited similar numbers of homologous PUL genes but in different orientation between species and still contained at least three syntenic homologous genes in the region flanking the PUL were scored as ''probably orthologous PULs''; (iii) PULs that exhibited different numbers of genes with little or no apparent homology, poor conservation of functional predictions (e.g., carbohydrate active enzymes), and were not located at syntenic genomic regions were scored as ''non-orthologous PULs.'' Cloning, Expression, and Purification of Recombinant Proteins DNA encoding the HTCS periplasmic domains were amplified from the appropriate species' genomic DNA using the primers stated in Table S9 and cloned into pET21d or pET28b (Novagen). The location of signal peptides and internal transmembrane domains in the HTCS proteins were predicted using the webbased programs SignalP 3.0 (http://www.cbs.dtu.dk/services/ SignalP/) and TMPred (http://www.ch.embnet.org/software/ TMPRED_form.html), respectively. E. coli BL21 or Tuner (Novagen) cells were used to express recombinant proteins, which were purified in a single step using metal affinity chromatography, as described previously [35].

Isothermal Titration Calorimetry
ITC was performed essentially as described previously [34], using a Microcal VP-ITC. Proteins (50-200 mM, in cell) were dialyzed into 20 mM HEPES, pH 8.0, and ligands (0.5-20 mM oligosaccharides, 5-20 mg/ml polysaccharides, in syringe) were dissolved in the dialysis buffer to minimize heats of dilution. Integrated binding heat effects minus heats of dilution were analyzed by non-linear regression using either a single or two-site binding model (Microcal Origin 7.0 software).

Quantitative PCR (qPCR)
Additional quantification of transcript expression was performed by qPCR using a Roche Lightcycler 480 and primers listed in Table S9. Bacteria were cultured in 5 ml of MM containing 0.5% carbon source, as described above. Triplicate bacterial cultures were harvested at mid-log phase and placed in RNAprotect (Qiagen) prior to purification with RNeasy kit (Qiagen). cDNA was produced with QuantiTect Reverse Transcription kit (Qiagen). qPCR was performed in a 96-well plate with SYBRgreen 480 I Master (Roche). Data were normalized to 16S rRNA transcript levels.
All oligosaccharides and polysaccharides (low viscosity) used for ITC and qPCR studies were from Megazyme (Wicklow, Ireland), except for oat spelt xylan and cellobiose, which were from Fluka. The water-soluble fraction of oat spelt xylan was used and prepared as described above.
CAZyme Analysis of B. thetaiotaomicron and B. ovatus ATCC 8483 Each protein model encoded by the genomes of the two Bacteroides studied here was subjected to a combination of BLAST [36] and HMMer [37] searches against, respectively, sequence libraries built with the individual modules of the proteins found in the CAZy database (www.cazy.org), and HMM models built with each of the families present in CAZy [14]. To avoid missing distant relatives, permissive thresholds were used (E-value,0.1), and all resulting candidate proteins were manually screened by comparison to the CAZy families (multiple alignments; presence of catalytic residues where known; presence of appended catalytic and non-catalytic modules, etc.).

Supporting Information
Text S1 Additional Results and Discussion of the B. thetaiotaomicron and B. ovatus plant cell wall degrading mechanisms. (DOC) Figure S1 Schematic of B. thetaiotaomicron PULs involved in plant glycan metabolism. (A) Eight HTCS-associated B. thetaiotaomicron PULs involved in plant pectin degradation. The HTCS genes (pink boxes) that were disrupted by plasmid insertions are labeled with an ''V'' symbol; black symbols indicate no phenotype was observed; red symbols indicate HTCS mutants that resulted in a growth-deficient phenotype. (B) Two B. thetaiotaomicron PULs expressed in response to rhamnogalacturonan II that lack an associated transcriptional regulator. Each gene is drawn to scale as a rectangle with its orientation indicated by the closed triangle. Dashed lines are used to connect linear segments and do not represent actual genomic distance. Genes that were not induced $10-fold in the indicated growth condition are shown as partially transparent (note that in all but two cases shown the HTCS regulator does not itself undergo any expression change). Genes encoding known or predicted functionalities are color coded: glycoside hydrolase (dark blue), polysaccharide lyase (light blue), carbohydrate esterase (light green), susC-like gene (purple), susDlike gene (orange), hybrid two-component system (pink), and other or unknown function (white). Genes encoding predicted enzymatic functions are also annotated according to their CAZy family number: glycoside hydrolase (GH), polysaccharide lyase (PL), and carbohydrate esterase (CE).  Figure S3 A Venn diagram of B. thetaiotaomicron genes that respond to plant and host glycans in vitro and in vivo. Comparison of genes from three different classes of growth conditions: in vitro growth on purified plant glycans; in vitro growth on purified host glycans; and in vivo growth in the ceca of mono-associated gnotobiotic mice consuming a diet rich in plant polysaccharides. All genes summarized exhibited $10-fold increased (''up'') or decreased (''down'') expression relative to growth on MM-glucose. These criteria are identical to those used in Figure 1 of the main text and in Table S1. Regulated genes from in vitro growth in the presence of the 7 plant glycans and 5 host glycans listed in Figure 1 of the main text were grouped together into the ''in vitro plant glycan set'' and ''in vitro host glycans set,'' respectively. Regions of overlap indicate inclusion of regulated genes from multiple lists. (EPS) Figure S4 Growth curves of B. thetaiotaomicron HTCS mutants on various pectins. In each panel, the growth profile of one or more HTCS mutants is compared directly to wild-type B. thetaiotaomicron. Six individual replicate cultures (200 ml each, in the same 96-well plate) were averaged to generate each curve. Error bars represent the standard deviation of each averaged value between the replicates. The average growth maximum for each mutant was compared to wild-type on the same substrate to quantify the growth defect in each mutant (see Table 2, main text). (EPS) Figure S5 A Venn diagram of B. ovatus genes that respond to plant glycans in vitro and in vivo. Comparison of genes from two different classes of growth conditions: in vitro growth on purified plant glycans; in vivo growth in the ceca of mice consuming a plant-rich diet. All genes summarized exhibited $10-fold increased (''up'') or decreased (''down'') expression relative to growth on MM-G; these are identical to the criteria used in Figure 2 of the main text and Table S4. Regulated genes from in vitro growth on the six-plant cell wall glycans listed in Figure 2 of the main text were grouped together into the in vitro plant glycans sets. Regions of overlap indicate inclusion of regulated genes from multiple lists. (EPS) Figure S6 Schematic of B. ovatus PULs involved in plant glycan metabolism. Seven B. ovatus PULs involved in hemicellulose or homogalacturonan degradation. Each gene is drawn to scale as a rectangle with its orientation indicated by the closed triangle. Dashed lines are used to connect linear segments and do not represent actual genomic distance. Genes that were not induced $10-fold in the indicated growth condition are shown as partially transparent (note that in all but two cases shown the HTCS regulator does not itself undergo any expression change). Genes encoding known or predicted functionalities are color coded: glycoside hydrolase (dark blue), polysaccharide lyase (light blue), carbohydrate esterase (light green), susC-like gene (purple), susDlike gene (orange), hybrid two-component system (pink), and other or unknown function (white). Genes encoding predicted enzymatic functions are also annotated according to their CAZy family number: glycoside hydrolase (GH), polysaccharide lyase (PL), and carbohydrate esterase (CE). Table S1 B. thetaiotaomicron genes with altered expression in vitro during growth on plant and host glycans or in vivo in monoassociated gnotobiotic mice fed a plant glycan rich diet. Genes are separated according to the Venn diagram sectors delineated in Figure S4. Values shown are the mean fold differences compared to expression during growth in MM-G; only fields with values $10-fold are shown. Upregulated genes are shown in green; downregulated genes are shown in red. Empty cells indicate a foldchange value ,10. (PDF)

Table S2
Glycan degrading enzymes encoded in B. thetaiotaomicron and B. ovatus PULs. Glycan degrading enzymes are listed as annotated in the CAZy database [14]. Predicted target linkages are based on the known activities within each CAZy family and the linkages known to be present in the substrate presented for growth. (PDF)  B. ovatus genes with altered expression in vitro during growth on plant glycans or in mice fed a plant glycan rich diet. Mean fold differences in expression compared to growth on minimal medium plus glucose are noted. ''Sector'' designations for each sub-list refer to the Venn diagram in Figure S6. (PDF)

Table S5
Putative PULs identified in the B. ovatus ATCC8384 genome. B. ovatus PULs that contain genes expressed in vivo in the ceca of mice fed a plant rich diet are indicated in the final column. Abbreviations and notes: extracytoplasmic function sigma factor, ECF; hybrid two-component system, HTCS; regulators with homology to other families of PUL-associated regulators are indicated (SusR, GntR, AraC); the presence of a potential regulator in a different family is indicated as ''unknown.'' (PDF)