Functional Mapping of Protein-Protein Interactions in an Enzyme Complex by Directed Evolution

The shikimate pathway enzyme chorismate mutase converts chorismate into prephenate, a precursor of Tyr and Phe. The intracellular chorismate mutase (MtCM) of Mycobacterium tuberculosis is poorly active on its own, but becomes >100-fold more efficient upon formation of a complex with the first enzyme of the shikimate pathway, 3-deoxy-d-arabino-heptulosonate-7-phosphate synthase (MtDS). The crystal structure of the enzyme complex revealed involvement of C-terminal MtCM residues with the MtDS interface. Here we employed evolutionary strategies to probe the tolerance to substitution of the C-terminal MtCM residues from positions 84–90. Variants with randomized positions were subjected to stringent selection in vivo requiring productive interactions with MtDS for survival. Sequence patterns identified in active library members coincide with residue conservation in natural chorismate mutases of the AroQδ subclass to which MtCM belongs. An Arg-Gly dyad at positions 85 and 86, invariant in AroQδ sequences, was intolerant to mutation, whereas Leu88 and Gly89 exhibited a preference for small and hydrophobic residues in functional MtCM-MtDS complexes. In the absence of MtDS, selection under relaxed conditions identifies positions 84–86 as MtCM integrity determinants, suggesting that the more C-terminal residues function in the activation by MtDS. Several MtCM variants, purified using a novel plasmid-based T7 RNA polymerase gene expression system, showed that a diminished ability to physically interact with MtDS correlates with reduced activatability and feedback regulatory control by Tyr and Phe. Mapping critical protein-protein interaction sites by evolutionary strategies may pinpoint promising targets for drugs that interfere with the activity of protein complexes.


Introduction
In prokaryotes, fungi, algae, and plants, the aromatic amino acids L-phenylalanine (Phe, F), L-tyrosine (Tyr, Y), and L-tryptophan (Trp, W) are biosynthesized via the shikimate pathway [1][2][3]. The initial step is the condensation of D-erythrose-4-phosphate 1 and phosphoenolpyruvate 2 to 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP; 3) by DAHP synthase (Fig. 1A). Another six enzymatic steps afford chorismate 4, which is the substrate for anthranilate synthase in the branch towards Trp biosynthesis, or which is converted by chorismate mutase (CM) to prephenate 6, the precursor of Phe and Tyr. Feedback inhibition of strategically positioned enzymatic steps is a commonly used strategy in the shikimate pathway but it is implemented in different ways for various organisms [1]. For example, Escherichia coli produces three isoforms of DAHP synthase, each of them being sensitive to Tyr, Phe, or Trp [2]. In addition, the bifunctional CM-prephenate dehydratase [4,5] and CM-prephenate dehydrogenase [6,7] are inhibited by the products Phe and Tyr of the respective metabolic branches in E. coli [2].
In contrast, Mycobacterium tuberculosis possesses just one DAHP synthase (MtDS) and one intracellular, monofunctional chorismate mutase (MtCM). Whereas MtDS alone is strongly inhibited by the simultaneous binding of the three aromatic amino acids [8,9], kinetic investigations have shown that the CM activity of MtCM itself is not subject to feedback control [10]. However, MtCM becomes sensitive to synergistic inhibition by Tyr and Phe upon formation of a non-covalent enzyme complex with MtDS. Structural characterization by x-ray crystallography has revealed that the complex consists of two MtCM dimers, which decorate a tetrameric MtDS core (Fig. 1B) [10]. The CMs from this heterooctameric complex exhibit structural features that significantly deviate from the prototypic bacterial CM domain, the AroQ a fold, as well as from the eukaryotic (AroQ b ) and the secreted (AroQ c ) CMs, and were thus grouped as the AroQ d subclass [11,12]. MtCM is the first AroQ d representative that was investigated in structural detail [10,13]. It is assumed that the CMs of the shikimate pathway in the bacterial order Actinomycetales, which includes corynebacteria and mycobacteria, generally belong to the AroQ d subclass and are involved in a similar enzyme complex with the corresponding DAHP synthases [10,14,15].
Without MtDS, MtCM activity is reduced by two orders of magnitude. From the crystal structures of free MtCM and the MtCM-MtDS complex it is clear that MtDS residues do not directly participate in the acceleration of the chorismate to prephenate rearrangement [10]. Instead, it was speculated that the interaction with MtDS optimally positions MtCM active site residues for catalysis. Thus, the stimulation of CM activity must be indirect through transmission of conforma-tional changes at the subunit interface to the catalytic center of MtCM (Fig. 2) [10]. However, it is poorly understood how individual protein segments participate in the more than 100-fold enhancement in catalytic activity. Some Cterminal positions of MtCM were previously probed by site-directed mutagenesis for their contribution to the activation mechanism. Whereas MtCM variants Arg87Ala, Leu88Ala, and a variant with a truncation before Leu88 showed catalytic parameters similar to wild-type MtCM in the absence of MtDS, activation upon complex formation was affected over a broad range of between only 2-fold (Arg87Ala) and 70-fold (Leu88 truncation) [10]. This finding suggested that the tested C-terminal residues are not involved in the basic catalytic machinery, but in the activation mediated by MtDS. To more systematically probe the activation mechanism, we implemented here an evolutionary strategy for the identification of patterns of MtCM residues compatible with formation of a productive complex with MtDS.

Rationale for randomizing MtCM residues
We chose the method of directed evolution [12,[16][17][18] to investigate the role of the C-terminal residues of MtCM in the formation of the complex with MtDS. Thereby, several gene libraries encoding MtCM variants with randomized Cterminal positions were generated and subjected to selection for CM function. The shikimate pathway starts out with the condensation of D-erythrose-4-phosphate 1 and phosphoenolpyruvate 2 to form D-arabino-heptulosonate-7-phosphate (DAHP) 3 catalyzed by DAHP synthase. DAHP is processed in six further enzymatic steps to chorismate 4. Chorismate mutase (CM) catalyzes the pericyclic Claisen rearrangement from 4 via the presumed bicyclic transition state 5 to prephenate 6 [46]. A mimic of this transition state, Bartlett's endo-oxabicyclic dicarboxylic acid transition state analog 7 is the best known inhibitor of most CMs [47]. (B) Hetero-octameric complex between MtCM and MtDS. MtDS forms a tetrameric core (shown in surface representation) which is flanked by two MtCM dimers (cartoon mode with a-helices represented as cylinders featuring 7 as a stick model with grey carbons in the active sites) that clamp the MtDS tetramerization interface (PDB: 2W1A) [10].
From superimpositions of the crystal structures of free MtCM (PDB: 2VKL) [10], MtCM in complex with MtDS (PDB: 2W1A) [10], and EcCM (the CM domain of the CM-prephenate dehydratase from E. coli; PDB: 1ECM) [19], and from an alignment of AroQ sequences, we identified several residues in the Cterminal region of MtCM as candidates for facilitating the boost in catalytic activity upon complex formation. While not contributing functional groups to the active site, the seven most C-terminal residues (positions 84-90) looked promising in this regard as they fulfill the following criteria: (a) they are proximal to MtDS in the MtCM-MtDS complex and are also in contact with active site residues. The six last residues are within 6 Å of MtDS (PDB: 2W1A; Fig. 3A; [10]), contacting ten out of the 20 MtDS residues at the interaction interface Wall-eyed stereogram of the superimposed active sites, with both liganded malate (sticks with green carbons) and 7. Side chains of some active site residues, which change location upon complex formation, are shown as sticks [10].
between the two enzymes. Arg85 additionally makes van der Waals contacts to ligand 7, which is bound at the active site of MtCM. (b) They show structural and chemical dissimilarities to the prototypical AroQ a protein, EcCM, which does not rely on activation by a complex partner. EcCM has two residues in the C-terminal portion of helix 3, Ser84 and Gln88, that are crucial for its activity. Upon Gln88Ala mutation, EcCM activity drops by a factor of 2610 4 , and Ser84 is considered important for orientation of the substrate molecule in the active site [19,20]. Both residues are missing in MtCM, and there are no chemically plausible substitutes in the C-terminal region (Fig. 3A). Furthermore, instead of an extended helix in EcCM, the C-terminus in MtCM adopts a loop structure. Upon complex formation, this loop rearranges such that the penultimate Cterminal residue changes its position by more than 14 Å [10]. Gly84 and Gly86 of MtCM may thereby serve as helix-breakers, allowing the C-terminus to bend away from the active site ( Fig. 2A). (c) They show good conservation in the AroQ d  [10]. Catalytic residues are indicated with dots and numbers above or below the primary sequence. Residues that could assume the roles of EcCM's Ser84 and Gln88 are missing in MtCM. MtCM residues within a 6-Å shell of MtDS are highlighted in cyan. (B) Multiple sequence alignment of representative AroQ d CMs from the order of Actinomycetales. The conservation of individual residues is color-coded by text highlighting in black, as 100%; red, $75%; orange, $50%; yellow, $33%; white, ,33% identity; numbering according to the MtCM (Mtu) sequence. Abbreviations: Mtu, M. tuberculosis; Eco, E. coli; Cdi, Corynebacterium diphtheriae; Rer, Rhodococcus erythropolis; Tpa, Tsukamurella paurometabola; Gbr, Gordonia bronchialis; Sro, Segniliparus rotundus; Ame, Amycolatopsis mediterranei; Svi, Saccharomonospora viridis; Ami, Actinosynnema mirum; Nmu, Nakamurella multipartita; Sna, Stackebrandtia nassauensis; Str, Salinispora tropica; Gob, Geodermatophilus obscurus; Stvi, Streptomyces viridochromogenes; Ace, Acidothermus alignment (Fig. 3B). In fact, the multiple sequence alignment of different AroQ d CMs exhibits a fully conserved Arg-Gly dyad at MtCM positions 85-86 and strongly conserved residues next to this pattern that are not found in AroQ a proteins like EcCM (Fig. 3A).

Redesign and calibration of the selection system for directed evolution
To select for MtCM variants that are still capable of productive interactions with MtDS, a previously established selection system [21] based on the CM-deficient and thus Phe and Tyr auxotrophic E. coli strain KA12 was adapted. KA12 carries a chromosomal deletion of the genes pheA and tyrA encoding the two bifunctional enzymes CM-prephenate dehydratase and CM-prephenate dehydrogenase, respectively. It can grow on minimal medium devoid of Phe and Tyr (M9c), if provided with the helper plasmid pKIMP-UAUC carrying the genes pheC and *tyrA for monofunctional versions of prephenate dehydratase and prephenate dehydrogenase, respectively [21], and, additionally, with a compatible plasmid containing a sufficiently active CM gene. Fig. 4A shows that the wild-type MtCM gene on plasmid pKTNTET complements the CM deficiency of KA12/pKIMP-UAUC on selective minimal plates. However, growth is only possible, if MtCM gene expression is induced with an elevated concentration (500 ng/mL) of tetracycline (Tet), the inducer of the P tet promoter upstream of the CM gene [17,22]. In the absence of inducer or at lower Tet levels, growth is impossible or severely impaired. As an alternative to varying the Tet concentration for rigorous control of the intracellular enzyme level [17,22], the stringency of the selection system can be tuned by providing Phe in the selective minimal medium (i.e., M9c +F; Fig. 4A), such that the cells only need to biosynthesize Tyr for growth [21]. Fig. 4B illustrates an extended version of the selection system to explore MtCM sequence features important for activation by MtDS. Instead of plasmids pKIMP-UAUC and pKTNTET, KA12 contains, respectively, pKIMP-ACG, which additionally carries aroG encoding MtDS, and the library plasmid pKT-CM, which encodes partially randomized MtCM variants. Since pKT-CM has an otherwise identical structure to pKTNTET, expression of the aroQ d mutant genes is also controlled from P tet .
A possible concern is that endogenous E. coli DAHP synthases might affect MtCM activity. However, this is highly unlikely, since MtCM is a member of the distinct AroQ d subclass not present in E. coli, and the endogenous DAHP synthase isoenzymes, which belong to a different subtype than MtDS, are not part of a feedback regulatory circuit for controlling CM activity [10]. The lack of regulatory interactions between MtCM and the DAHP synthases of KA12 is also apparent in Fig. 4A where addition of Phe to M9c promotes -rather than feedback inhibitsgrowth of KA12/pKIMP-UAUC/pKTNTET.
The new KA12/pKIMP-ACG selection system was calibrated using wild-type MtCM (on pKTNTET) and appropriate negative controls. It was demonstrated (Fig. 4A) that at basal promoter levels (no or only little Tet added), survival of Fig. 4. Selection system for MtCM variants that can be activated by MtDS. The selection system used is based on E. coli strain KA12 lacking the endogenous pheA and tyrA genes, which encode CM-prephenate dehydratase (PDT) and CM-prephenate dehydrogenase (PDH), respectively. Plasmid pKIMP-UAUC has the p15A-derived origin of replication (ori p15A ) and carries *tyrA for a monofunctional PDH and pheC for a monofunctional PDT, in addition to cat providing chloramphenicol resistance [21]. Plasmid pKIMP-ACG additionally contains the aroG gene encoding MtDS. (A) Performance of wild-type bacteria on selective minimal medium agar plates (M9c, M9c +F) depends on the formation of an active MtCM-MtDS complex (i.e., the presence of both pKIMP-ACG and pKTNTET). These results showed that KA12/pKIMP-ACG can be used as a new selection system to explore the interactions between the two enzymes.

Sequence patterns selected from libraries of C-terminally randomized MtCM
Eight MtCM gene libraries were constructed by randomizing different subsets of the codons for the seven C-terminal residues (Fig. 3C). This was accomplished with partially degenerate oligonucleotides encoding the mutagenized positions in the NNN codon format, deliberately including the three stop codons in the randomizing cassettes. The library plasmid pools were transformed into KA12/ pKIMP-ACG and the transformants were plated on minimal medium agar plates. Library sizes determined from control platings on non-selective M9c +FY (supplemented with both Phe and Tyr) ranged from 3610 4 to 1.8610 6 clones per library. Sequencing of 65 clones from non-selective plates confirmed that the chemical oligonucleotide synthesis and the procedures used in the construction of the libraries did not inappropriately bias the composition of the libraries at the nucleotide level, which would have skewed the ensemble of amino acid sequences available for selection (Fig. 5A). The KA12/pKIMP-ACG libraries were subjected to selection on minimal M9c agar plates. After 3 days of growth at 30˚C, single colonies were picked and the sequence of their aroQ d genes was determined. From the eight combinatorial libraries, a total of 111 members growing on selective plates were sequenced. Fig. 5B summarizes the pattern of residues found in MtCM variants that enable survival under stringent selection conditions in the presence of MtDS.
All selected clones retained a small amino acid at position 84 (Gly, Ala, Thr, Ser). This coincides with the conservation of a small residue (Gly, Ser, Cys) in the alignment of natural AroQ d sequences (Fig. 3B). The crystal structure of the MtCM-MtDS complex (Fig. 6) provides a rationale for this finding. Gly84 is buried in the protein, leaving little space for larger amino acids at this position. The Arg85-Gly86 dyad was fully conserved in all experimentally selected clones (Fig. 5B), as well as in natural AroQ d proteins (Fig. 3B), and thus might be crucial for complex formation. Alternatively, these residues may play a role in the catalytic machinery of MtCM, which was probed in a separate experiment described below. Arg85 makes hydrophobic contacts to the transition state analog MtCM (encoded by aroQ d on plasmid pKTNTET) without MtDS (pKIMP-UAUC) or with MtDS (pKIMP-ACG) in comparison to a negative control (pKTCTET-0, lacking aroQ d ). Growth was assessed on M9c-based minimal plates in the presence or absence of Phe (F) and Tyr (Y). Colony size was scored either as++(good growth),+(moderate growth), -(poor growth), or 0 (no trace of growth) as a function of the added Tet concentration, the inducer of the P tet promoter upstream of aroQ d . (B) Schematic representation of the redesigned selection system. The aroQ d gene encoding an MtCM library variant is provided on plasmid pKT-CM that has otherwise the same structure as pKTNTET (bla, b-lactamase for ampicillin resistance; its ori pUC is compatible with ori p15A ). Under stringent selection conditions (M9c), host cells transformed with both pKIMP-ACG and a pKT-CM library plasmid can only produce enough prephenate and consequently Phe and Tyr needed for growth if the encoded MtCM variant can engage in a productive complex with MtDS (wide green arrow). Transformants having insufficient CM activity (thin, light green arrow) require exogenously added F and Y for growing on M9c minimal plates.  Column colors correspond to the randomized positions 84 (blue), 85 (red), 86 (green), 87 (purple), 88 (cyan), 89 (orange), and 90 (light blue). Side chains are ordered according to increasing volume [48]; an asterisk denotes a stop codon. The absolute number of codons compiled at each position is indicated in parentheses next to the wild-type residue. The absolute numbers of individual residues found at every position are, in addition to the graphical representation of the relative frequencies shown here, listed in S1 Table.  7 via its apolar methylene groups and also to the MtDS surface ( Fig. 6) [10]. It is completely engulfed by surrounding residues and by 7, except for the charged head group, which is partially solvent accessible. Inspection of the crystal structure of the complex also provides a rationale for the total conservation of Gly86 found in the experiment. Even though the a-carbon of Gly86 is surface exposed and enough space would potentially be available to accommodate larger residues, Gly86 is probably retained because it is the only residue that can prevent unfavorable interactions of other C-terminal amino acids with the MtCM catalytic core. This conclusion is supported by a Ramachandran analysis where the w/yangles of Gly86 place this residue far away from the allowed regions for all other amino acids. Thus, Gly86 appears to be an ideal choice because it can break the last helix, allowing the C-terminal residues to bend away from the protein core and thereby making them available for interactions with MtDS ( Fig. 6) [10].
Residue Arg87 is only moderately conserved phylogenetically (Fig. 3B) and also exhibits high variability in our selection experiments, including substitutions by smaller and other polar residues (Fig. 5B). Space constraints may disfavor amino acids bulkier than arginine, since clearly less Phe, Trp, or Tyr were selected. In the crystal structure of the MtCM-MtDS complex, Arg87 makes only few contacts to other residues in MtCM and none to MtDS (Fig. 6), suggesting that it is less important for activity enhancement by MtDS. In contrast, Leu88 is almost completely buried by other residues from MtCM and MtDS (Fig. 6). In this context, it is remarkable that several other amino acids are tolerated at this position. Besides the preferred Leu, other apolar amino acids like Ala, Pro, Val, Ile, and Met were found, suggesting that the size of the residue is less important at this site of the interaction interface than the preservation of hydrophobic contacts.
Similar to Leu88, Gly89 makes close contacts to residues of both MtCM and MtDS (Fig. 6). From the selected amino acid pattern at this position, small residues seem preferred for high CM activity of the complex. Interestingly, besides mostly Gly, Ala, and Ser, many of the selected variants terminated at position 89 (Fig. 5B). Thus, larger residues appear to be more detrimental to complex formation than the absence of the last two C-terminal amino acids. About one third of the natural AroQ d proteins in the multiple sequence alignment (Fig. 3B) also terminate at this position, corroborating our observations. Even though His90 interacts with four MtDS residues in the crystal structure (Glu396, Arg399, Arg461, and Asp462; PDB: 2W1A), it is not conserved among the selected mutants. The properties of the side chains that act as functional substitutes range from very small to very large and from hydrophobic to charged (Fig. 5B).
Overall, the pattern of residues emerging from the selection experiments in the presence of MtDS mirrors the conservation pattern in the multiple sequence alignment of AroQ d proteins ( Fig. 3B; visualized in Fig. 6). In general, conservation of MtCM residues could either mean that they are important for the AroQ d -specific activation by interaction with MtDS or that they are required for the intrinsic catalytic machinery of MtCM. To distinguish between these two possibilities the seven C-terminal positions of MtCM were probed in an independent experiment for their direct involvement in CM catalysis. This was accomplished by surveying the complementation ability of the randomized MtCM variants under less stringent conditions, where formation of a complex with MtDS is not required for survival and growth on minimal plates. Specifically, plating onto the only mildly selective agar plates M9c +F +500 ng/mL Tet allows for good growth of clones with wild-type MtCM in the host KA12/pKIMP-UAUC, even in the absence of MtDS (Fig. 4).
Five representative previously constructed libraries (CT7, GRGR, LGH, GRG, and RLGH; Fig. 3C) were transformed into KA12/pKIMP-UAUC and between 0.16% and .50% of the library members were able to form colonies on M9c +F +500 ng/mL Tet. Sequencing of 106 complementing clones yielded the conservation pattern shown in Fig. 5C. From the high frequency of small residues at position 84, and the almost 100% conservation found for Arg85 and Gly86, we conclude, that residues 84 to 86 are not specifically responsible for the AroQ dtypical activation through complex formation. Instead, these residues are required for the basic catalytic machinery in MtCM or for the integrity of its structure. Such a role is conceivable, as Arg85 contacts the ligand directly and Gly86 allows for kinking the polypeptide chain at the C-terminus to maintain an unobstructed active site.
The positions C-terminal to Gly86 do not seem to be essential for MtCM activity, since they show a rather random distribution of amino acids. A plot relating the frequencies of conserved residues selected in complexed vs. free MtCM (Fig. 5D) reveals that the complexed MtCM shows a preference for Met at position 88 followed by Leu. Furthermore, a Gly (or a stop codon) is strongly favored at position 89 for the complex, whereas free MtCM shows a fully random distribution of residues here. Overall, the intrinsic low CM activity of free MtCM is more tolerant to C-terminal mutations, as apparent from the many columns with small negative values in Fig. 5D. In contrast, the conservation of residues specifically in the presence of MtDS pinpoints critical hinges and contact areas involved in productive transmission of conformational changes from the interface of the enzyme complex to the active site, as already discussed in the previous section in the context of the MtCM-MtDS structure.

Properties of C-terminally randomized MtCM variants
To examine the impact of C-terminal amino acid exchanges on the kinetic properties of MtCM, several selected enzyme variants were overproduced and purified. The library plasmid pKT-CM features, in addition to P tet required for gene expression during in vivo selection, also the much stronger T7 promoter in tandem configuration (Fig. 7A). Typically, this promoter is used for high-level gene expression in conjunction with an engineered E. coli strain possessing a chromosomally integrated gene for T7 RNA polymerase controlled by the lacpromoter [23][24][25]. To exclude a priori any contamination of the purified proteins by CM activity from the two endogenous E. coli enzymes, we developed a new, generally applicable gene expression strategy that relies on a plasmid-borne T7 RNA polymerase gene, allowing for convenient overproduction of the MtCM variants in our CM-deficient mutant strain KA12 (Fig. 7B).
The new plasmid pT7POLTS (ori p15A ) is compatible with pKT-CM (ori pUC ), carries a chloramphenicol resistance marker and the tetracycline repressor, which controls expression of an adjacent P tet regulated T7 RNA polymerase gene in response to the concentration of the inducer Tet. A problem often encountered with T7 RNA polymerase-driven gene expression is an undesirable high basal activity (''leakiness'') prior to addition of inducer [25]. This is typically caused by trace amounts of lactose that contaminate complex media components to varying degrees, resulting in low-level induction of the chromosomal lac promotercontrolled T7 RNA polymerase gene [26]. In our system, any background expression is efficiently reduced with an in-frame translational fusion of the Cterminus of the polymerase to an SsrA peptide tag that directs the T7 RNA polymerase mostly to the ClpXP protease system [27,28]. In fact, even though the modified polymerase gene was placed on the intermediate high copy-number plasmid pT7POLTS, there was no apparent gene expression in the absence of the inducer [29]. Upon induction with 2 mg/mL Tet, T7 RNA polymerase production -now presumably exceeding the degradation capacity of ClpXP-leads to strong expression of the target gene on the library plasmid pKT-CM from its tandem P tet P T7 promoter system (Fig. 7). Table 1 lists MtCM library variants competent in MtDS activation that were chosen for further characterization. After production and purification by metal affinity chromatography, the electrophoretically homogeneous proteins were assessed for their structural integrity by circular dichroism (CD) spectroscopy. All clones showed spectra comparable to wild-type MtCM, with a dominant a-helical structure as apparent from the typical relative minima at 208 and 222 nm. Also, the expected molecular masses of the variants were confirmed by electrospray ionization mass spectrometry within the error of the experiment (¡5 Da).
In the absence of MtDS, all variants exhibited a catalytic efficiency (k cat /K m ) within a factor of 2 of the wild-type MtCM (Table 1). They all are activated in vitro by MtDS by more than a factor of 4, which apparently suffices for complementation under the stringent in vivo selection regime. The magnitude of activation correlates with some sequence features found in the variants. It appears that premature termination at position 88 reduces the activation by MtDS by ) and in vitro overproduction of the MtCM variants (using P T7 ). The sequence of the P tet P T7 tandem promoter is given with the binding sites for the Tet-responsive TetR repressor highlighted in bold italics and the start codons of the reading frames in bold roman type [49]; underlined are relevant restriction sites, the ribosomal binding site (RBS tet ), 235 and 210 regions of P tet [49] and the RBS T7 and promoter P T7 from phage T7 [23,50]; start and direction of transcription is marked by an arrowhead [22].   (Table 1).
To survey the physical interaction between MtDS and the different MtCM variants band-shift experiments with native polyacrylamide gel electrophoresis (PAGE) were performed (Fig. 8). Although MtCM variants do not appear as discrete bands because their pI is above the pH of the gel, we observed a full shift of the MtDS band if wild-type MtCM is present. Like the wild-type control, the highly activated variants 4-5 and 2-4 cause maximum MtDS shifts, but also the poorly activated variant ST-46. Thus, even though physical interaction is necessary, it is not sufficient for high catalytic activatability by MtDS. The extra positive charge (Arg90) at the C-terminus of ST-46 might be responsible for the and a P tet controlled T7 RNA polymerase gene (T7pol) translationally fused at its 39 end to the sequence for the SsrA degradation tag. In the absence of Tet, TetR binding to its operator sites (highlighted as in panel A) blocks gene expression from P tet and any T7 RNA polymerase produced due to low-level leaky transcription is effectively eliminated by SsrA-mediated Clp proteolysis, thereby suppressing basal polymerase activity. Provision of Tet releases TetR from the operator, resulting in intracellular polymerase levels higher than can be degraded efficiently by the Clp proteases [41]. For efficient translation, the alternative RBS alt can be used. The accumulating polymerase then directs massive transcription from P T7 controlled genes, such as aroQ d on pKT-CM. The entire nucleotide sequence of pT7POLTS is provided as S2 Fig.   doi The apparent activation factor was estimated as described previously [10], as the ratio of CM initial velocities of the MtCM-MtDS complex (v 0 (MtDS+MtCM) ), normalized by MtCM-variant and chorismate concentrations, over k cat /K m for free MtCM. tight, but functionally modest interaction with MtDS. In fact, an Arg90 modeled into the crystal structure of the MtCM-MtDS complex [10] could form a salt bridge with the C-terminal Asp462 of MtDS, either to its side chain or its free Cterminal carboxylate. Even though all variants are catalytically activated (to some extent) in vitro and in vivo by MtDS, no physical interaction with the partner enzyme was apparent for MtCM versions 2-8 and 2-17 under the prevailing conditions of the native PAGE, suggesting that these variants bind more weakly to MtDS.
In summary, the selection experiments with C-terminally randomized MtCM libraries resulted in a set of distinct regulatory variants with roughly similar intrinsic basal CM activities but varying widely in their potential to become activated by MtDS. Simultaneously, the activation potential correlates roughly with the degree of the physical interaction with MtDS observed by native PAGE and the sensitivity to feedback inhibition of the complex, as illustrated in Fig. 9.

Conclusions
We have applied the strategies of directed evolution -i.e., randomizing mutagenesis, selection in vivo, and analysis of surviving library members-to quickly survey essential protein-protein interactions in the MtCM-MtDS enzyme complex. The pattern of sequence conservation for MtCM's seven C-terminal residues that we observed in our laboratory evolution experiments essentially coincides with the pattern that emerged during natural evolution of homologous CMs. From two complementary selection experiments, carried out in the absence and in the presence of MtDS, we were able to discriminate between residues needed mainly for the basic integrity of AroQ d CMs and those required for catalytically productive complex formation, respectively.
Because the biosynthesis of the aromatic amino acids Phe and Tyr is energetically very costly for organisms [30] it needs to be tightly regulated, particularly at metabolic branch point reactions and at irreversible steps, such as the one catalyzed by CM [31]. It was hypothesized that the level of CM activity in an M. tuberculosis cell is regulated by rapid dynamic control of the ratio between the poorly active free MtCM and the over 100 fold more active MtCM when in complex with MtDS [10]. Specifically, if intracellular Phe and Tyr is abundant, these shikimate pathway end products could influence this equilibrium by binding to allosteric effector sites in the MtCM-MtDS assembly, resulting in a further increase of the already high (140 nM) apparent dissociation constant K d, app of the hetero-octameric complex [10]. The examination of the MtCM variants and their interactions with MtDS described here support a regulatory mechanism that involves shifting the equilibrium between free and complexed MtCM. In general, a poor ability of an MtCM variant to bind to MtDS on a native gel correlates with both a rather modest MtDS-mediated CM activity increase and a poor response to feedback inhibition.
Finally, high-resolution structures of some of the strongly deregulated variants selected here, such as clones 1-6 or 2-17, could inform the rational design of molecules that interfere with protein-protein interactions [32][33][34][35]. Drugs that target Actinomycetales-specific CM-DS complexes [10,15] may cause an activity decrease by blocking the protein-protein interface through direct competitive binding, or by stabilizing a non-productive conformation of one partner protein, Protein Interactions Mapped by Directed Evolution as may be assumed by some of the MtCM variants found in this work. In principle, our evolutionary strategy is applicable for characterizing druggable regions in any protein complex for which a selectable function depends on protein-protein interactions.

Reengineering of the Chorismate Mutase Selection System
Tight control of MtCM gene expression in vivo was achieved by reengineering the selection system towards a tetracycline-inducible promoter system. pKTCM-HCDXhoI was constructed from pKTCM-HC [10] by PCR amplification of a 270 bp fragment using oligonucleotides 293-DXhoI-S (59-TGGAAATCGTGGAGTCCCAACCTGT) and 204-MTCM2N (59-CGATAACTCGAGGTGACCGAGGCGGCCACGGCCCAAT, employed restriction site underlined). The obtained PCR fragment was subsequently used as a mega-primer together with oligonucleotide 206-MTCM2S (59-ACCGATGTCATATGCGTCCAGAACCCCCACATCA) on pKTCM-HC as a template. The NdeI/XhoI-digested fragment (318 bp) was ligated to the 4561 bp NdeI-XhoI fragment from pKTCM-HC, yielding the 4879 bp pKTCM-HCDXhoI plasmid. This plasmid was used to construct pMG242, containing a shortened but functional version of the aroQ d gene, which encodes an MtCM variant starting at Met5. Ligation of the NdeI/XhoI-digested product (261 bp) of the PCR with oligonucleotides 297-SHO-S (59-CAACATATGCTGGAGTCCCAACCT) and 204-MTCM2N on template pKTCM-HCDXhoI to the 4561 bp fragment of the NdeI/ XhoI-digested pKTCM-HC plasmid yielded the 4822 bp pMG242. Plasmid pKSS-TM4 was used as the template for library construction. pKSS-TM4 contains a non-functional portion of the aroQ d gene on a pKSS backbone [37], lacks the restriction sites used later for library cloning, and was assembled from a PCR fragment generated using oligonucleotides 341-MCMN-SacI-S (59-CTCACGAGCTCACCATCATCATCACCACTTCTTCTGGTATGCTCGAGTCC-CAACCTGT) and 342-MCM-KpnI-N (59-TACTTGGTACCTTAGCGGCCCAAGCGCAAAAGCAGGATGGCCAGA) on template pMG242. The 278 bp KpnI-SacI fragment was ligated into the correspondingly cut pKSS fragment of 2855 bp, yielding pKSS-TM4 (3133 bp). Plasmid pKTCMtet2-HC (3058 bp) contains the aroQ d gene driven by a tandem P tet P T7 promoter system, as well as the tetR gene for controlled repression of the tet operator within P tet . It was obtained by ligating the 261 bp NdeI-XhoI fragment of pMG242 to the 2797 bp NdeI-XhoI fragment of plasmid pKTH-400-5 (3100 bp) [17,29]. To generate an aroQ d -negative control and as acceptor vector for the libraries, plasmid pKTCTET-0 (4096 bp) was constructed by ligating the 2797 bp NdeI-XhoI fragment of pKTCMtet2-HC with the 1299 bp NdeI-XhoI stuffer fragment from pMG242-0. Plasmid pMG242-0 (5860 bp) is derived from pMG242 by inserting a Tet resistance stuffer fragment. It was constructed in two steps, first by introducing a silent AscI site into the MtCM gene of pMG242, yielding pMG242-sil, and subsequently, by cloning an AscI/XhoI-digested Tet resistance determinant fragment into the AscI/XhoI-digested pMG242-sil. For the construction of pMG242-sil, a 193 bp fragment generated with primers 299-Ascsil-S (59-TTAGTCAAGCGGCGCGCCGAGGTTTCCAAGGCCAT-39) and 204-MTCM2N on template pMG242 was used as a megaprimer for a second PCR on the same template together with 297-SHO-S, giving a 278 bp fragment. This fragment was cut with NdeI and XhoI (261 bp) and inserted into the NdeI/XhoIdigested pKTCM-HC vector backbone (4561 bp) to give pMG242-sil (4822 bp). The fragment carrying the Tet resistance determinant was amplified from a DNA sequence of pBR322 [39] with primers 302-STUFF (59-CAAACTCGAGCCGTGTATGAAATCTAA-39) and 162-STS (59-CAAAGGCGCGCCCATTCAGGTCGAGGT-39). The resulting 1222 bp PCR fragment was digested with XhoI and AscI to yield a 1207 bp stuffer fragment (encoding the Tet resistance determinant) that was then ligated with the 4653 bp XhoI/AscI-digested pMG242-sil to give plasmid pMG242-0 (5860 bp). Plasmid pKTNTET was constructed by restriction digestion of plasmids pMG244 and pKTCTET-0 with NdeI and SpeI. The respective 296 bp and 2765 bp fragments were ligated and yielded plasmid pKTNTET (3061 bp). pKTNTET encodes the His 6 -tagged MtCM sequence defined in this work as wild-type (wt) MtCM (i.e., MtCM carrying an N-terminal Met-His 6 -Ser-Ser-Gly sequence fused to Met5 of entry Mtu in Fig. 3C) but otherwise has the same structure as the library plasmids ''pKT-CM'' and was therefore used as the positive control in in vivo complementation tests; its entire nucleotide sequence is provided as S1 Fig.  pMG244 was constructed from ligation of the 4529 bp NdeI-SpeI-fragment of pMG243 to the 296 bp NdeI/SpeI-digested PCR fragment obtained from amplification using oligonucleotides 201-MTCM2HS (59-ACCGATGTCATATGCACCATCATCATCATCATTCTTCTGGTATGCTCGAG-TCCCAACCT) and 203-MTCM2CN (59-CGATACACTAGTTATTAGTGACCGAGGCGGCCACGGCCCAAT) on template pKTCM-HC. pMG243 (4879 bp) was constructed from the 4561 bp NdeI/XhoI fragment of pKTCM-HC and a 318 bp NdeI/XhoI-digested PCR fragment obtained from amplification of pKTCM-HCDXhoI using oligonucleotides 294-RTOK (59-GCAAGGCCAAAATGGCGTCCGGT) and 204-MTCM2N, where the 155 bp PCR product was used as primer together with oligonucleotide 206-MTCM2S, also on template pKTCM-HCDXhoI (crude PCR product length, 338 bp; NdeI/XhoI-digested fragment, 318 bp).
To carry out in vivo selection experiments for MtCM variants able to interact with MtDS, the gene for the DAHP synthase was provided on the separate, compatible plasmid called pKIMP-ACG. pKIMP-ACG is based on pKIMP-UAUC [21] and carries the tandem P sal P T7 controlled aroG from M. tuberculosis encoding MtDS, in addition to the helper functions tyrA* and pheC, and also a copy of nahR encoding the transcriptional activator of the sal promoter [40]. For its construction, plasmid pKIMP-UAUC was digested with SpeI and subsequently treated with calf intestinal phosphatase. The product was completely digested with NarI. pKTDS-H [10] was cut to completion with SpeI and partially digested with SphI. Since the resulting desired 2592 bp and undesired 2515 bp fragments were not separable by agarose gel electrophoresis, they were used as a mixture for subsequent ligation (in equimolar ratio) with both the 4993 bp SpeI-NarI fragment of pKIMP-UAUC and the self-annealing oligonucleotide pair (334-trpAfw 59-CAGCTTAGCCCGCCTAATGAGCGGGCTTTTTTTGG and 335-trpArv 59-CGCCAAAAAAAGCCCGCTCATTAGGCGGGCTAAGCTGCATG), which forms a trpA transcriptional terminator (with NarI and SphI-compatible ends) and which was included to circumvent transcriptional coupling of the pheC and nahR genes. The ligation product was used to transform competent KA12 cells and a clone (pKIMP-ACG; 7623 bp) containing the correct (2592 bp) fragment from pKTDS-H was identified by sequencing.

A Generally Applicable Plasmid-borne T7 RNA Polymerase System
Plasmid pT7POLTS (5901 bp) was used as a new and convenient plasmid-based T7 RNA polymerase gene expression system. It contains a P tet controlled T7 RNA polymerase gene genetically fused in-frame to the DNA encoding a C-terminal SsrA degradation tag. The SsrA tag targets the polymerase to cellular degradation machineries, such as the ClpXP or ClpAP protease complexes [27,28]. Thereby, intracellular T7 RNA polymerase concentrations are kept very low in the uninduced state, preventing circumstantial toxicity during the growth phase of the production culture by the gene to be overexpressed. After induction with Tet, SsrA-tagged T7 RNA polymerase accumulates, presumably because its level exceeds the degradation capacity of the protease systems [41], leading to massive transcription from the P T7 controlled gene of interest that is present on a second plasmid.
In E. coli strains, which do not carry an endogenous Tet resistance, gene expression from P tet was shown to be homogeneous in each cell, and it responded over 2 to 3 orders of magnitude with little cooperativity (in the case of the moderate-copy number pT7POLTS) to increasing Tet concentrations up to 100 ng/mL, before the antibiotic became toxic [22]. For Tet resistant E. coli strains, such as strain KA12 used in this work harboring the tetR-tetA regulatory system from Tn10, the dose-response profile is fully linear and shifted to higher Tet concentrations up to 5 mg/mL [22]. The latter is due to the fact that the TetA resistance determinant is an antiporter that lowers the intracellular inducer concentration by coupling Tet efflux to the H + gradient across the membrane.
Additionally, pT7POLTS carries a p15A origin of replication, a chloramphenicol resistance gene, and the tetR repressor gene responsible for tight control of T7 RNA polymerase transcription. The assembly of pT7POLTS (previously also referred to as pAC-Ptet-T7pol-S) is described in detail elsewhere [29]; its entire nucleotide sequence is listed as S2 Fig. We expect our pT7POLTS system, which circumvents the need for a chromosomally integrated T7 RNA polymerase gene, to be of general use for T7-promoted gene expression also beyond this project.

Gene Expression and Protein Purification
Production of N-terminally His 6 -tagged MtCM variants (Met-His 6 -Ser-Ser-Gly tag fused to Met5 of the Mtu sequence in Fig. 3C) was carried out using KA12/ pT7POLTS cells transformed with pKTNTET (for wild type) or pKT-CM plasmids (for library members). Cultures were grown in 500 mL LB medium containing 150 mg/mL Na-ampicillin and 30 mg/mL chloramphenicol at 30˚C and gene expression was induced with 2 mg/mL Tet at an OD 600 of 0.3-0.5. The crude lysate was obtained following published protocols [42], except for omitting the RNase A and DNase I treatment. The crude lysate was provided with imidazole (10 mM) and loaded onto a column packed with an equilibrated His-Select Nickel Affinity Gel (Sigma). The AroQ d variant was eluted with 250 mM imidazole and dialyzed against 20 mM potassium phosphate, pH 7.5.
MtDS protein needed for kinetic assays and native gels was produced in the His-tagged format (containing a Met-His 6 -Ser-Ser-Gly sequence appended to the natural N-terminus) from KA13/pKTDS-HN as described earlier [10,43].
The concentration of the purified enzymes was determined by the Micro BCA Protein Assay Reagent Kit (ThermoFisher, formerly Pierce) using BSA as a standard or a calibrated Bradford assay with BSA values corrected for MtCM and MtDS-specific absorption [10].

Structural Investigation of Purified Proteins
Protein structural integrity was assessed by SDS PAGE using the PhastSystem (20% homogeneous gels, GE Healthcare) and by electrospray ionization mass spectrometry (ESI-MS) as detailed previously [10]. CD spectroscopy [44] and native PAGE [10] were carried out as described before.

Chorismate Mutase Assays
Individual Michaelis-Menten steady-state kinetic parameters were obtained for free MtCM using a continuous assay on a PerkinElmer Lambda 20 spectrophotometer by monitoring initial rates (v 0 ) of chorismate disappearance at 310 nm (e 310 5370 M 21 cm 21 ). The substrate (-)-chorismate was prepared from KA12 as previously described [45]. Typically, five chorismate concentrations ([S]) between 100 and ,2000 mM were tested at 30˚C in 50 mM potassium phosphate buffer, pH 7.5. The data were corrected for the background reaction in the absence of a CM at the same temperature and then iteratively fitted to the The k cat /K m values listed in Table 1 were derived from the ratio of individual k cat and K m parameters each obtained from the fitting of four to six kinetic measurements and stemming from a single set of all MtCM variants and the wild type, prepared and characterized under identical conditions. Independent isolation of the MtCM variants and kinetic assays under slightly variable conditions resulted in non-systematic data fluctuation with an average standard deviation (s n-1 ) to the values of the homogenous dataset of typically less than 15%. For optimal comparability within the MtCM dataset, we chose to list in Table 1 the calculated k cat /K m values of the homogenous set only, and add a 15% standard deviation to give a conservative estimate of the variation.
k cat /K m estimates for CM activity of the MtCM variants in the presence of an excess of MtDS to ensure complex formation were performed as described previously [10]. Initial velocities (v 0 (MtDS+MtCM) ) of a mixture of 30 Table 1 are the means of two independent determinations, with the standard deviations (s n-1 ) for the ratios of the individual parameters calculated by error propagation of the experimental s n-1 values.