Directed Evolution of Human Heavy Chain Variable Domain (VH) Using In Vivo Protein Fitness Filter

Human immunoglobulin heavy chain variable domains (VH) are promising scaffolds for antigen binding. However, VH is an unstable and aggregation-prone protein, hindering its use for therapeutic purposes. To evolve the VH domain, we performed in vivo protein solubility selection that linked antibiotic resistance to the protein folding quality control mechanism of the twin-arginine translocation pathway of E. coli. After screening a human germ-line VH library, 95% of the VH proteins obtained were identified as VH3 family members; one VH protein, MG2x1, stood out among separate clones expressing individual VH variants. With further screening of combinatorial framework mutation library of MG2x1, we found a consistent bias toward substitution with tryptophan at the position of 50 and 58 in VH. Comparison of the crystal structures of the VH variants revealed that those substitutions with bulky side chain amino acids filled the cavity in the VH interface between heavy and light chains of the Fab arrangement along with the increased number of hydrogen bonds, decreased solvation energy, and increased negative charge. Accordingly, the engineered VH acquires an increased level of thermodynamic stability, reversible folding, and soluble expression. The library built with the VH variant as a scaffold was qualified as most of VH clones selected randomly were expressed as soluble form in E. coli regardless length of the combinatorial CDR. Furthermore, a non-aggregation feature of the selected VH conferred a free of humoral response in mice, even when administered together with adjuvant. As a result, this selection provides an alternative directed evolution pathway for unstable proteins, which are distinct from conventional methods based on the phage display.


Introduction
The variable domain of heavy or light chain (V H or V L ) of a human immunoglobulin G (IgG) molecule is the smallest part of the antibody that preserves the original binding activity.Although variable domains have short serum half-lives and lack effector function, their format flexibility by adopting immune cell engaging strategy or introducing a long-acting module can ameliorate these defects [1][2][3].Furthermore, their ability to access occluded or hidden epitopes, superior bio-distribution, and cost-effective production make variable domains potentially useful in therapeutic applications for which full IgG molecules are not appropriate [4][5][6].
When not assembled with each other, instability problem of V H and V L of human IgG is a major concern for biotechnological applications since Ward et al. reported that such V H domains are relatively sticky resulting in tendency to aggregate [7].This aggregation is primarily due to interactions between hydrophobic patches residing at the interface between V H and V L .Direct replacement of the interfacial hydrophobic residues of V H or V L with hydrophilic amino acids has been partially successful in improving protein stability.Three hydrophilic substitutions (G44E/L45R/W47G) improve the solubility of V H [8][9][10], but these changes also decrease expression yield and thermal stability due to the resultant deformations of the b-sheet structure [11][12][13].
In addition to rational mutation strategies, several groups have adopted combinatorial approaches to engineer human V H or V L .Jespers et al. screened a combinatorial CDR library bound to protein A for aggregation-resistant V H , using panning phage display under heat-denatured conditions [14].They found that mutations in the CDRs of human V H can increase solubility and promote reversible folding.Without the use of heat denaturation in phage display, Barthelemy et al. isolated various mutant V H domains with an increased stability and solubility [15].To eliminate the complicated step involving in vitro protein A panning, To et al. selected monomeric human V H domains directly from bacterial lawns by plaque size [16].These variant techniques notwithstanding, most screenings of engineered V H domains have been conducted using phage display and protein A-binding activity.
On the other hand, in vivo genetic selection methods distinct from in vitro phage display have been applied in efforts to improve protein solubility [17,18].In one such in vivo method, the twin-arginine translocation (Tat) pathway was exploited as an in vivo protein fitness filter for fast folding and solubility of protein of interests including single chain Fv [19][20][21].However, such approaches have not been attempted for V H or V L alone.In the current study, we applied this system to evolve human V H toward greater stability and characterized the structural hallmarks to greater stability and solubility.

Ethics Statement
All animal experiments were performed in accordance with the guidelines for the care and use of laboratory animals recommended by the Ministry of Food and Drug Safety of Republic of Korea.The experimental procedures were approved by the Mogam Animal Care and Use Committee.Currently, Mogam Animal Care and Use Committee changed the name as the Green Cross Central Research laboratory Animal Care and Use Committee, by which the animal experiment closing report was reviewed and approved.

Construction of the Tat-based genetic selection vector
The vector system for screening of stable V H domains was modified from the previous report [20,22].Briefly, TEM-1 blactamase (BLA) was ligated with the Tat signal sequence of trimethylamine N-oxide reductase (ssTorA) of E. coli in pET9a, yielding pET-TAPE (Figure 1A).Next, a fusion gene of ssTorA with the representative human immunoglobulin heavy chain variable domain V H family type 2 (V H 2) was synthesized (GenScript, USA).V H 2 was used as a template for PCR using a 59 primer (Table S1, primer 1) including an NdeI restriction site and a 39 primer (Table S1, primer 2) including a NotI site, a 66His tag, and a BamHI site, to yield the NdeI-ssTorA-V H 2-NotI-66His-BamHI gene.This gene was inserted between the Ndel and BamHI sites in the multi-cloning site of pET9a to yield pET9a-ssTorA-V H 2. The NotI-BLA-BamHI segment was generated by PCR (Table S1: primer 3 as sense, primer 4 as antisense) using BLA as a template.This gene was inserted between the Notl and BamHI sites of pET9a-ssTorA-V H 2, yielding pET9a-ssTorA-V H 2-BLA, which was named pET-TAPE.A synthetic or human germ-line V H library was constructed by replacing the V H 2 gene in pET-TAPE.

Library design and construction
cDNA for the human V H library was obtained by reverse transcription of mRNAs from the liver, peripheral blood mononuclear cells, spleen, and thyroid (Clontech, Madison, WI, US) using various primers (Table S1: primers 5-12 as sense, and primers 13-15 as antisense).Each of cloned human V H gene family (V H 1, V H 3, and V H 5) was inserted between the NdeI and BamHI sites of pET-TAPE, yielding a pET-TAPE-V H library with approximately 10 9 distinct clones.Mutations were introduced by PCR using MG2x1 as the template and primers that introduced mutations at the first fragment (Table S1: primers 16 and 17) and the second fragment (Table S1: primers 18 and 19).Next, MG2x1 variant genes were synthesized by overlapping PCR of the two gene fragments using primers 16 and 19 (Table S1).After digestion of the MG2x1 variants with NcoI and NotI, the inserts were cloned into pET-TAPE, yielding the frame-mutation V H library with approximately 10 8 distinct clones.

Setup for Tat-associated protein engineering (TAPE) system
Along with the construction of pET-TAPE, the protocol implementing a liquid culture and rescuing correct size of gene of interests was conducted to screen protein solubility in highthroughput manner.The antibiotic resistance of E. coli is correlated to the translocation of soluble V H -BLA fusion protein into the periplasm via the Tat pathway.The TAPE system differs from previously described systems [20] in that soluble proteins are enriched in consecutive rounds of liquid culture with increasing concentrations of antibiotic.E. coli T7 Express LysY/I q was transformed with the pET-TAPE-V H library by electroporation.Transformants were cultured in SOC (20 g/l Bacto tryptone, 5 g/ l Bacto yeast extract, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl 2 , 10 mM MgSO 4 , and 20 mM glucose) at 37uC for 1 h, and then inoculated and cultured in liquid LB media containing 50 mg/ml ampicillin.When OD (600 nm) reached 0.6, cells were collected by centrifugation and plasmid DNA was isolated.To prevent enrichment of false-positives in subsequent rounds of selection, isolated plasmids were restricted with NcoI and BamHI, and digests were subjected to gel electrophoresis to allow size selection of fulllength V H -BLA genes.The size-selected V H -BLA genes were cloned between the NcoI and BamHI sites of pET-TAPE, and the resultant plasmids were transformed into E. coli.Subsequently, liquid culture was performed in repeated rounds with stepwise increases in the concentration of ampicillin up to 500 mg/ml.(Figure 2).After performing 3-5 consecutive cycles of liquid culture, clones were separated on an LB agar plate containing ampicillin and 50 mg/ml kanamycin.

Host strains and plasmids
E. coli T7 Express LysY/I q (New England BioLabs, MA, USA) was used as the host for the expression of the V H domains and their fusion proteins.pET9a (New England BioLabs, MA, US) was used to construct the TAPE system, i.e., for expression of fusion proteins of various V H domains and BLA.pET22b (New England Biolabs, MA, USA) was used to express the V H domain alone.All other DNA manipulations were conducted according to common methods.

Fractionation of soluble and insoluble V H
To determine the degree of soluble expression, individual V H domains alone (i.e., without the BLA fusion) were expressed in E. coli.The soluble and insoluble fractions were separated after induction of V H expression, followed by SDS-PAGE.Soluble and insoluble proteins were fractionated in lysis buffer (B-PER Reagent, Thermo Scientific, USA).The pellet was washed with PBS, and then resuspended in solubilization buffer (pH 7.4, 50 mM NaH 2 PO 4 , 6 M urea, 0.5 M NaCl, and 4 mM DTT) to obtain the insoluble fraction.Each fraction was prepared from the same quantity of cells to allow band intensities to be compared after gels were stained with Coomassie blue.

Circular Dichroism
Purified V H domains were diluted to 0.2 mg/ml.The purity of V H domains used for CD measuremnt was demonstrated with SDS-PAGE (Figure S1).CD was measured using a spectropolarimeter (Jasco J-715 model, Jasco Inc, Easton, MD, US).T m was defined as the temperature at which a 50% reduction in the soluble protein fraction was observed.The profile was recorded at a wavelength of 235 nm as the temperature gradually increased from 25 to 85uC at a rate of 1uC/min.All CD measurement were repeated 3 times for each V H domain.The p-value (paired t-test) between two V H domains was less than 0.005 for all possible pairs of the tested V H domains.

Recovery yield
The recovery yield was defined as the level of soluble V H after heat denaturation.After aggregates were removed by centrifugation, the concentration of soluble V H was determined according to the equation, c = A/ (E6b), where A is the absorbance at 280 nm, E is the molar extinction coefficient (M 21 cm 21 ), b is the pathway length (cm), and c is the molar concentration (mol/l).The extinction coefficient was calculated using the amino acid Figure 1.Verification of the selection system, TAPE.(A) Plasmid map of pET-TAPE.(B) Average number of ampicillin-resistant colonies from cultures harboring constructs for expression of a negative control (no Tat signal sequence, V H 3-BLA [2]), positive control (Tat signal sequence and reporter gene only, ssTorA-BLA [+]), and published V H domains (HuCal V H 2, HuCal V H 3, Dp47d, and HEL4).Each construct was expressed in LB medium containing 50 mg/ml ampicillin.Cultures were induced by the addition of 1 mM isopropylbD-1-thiogalactopyranoside for 3 h after inoculation.After the induction, cultures were spread onto agar plates containing 50 mg/ml ampicillin for colony counting.Data points are means and standard deviation for three independent experiments.doi:10.1371/journal.pone.0098178.g001composition, assuming that all pairs of cysteine residues were involved in disulfide bonds (web.expasy.org/protparam).Protein quality was confirmed by size-exclusion chromatography.
Humoral immune response of mice to the screened V H BALB/c mice (six per group) were intravenously injected with 10 mg MG2x1, MG8-14, or V HH on 3 consecutive days.The injections were repeated at weeks 1, 4, and 8. Samples of immune sera were obtained every week, and mice were sacrificed at day 65.For intramuscular and subcutaneous injections, BALB/c mice (six per group) were injected with 1 or 10 mg of MG8-14 or V HH .V HH is identical to V HH #3E, which binds to tumor necrosis factor-a [23].The injection was repeated every 2 weeks with a total of five injections.The mice were sacrificed 2 weeks after the final injection.Samples of immune sera were obtained every 2 weeks, 1 day before the next injection.To measure antibody titers, enzymelinked immunosorbent assays were performed using 96-well plates coated with MG2x1, MG8-14, or V HH , and HRP-labeled goat anti-mouse antibody as a secondary antibody, followed by the addition of 3,39,5,59-tetramethylbenzidine and measurement of OD (490 nm).

Verifying TAPE
To verify whether TAPE system can discriminate between proteins of different solubilities, we applied this system to various published V H domains whose soluble expression levels are well known.The V H domains were cloned into the pET-TAPE vector (Figure 1A), allowing them to be expressed in E. coli as fusions with BLA and the Tat signal sequence of ssTorA.The antibiotic resistance of strains carrying each construct was measured by counting the cell number in cultures containing 50 mg/ml ampicillin.Cells expressing BLA alone (ssTorA-BLA [+], positive control) exhibited the highest resistance, and cells expressing HEL4 were approximately as resistant as the positive control (Figure 1B) [22].Cells expressing the other representative V H 3 family genes, Dp47d and V H 3 (HuCAL), exhibited resistances intermediate between those of the positive and negative controls [24,25].The resistance of cells expressing the antibiotic resistance gene with no Tat signal sequence (V H 3-BLA [-], negative control) was lower than that of cells expressing any other construct, with the notable exception of the V H 2 (HuCAL) construct (ssTorA-V H 2-BLA). Cells expressing V H 2 exhibited the lowest ampicillin resistance, even lower than that of the negative control.Since most of the V H 2 was expressed exclusively as inclusion bodies (Figure 3A), the biostatic effect of V H 2 aggregate formation in E. coli might have further slowed cell growth beyond the bactericidal effect of the antibiotic.

Screening of human germ-line V H library via TAPE
In the Tat-associated screening system using ampicillincontaining agar plates, false-positive clones containing small V H peptide fragments were often enriched because such fragments are highly compatible with the Tat pathway.To overcome this problem, previous screens have included a step to exclude clones with excessively high antibiotic resistance (i.e., counter-selection) [20].In this study, to perform V H solubility screening in a highthroughput manner, we enriched antibiotic-resistant clones in liquid cultures ('liquid screen') containing various concentrations of ampicillin (50-500 mg/ml) (Figure 2).Furthermore, to avoid enrichment of short V H gene fragments that might yield falsepositive results, full-size V H -BLA fusion genes were recovered by gel purification.In contrast to the limitation of library size in the plate-based method, the liquid screen with a culture larger than 100 ml can cover library sizes greater than 10 9 because 1 ml overnight culture of E. coli in the LB with ampicillin contains normally about 10 9 cells.The size of the human germ-line V H library for TAPE was about 2.17610 9 .
After the third round of TAPE through selection of antibiotic resistance, 154 V H sequences were selected from the human germline V H library that had been constructed using primers specific for the V H 1, V H 3, and V H 5 families.These 154 V H sequences were classified into 19 different V H family types.Of the 154 total V H hits, 146 (94.8%) were identified as members of the V H 3 family; this frequency is significantly higher than the V H 3 family frequency in the library prior to TAPE (101 V H 3 family members  out of 144 sequences: 70.1%).Among the V H 3 family genes isolated from the germ-line V H library, the V H 3-30 and V H 3-23 genes were predominant.On the other hand, the frequencies of the V H 1 and V H 5 families decreased by 0.1-fold and 0.3-fold, respectively.Overall, as a result of TAPE, the V H 3 family was enriched 1.4-fold (i.e., from 70.1% to 94.8%), whereas the other families became less abundant (Table 1).
To determine the degree of soluble expression of isolated individual V H domains lacking the BLA fusion, the soluble and insoluble fractions were separated after expression of the corresponding genes, and their expression patterns were compared with those of various V H domains published previously [24,25].V H domains randomly selected from the germ-line V H library were expressed predominantly as inclusion bodies (Figure 3B, RD1-3), whereas the soluble expression levels of V H domains selected by TAPE, e.g., MG4x4-44, MG4x4-25, MG10-10, and MG2x1, were significantly increased (Figure 3B).Moreover, the V H domains selected by TAPE exhibited a higher ratio of soluble to insoluble protein than the previously characterized V H domains described above, i.e., V H 2 (HuCAL), V H 3 (HuCAL), V H 6 (HuCAL), V H 3 (DP47d), and HEL4 (Figure 3A).
An artificial library comprising 25 individual V H domains, either selected from the germ-line library or previously characterized V H domains (HEL4, DP47d, HuCal V H 3, and HuCal V H 2) were subjected to TAPE.Only one clone, MG2x1, grew out at the third round of TAPE.This clone was used as the backbone for the frame-mutation library with selected mutation sites, described below.

Screening of the frame-mutation library of MG2x1 via TAPE
To confer additive solubility and stability to MG2x1, combinatorial mutations were introduced into seven specific sites of MG2x1 to generate the MG2x1 frame-mutation library.The number of distinct clones in the library was 1.4x10 8 , which covers all the possible combinations of mutations with NNK degeneration codon (theoretically, 6.4610 7 combinations).The selected mutation sites are distributed over the CDRH1 (S35), frame 2 (Q39, L45, and W47), and the CDRH2 (A50, Y58, and A60) with the kabat numbering system (Figure 4A, residues in red).These sites were selected by referring to the crystal structure of MG2x1 (PDB  Soluble expression level and thermodynamic stability are correlated in V H domains selected by TAPE Among the hits obtained from the combinatorial framemutation library of MG2x1, 23 unique sequences were selected from the final round of TAPE.Most of the selected V H domains were expressed as soluble proteins.In particular, MG8-14, MG2-55, MG4-5, MG-4-13, MG8-4, and MG8-6 were expressed exclusively in their soluble forms (Figure 5).A previous study using the Tat pathway to express a protein fused to an antibiotic resistance marker showed that the ability to confer growth was correlated to both the solubility profile and the molecular weight of the protein [26].The thermodynamic stabilities of the V H domains selected from the naı ¨ve human V H library by TAPE were higher than those of wild-type V H 3 domains.The melting temperatures (T m ) of the selected germ-line V H domains were 55.6-65.2uC,whereas the T m of the randomly chosen V H domains from the germ-line library were generally below 50uC, e.g., 46.5uC for V H 3-15 (Figure 6A).Among the selected germline V H domains, MG2x1 had the highest T m .Furthermore, the T m of V H domains selected from the combinatorial framemutation library of MG2x1 (65.2-77.5uC)were significantly higher than that of the parental V H (MG2x1) (Figure 6B).The thermodynamic stabilities of the engineered V H domains identified in this study were generally higher than that of HEL4, which was selected from a combinatorial CDR3 library based on Dp47d by heat-resistant phage display selection [24].

Selected V H domains fold autonomously after denaturation
Proteins exist in thermodynamic equilibrium between their folded and unfolded states.Hence, unstable proteins are much more vulnerable to heat and pH disturbance because exposure of their hydrophobic core during occupancy of the unfolded state promotes aggregation.Many V H 3 family domains are soluble and aggregation-resistant.However, once these proteins are denatured, they never refold into their native conformation.This was the case for all V H domains selected from the germ-line library in this study, including MG2x1.However, some of the V H domains selected from the frame-mutation library of MG2x1 by TAPE were folded reversibly after denaturation.Far-UV circular dichroism (CD) spectra suggested that MG8-14 could be reversibly folded after denaturation heating at 85uC (Figure 7C), whereas the parental V H domain, MG2x1, could not (Figure 7A).Furthermore, the modified MG8-14 [L50W] had a perfect renaturation profile (Figure 7D).MG8-6 had the highest T m , but could not refold after denaturation (Figure 7B).The recovery yield for the selected V H after heat denaturation reached 95% (Table 3), in contrast to that of the parental sequence (MG2x1), which was below 5%.

Structural features underlying the superior biophysical properties of selected V H domains
Superimposition of crystal structures of the parental V H , MG2x1 (PDB ID: 3ZHK), and the modified V H domains MG8-4 (PDB ID: 3ZHD) and MG8-14 (PDB ID: 3ZHL) revealed that these proteins have the same overall topology: two b-sheets connected by a disulfide bond between C22 and C96, yielding a typical b-sandwich lectin fold structure (Figure 8A and Table S2).The random amino acid changes introduced in the combinatorial frame-mutation library of MG2x1 are positioned on the b-strand that forms the sandwich scaffolds; in particular, they are located on the side of the sandwich corresponding to the hydrophobic interface region between heavy and light chains in the typical Fab complex arrangement (Figure 8B).Mutations in MG8-4 and MG8-14 altered the conformation of the flexible CDRH3 loop,  S1. doi:10.1371/journal.pone.0098178.t002 whereas the CDRH1 and CDRH2 loops remained in their original conformations (Figure 8C and Table S2).
Surface electrostatic calculations revealed that MG8-4 and MG8-14 exhibited increased partial negative charge next to the hydrophobic patch, possibly due to the introduction of a charged group such as aspartate (D) at position 60, whereas substantial positive charge was detected next to the exposed surface of the heavy chain in all three structures (MG2x1, MG8-4, and MG8-14) (Figure 9A and 9B).The solvation energies of MG8-4 and MG8-14 (21166.8kcal/mol and 21153.4kcal/mol, respectively) were significantly lower than that of MG2x1 (21047.5 kcal/mol), suggesting that the charged residues on the surface contribute to the solvation energy, and hence the solubility, of the protein.Analysis of surface features revealed an significantly increased number of hydrogen bonds between side chains of the residues of MG8-4 and MG8-14 (26 and 39, respectively), whereas only 19 hydrogen bonds were observed in MG2x1, indicating that the architecture of MG8-14 is more stable than that of MG2x1.In addition, the structures of MG8-4 and MG8-14 contained more charge-charge interactions (8 and 9, respectively) than the structure of MG2x1 (5) (Table 4).
MG2x1 contains a prominent pocket comprising residues W47, A50, and Y58, with a cavity area of 32 A ˚2 and a volume of 19.5 A ˚3, centered at residue A50 (Figure 8B and Figure 9C).Sequence analysis of V H domains selected by TAPE revealed that two positions in the framework, A50 and Y58, were consistently biased toward W. Residue A50 was also replaced by leucine (L) or W in representative selected V H domains such as MG8-4, MG8-14, MG8-6, and MG4-13, suggesting that replacement of this residue with a bulky side chain is related to the stability of the molecule.The structural model of the modified MG8-14 [L50W] suggests that the cavity is filled with a triad bulky side chains consisting of 50 W, W47, and W58 (Figure 9D).Accordingly, the modified MG8-14 [L50W] exhibited high thermodynamic stability as well as reversible folding after heat denaturation (Figure 7D).

Validation of the combinatorial CDRH synthetic library built on MG8-14 scaffold
To confirm the effects of CDR variation on the stability of V H scaffold, we examined the soluble expression level of V H domains containing CDRH3 regions of various lengths (7-13 amino acids), using a combinatorial CDRH synthetic library based on MG8-14.Eight or nine different sequences of each length were randomly selected and expressed in E. coli; 64 of 73 (88%) V H clones were expressed in soluble form.In addition, 11 different sequences from a rational mutation library (CDRH3 length fixed and seven positions of CDRH1, 2 and 3 of MG8-14 were randomized) were Table 3.The recovery yields of selected V H after thermal stress.Data are means and standard deviation for three independent treatment of heat denaturation within the same sample.The recovery yield was defined as the fraction of soluble V H remaining after heating at the denaturation temperature (85uC).doi:10.1371/journal.pone.0098178.t003 randomly tested; all of the test sequences were expressed in soluble form in the cytoplasm of E. coli under reducing conditions (Figure 10).Thus, aggregation was infrequently occurred regardless of CDR alteration in a combinatorial CDRH library that used the MG8-14 framework as a scaffold.

Humoral response to MG2x1 and MG8-14 in mouse
To test the humoral immune response of the selected VH domains, BALB/c mice were subjected to repeated immunization with selected V H domains, administered by various routes.Antibody against MG2x1 was undetectable after nine intravenous injections of 10 mg protein over 9 weeks (Figure 11A).Furthermore, there was no antibody-boosting response, even when injections included Freund9s Complete Adjuvant (CFA), in four of six mice at week 9 (Figure 11A).In the case of MG8-14, there was no detectable anti-MG8-14 antibody until week 6, although a mild antibody response was present in half of the tested mice at week 9 (Figure 11A).On the other hand, a camel single-domain antibody, V HH [23], was more immunogenic than MG2x1 and MG8-14, as shown by the high titer after the first injection (with CFA) at week 3 (Figure 11B).Intramuscular and subcutaneous injection of 1 mg MG8-14 resulted in no antibody response against MG8-14 throughout a 10-week course of immunization (Figure 11D), whereas V HH injection caused an increase in antibody titer starting at week 6 (Figure 11E).When mice were injected intramuscularly with 10 mg MG8-14, anti-MG8-14 antibody was elicited moderately at week 10 in only one of six mice.Subcutaneous injection of 10 mg MG8-14 elicited no antibody response until the fourth injection at week 6; moderate levels of anti-MG8-14 antibody were detectable after this time point (Figure 11D).Among mice subjected to intramuscular and subcutaneous injection of V HH , most animals exhibited an anti-V HH antibody response at week 4, immediately after the second injection (Figure E).

Discussion
The external diameter of the TatABC complex is around 160 A ˚, but its pore is relatively small [27].Variations in complex size may result in variations in pore size, influencing the compatibility of each complex with differently sized Tat substrate proteins [28].The capacity of the Tat system to export proteins via membrane-bound TatABC complexes varies among species of Gram-negative bacteria.For example, the A. tumefaciens TatABC complex is capable of exporting large (.80 kD) proteins [29], whereas in E. coli, the correlation between protein folding and export to the periplasm via the Tat pathway is poorer for proteins larger than 30 kDa than proteins of a lower molecular weight [26].The molecular weight of the V H domain is around 14 kDa; therefore, this group of proteins was predicted to be compatible with the Tat pathway of E. coli.Consistent with this expectation, in this study, the export of V H in vivo corresponded well with properties related to protein stability in vitro.Accordingly, because the V H 3 family is the most soluble of the seven V H families (V H 1-7), the V H 3 family was enriched via TAPE (Table 1) in a screen of a human germ-line library.This suggests that selection was driven by the function of the Tat pathway, which serves as a 'molecular sieve' in vivo as already discussed in many previous works [28,30,31].
We tried to compare ampicillin resistance of V H variants to the other variants by using visual measurement.For example, spot analyses of serial diluents of the culture containing ampicillin [22] was not sensitive to demonstrate the direct comparison of their resistance in this study (data not shown).To overcome this limitation, we performed a head-to-head competition of the ampicilline resistance among the 25 germ-line V H domains (the artificial library) with the third round of selection in liquid culture.This experiment resulted in MG2x1 as a sole survivor, a V H 3 family member (V H 3-23), which was used for the backbone of a frame-mutation library.This library was then subjected to another round of TAPE, with the goal of improving the physicochemical properties of this protein.Considering that MG2x1 is already relatively soluble and stable, one might expect only a marginal improvement from directed evolution via TAPE.However, subjecting the frame-mutation library to selection resulted in a significant improvement in folding-related properties.
Studies of the protein folding quality control mechanism of the E. coli Tat pathway have primarily focused on the tendency of proteins to be expressed in soluble form [19,20].However, the correlation between the selection via Tat-mediated protein folding and increases in the thermodynamic stabilities of proteins of interest has not been clearly demonstrated.In this study, we showed that both protein expression in soluble form and properties related to thermodynamic stability were clearly improved by Tatassociated screening.Foit et al. also demonstrated that antibiotic resistance bestowed by the tripartite fusion protein is correlated with stability in vivo and thermodynamic stability in vitro [32].Although both methods use the same reporter gene, i.e., BLA, the protein folding occurs in a different environment, i.e., periplasm for the tripartite system and cytoplasm for TAPE.With the reduced condition of TAPE for protein folding, some of the evolved V H was capable of autonomous refolding over repeated  cycles of heating and cooling.More reversible refolding and a higher recovery yield should increase resistance to mechanical or thermal stresses during the purification process, as well as improve long-term storage due to the low exposure rate of hydrophobic patches [33].Christ et al. demonstrated that the frequency of aggregationresistant domain was about 80% in the repertoire after heatcooling selection and about 71% in the large aggregation-resistant repertoire generated by combinatorial ligation of CDR-encoding regions [34].In this study, the frequency of aggregation-resistant V H domains in combinatorial CDRH3 repertoires with a fixed scaffold (MG8-14) screened by TAPE was 88%, regardless of the length of the CDRH3 region (Figure 10).With the exception of the CDRH3 region, the crystal structure of MG8-4 and MG8-14 superimposed closely with the parental V H , MG2x1, despite containing mutations in the frame region (Figure 8C).In addition, the atomic mobility of MG8-14 at residue L50 had the lowest observed B-factor (32), whereas the average B-factor was 43.3.These observations suggest that the core of this region is very rigid, but is still capable of accommodating various structures of CDRH3.As framework and CDR regions of the scaffold are conformational, a stability-functional tradeoffs are fully anticipated when the stability-enhancing mutation are introduced to the given functional protein, for example, scFv [20].In contrast, we screened out the stable V H scaffold first and then generated the combinatorial CDRH synthetic library to give functionality later.As the affinity of V H domains we screened from the library against several antigens, including HER3, TNF-a, and albumin were all sub-nanomolar range, we can expect that the problems on a stability-functional tradeoffs would be a minimal when we screen the functional V H domains with this quality of the library (data not shown).
The modified MG8-14 [L50W] contains three W residues that fill a large cavity of MG2x1 near the V H /V L interface.Van der Waals interactions in this region would enhance stable architecture, allowing reversible folding of the antibody during the refolding process after denaturation.Within the cavity structure, high temperature leads to thermal destabilization as a result of water permeation [35,36].Therefore, water molecules in the hydrophobic cavity of MG2x1 may directly affect thermal resilience and promote structural perturbation.Taken together, these data demonstrate that surface properties are important factors in selection of single-domain antibodies with high solubility and thermodynamic stability.
V H domains that had been selected by heat-denatured phage display from a combinatorial CDR repertoire exhibited an enrichment of certain amino acids at several positions within the CDR regions, including glycine at position 35 and glutamate at position 32 [37].Our differentiated in vivo selection strategy, using the Tat pathway in E. coli, resulted in a unique preference for tryptophan at positions 50 and 58, leading to the creation of a bulky ring structure.We believe that this preference helps V H to acquire a stable conformation, preventing structural perturbation during folding and refolding.
MG2x1 contains a negatively charged amino acid, aspartic acid (D) at position 61, which was previously identified as a determinant of protein aggregation and solubility [38].In MG8-4 and MG8-14, which were selected from the MG2x1 framemutation library, D was incorporated consecutively at positions 60 and 61, significantly increasing the net negative charge.This preference for adjacent D residues has also been observed in other protein stability screens of combinatorial CDR repertoires.For example, positions 32 and 33 of V H and positions 52 and 53 of V L are determinants for aggregation resistance [37].One important safety issue in protein therapeutics is related to immunogenicity.Many previous studies suggest that formation of sub-visible aggregates exerts a major influence on the humoral immune response [39,40].In this work, the antibody titer represents both the quantity and quality (affinity) of IgG that is specific to certain V H domain.Although we cannot discriminate which factor affects the titer more than the other does, it is obvious that the mouse immune system hardly responded to the selected V H domains even with CFA, compared to V HH as shown in Figure 11.This may be attributed to a favorable folding properties of the selected V H domains preventing aggregation, as we employed Tat-associated protein folding fitness filter.

Database access codes
The atomic coordinates and structure factors have been deposited in the Protein Data Bank.www.pdb.org(PDB ID: 3ZHL, 3ZHK and 3ZHD).

Figure 2 .
Figure 2. Schematic procedure for screening protein solubility using TAPE ('liquid screen').(A) Construction of the pET-TAPE V H library (either germ-line or mutated) and transformation of the library into E. coli.(B) Liquid culture of the library with stepwise increases in antibiotic concentration.(C) Collection of plasmids and purification of the intact V H -BLA coding region.(D) Re-cloning of the V H -BLA gene into pET-TAPE between the NcoI and BamHI sites, and transformation into E. coli.Steps (B), (C), and (D) were repeated four times for each ampicillin concentration (50, 100, 250, and 500 mg/ml).doi:10.1371/journal.pone.0098178.g002
V domains chosen randomly from the human V germ-line library (RD1-3) or selected from the human germ-line library using TAPE (MG4x4-44, H H MG4x4-25, MG10-10, and MG2x1).Cultures expressing each V H domain were harvested after induction at 25uC for 3.5 h, and soluble (S) and insoluble (I) fractions were prepared.Lane 'MW' contains a protein size marker; the size of each marker is indicated (in kD) to the left of each panel.In both panels, the mobilities of V H domains correspond to the 15-kD protein size marker.Different parts from separating gels are grouped to align expression patterns for soluble and insoluble fraction of each V H domain. doi:10.1371/journal.pone.0098178.g003 ID: 3ZHK) to identify amino acids that stretch their side chains outward from the surface.Also, all of these sites are located in the b-sheet structure away from the flexible loop of the CDRs.The frame-mutation library of MG2x1 was screened by TAPE, with the concentration of ampicillin increased (50, 100, 250, and 500 mg/ml) in successive rounds.After the final round of TAPE, 41 clones were randomly selected for sequencing of their V H domains. Changes at positions 50 and 58 (Kabat scheme) were biased toward tryptophan (W): alanine (A) at position 50 was replaced by W in 39% (16/41) of the clones, and tyrosine (Y) at position 58 was replaced by W in 58% (24/41) of the clones (Table2).The other mutation sites were not particularly biased.Sequence alignment of the selected V H domains after TAPE revealed the biased amino acids at positions 50 and 58 (Figure4B, dashed box).Based on the biased mutation frequencies at positions 50 and 58, we generated a MG8-14 mutant in which leucine (L) at position 50 was replaced with W (MG8-14 [L50W]) for further analyses of its physicochemical properties.

Figure 4 .Table 2 .
Figure 4. Rationale for designing of the combinatorial frame-mutation library.(A) Positions chosen for randomization based on the crystal structure of MG2x1.Residues are numbered according to the Kabat scheme for the V H sequence. (B) Representative sequences (MG8-4, MG8-14, MG4-13, and MG8-6) selected from the MG2x1 frame-mutation library by TAPE were aligned with the original MG2x1 sequence.Mutation sites in the sequence of MG2x1 are shown as bold dots.All mutations were introduced using degenerate codons (NNK), except that serine (S) 35 was replaced by glycine (G).X represents all amino acids.At positions 50 and 58, the mutations converged primarily onto tryptophan, indicated by dashed boxes.doi:10.1371/journal.pone.0098178.g004

Figure 10 .
Figure 10.Validation of the combinatorial CDRH synthetic library built on MG8-14 scaffold.SDS-PAGE of soluble and insoluble fractions of E. coli expressing V H domains selected randomly from the combinatorial CDRH3 synthetic libraries.Coomassie-stained gels are aligned by lane numbers (columns) and amino acid lengths of CDRH3 (rows).Images depict the region of the gel corresponding to the size of V H .Some images were combined with separate gels for the purpose of alignment (indicating with a dividing bar between gels).'MW' indicates the protein size marker corresponding to a molecular weight of 15 kD.doi:10.1371/journal.pone.0098178.g010

Table 1 .
Isolated germ-line V H genes after the third round of TAPE.
a Proportion of each identified V H gene among the 154 sequences selected after TAPE.b Proportion of each identified V H gene among 144 sequences randomly selected from the library.c Ratio of '% of V H gene after TAPE' to '% of V H gene before TAPE'.doi:10.1371/journal.pone.0098178.t001