## Figures

## Abstract

Immunogenicity is a major problem during the development of biotherapeutics since it can lead to rapid clearance of the drug and adverse reactions. The challenge for biotherapeutic design is therefore to identify mutants of the protein sequence that minimize immunogenicity in a target population whilst retaining pharmaceutical activity and protein function. Current approaches are moderately successful in designing sequences with reduced immunogenicity, but do not account for the varying frequencies of different human leucocyte antigen alleles in a specific population and in addition, since many designs are non-functional, require costly experimental post-screening. Here, we report a new method for de-immunization design using multi-objective combinatorial optimization. The method simultaneously optimizes the likelihood of a functional protein sequence at the same time as minimizing its immunogenicity tailored to a target population. We bypass the need for three-dimensional protein structure or molecular simulations to identify functional designs by automatically generating sequences using probabilistic models that have been used previously for mutation effect prediction and structure prediction. As proof-of-principle we designed sequences of the C2 domain of Factor VIII and tested them experimentally, resulting in a good correlation with the predicted immunogenicity of our model.

## Author summary

Therapeutic proteins have become an important area of pharmaceutical research and have been successfully applied to treat many diseases in the last decades. However, biotherapeutics suffer from the formation of anti-drug antibodies, which can reduce the efficacy of the drug or even result in severe adverse effects. A main contributor to the antibody formation is a T-cell mediated immune reaction caused by presentation of small immunogenic peptides derived from the biotherapeutic. Targeting these peptides via sequence alterations reduces the immunogenicity of the biotherapeutic but inevitably will have effects on structure and function. Experimentally determining optimal mutations is not feasible due to the sheer number of possible sequence alterations. Therefore, computational approaches are needed that can effectively cover the complete search space. Here, we present a computational method that finds provable optimal designs that simultaneously optimize immunogenicity and structural integrity of the biotherapeutic. It relies solely on sequence information by utilizing recent advances in protein *ab initio* prediction and incorporates immunogenicity prediction methods. Thus, the approach presents a valuable tool for bioengineers to explore the design space to find viable candidate designs that can be experimentally tested and further refined.

**Citation: **Schubert B, Schärfe C, Dönnes P, Hopf T, Marks D, Kohlbacher O (2018) Population-specific design of de-immunized protein biotherapeutics. PLoS Comput Biol 14(3):
e1005983.
https://doi.org/10.1371/journal.pcbi.1005983

**Editor: **Roland L. Dunbrack Jr., Fox Chase Cancer Center, UNITED STATES

**Received: **August 31, 2017; **Accepted: **January 15, 2018; **Published: ** March 2, 2018

**Copyright: ** © 2018 Schubert et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **All relevant data are within the paper and its Supporting Information files. The solver implementation, the ILP model in AMPL format, and the instances solved are available at https://github.com/FRED-2/EVdeimmunization released under a 3-claus BSD license.

**Funding: **This study was partially funded by Deutsche Forschungsgemeinschaft (www.dfg.de) (BS 685/B1, OK 2313/6-1) and Bundesministerium für Bildung und Forschung (www.bmbf.de) (01GU1106). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Protein-based drugs (biotherapeutics) are increasingly used to treat a wide variety of diseases[1, 2]. Although biotherapeutics show high activity and specificity at the initiation of treatment, the gradual build-up of a patient immune response is a bottleneck for even wider usage[3]. The immunogenicity of the biotherapeutic is influenced by multiple factors that can be roughly divided into extrinsic—such as dosage, rout of administration, duration and production impurities—and intrinsic properties like the protein sequence or post-translational modifications [3]. This immune response involves the formation of anti-drug antibodies (ADAs) that target the biotherapeutic itself and cause loss of effect or adverse reactions[3–5]. A prominent example of this adverse effect is in the treatment of hemophilia A (HA) with coagulation Factor VIII, where ADAs develop in 10–15% of all HA patients and as much as 30% of those patients with the most severe form of HA[6]. Patients with the highest need for therapy are thus least likely to benefit. This correlation between severity of the disease and lack of efficacy follows from the fact that the immune system is more likely to recognize the therapeutic Factor VIII as foreign the more severe the natural mutation is, where mutations that cause a total loss of Factor VIII production are most strongly associated with ADA development[7, 8].

The reduction of the immunogenicity has thus become a major step in a the development of a biotherapeutic[5]. The primary focus of reducing immunogenicity has been on humanized monoclonal antibodies (mAbs) that are comprised of foreign complementarity-determining regions in the variable regions, with the remainder of human origin, and, more recently, on fully human mAbs using bioengineering techniques[9, 10]. However, these approaches are not generally applicable to other classes of biotherapeutics and even humanized and full human mAbs can still induce a clinically relevant anti-drug immune response, likely through the CD4^{+} T-cell mediated adaptive immune system[11, 12]. The CD4^{+} T-cell activation is induced by the recognition of linear sequential peptides (called epitopes) derived from the therapeutic protein, which are presented on human leucocyte antigen (HLA) class II molecules of antigen presenting cells. Therefore, the systematic removal of these epitopes by sequence alteration (termed de-immunization) has been successfully used as an alternative approach to reduce the immunogenicity of mAbs and other therapeutic proteins [12–16].

In recent years, computational screening approaches have been developed to suggest protein sequences with reduced overall immunogenicity. The simplest approaches focused solely on introducing point mutations to reduce the amount of CD4^{+} T-cell epitopes by applying well-established epitope prediction methods [17–19]. However, the suggested mutations can have a significant impact on the stability and function of the protein. Naïve approaches not considering the structural impact on the protein will inevitably produce inactive designs. More advanced methods therefore try to exclude potentially harmful mutations by predicting their impact with various metrics [20]. The most recent approaches simultaneous optimize the number of deleted epitopes as well as the stability of the protein approximated either using structural or simple sequence information [21–23].

Recent advances in statistical protein modeling now allow to accurately infer the tertiary structure [24–28] and mutational effects [29–31] of proteins using evolutionary information contained in an multiple sequence alignment of a protein family. The statistical global pairwise entropy model used for protein inference accurately captures co-evolving sites within a protein which can be utilized to identify structural and functional important position using evolutionary couplings (EC) analysis, infer the protein structure, and predicted the effects of mutational changes.

In this work, we present a novel formulation of the de-immunization problem that uses, for the first time to our knowledge, the maximum entropy model for protein design. Incorporating the maximum entropy model, as opposed to force-field based approaches such as FoldX[32] that have been previously used for protein de-immunization, has four distinct advantages: (i) The statistical model does not require a known structure or depend on the conditions in which the structure was measured. (ii) It implicitly considers constraints on residues from interactions with ligands and other proteins, and (iii) models interactions between mutations rather than early filtering of deleterious singles. (iv) A de-immunization approach using the maximum entropy model is likely to generate more viable structures as it minimizes potential damage to protein function at the same time as minimizing the immunogenicity of the biotherapeutic design. The latter can also be achieved by incorporating a force field, such as AMBER[33], into the optimization process [21, 22], which however complicates the de-immunization formulation.

As the frequencies of HLA alleles differ drastically between populations, the immunogenicity of the biotherapeutic differs as well. It is thus imperative to design a biotherapeutic for a specific target population considering their HLA allele frequencies, as opposed to treating each HLA allele equally important during the design process, as all previous methods have done. We therefore developed a new quantitative immunogenicity objective that builds on HLA affinity prediction methods for immunogenicity approximation, as their exists a strong correlation between immunogenicity and HLA binding affinity[34], and considers the HLA allele distributions within different populations. The resulting de-immunization model does not require known structural information about the protein, summarizes functional and structural information that might not be captured by a structure-based model, and considers the varying HLA frequencies in different populations. We also demonstrate how the resulting bi-objective combinatorial optimization problem can be formulated in a concise manner and solved efficiently for relevant problem sizes with a newly developed distributed solving strategy. An experimental validation of the resulting designs confirms that the algorithm can indeed lead to significantly reduced immunogenicity.

## Results

### Formulation of the de-immunization problem

The problem of protein de-immunization can be described as identifying amino acid substitutions that reduce immunogenicity by removing T-cell epitopes while at the same time keeping the structure and function of the protein intact. We therefore define the problem of protein de-immunization as a bi-objective optimization problem. The first objective characterizes the immunogenicity of the target protein with respect to a set of HLA alleles. The immunogenicity objective *I*(*S*|*H*,*P*_{H}) combines the immunogenicity of each predicted epitope over a certain binding threshold weighted by the HLA allele frequencies *P*_{H} of a specific target population represented by their prevalent HLA alleles *H* [35]. The second objective *E*(*S*) approximates the protein fitness via the statistical energy of the protein sequence, computed by the pairwise maximum entropy model inferred using a multiple sequence alignment (MSA) of the target protein family[24–28].

More formally, we define the protein de-immunization problem as follows: Given a protein sequence S of length n and a set M_{i} of possible alterations per position 1 ≤ i ≤ n. We seek a mutant S′ of S with k alterations for which S′[i] ∈ M_{i} ∀ 1 ≤ i ≤ n holds and that minimizes:

The model therefore optimizes the tradeoff between these two objectives and produces a set of Pareto-optimal designs of the protein sequence.

### Immunogenicity objective function definition

The first objective of the de-immunization model is an adaptation of the immunogenicity score introduced by Toussaint *et al*. for epitope selection in the context of *in silico* vaccine design and is defined as follows [35]:
with *A* being a set of epitopes, *i*_{e,h} the immunogenicity of epitope *e* ∈ *A* bound to HLA allele *h* ∈ *H*. It assumes that each epitope independently influences the immune response with respect to all considered HLA alleles. The contribution of an HLA allele *h* ∈ *H* is directly proportional to its probability *p*_{h} of occurring within the target population *H*.

### Protein fitness objective function definition

The second objective is an evolutionary statistical energy of sequences computed by a pairwise maximum entropy model of protein families. Under these family-specific models, the probability for a protein sequence (X_{1},…, X_{n}) of length *n* is defined as
where the pair coupling parameters *J*_{ij}*(X*_{i}, *X*_{j}*)* describe evolutionary co-constraints on the amino acid configuration of residue pairs *i* and *j* for all amino acids and the parameters *h*_{i}*(X*_{i}*)* corresponds to single-site amino acid constraints. The partition function *Z* is a global normalization factor summing over all possible amino acid sequences of length *n* [24, 25]. The parameters J_{ij} and h_{i} are inferred from a protein family sequence alignment using an iterative approximate maximum likelihood inference scheme (pseudo-likelihood maximization) under l_{2}-regularization to prevent overfitting. Given an inferred probability model for a family, the statistical energy −*E*(*X*) can be used to quantify the fitness of specific sequences. Recent work has demonstrated that changes of E(X) quantitatively correspond to the experimental phenotypic consequences of mutations, including effects on protein stability and organismal growth [31]. To maintain protein function while minimizing immunogenicity, the second objective function is defined as the minimization of −*E*(*X*) given inferred parameters *h*_{i} and *J*_{ij} from a multiple sequence alignment of the biotherapeutic.

Evolutionary coupling (EC) strength between pairs of positions *i* and *j* is computed using the Frobenius norms of the matrices J_{ij} with subsequent correction for finite sampling and phylogenetic effects (average product correction)[27]. The evolutionary couplings are predictive of residue proximity in many protein families, and the cumulative score of one position to all others (EC enrichment score) is indicative of functionally and structurally important positions [27].

### Derivation of the integer linear program representation of the de-immunization model

We solve the stated de-immunization problem as a bi-objective mixed integer linear program (BOMILP). Solving a BOMILP finds all Pareto-optimal solutions to linear objectives with affine constraints and additional integrality constraints on a subset of the variables. The model is based on Kingsford *et al*.’s ILP formulation of the side-chain placement problem [36]. But instead of selecting energetically favorable rotamers, we encode each state of the model as a possible amino acid substitution at each position. A binary decision variable *x*_{i,a} for each position *i* ∈ {1..*n*} and each possible variation *a* ∈ *M*_{i} is introduced with *x*_{i,a} = 1 if this variant will be part of the final mutant *S*′. An additional binary variable is introduced for each pair of variants and positions notated *w*_{i,j,a,b} with *w*_{i,j,a,b} = 1 if variant *a* at position *i* and variant *b* at position *j* have been selected as part of the solution *S*′. These variables are associated with their inferred fitness terms *h*_{i,a} and *J*_{i,j,a,b} to form the second objective function (Table 1 O2).

The immunogenicity objective, in contrast to the problem formulation of Toussaint *et al*., in which the immunogenicity of each candidate epitope *e* ∈ *A* could be pre-calculated, does not have an easy ILP representation. Prediction methods must be directly incorporate into the ILP to approximate the immunogenicity of the current mutant *S*′. Therefore, we use TEPITOPEpan [37], a linear HLA-epitope affinity prediction method, as internal prediction engine since it has been demonstrated to have good predictive power and can be easily integrated into the ILP framework. More advanced, potentially non-convex non-linear prediction models such as artificial neural networks cannot readily be integrated into the problem formulation as the resulting optimization problem would be discreet, non-convex, and non-linear. This class of optimization problems is known to be hard to solve to optimality even for small instances [38] and thus out of reach for design problems of relevant size.

With linear (matrix-based) methods, the integration is possible by scoring each peptide generated with a sliding window of width *e*_{n} for each allele *h* ∈ *H* independently by summing over TEPITOPEpan’s position specific scoring matrix for amino acid *a* ∈ *M*_{i+j} at position *i* + *j* with *i* ∈ {1..(*n* – *e*_{n})} and *j* ∈ {0..(*e*_{n} − 1)}. To only consider predicted binding epitopes, the binding threshold *τ*_{h} of each HLA allele *h* ∈ *H* is subtracted from the sum score of an epitope and embedded into a hinge loss function. The summarized contribution of an allele *h* ∈ *H* is than weighted by its population probability *p*_{h}. To make the prediction scores comparable across HLA alleles, the position specific scoring matrices of TEPITOPEpan were z-score normalized and the binding thresholds adjusted accordingly. The final immunogenicity score consists of the sum of all allele-wise weighted sums (Table 1, O1).

To construct a consistent model, three constraints have to be introduced guaranteeing that only one amino acid per position is selected (Table 1, C1) and that only pairwise interactions are considered for selected variants (i.e., *J*_{i,j,a,b}, = 1 ↔ *x*_{i,a} = 1 ⋀ *x*_{j,b} = 1, see Table 1 C2 and C3). Constraints C2 and C3 can be further relaxed by dividing the pairwise fitness values into positive and negative sets [36], which is done in practice but disregarded here for ease of presentation. To be able to restrict the mutant to a specific number of introduced variations, constraint C4 limits the number of deviating amino acids to the wild type sequence *W*. A detailed formulation of the complete optimization problem can be found in S2 Material.

### Computational and experimental evaluation

As proof-of-principle, we tested the ability of the model to find low immunogenic constructs of the C2 domain of Factor VIII as the domain is highly immunogenic and involved in the ADA development in hemophilia A patients when used therapeutically [39, 40].

### Evolutionary couplings accurately predict structure and mutational effects

Evolutionary couplings computed from sequence alignments have been used successfully to predict the phenotypic effects of mutations [29–31, 41], as well as the 3D structure shown in earlier work [24–28]. The approach assigns an evolutionary statistical energy to any protein sequence that is hypothesized to correspond to the fitness of the molecule. The computation of the statistical energy of the protein and any changes to it after mutation is automatic and does not depend on computing or knowing the 3D structure. Therefore, we reasoned that we could use this statistical model in a generative mode for design within the algorithmic de-immunization process.

Previous work on predicting the effect of mutations suggested that the model accuracy depends on the diversity of the sequence alignment and the ability to predict the 3D structure accurately[31]. We used the precision of the total epistatic constraints between residue pairs as an approximation of the model validity. Overall, 70 long-range evolutionary coupled residue pairs (ECs) have a probability of at least 90% of being significant and 65 of these (93% precision) are close in space (less than 5Å; Fig 1A, Supplementary S1 Material) in a 3D structure of Factor VIII’s C2 domain (pdb: 3hny[42]; Fig 1B).

(A) Contact map of Factor VIII’s C2 domain. The gray circles represent the known crystal structure (pdb: 3hny), while the orange dots represent predicted evolutionary couplings. (B) The tertiary structure of Factor VIII’s C2 domain (pdb: 3hny). (C) Statistical energy/fitness change density separated by HA severity status (red and yellow) and the complete maximum entropy landscape of single point mutations (blue). As expected, all known single point mutations with known HA severity status reside in the lower percentile of the energy landscape. The severe and moderate HA cases are clearly separable from the mild cases using the maximum entropy model prediction. In contrast, the two distributions cannot be clearly separated using FoldX predictions (D).

To assess how well maximum entropy model can predict the effects of specific mutations compared to force-field methods, we used our maximum entropy model and FoldX predictions in a multinomial and logistic regression to predict hemophilia A severity (severe, moderate, and mild) based on patient data collected from the Factor VIII variant database (http://www.factorviii-db.org, Supplementary S1 Table). Since the severity of HA is directly correlated with instability and malfunctioning of Factor VIII, the prediction of disease severity can be seen as a proxy for functional and structural effect prediction. The multinomial regression model, using the change in statistical energy between mutant and wild type as independent variable, shows a moderate ability to predict the clinical outcome (F1-micro of 0.65 ± 0.09, F1-macro of 0.47 ± 0.07).

The performance of a FoldX-based multinomial regression model however was significantly worse (one-sided Wilcoxon signed rank test, V = 13558, p-value < 2.2e-16; F1-micro of 0.49 ± 0.11, F1-macro of 0.35 ± 0.09). We combined the severe and moderate clinical classes and performed a logistic regression, which improved the prediction performance of our maximum entropy model (weighted AUC of 0.72 ± 0.11, weighted F1-score of 0.73 ± 0.11; Supplementary S1 Table; Fig 1C). The FoldX-based logistic regression model was again outperformed by our maximum entropy model (one-sided Wilcoxon signed rank test, V = 15633, p-value < 2.2e-16; Fig 1D), but also yielded higher predictive power compared to its multinomial model (weighted AUC: 0.62 ± 0.11, weighted F1 0.58 ± 0.13).

### Strong immunogenic region contains functional sites

We first *in silico* identified a narrow region of high immunogenic potential for the three most prevalent HLA alleles in the European population (DRB1*15:01, DRB1*03:01 and DRB1*07:01; accounting for 70% of the patients in Western Europe) to facility experimental evaluation.

We screened the C2 domain of Factor VIII using TEPITOPEpan [37] for peptides binding to the three HLA alleles. Each peptide that fell into the 95% percentile of TEPITOPEpan’s score distribution of an HLA molecule was considered an epitope. We predicted 16 epitopes in 6 regions of the C2-domain of Factor VIII (UniProt: FA8_HUMAN). The region with the highest scoring immunogenicity (residues 2,312–2,340) had nine of the 16 predicted strong binding epitopes (Fig 2A and 2B), making it a prime candidate region for de-immunization design. However, there is evidence that this very region might be of high functional importance for the protein; The region is enriched for conserved co-variation of residues and contains a known membrane-binding motif [42]. Eight of the top ten evolutionary couplings involve residues in the high immunogenic region (Fig 2C). In general, the region is enriched for strong evolutionary couplings (sign test, s = 124, n = 130, p-value < 2.2e-16, CI95 = [0.90, 0.98]; Fig 2A). Hence there is a risk that mutations designed to minimize immunogenicity could be detrimental to protein function and the method we have developed here is specifically designed to minimize the risk of both.

(A) Immunogenicity screening for three DRB1 alleles (DRB1*15:01, DRB1*03:01 and DRB1*07:01) with TEPITOPEpan. All peptides who’s predicted binding affinity fell into the 95% percentile of TEPITOPEpan’s score distributions have been considered epitopes. The blue regions depict the cumulative number of predicted epitopes per position and the orange regions depict the EC cumulative summary scores of top 70 ECs. Six immunogenic regions can be identified based on the *in silico* screening, with region 2,321 to 2,340 having the highest number of overlapping epitopes (9 out of 16 predicted epitopes) and was thus chosen as de-immunization target. It is comprised of the highest evolutionary coupling pairs. (B) The tertiary structure of Factor VIII’s C2 domain of with highlight immunogenic region selected for de-immunization redesign in blue. (C) The tertiary structure of Factor VIII’s C2 domain with marked top eight EC (red spheres) that coincide with the identified immunogenic region (blue).

### De-immunization of Factor VIII’s C2 domain

We next solved our bi-objective mixed integer de-immunization model to design sequences of the identified highly immunogenic region resulting in 21 Pareto-optimal sequences with up to three simultaneous point mutations (Table 2, Fig 3). Although the model was set up to constrain sequence substitutions solely to the identified immunogenic region, the resulting fitness change was optimized based on the interactions with all sites in that protein domain.

Pareto front of de-immunized designs in percent change compared to the wild type with k = 1,2,3 mutations. Each design is a tradeoff between the immunogenicity and the protein fitness function and represents a new sequence (here represented as tertiary structures). Immunogenicity is approximated by the HLA binding affinity predictions for a set of HLA molecules weighted by their HLA allele frequency in a specific population. The red spheres within the tertiary structures mark the mutated residues.

Even though none of the 21 designed sequences were predicted “fitter” than the wild–type, they were all close to the wild-type fitness. The computed fitness of 20 out of 21 designs resided in 95% percentile or higher when compared to the whole distribution of single, double, and triple mutations, suggesting that the protein would remain stable and functional. The sequence with the highest difference to wild-type fitness prediction (Design-11; V2313M, Y2324L, V2333E) was in the 90% percentile and still close to WT fitness (reduction of 1.7%). It exhibited also the maximal predicted reduction of immunogenicity (immunogenicity reduction of 45%) deleting eight out of nine epitopes of the identified region. The next-best triple mutant (Design-12; L2321T, I2327L, V2333E) resulted in the deletion of eight epitopes with an immunogenicity reduction of 42% and a fitness reduction of 1.28%.

Previous work that aims to increase the likelihood of a functional protein after mutation design, has used force-field based modeling, such as FoldX [32]. As to distinguish the differences between FoldX and the employed maximum entropy model we predicted the mutation effects of the 21 designs with the EV model and FoldX (Table 2). The predictions of the maximum entropy model only moderately correlate to those using FoldX (r = 0.44, CI95 = [0.02, 0.73], t = 2.173, df = 20, p-value = 4.2e-2; Fig 4). The two most deviating mutations between the two prediction methods were Design-11 (V2313M, Y2324L, V2333E) and Design-3 (Y2324L, V2333E), both of which introduced a mutation at a membrane-binding site[42]. FoldX predicted these designs comparatively less deleterious than the predictions of the maximum entropy model. One explanation for this discrepancy is that it would be harder for force-field based methods to capture the membrane binding constraints unless they were in the structure used.

The red line is a fitted linear regression, and the red tube represents its 95-confidence interval. The orange-circled dots are the two mutational designs with the highest discrepancy. FoldX predicted these two mutations less deleterious compared maximum entropy model, although both designs introduced a mutation at a membrane-binding site.

### Experimental validation

To test the designs, we synthesized twenty overlapping 15-mer peptides containing the introduced mutations and their wild-type counterparts. The peptides maximally covered the predicted epitopes around the mutations of all designed constructs that contained one and two mutations (Supplementary S2 Table). The affinity of these peptides to the three HLA alleles was measured at time zero and after 24 hours (Methods, Supplementary S3 and S4 Tables). We linearly combined the measured relative affinity scores across HLA alleles weighted by their allele frequencies for each peptide respectively to produce a score that approximates the overall immunogenicity (Fig 5A). Results are presented for measurements taken at time point zero. Measurements made after 24 hours were very similar (r = 0.94, CI95 = [0.86, 0.98], t = 11.81, df = 17, p-value = 1.3e-09) and thus resulted in similar correlations. Overall, the measured and predicted immunogenicity of the tested peptides correlated well with r = 0.76 (CI95 = [0.48, 0.91], t = 4.885, df = 17, p-value = 1.4e-4; Fig 5A). Next, we compared the predicted and measured gain or loss in immunogenicity for the whole region by reconstructing the targeted region using overlapping peptides. The measured relative scores were linearly combined (as previously) and then normalized to the number of overlapping peptides used in the reconstruction (Fig 5B). The difference in the measured scores between the wild type and mutant regions can be thought of as a proxy for gain or loss in immunogenicity and correlated well to the predicted changes (r = 0.86, CI95 = [0.40, 0.97], t = 4.136, df = 6, p-value = 6.1e-3; Fig 5B).

(A) Correlation of experimental and predicted immunogenicity of each peptide. The experimental immunogenicity score of a peptide is defined as the linear combination of the individual experimentally determined relative HLA binding affinity of each HLA allele h ∈ H weighted by the HLA allele frequency. (B) Correlation of experimental and predicted immunogenicity of the whole redesigned region. The summarized immunogenicity score of the whole region is the linear combination of the overlapping peptides used to reconstruct the region, normalized by total number of peptides used. The predicted immunogenicity scores per peptide were computed according to our immunogenicity objective function. The red lines are a fitted linear regression and the red tubes represent their 95-confidence interval.

## Discussion

This work introduced a novel method to reduce a protein’s immunogenicity while maintaining its structural integrity requiring only sequence information of the target protein. The method uses a different immunogenicity objective compared to all previous approaches, accounting for both relative epitope strength and HLA allele frequency information of a target population. The HLA distribution can differ tremendously between populations influencing the immunogenicity of a protein and hence the design process should account for the difference by prioritizing different T-cell epitopes. We further combined these objective functions in a bi-objective mixed-integer linear program and introduced a novel solving strategy that guarantees to find the full and exact Pareto front of our de-immunization model. While guaranteeing global optimality, an integer linear program imposes constraints on the functional form the immunogenicity and protein fitness objectives can take. Only linear or simple convex functions can be integrated into an integer linear program, thus prohibiting the use of non-linear, non-convex prediction models. However, integrating such complex, non-convex methods would lead to a highly complex optimization problem that is effectively impossible to solve to optimality for design problems of relevant size.

The fact that the highly immunogenic region, *a priori* identified during an *in silico* screening and independently described by others [19], coincides with a highly evolutionary connected as well as functionally important region underlines the need for methods that are capable of incorporating functional and structural integrity prediction in the de-immunization process. The de-immunization model introduced demonstrates the power of such approaches. The predicted immunogenicity of the complete domain could be reduced by 45% without disrupting the fitness landscape extensively. Moreover, the observed highly significant correlations between measured and predicted immunogenicity both on individual peptide and (reconstructed) segment level affirmed that the underlying assumptions made by the model are sufficient to predict the influence of mutation in terms of immunogenicity.

In the case of this Factor VIII domain, we found no advantage to structure and force-field base approaches to assessing the effect of clinical mutation classification; structure-based approaches may even be a disadvantage when structure information is incomplete (e.g., binding partners not present). This suggests that sequence information may be sufficient for de-immunization design, and is consistent with the previous observation that sequence alignments can be used to identify constrained interacting residues across biomolecules as well as the effect of mutations [31, 43]. However, high-quality diverse sequence alignments are not always available, especially for chimeric or synthetic proteins.

In summary, we proposed a novel de-immunization model that integrates quantitative immunogenicity optimization with sequence-based fitness optimization and used the approach to design novel C2 domains of Factor VIII that can be further validated for clinical application using mouse models or T-cell proliferation assays based on PBMCs of HA patients. The approach will allow bioengineers to reliably explore the design space of the target protein to select promising candidates for experimental evaluation.

## Methods

### Pre-processing

To reduce the search space, a filtering approach based on position specific amino acid frequency *f*_{i}(*a*) (i.e., conservation) can be applied. Only amino acids at position *i* ∈ {1..*n*} exceeding a certain frequency threshold *ζ* are considered as possible substitution at a site. Hence, the set of possible substitutions per position is defined as *M*_{i}: = {*a* ∈ *Σ*|*f*_{i}(*a*) ≥ *ζ*}. The wild type amino acid is additionally added if it does not exceed the frequency threshold. This filtering assumes that variants that are not or infrequently observed are harmful due to either destabilizing effects, reduction of function, or intervening effects with interaction partners.

### Solving a bi-objective integer program

Special strategies must be applied to solve a BOMILP. Popular methods to solve discrete multi-objective problems include the *ε*-constraint [44], perpendicular search [45], and the augmented weighted Chebychev method [46]. All have their own limitations. The recently published rectangle splitting approach tries to overcome these [47, 48]. For solving the de-immunization problem, we developed a parallel two-phase version of the rectangle-splitting approach that can exploit the parallel nature of the algorithm and can effectively utilize modern distributed computing resources. In the following we sketch the newly developed two-phase approach.

First, we introduce necessary notations and concepts (adopted from Boland *et*. *al*.). Let and be two points in solution space with and . Further we define *R*(*z*^{1},*z*^{2}) to be the rectangle spanned by *z*^{1} and *z*^{2}. A nondominated point within *R*(*z*^{1},*z*^{2}) can be found with the following sequential operation (see proof in[47]):

These operations will be denoted as

As a first step of the two-phase parallel rectangle-splitting approach the boundaries of the Pareto front are calculated by solving
and
in parallel (Fig 6A). Then, the search space within *R*(*z*^{T},*z*^{B}) is evenly constraint based on boundary conditions enforced w.l.o.g. on *z*_{1} (Fig 6B). The boundaries are calculated for a predefined number of constraints *m* with:

(A) First, the boundaries of the Pareto front are identified. (B) Then, the space between the boundaries is evenly divided and searched in parallel for nondominated points using the ε-constraint method. (C) The identified nondominated points are used to initiate rectangle search spaces which can be processed in parallel using the standard rectangle-splitting approach, by splitting the rectangle in half and searching independently the bottom and top half (D). If the corner points of the rectangles are found during the search, it is proofs, that no further nondominated point resides within the search space and all points have been identified.

Each section of the separated search space can be independently searched by solving and the resulting new nondominated points can be used as initial approximation of the Pareto front. The found nondominated points might contain duplicates and might not resemble the complete Pareto front. Therefore, it is necessary to perform a refinement of the Pareto front to find the remaining nondominated points. To this end, the nondominated points are sorted in nondecreasing order such that . Each consecutive pair of points spans a search rectangle *R*(*z*^{i},*z*^{j}) with *i* ≤ *j*. These rectangles can now be searched in parallel by the rectangle-splitting algorithm (Fig 6C). The search rectangles are split in half. First, the bottom half *R*^{B} is searched by solving:

If a nondominated point is found, the upper half *R*^{T} is further restricted and spans now in which is searched. Each newly found point spans a new independent search rectangle and with its adjacent point. These rectangles are searched in parallel with the described procedure (Fig 6D). If the search operation yielded the known point *z*^{j} for and *z*^{i} for accordingly it proofs that the area does not contain further nondominated points. The search procedure is carried out until the complete search space has been explored.

### Inference of the maximum entropy sequence model

Multiple sequence alignments (MSA), created by JackHMMER[49], were used for the inference of the maximum entropy models of the Factor VIII C2-domain (residues 2,188–2,345 of FA8_HUMAN, Supplementary S1 Material). To optimize residue coverage and MSA diversity, the alignment was created using five search iterations at an E-value threshold of 10^{−20}. Sequences with 70% or more gaps and columns with over 50% gaps were excluded from subsequent statistical inference. To reduce the influence of sampling bias in the inference step, sequences were clustered at a 90% identity threshold (theta 0.9), and reweighted by the inverse of the number of cluster members resulting in M_{eff} = 1656 effective sequences. Generally, at least a M_{eff}/L ≥ 1 is considered necessary to predict the tertiary structure of a protein [31, 50]. Here we achieve a M_{eff}/L ≥ 10 (with L = 157 aa), which should be sufficient to guarantee a high-quality model. The parameters of the pairwise maximum entropy model and evolutionary couplings were then inferred using EVfold^{PLM} [51] with pseudo-likelihood maximization [52]. Substitution effects were derived by calculating the difference between the wild-type and the mutant statistical energy [31]. The validity of the maximum entropy model was verified by using the precision of the inferred top evolutionary couplings (ECs) between residue pairs compared to an existing 3D structure. To identify the top ECs a Gaussian-lognormal mixture model was inferred based on the overall score distribution [28] and the ECs within the tail of the distribution (ECs with a probability ≥ 0.90 of belonging to the lognormal) were used for model quality assessment [28].

### Hemophilia a severity data and regression analysis

Single point mutation data with known patient severity status was extracted from the Factor VIII variant database (http://www.factorviii-db.org). The data was filtered for mutations residing within the C2 domain, which resulted in 40 data points in total (Supplementary S4 Table). The severity status of each patient was determined based on a one-stage Factor VIII:C and categorized into three classes–severe (<1%), moderate (1–5%), and mild (>5%). The data points were unevenly distributed across the classes with 15 severe, 8 moderate, and 17 mild cases.

To train and validate the multinomial and logistic regression models, the data was randomly divided into training and test set (70:30%-split) in a stratified manner. This process was repeated two hundred times and the prediction performance averaged over the runs.

### Experimental design

In order to experimentally verify our *in silico* predictions for Factor VIII, we utilized the commercial REVEAL HLA-Peptide binding assay of ProImmune (www.proimmune.com). Peptides were synthesized using the PEPscreen custom library synthesis method, yielding high purity peptides for experimental analysis. HLA-peptide binding was assessed for the three HLA-DRB1 alleles used in this study (DRB1*15:01, DRB1*03:01 and DRB1*07:01). In short, the method compares the affinity of the studied peptide to the affinity of a high-affinity control peptide. Each peptide is then scored for binding to a certain HLA molecule relative to the score of the control peptide and reported as the percentage of the signal generated by the control peptide.

### Implementation

The two-phase rectangle-splitting solver was implemented in Python 2.7 using the CPLEX package, Numpy 1.4, and Polygon 2.0.6 package. CPLEX 12.6 was used as backend to solve the BOMILP models.

Structure-based fitness prediction for validation purposes of the de-immunized Factor VIII C2 domain constructs were performed with FoldX [32] using default settings for the obtained mutations. TEPITOPEpan 1.0 was used for epitope prediction.

### Statistical analysis

The multinomial and logistic models were fit and evaluated using Scikit-learn 0.18 [53]. The statistical analysis was conducted with R 3.0.2. Statistical significance was considered at *α* = 0.05. The specific statistical tests used are indicated in the figures or in the results section.

## Supporting information

### S1 Material. The long-range evolutionary couplings of Factor VIII C2 domain, and the multiple sequence alignments used for model inference.

https://doi.org/10.1371/journal.pcbi.1005983.s001

(GZ)

### S2 Material. Detailed description of the de-immunization BOMILP formulation.

https://doi.org/10.1371/journal.pcbi.1005983.s002

(DOCX)

### S1 Table. Known Factor VIII C2 domain single point mutation with severity status and predicted statistical fitness change with the maximum entropy model and FoldX.

https://doi.org/10.1371/journal.pcbi.1005983.s003

(XLSX)

### S3 Table. Experimental immunogenicity score measured per peptide and HLA allele.

https://doi.org/10.1371/journal.pcbi.1005983.s005

(XLSX)

### S4 Table. Summarized experimental immunogenicity scores of the reconstructed immunogenic region.

https://doi.org/10.1371/journal.pcbi.1005983.s006

(XLSX)

## References

- 1. Leader B, Baca QJ, Golan DE. Protein therapeutics: a summary and pharmacological classification. Nature Reviews Drug Discovery. 2008;7(1):21–39. pmid:18097458
- 2. Jiang K. Near-record number of approvals signals drug development shift. Nature medicine. 2013;19(2):114–. pmid:23389589
- 3. Schellekens H. Bioequivalence and the immunogenicity of biopharmaceuticals. Nature reviews Drug discovery. 2002;1(6):457–62. pmid:12119747
- 4. Krieckaert C, Rispens T, Wolbink G. Immunogenicity of biological therapeutics: from assay to patient. Current opinion in rheumatology. 2012;24(3):306–11. pmid:22388644
- 5. Ryu JK, Kim HS, Nam DH. Current status and perspectives of biopharmaceutical drugs. Biotechnology and Bioprocess Engineering. 2012;17(5):900–11.
- 6. Astermark J. Inhibitor development: patient‐determined risk factors. Haemophilia. 2010;16(102):66–70. pmid:19298384
- 7. Lacroix‐Desmazes S, Misra N, Bayry J, Artaud C, Drayton B, Kaveri S, et al. Pathophysiology of inhibitors to factor VIII in patients with haemophilia A. Haemophilia. 2002;8(3):273–9. pmid:12010423
- 8. Astermark J, Oldenburg J, Escobar M, White GC, Berntorp E. The Malmo International Brother Study (MIBS). Genetic defects and inhibitor development in siblings with severe hemophilia A. haematologica. 2005;90(7):924–31. pmid:15996930
- 9. Hwang WYK, Foote J. Immunogenicity of engineered antibodies. Methods. 2005;36(1):3–10. pmid:15848070
- 10. Nelson AL, Dhimolea E, Reichert JM. Development trends for human monoclonal antibody therapeutics. Nature reviews drug discovery. 2010;9(10):767–74. pmid:20811384
- 11. Baker MP, Jones TD. Identification and removal of immunogenicity in therapeutic proteins. Current opinion in drug discovery & development. 2007;10(2):219.
- 12. Harding FA, Stickler MM, Razo J, DuBridge RB. The immunogenicity of humanized and fully human antibodies. Residual immunogenicity resides in the CDR regions mAbs. 2010;2(3):256–65. pmid:20400861
- 13. Warmerdam PA, Plaisance S, Vanderlick K, Vandervoort P, Brepoels K, Collen D, et al. Elimination of a human T-cell region in staphylokinase by T-cell screening and computer modeling. THROMBOSIS AND HAEMOSTASIS-STUTTGART-. 2002;87(4):666–73.
- 14. Tangri S, Mothé BR, Eisenbraun J, Sidney J, Southwood S, Briggs K, et al. Rationally engineered therapeutic proteins with reduced immunogenicity. The Journal of Immunology. 2005;174(6):3187–96. pmid:15749848
- 15. Onda M, Beers R, Xiang L, Nagata S, Wang Q-c, Pastan I. An immunotoxin with greatly reduced immunogenicity by identification and removal of B cell epitopes. Proceedings of the National Academy of Sciences. 2008;105(32):11311–6.
- 16. Mazor R, Vassall AN, Eberle JA, Beers R, Weldon JE, Venzon DJ, et al. Identification and elimination of an immunodominant T-cell epitope in recombinant immunotoxins based on Pseudomonas exotoxin A. Proceedings of the National Academy of Sciences. 2012;109(51):E3597–E603.
- 17. De Groot A, Knopp P, Martin W. De-immunization of therapeutic proteins by T-cell epitope modification. Developments in biologicals. 2004;122:171–94.
- 18. Cantor JR, Yoo TH, Dixit A, Iverson BL, Forsthuber TG, Georgiou G. Therapeutic enzyme deimmunization by combinatorial T-cell epitope removal using neutral drift. Proceedings of the National Academy of Sciences. 2011;108(4):1272–7.
- 19.
Moise L, Song C, Martin WD, Tassone R, De Groot AS, Scott DW. Effect of HLA DR epitope de-immunization of Factor VIII
*in vitro*and*in vivo*. Clinical Immunology. 2012;142(3):320–31. pmid:22222093 - 20. Parker AS, Zheng W, Griswold KE, Bailey-Kellogg C. Optimization algorithms for functional deimmunization of therapeutic proteins. BMC bioinformatics. 2010;11(1):180.
- 21. Parker AS, Griswold KE, Bailey-Kellogg C. Optimization of therapeutic proteins to delete T-cell epitopes while maintaining beneficial residue interactions. Journal of bioinformatics and computational biology. 2011;9(02):207–29.
- 22. Parker AS, Choi Y, Griswold KE, Bailey-Kellogg C. Structure-guided deimmunization of therapeutic proteins. Journal of Computational Biology. 2013;20(2):152–65. pmid:23384000
- 23. King C, Garza EN, Mazor R, Linehan JL, Pastan I, Pepper M, et al. Removing T-cell epitopes with computational protein design. Proceedings of the National Academy of Sciences. 2014;111(23):8577–82.
- 24. Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, et al. Protein 3D structure computed from evolutionary sequence variation. PloS one. 2011;6(12):e28766. pmid:22163331
- 25. Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proceedings of the National Academy of Sciences of the United States of America. 2011;108(49):E1293–301. pmid:22106262; PubMed Central PMCID: PMC3241805.
- 26. Balakrishnan S, Kamisetty H, Carbonell JG, Lee SI, Langmead CJ. Learning generative models for protein fold families. Proteins: Structure, Function, and Bioinformatics. 2011;79(4):1061–78.
- 27. Marks DS, Hopf TA, Sander C. Protein structure prediction from sequence variation. Nature biotechnology. 2012;30(11):1072–80. pmid:23138306
- 28. Toth-Petroczy A, Palmedo P, Ingraham J, Hopf TA, Berger B, Sander C, et al. Structured states of disordered proteins from genomic sequences. Cell. 2016;167(1):158–70. e12. pmid:27662088
- 29. Shekhar K, Ruberman CF, Ferguson AL, Barton JP, Kardar M, Chakraborty AK. Spin models inferred from patient-derived viral sequence data faithfully describe HIV fitness landscapes. Physical review E. 2013;88(6):062705.
- 30. Figliuzzi M, Jacquier H, Schug A, Tenaillon O, Weigt M. Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1. Molecular biology and evolution. 2015:msv211.
- 31. Hopf TA, Ingraham JB, Poelwijk FJ, Scharfe CPI, Springer M, Sander C, et al. Mutation effects predicted from sequence co-variation. Nat Biotech. 2017;advance online publication. http://www.nature.com/nbt/journal/vaop/ncurrent/abs/nbt.3769.html—supplementary-information. pmid:28092658
- 32. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: an online force field. Nucleic acids research. 2005;33(suppl 2):W382–W8.
- 33. Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Ferguson DM, et al. A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. Journal of the American Chemical Society. 1995;117(19):5179–97.
- 34. Sette A, Vitiello A, Reherman B, Fowler P, Nayersina R, Kast WM, et al. The relationship between class I binding affinity and immunogenicity of potential cytotoxic T cell epitopes. The Journal of Immunology. 1994;153(12):5586–92. pmid:7527444
- 35. Toussaint NC, Dönnes P, Kohlbacher O. A mathematical framework for the selection of an optimal set of peptides for epitope-based vaccines. PLoS computational biology. 2008;4(12):e1000246. pmid:19112482
- 36. Kingsford CL, Chazelle B, Singh M. Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics. 2005;21(7):1028–39. pmid:15546935
- 37. Zhang L, Chen Y, Wong H-S, Zhou S, Mamitsuka H, Zhu S. TEPITOPEpan: extending TEPITOPE for peptide binding prediction covering over 700 HLA-DR molecules. PloS one. 2012;7(2):e30483. pmid:22383964
- 38.
Jünger M, Liebling TM, Naddef D, Nemhauser GL, Pulleyblank WR, Reinelt G, et al. 50 years of integer programming 1958–2008: From the early years to the state-of-the-art: Springer Science & Business Media; 2009.
- 39. Meeks SL, Cox CL, Healey JF, Parker ET, Doshi BS, Gangadharan B, et al. A major determinant of the immunogenicity of factor VIII in a murine model is independent of its procoagulant function. Blood. 2012;120(12):2512–20. pmid:22855607
- 40. Walter JD, Werther RA, Brison CM, Cragerud RK, Healey JF, Meeks SL, et al. Structure of the factor VIII C2 domain in a ternary complex with 2 inhibitor antibodies reveals classical and nonclassical epitopes. Blood. 2013;122(26):4270–8. pmid:24085769
- 41.
Hopf TA, Ingraham JB, Poelwijk FJ, Springer M, Sander C, Marks DS. Quantification of the effect of mutations using a global probability model of natural sequence variation. arXiv preprint arXiv:151004612. 2015.
- 42. Liu Z, Lin L, Yuan C, Nicolaes GA, Chen L, Meehan EJ, et al. Trp2313-His2315 of Factor VIII C2 Domain Is Involved in Membrane Binding: Structure of a complex between the C2 domain and an inhibitor of membrane binding. Journal of Biological Chemistry. 2010;285(12):8824–9. pmid:20089867
- 43. Hopf TA, Schärfe CP, Rodrigues JP, Green AG, Kohlbacher O, Sander C, et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. Elife. 2014;3:e03430.
- 44.
Vira C, Haimes YY. Multiobjective decision making: theory and methodology: North-Holland; 1983.
- 45. Chalmet L, Lemonidis L, Elzinga D. An algorithm for the bi-criterion integer programming problem. European Journal of Operational Research. 1986;25(2):292–300.
- 46. Steuer RE, Choo E-U. An interactive weighted Tchebycheff procedure for multiple objective programming. Mathematical programming. 1983;26(3):326–44.
- 47. Boland N, Charkhgard H, Savelsbergh M. Criterion space search algorithms for biobjective mixed integer programming part I: Integer Programs. Optimization-Online2013.
- 48. Boland N, Charkhgard H, Savelsbergh M. Criterion space search algorithms for biobjective mixed integer programming part II: Mixed integer programs. Optimization-Online2013.
- 49. Johnson LS, Eddy SR, Portugaly E. Hidden Markov model speed heuristic and iterative HMM search procedure. BMC bioinformatics. 2010;11(1):431.
- 50. Ovchinnikov S, Park H, Varghese N, Huang P-S, Pavlopoulos GA, Kim DE, et al. Protein structure determination using metagenome sequence data. Science. 2017;355(6322):294–8. pmid:28104891
- 51. Hopf TA, Morinaga S, Ihara S, Touhara K, Marks DS, Benton R. Amino acid coevolution reveals three-dimensional structure and functional domains of insect odorant receptors. Nature communications. 2015;6.
- 52. Ekeberg M, Lovkvist C, Lan Y, Weigt M, Aurell E. Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Physical review E, Statistical, nonlinear, and soft matter physics. 2013;87(1):012707. pmid:23410359.
- 53. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research. 2011;12(Oct):2825–30.