How we responded to reviewers’ comments: Novel, Provable Algorithms for Eﬃcient Ensemble-Based Computational Protein Design and Their Application to the Redesign of the c-Raf-RBD:KRas Protein-Protein Interface

We found the reviewers’ comments very helpful, and we thank them for their suggestions which have greatly improved our manuscript. Below, we address each comment individually and summarize the resulting changes to the manuscript. While changes to the manuscript have been made throughout, the major ones are indicated in red. References (e.g., [1]) can be found at the end of this response. Added bibliographic references are not in red in the manuscript due to limitations of LaTex. In response to Reviewer 1’s questions about the signiﬁcance of our designed c-Raf-RBD(RKY) variant, we attempt to answer these questions in this response to reviewers. However, we do agree to tone down our claims of signiﬁcance, as she/he suggested, and have done so in the revised manuscript. We agree it is best to present our data, suggest the criteria to evaluate it, and let the reader decide. For this reason, our answers herein to Reviewer 1 are designed to explain our excitement about and interpretation of our data. But, conceding to the reviewer’s preference, the arguably over-enthusiastic language of multiple published papers in the Ras ﬁeld (for example, characterizing previous 5-to 7-fold binding improvements as “superbinders” — their term — a phenomenon we discuss below on pages 10-11) is not propagated to the revised manuscript. We do feel that the intense previous search for mutations to improve Ras:Raf binding was motivated by the biomedical signiﬁcance of understanding this PPI. Moreover, it is surprising to us that this search, spanning multiple papers and techniques (ranging from experimental to computational), did not discover our designed variant. This suggests that it was not an easy mutant to ﬁnd, despite a great deal of previous research. Altogether, we feel this background literature strengthens the case for publication of our manuscript, since it documents that aﬃnity


Reviewer 1
Lowegard et al. present a development of the Donald lab Osprey design methodology to improve efficiency in the design of large sequence ensembles.If I understood correctly, Osprey carries out a combined sequence-design and backbone relaxation step and therefore requires enumerating sequence space.To reduce the enumerated space to a size that can be practically computed, they now compute the stability of each component of the protein system (for instance, receptor and ligand) to ensure that no mutant is too destabilising before considering all combined mutations.The authors tested their approach on a natural (and engineered) PPI involving KRas and retrospectively tested the ranking of mutants relative to previous mutational analyses.They also used the method to design a single-point mutant and found that it improved affinity five-fold relative to the starting point.

Answer:
1.While osprey is capable of designing with backbone movements [21,22], it does not carry out combined sequence-design and backbone relaxation steps.Any provable protein design algorithm must search (but not necessarily enumerate) the entire sequence space for a design -this is not a consequence of backbone flexibility but rather a requirement for a provable algorithm.In other words, the power of these algorithms is that every sequence is either enumerated or pruned, and a combinatorial number of sequences can be pruned without being enumerated.The result is that no sequence is missed, and the optimal sequences (relative to the input model) are returned.Please see the response to reviewer 2 on pages 19-20 for more detail on this point.
2. We tested this work not only on the KRas:c-Raf-RBD PPI both prospectively and retrospectively, but also on designs selected from 40 different protein-protein interfaces (See the sections labeled "Computational experiments" and "Computational results" along with the list of systems in Table S2).fries alone was tested on 2,662 designs and fries/EWAK * in combination were tested on 167 design examples.These experiments show that fries/EWAK * are more efficient than the previous state-of-the-art approach, BBK * .We have emphasized this more throughout the revised manuscript and, in particular, we added the following to the abstract: "This combined approach led to significant speed-ups compared to the previous state-of-the-art multi-sequence algorithm, BBK * , while maintaining its efficiency and accuracy, which we show across 40 different protein systems and a total of 2,826 protein design problems." 3. Finally, the resulting variant c-Raf-RBD binds 36 times more tightly to KRas than the design starting point and 5 times more tightly than the previous tightest-known binder.We have revised the manuscript to more clearly highlight these points.To emphasize this we added the following to the abstract: "This new variant binds roughly five times more tightly than the previous best known binder and roughly 36 times more tightly than the design starting point." The main message of the paper is rather difficult to distill.I wrote above how I understood the motivation for the current paper and how the authors addressed the computational problem, but this not stated in the abstract and it was only by going through the methods that I could understand this (and I'm still not sure that I understood correctly).The methods are, as expected for such a paper, quite dense with formulae so the big picture is lost on a reader.My first suggestion is to clarify already in the abstract what is the main contribution of this paper, not just by stating that the method is more efficient, but why it is more efficient and for which problems.In this connection, the provability of the method is of far less significance to most users and developers of computational design methodology than its practical usefulness (accuracy, speed, scope).Nevertheless, the point about provability is repeated from the second word of the title to the end of the paper in the excess of 30 times.My second suggestion is therefore to substantially reduce the emphasis from this point and highlight applicability and accuracy.

Answer:
We appreciate this feedback, and have clarified the main message of the paper both in the abstract and more generally.Additionally, we have reduced the emphasis on provability to only where we feel it is of crucial importance and have highlighted the applicability of the methods presented in the manuscript.In particular, the reviewer's point about the abstract has been addressed by adding the following: "fries pre-processes the sequence space to limit a design to only the most stable, energetically favorable sequence possibilities.EWAK * then takes this pruned sequence space as input and, using a user-specified energy window, calculates K * scores using the lowest energy conformations.We expect fries/fries to be most useful in cases where there are many unstable sequences in the design sequence space and when users are satisfied with enumerating the low-energy ensemble of conformations.In combination, these algorithms provably retain calculational accuracy while limiting the input sequence space and the conformations included in each partition function calculation to only the most energetically favorable, effectively reducing runtime while still enriching for desirable sequences.This combined approach led to significant speed-ups compared to the previous stateof-the-art multi-sequence algorithm, BBK * , while maintaining its efficiency and accuracy, which we show across 40 different protein systems and a total of 2,826 protein design problems." To further emphasize this point we have also added the following to the manuscript in the Section labeled "Background": "These previous methods, while efficient, suffer from two practical drawbacks.First, some returned sequences exhibit a large K * score due to a decrease in stability of the unbound states.These sequences are rarely desirable in practice, since decreasing protein stability can result in poor folding and aggregation.Second, the approximation error for some sequences is slow to approach epsilon which can lead to prohibitively slow designs." We also added the following to the Section labeled "Energy window approximation to K* (EWAK * )": "Limiting the partition functions to only these energetically favorable conformations can effectively reduce runtime while still enriching for desirable sequences." The authors chose the KRas system apparently because it is a drug target.They mention several times that KRas is undruggable and that this is therefore a biomedically significant problem.First, as the authors mention at one point in the text, KRas is actually not undruggable and there are now small molecules in clinical trials (or in the clinic; I'm not sure).Second, whether or not KRas is a drug target is beside the point for this paper, since KRas is an intracellular target and no matter what affinities the authors achieved, the designs could not be used in any conceivable way in treatments or even in drug discovery.I therefore recommend that they drop all reference to druggability and concentrate on the truly important aspect, which is that KRas has been studied extensively as a model PPI, thus providing an excellent data set for retrospective analysis.

Answer:
We have removed the phrase "undruggable" from the manuscript and replaced it with more precise language.Overall, as suggested by reviewer 1, we have shifted focus toward the c-Raf-RBD:KRas protein-protein interface as a model system, rather than a drug target.In particular, we made large changes to the Section labeled "Computational redesign of the c-Raf-RBD:KRas" in regards to the motivation behind selecting this system.However, we believe that any insights into the PPI mechanism of Ras:Raf recognition through improved mutations is potentially relevant to biomedical understanding, given the intense search for these improved binders previously [2,[4][5][6]8,[10][11][12]18,19,24,26,27,[31][32][33]35,42,[45][46][47].Moreover, all previous improved binders supercharged RBD, limiting their significance, whereas ours make aromatic substitutions.Finally, while the particular c-Raf-RBD variant presented herein is not being considered as a drug candidate, simply attaching this design to a cellpenetrating peptide (e.g., 8R) could allow disruption of effector signalling in cell culture, and is therefore a potential reagent for further characterizing the in vitro signalling of KRas.We submit that it is indeed appropriate to describe such a tool as biomedically relevant, since such a reagent could enable new experiments and better characterization of the interaction between KRas and its effectors, a system that has been extensively studied for purposes of drug discovery [2,[4][5][6]8,[10][11][12]18,19,24,26,27,[31][32][33]35,42,[45][46][47].We have added sentences to this effect in the Discussion.
The narrative of this study is confusing: the authors developed new methods to enable design within large sequence spaces, but in the end, they validated their method by testing it retrospectively against a predetermined set of mutants and in the prospective design of a single-point mutation.It seems that any molecular threading approach and the simplest mutational scanning method (FOLDX?) would be just as useful for this analysis as the newly developed method.This is, in my opinion, a critical point.I don't think that the validation presented here is sufficient and instead, the authors should show that in the design of large ensembles, they recapitulate known sequence signatures (for instance, natural sequence alignments), and they should do this for more than one protein.Such a benchmark would show that the method is indeed fast enough to be practically useful for large sequence spaces and that it yields more than anecdotal successes.

Answer:
The reviewer raises two main points in this comment that we will address separately: 1.The reviewer argues that validation of the accuracy of this method has not been performed adequately.
2. The reviewer argues that validation of the speed of this method has not been performed adequately.
Our response follows: 1.One of the advantages of provable design algorithms is that they are guaranteed to return the optimal result (within user specified error) for a given input model.This is in sharp contrast to stochastic methods, which frequently return different results depending on a variety of factors, including initial state.The experimental accuracy of the K * design algorithm has been validated extensively against experimental data on a variety of systems, including HIV bNAbs, inhibitors for cystic fibrosis, and dihydrofolate reductase, among others [7,9,13,16,[37][38][39]43] using experimental measurements of binding, crystal structures, NMR structures, and in-cell measurements.
For this reason, in addition to testing the accuracy of EWAK * and fries both retrospectively and prospectively on the KRas:RBD interface, we performed 2,826 designs across 40 different protein-protein interfaces and compared the results to those returned by BBK * , which also enjoys the provable guarantees of K * .In addition to achieving excellent prediction accuracy for the KRas:RBD system (See the section entitled "fries/ EWAK * retrospectively predicted the effect of mutations in c-Raf-RBD," fries and EWAK * achieved excellent concordance with BBK * on 40 protein-protein interfaces.Therefore, we argue that our new algorithms are of comparable accuracy with previous provable methods.We have highlighted this point by adding the following to the section titled "EWAK* limits the number of minimized conformations when approximating partition functions while maintaining accurate K* scores" in the revised manuscript: "This indicates that EWAK * retains accuracy when compared to previous provable algorithms, which have been extensively validated using experimental measurements of binding, crystal structures, and NMR structures on a variety of systems [7,9,13,16,[37][38][39]43].The accuracy of EWAK * is explored further in Section 5.1, where we perform additional retrospective validation against experimental measurements." Furthermore, simple mutational scanning methods like ORBIT [8] appear to be less useful than our method in at least one case, evidenced by their inability to predict the improved RBD variant presented in this manuscript.It has also been shown in [41] that Rosetta, a widely-used heuristic sampling method, is unlikely (and possibly unable) to efficiently find the optimal solution of a design problem.Additionally, although FOLDX has been shown to correlate well with experimental measures of binding for point mutations [3,28], that correlation can vanish when considering more than one mutation site [48].In contrast, our method maintains accuracy when considering double and triple mutations (See Figure 5).Furthermore, FOLDX has previously been applied to the KRas/RBD interface [28] but did not discover the c-Raf-RBD(RKY) variant reported herein, suggesting that our methods do in fact provide utility relative to FOLDX.Finally, the ability to perform native sequence recovery does not guarantee the ability to predict mutations that improve binding.For design of kCAL01 [38] and VRC07-523LS [39], and resistance prediction in DHFR [9,37], non-native sequences were required and were found by osprey.These predictions were validated not only biochemically and structurally, but also at an organismal level, in animals, and in humans.
In sum, we believe not only that the most useful validation of algorithm accuracy should be a comparison based on experimental measurements and first principles, but also that this validation has already been done.
2. In order to test the ability of fries and EWAK * to design on large sequence spaces we compared the runtime of our new algorithms against the runtime of BBK * for 2,826 designs across 40 different protein-protein interfaces, with a maximum sequence space size of 9261.We showed that fries and EWAK * ran up to two orders of magnitude faster, and reduced the sequence space by more than two orders of magnitude in the best case.We submit that this validation is sufficient to show that our new algorithms are able to perform more efficiently than the previous state of the art on large sequence spaces.We have emphasized this more throughout the revised manuscript and, in particular, we added the following to the abstract: "This combined approach led to significant speed-ups compared to the previous state-of-the-art multi-sequence algorithm, BBK * , while maintaining its efficiency and accuracy, which we show across 40 different protein systems and a total of 2,826 protein design problems." Our algorithms have been extensively validated against experimental data [7,9,13,16,[37][38][39]43].Because our methods are provable, this validation also applies to fries and EWAK * .This is because provable algorithms are guaranteed to return the same result.Moreover, we validated the accuracy of fries and EWAK * on 2,826 designs.Finally, we believe that our results are sufficient to show that fries and EWAK * are more efficient than the previous state of the art on large sequence spaces.To further emphasize this and the real-world application of the algorithms we have added the following to the Section labeled "fries/EWAK * retrospectively predicted the effect mutations in c-Raf-RBD have on binding to KRas" in the manuscript: "These cases in particular serve as examples of large designs where EWAK * outperforms BBK * and highlight the utility of fries/EWAK * when considering larger designs." Regarding the single-point mutant that the method prospectively designed.This mutant showed at most five-fold improvement in affinity over the starting point engineered variant.This is quite modest to say the least, though the authors trumpet this result as "a discovery of some significance."The tone relating to this mutant should be reduced throughout the paper.

Answer:
We have moderated the tone regarding the designed KRas mutant throughout the paper.However, we believe that it would be overly cautious to describe a 5-fold improvement in affinity as "quite modest to say the least." In a biological system, protein binding and specificity results from complex interactions involving multiple partners, and is therefore context-dependent [40].Therefore we do not find it obvious that the 5-fold improvement in affinity of a c-Raf-RBD variant previously described as a "superbinder" [8] would result in only modest effects.We have refrained from using the term "superbinder" to describe our designed construct, but note that experts in the Ras:Raf field have used this term for comparable improvements in affinity.We submit that the improvement in affinity of 5X is potentially significant, especially since this represents a 36X improvement in affinity over wild-type.
Finally, we present two cases [25,38,39] in which improvements in binding of 3-7X were biologically and physiologically significant.Both of these designs used osprey/K * , and here we believe the comparison will be useful.We are authors on these previous studies.Consider the case of the anti-HIV broadly-neutralizing antibody (bNAb) VRC07.Improvements to VRC07 reported in [39] resulted in the antibody VRC07-523LS, which exhibited only a 3.4-fold improvement over the original VRC07 antibody (calculated using geometric mean of IC50 over a panel of 179 HIV-1 Envelope pseudoviruses).Similarly, the K d was improved by approximately 5X.First, these improvements in affinity resulted in similar (5-8X) improvement in in vivo potency in monkeys [39] for VRC07-523LS.Second, only one clinical trial of VRC07 has been reported (ClinicalTrials.govidentifier: NCT03374202), but 13 clinical trials are listed for VRC07-523LS at the time of writing.These trials include 3 phase II, and very promising results have already been reported in The Lancet [1].Thus, an apparently modest improvement in binding resulted in significantly greater clinical impact.Similarly, with kCAL01 [38], a drug lead for CFTR, osprey/K * improved binding by roughly 7-fold.This improvement was in part due to successful modeling of entropic effects by osprey [25], and resulted in a significant improvement in chloride ion transport in human cell-based models of cystic fibrosis.We showed that, since they lacked the 7X improvement in K d obtained by the osprey/K * designs, previous inhibitors lacked the in vivo efficacy of kCAL01.
The BLI experimental methods are described far too briefly to understand what was actually done and what fits are reported.It is not clear what model was used to fit the data.It looks to me like the authors may have fitted a kinetic model but I'm not sure whether they fitted each curve to a separate kinetic model or all curves at once (the latter is the correct way of analysing these data).The fits to the RK variant (Fig10) are quite poor.Considering that they have 6 independent measurements, and 4 of them show poor fits, I think that this experiment needs to be redone and since this serves as the baseline measurement to judge the impact of the V-¿Y designed mutation, this is quite important.I also suggest to draw the fits in black because some of the traces are quite close in tone to the red used to show the fits.
Answer: We thank the reviewers for these comments, and have added the new SI figures (S2 and S3) to address them.In addition, the BLI methods sections have been re-written to reflect the concerns of the reviewer.In the revised manuscript we clarify that experiments have been done in triplicate for c-Raf-RBD(RK) and c-Raf-RBD(RKY) (see newly added Figs S2 and S3).Moreover, the model used for fitting has been explained as has the fact that curves were globally fit (we agree with the reviewer that this is the proper fitting).More detail has been added to the Section labeled "Experimental validation of mutations in the c-Raf-RBD:KRas protein-protein interface" to hopefully sufficiently describe the experiments.Additionally, the fit is well within what is recommended by the manufacturer for reporting [44].We would also like to note that the K d we determined and reported for the c-Raf-RBD(RK) variant is nearly identical to that reported in the literature [8] which gives us even more confidence in the accuracy of the K d values we report in the manuscript.Also, typically when such modest improvements in affinity are reported (five-fold), and given that the fits are rather poor in the data that were presented, it's important to provide experimental replicates.It is difficult to say whether the five-fold effect is real or within the noise of the experimental setup.

Answer:
We do have experimental replicates that we give data for in the SI, but we did not previously provide plots.We have updated this and added the requested plots to the SI for the important c-Raf-RBD variants, c-Raf-RBD(RKY) and c-Raf-RBD(RK).These two variants are our new designed variant and the previous tightest-known binding variant, respectively.Please see Figs S2 and S3.
Eq. 2: define C,P,L Answer: C,P, and L are defined, but not right next to equation 2. This has been updated and made more clear in the revised manuscript by adding the following just above Equation 2: "...where C, P , and L refer to the protein-ligand complex, the unbound protein, and the unbound ligand, respectively" 2. On pg.17, the authors use Spearman correlations to check the rank order correlation in 38 mutations.Since the computational method reports energies and the binding experiments report KDs, why not use Pearson correlations?

Answer:
Briefly, our current designs likely underestimate entropic contributions to binding, as they only model a subset of biologically available flexibility and do not model explicit solvent molecules.Furthermore, most physical effective energy functions (including the default used in osprey) are based on small-molecule energetics, which can overestimate van der Waals terms and thereby overestimate internal energy.To clarify this we have added the following to the manuscript at the end of the Section labeled "fries/EWAK * retrospectively predicted the effect mutations in c-Raf-RBD have on binding to KRas": "We use Spearman's ρ here as opposed to a Pearson's correlation since our current designs likely underestimate entropic contributions to binding due to various limitations in biological modeling.Despite these model limitations, in [23,37] large changes in K * score corresponded to significant changes in energy, and rankings correlated well with experimental binding measurements.The Spearman's ρ for the study presented here is comparable to the values for other PPI systems when using osprey [23,37].Furthermore, an accurate ranking can guide an experimental lab in choosing the rank order in which to test computational predictions [7, 9, 13-17, 30, 34, 37-39, 43]" Line 403: the authors state that they filtered mutations based on "promising K* scores and structure examination."It's important that they provide some guidelines about how they selected the mutation.This explanation cannot be reproduced by anyone.

Answer:
We added an explanation of mutation filtering to the end of the section titled: "Prospective redesign of the c-Raf-RBD:KRas protein-protein interface toward improve binding." The following has been added to the manuscript to clarify: "Of the mutations selected, T57M was selected to act as a variant that we computationally predicted to be comparable to wild-type.This variant was included to further verify the accuracy of osprey's predictions.On the other hand, some of osprey's top predictions were excluded, for instance, T57R (included in S3 Table ) was not selected for experimental testing because it has an unsatisfied hydrogen bond as evidenced in the structures calculated by osprey.Another example is position V69 where 3 different mutations are predicted to improve binding, however, this position was included in our retrospective study (see Section labeled "fries/EWAK * retrospectively predicted the effect mutations in c-Raf-RBD have on binding to KRas" and Fig 5) and was 1 of only 3 positions where osprey incorrectly predicted the effect of the mutation.Therefore, we do not believe that the scores accurately represent the effect the mutations will have in these few cases.Other excluded top predictions (see S3 Table) displayed unsatisfied hydrogen bonds or have been reported and tested previously [8,10,27].One special case that is not shown in our experimental validation below is V88W which caused poor expression of c-Raf-RBD so we were unable to test it." The single-concentration BLI measurements are very unusual (Fig8).I'm not sure that they are meaningful at all since at a single concentration it's impossible to determine K d and it's not very obvious what is the signal one measures.I recommend to drop this analysis as it is misleading, especially since the authors draw conclusions on the correlation between these experimental results and the computational analysis.

Answer:
We thank the reviewers for their insight.The concerns are well taken and to accommodate the reviewer we no longer approximate binding affinity using data from this assay.We have recreated the figure to focus on response and dissociation rates.Because the c-Raf-RBD variants are of similar size and are all screened at the same concentration, the response rate and dissociation rate can be used for screening purposes, as is well established previously in Refs: [29,36,49].It is important to note that this method is used merely as a screen to filter and pick which mutations to subject to further experimental validation.It is also worth noting that the results of the subsequent titration experiments align excellently with the results of this screen (see Figs 8 and 10).Additionally, we have modified sections "Prospective redesign of the c-Raf-RBD:KRas protein-protein interface toward improved binding" and "Experimental validation of mutations in the c-Raf-RBD:KRas protein-protein interface" to reflect this.
Lines 433-437: The authors make a lot of the five-fold improvement that the single point mutation exhibited.It's a quite modest improvement and seems almost anecdotal given that no other mutants were presented.

Answer:
We have moderated the language in this passage.Please see pages 10-11 above for a discussion on the relevance of a five-fold improvement in binding affinity.
We believe a strength of our method is that only a few mutants need to be created and tested to discover a potentially significant increase in binding.Competing methods might require screening for more mutants, and the benefit of computation is that it can reduce the need for extensive screening when an improvement in binding is desired.

Answer:
A section heading entitled "Experimental materials and methods" has been added to clarify and delineate the methods from the other sections.
475: "we applied these algorithms to a biomedically significant design problem": this is not a biomedically significant design problem, because the design cannot be used in a biomedical context.KRas is biomedically significant drug target but the design is in no way a starting point for a drug.Again, the authors can highlight the datasets that are available for KRas but not the druggability aspects as they have no relevance to the work.502: "the discovery that such a mutation can improve the binding...is of considerable significance...eventually developing successful therapeutics".This is quite spurious.They tested a single mutation which exhibited a very modest effect and has no relevance to drug discovery.

Answer:
We have moderated the tone in the indicated passages, and highlighted the available datasets for KRas.Please see pages 6-7 above for a discussion of the biomedical relevance of the RBD construct, and pages 10-11 above for a discussion of the predicted effects of a 5-fold improvement in binding affinity.

Reviewer 2
The authors present two algorithmic improvements for the K * algorithm for predicting binding affinities at protein/protein interfaces, and then show that their improved algorithm is capable of accurately predicting binding energies with a retrospective analysis of mutations at the cRaf-RBD / KRas interface and by predicting several point mutations that improve binding relative to wild-type, including one mutant, V88Y, that when paired with two previously-reported mutations, creates the tightestyet-known interface between these two proteins.The paper spans from theory to the bench and is an impressive piece of work.
Answer: We thank the reviewer for the interest and positive evaluation of our work.
The two algorithmic improvements the authors describe focus on the two levels at which the K * algorithm broaches an exponential amount of work, 1) at the conformation level, and 2) at the sequence level.
At the conformation level, the K * algorithm is designed to approximate K = q pl /q p q l by enumerating all conformations of a single sequence in order of increasing energy until the conformations remaining are at a high-enough energy that their contribution to the partition function is small -small enough that the resulting K * approximation is within a user-provided constant, epsilon, of the full K.To improve performance here, the authors introduce the EWAK * algorithm.EWAK * instead of enumerating conformations until the epsilon-error threshold is reached, they stop when the bound on the energy of the next conformation is larger than some user-provided deltaE of the best conformation.The authors contrast EWAK * with the previous algorithm, BBK *  , and note that it produces significant performance improvements.
I found this section a little difficult to understand: A) The algorithmic improvement presented in the BBK * paper from 2017 focuses entirely on sequence-space pruning and not conformation-space pruning, and so EWAK * seems like it ought to be compared to K * (or that the text should say that BBK * and K * are the same for the sake of this comparison).B) It would be nice to understand how the user providing an energy threshold "1 kcal/mol" differs from the user changing epsilon to ".05" from "0.01".Is there a simple mathematical relationship between these numbers or are they related but not comparable?

Answer:
A) We have added language clarifying the relationship between the K * and BBK * algorithms to the introduction and a more detailed discussion to the SI.Briefly, BBK * calls a partition function approximation algorithm as a subroutine.In the submitted manuscript, we compare the previous partition function approximation algorithm (K * ) with the new one (EWAK * ).
B) The reviewer is correct in noting that the energy threshold and epsilon values are related, since increasing the energy threshold must decrease the value of epsilon, and vice versa.Unfortunately, the precise function relating these two values is not entirely simple to calculate.We have added a discussion of this to the SI, and have added a sentence to the end of the section "Energy Window Approximation to K * (EWAK * )" pointing the interested reader to this discussion.
At the sequence level, the authors present an algorithm, fries, for removing sequences that are higher in energy than the wild type sequence.Here both fries and BBK * enumerate sequences in order of decreasing bounds on their energies.With fries, the intuition on when to stop is that, after the wild-type sequence is encountered at an interface where one is looking for tighter binding, then there is not much point in continuing.
I found this section slightly confusing because it's not clear whether the energies are for the complex structure, the unbound structure, or both.It's also unclear why fries searches the multi-sequence tree until the wild type sequence is hit instead of going directly to the (clearly known!) wild type sequence.I believe what the authors mean to describe is that fries searches the multi-sequence tree and descends into the single-sequence conformation tree for each sequence it encounters until it hits the wild type sequence, after which point it begins looking to stop sequence enumeration.

Answer:
To clarify, BBK * does not have a comparable sequence filtering process.BBK * performs its search on the set of all conformations (which contains all possible sequences) to approximate K * scores.On the other hand, fries serves as a pre-process filter that limits the number of sequences that then go on to be included in the K * score approximations performed by EWAK * .fries does 3 separate searches, one for the complex, one for the protein, and one for the ligand.The sequences that remain in the intersection of all three filtered sets are the ones that are kept and further processed by EWAK * .This has been clarified in the Section labeled "Fast Removal of Inadequately Energied Sequences (fries)" with the following additional text: "The following algorithm is applied to each of the three states (protein, ligand, and protein-ligand complex) independently.The resulting, filtered sequence space is determined by taking the intersection of the output from the algorithm for the three states." The purpose of searching until finding the wild-type sequence is that it allows fries to collect all of the sequences that have potentially better, lower energies.The energy that fries enumerates by is the lower-bound on the minimized energy for each sequence as described in [20].Hence, fries enumerates up to the wild-type sequence and, as it does so, enumerates and keeps low-energy sequences.{Every sequence is either enumerated or provably pruned.If we merely calculated the energy of the wild-type sequence then we would not know the identity of the other low-energy sequences in the tree.Additionally, fries only descends into the single-sequence conformation tree (as described in the Section labeled "Fast Removal of Inadequately Energied Sequences (fries)") for the wild-type sequence (it does not search this conformation tree for any other sequence) in order to find the energy of one specific, low-energy conformation of the wild-type sequence to provably bound the energies of the remaining sequences to be enumerated as described in the section labeled "Fast Removal of Inadequately Energied Sequences (fries)."To clarify this, we have added the following text to the Section labeled "Fast Removal of Inadequately Energied Sequences (fries)": "It is worth noting that fries only descends into and searches the single-sequence conformation tree for the wild-type sequence in order to calculate the provable halting criteria for Eq (3)."I would also be curious to understand why the authors chose to approximate q wt using only a single conformation of the wild-type sequence instead of trying to estimate q wt accurately; the wild type sequence is just one more sequence among the very many sequences that fries and BBK * would enumerate.
Answer: We thank the reviewer for their curiosity and interest in the algorithms.For the purposes of fries, which is meant to be a very fast way of identifying low-energy sequences, it is not necessary to more accurately approximate the partition function for q wt since it is used merely as a lower bound to establish our halting criteria for the sequence search.Estimating the partition function using a single low-energy conformation allows us to put a lower-bound on the partition function for the wild-type sequence so we can set a cut-off for the partition functions of the other low-energy sequences (see Equations 5 and 6 and accompanying text in the Section labeled "Fast Removal of Inadequately Energied Sequences (fries)").Once these sequences are identified, EWAK * is used to more accurately predict the values of the partition functions for each sequence including the wild-type sequence.
Minor point: on page 5 the phrase "by up to more than 2 orders of magnitude" confuses me; is it "by up to 2 orders of magnitude" or "more than 2 orders of magnitude"?Answer: We agree with the reviewer that the language is a bit confusing and have changed the line to say "by up to 2 orders of magnitude" although the improvement is slightly larger than 2 orders of magnitude in the best case.We feel this point is emphasized enough in the Section labeled "Computational results" so we are happy to change the language here to be more clear.