The peptides derived from envelope proteins have been shown to inhibit the protein-protein interactions in the virus membrane fusion process and thus have a great potential to be developed into effective antiviral therapies. There are three types of envelope proteins each exhibiting distinct structure folds. Although the exact fusion mechanism remains elusive, it was suggested that the three classes of viral fusion proteins share a similar mechanism of membrane fusion. The common mechanism of action makes it possible to correlate the properties of self-derived peptide inhibitors with their activities. Here we developed a support vector machine model using sequence-based statistical scores of self-derived peptide inhibitors as input features to correlate with their activities. The model displayed 92% prediction accuracy with the Matthew’s correlation coefficient of 0.84, obviously superior to those using physicochemical properties and amino acid decomposition as input. The predictive support vector machine model for self- derived peptides of envelope proteins would be useful in development of antiviral peptide inhibitors targeting the virus fusion process.
Citation: Xu Y, Yu S, Zou J-W, Hu G, Rahman NABD, Othman RB, et al. (2015) Identification of Peptide Inhibitors of Enveloped Viruses Using Support Vector Machine. PLoS ONE 10(11): e0144171. https://doi.org/10.1371/journal.pone.0144171
Editor: Massimiliano Galdiero, Second University of Naples, ITALY
Received: April 24, 2015; Accepted: November 13, 2015; Published: December 4, 2015
Copyright: © 2015 Xu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: These authors have no support or funding to report.
Competing interests: The authors have declared that no competing interests exist.
Fusion process is the initial step of viral infection, therefore targeting the fusion process represents a promising strategy in design of antiviral therapy . The entry step involves fusion of the viral and the cellular receptor membranes, which is mediated by the viral envelope (E) proteins. There are three classes of envelope proteins : Class I E proteins include influenza virus (IFV) hemagglutinin and retrovirus Human Immunodeficiency Virus 1 (HIV-1) gp41; Class II E proteins include a number of important human flavivirus pathogens such as Dengue virus (DENV), Japanese encephalitis virus (JEV), Yellow fever virus (YFV), West Nile virus (WNV), hepatitis C virus (HCV) and Togaviridae virus such as alphavirus Semliki Forest virus (SFV); Class III E proteins include vesicular stomatitis virus (VSV), Herpes Simplex virus-1 (HSV-1) and Human cytomegalovirus (HCMV). Although the exact fusion mechanism remains elusive and the three classes of viral fusion proteins exhibit distinct structural folds, they may share a similar mechanism of membrane fusion .
A peptide derived from a protein-protein interface would inhibit the formation of that interface by mimicking the interactions with its partner proteins, and therefore may serve as a promising lead in drug discovery . Enfuvirtide (T20), a peptide that mimicks the HR2 region of Class I HIV-1 gp41, is the first FDA-approved HIV-1 fusion drug that inhibits the entry process of virus infection [5–7]. Then peptides mimicking extended regions of the HIV-1 gp41 were also demonstrated as effective entry inhibitors [8, 9]. Furthermore, peptides derived from a distinct region of GB virus C E2 protein were found to interfere with the very early events of the HIV-1 replication cycle . Other successful examples of Class I peptide inhibitors include peptide inhibitors derived from SARS-CoV spike glycoprotein [11–13] and from Pichinde virus (PICV) envelope protein . Recently, a peptide derived from the fusion initiation region of the glycoprotein hemagglutinin (HA) in IFV, Flufirvitide-3 (FF-3) has progressed into clinical trial .
The success of developing the Class I peptide inhibitors into clinical use has triggered the interests in the design of inhibitors of the Class II and Class III E proteins. e.g. several hydrophobic peptides derived from the Class II DENV and WNV E proteins exhibited potent inhibitory activities [16–20]. In addition, a potent peptide inhibitor derived from the domain III of JEV glycoprotein and a peptide inhibitor derived from the stem region of Rift Valley fever virus (RVFV) glycoprotein were reported [21, 22]. Examples of the Class II peptide inhibitors of enveloped virus also include those derived from HCV E2 protein [23, 24] and from Claudin-1, a critical host factor in HCV entry . Moreover, peptides derived from the Class III HSV-1 gB also exhibited antiviral activities [26–31], as well as those derived from HCMV gB .
Computational informatics plays an important role in predicting the activities of the peptides generated from combinatorial libraries. In silico methods such as data mining, generic algorithm and vector-like analysis were reported to predict the antimicrobial activities of peptides [33–35]. In addition, quantitative structure-activity relationships (QSAR) [36–40] and artificial neural networks (ANN) were applied to predict the activities of peptides [41, 42]. Recently, a support vector machine (SVM) algorithm was employed to predict the antivirus activities using the physicochemical properties of general antiviral peptides . However, the mechanism of action of antiviral peptides is different from antimicrobial peptides; in fact, various protein targets are involved in the virus infection. e.g. HIV-1 virus infection involves virus fusion, integration, reverse transcription and maturation, etc. Thus it is difficult to retrieve the common features from general antiviral peptides to represent their antiviral activities. Virus fusion is mediated by E proteins. Although E proteins are highly divergent in sequence and structure, they share a common pathway of membrane fusion dynamics. i.e. E proteins experience significant conformational change to form a-trimer-of-hairpin, which drives the fusion of viral membrane and host membrane . The antiviral peptides derived from enveloped proteins function by in situ binding to their respective accessory proteins, disrupting forming of the trimer-of-hairpin and membrane fusion, and therefore inhibiting the virus infection. In view of the important role of E proteins in virus fusion process and common mechanism of action of self-derived peptides, we developed a SVM model to predict the antiviral activities of self-derived peptides using sequence-based statistical scores as input features. The sequence-based properties were calculated by a conditional probability discriminatory function which indicates the propensity of each amino acid for being active at a specific position. Our model exhibited remarkably higher accuracy in predicting the activities of self-derived peptides, compared to the previous models developed for general antiviral peptides using classical physicochemical properties as descriptors . The method would be useful in identification of entry inhibitors as a new generation of antiviral therapies.
202 peptide virus entry inhibitors of enveloped viruses were collected, among them, 101 are active peptides and 101 are non-active peptides. These peptides comprised the 75p+75n training set of SVM models. The remaining 26 active peptides and 26 non-active peptides inhibitors were used as the test set.
Five physicochemical properties were used in SVM models. Isoelectric point (PI), Molecular weight (MW) and Grand average of hydropathicity (GRAVY)  were calculated using the Protparam tool implemented in Expasy web server. Solvent accessibility and secondary structure features were calculated using SSpro and ACCpro packages implemented in the SCRATCH protein predictor server .
Sequence-based statistical scoring function.
The knowledge-based statistical function is developed from the concept of residue-specific all-atom probability discriminatory function (RAPDF) . RAPDF is a structure-based statistical scoring function. It is based on the assumption that averaging over different atom types in experimental conformations is an adequate representation of the random arrangements of these atom types in any compact conformation. Here we developed a sequence-based statistical scoring function, where we presume that averaging over different amino acid sequences with experimental validated inhibitive activities is an adequate representation of the random amino acid sequences with any inhibitory activity. The basis of this assumption is that the peptides share a common mechanism of action, i.e. the peptides derived from E proteins bind competitively to their partner proteins, disrupt the forming of a-trimer-of-hairpin, and therefore inhibit the virus membrane fusion.
is the probability of observing amino acid i in an active peptide sequence;
Nobs(i,a): The number of observed amino acid i within active peptides.
Nobs(i): The number of observed amino acid i within active peptides and non-active peptides.
Nobs(a): The number of observed amino acid types within active peptides.
Ntotal: The number of observed amino acid types within active peptides and non-active peptides.
Similarly, we employed a dataset of experimentally verified non-active peptides in developing the statistical function, where .
For a given amino acid sequence, 20 columns of input are generated, corresponding to the occurrence of twenty natural amino acids at each position. Each column is assigned a value of N * (−log–likelihood), where N is the number of amino acid and −log–likelihood is derived from the statistical function score. Each of the features thus combines the propensity of the amino acid for being active or non-active with the corresponding amino acid composition.
Below is an example of calculating the statistical scores for a given peptide sequence:
The amino acid order for SVM input features is set as:
If the amino acid sequence of an active peptide inhibitor is:
the statistical N values of the sequence would be:
The scores in the statistical function library based on the active peptide inhibitors are decided by Eq (1): -0.0856, 0.5057, 0.4740, 0.4133, -0.0856, -0.0856, 0.6439, 0.2508, 0.9440, -0.4670, 1.8603, 0.1330, 0.2261, -0.0115, 0.2761, 0.3288, 0.0479, -0.1207, 0.0079, 0.6816,
Therefore, the 20 SVM input features for the sequence would be: -0.1712, 1.0114, 0.4740, 0, -0.2568, -0.2568, 0, 0, 0, 0, 0, 0.1330, 0.6783, -0.0115, 0, 0, 0, -.3621, 0.0237, 0.
SVM Parameter Optimization
SVM models combined with radial basis function (RBF) kernel parameters were developed using the C-SVC module in LIBSVM (version 3.1) [48, 49] and executed under the Matlab interface. The performance of SVM depends on two parameters, gamma -g and cost–c . The default value is 1 for -c and 1/k for -g, where k is the number of input entries. Various pairs of (c, g) values were converted to exponential values (i.e. 2x;2y) and optimized using cross-validation and the pair with the best cross-validation accuracy was selected.
5-fold cross validation was performed to evaluate the performance of SVM models. In the evaluation process, dataset was partitioned randomly into five equally sized subsets. The training and testing were carried out five times, each time four distinct subsets being used as training sets and the remaining subset as test set. The results were averaged over all five rounds of validation. The following equations were used to evaluate the prediction quality of the SVM models [48, 51]:
In the above equations, TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives and FN is the number of false negatives. Matthew’s correlation coefficient (MCC) reflects the performance of the model. It ranges between -1 to 1 and a larger MCC value indicates a better prediction.
Results and Discussion
SVM learning algorithm is a powerful machine learning method that has been widely used in pattern recognition and classification. SVM trains a dataset of experimentally validated positive and negative samples and generates a classifier to classify unknown samples into two distinct categories (positive or negative).
Collection of dataset
We performed an exhaustive literature search on self-derived peptide inhibitors of enveloped proteins and collected experimentally validated peptides derived from the three classes of E proteins. For those peptides with overlapping segments, only one peptide sequence was kept. 202 peptides were found, among them, 101 are active peptides and 101 are non-active peptides (Table 1). 75 active peptide inhibitors and 75 non-active peptides (75p+75n) of E proteins were used as the training dataset in SVM learning; the remaining 26 active and 26 non-active peptides (26p+26n) were used as the test set.
SVM input features.
Three SVM models were developed using different features as input descriptors, namely physicochemical properties (denoted as EAPphysico), amino acid composition (EAPcompo) and statistical scoring function amino acid composition (EAPscoring).
Knowledge-based statistical functions are rooted in the Bayesian (conditional) probability formalism and derived directly from properties observed in the known folded proteins [52–54]. In knowledge-based scoring function, it was presumed that averaging over different atom types in experimental conformations is an adequate representation of the random arrangements of these atom types in any compact conformation . Because the three classes of E proteins have different structural folds, it is difficult to retrieve a structure-based feature that is relevant to their antiviral activities. Generally speaking, any property associated with folded proteins can be converted into an energy function . Since amino acid sequence determines the structural folds and properties of proteins/peptides, we presumed that a sequence-based statistical scoring function averaging over different amino acid sequences exhibiting inhibitive activities is an adequate representation of the random combinations of all twenty amino acid exhibiting any activity. In this approach, a peptide sequence derived from E protein is represented by twenty features each corresponding to the propensity of observing each of the twenty natural amino acids to be either active or non-active. A vector space of twenty sequence-based statistical scores was used as the EAPscoring input entries in the SVM learning.
We also built a SVM model using physicochemical properties as input features. Because of the feature of membrane fusion process, it was suggested that functional regions in glycoproteins need to be solvent accessible, hydrophobic and flexible . Actually the majority of known peptide entry inhibitors share a common physicochemical property of being hydrophobic and amphipathic with a propensity for binding to lipid membranes . Therefore, here the properties of E peptide inhibitors were described by five physicochemical parameters: PI, MW, GRAVY index (positive and negative GRAVY values indicate hydrophobic and hydrophilic peptides, respectively), solvent accessibility (exposed or buried) and secondary structure features (propensity for adopting α-helix, β-sheet or turn structure). These physicochemical features were calculated for each of the peptides and used as the EAPphysico input entries in the SVM learning. A third SVM model EAPcompo was also built where the fractions of amino acids in a peptide were used as input features in the machine learning process.
The SVM models were trained using the experimentally validated 75p+75n data sets. During 5-fold cross validation, the training set was randomly partitioned into four subsets with equal size of (15p+15n) and a remaining subset (15p+15n). Three SVM models were built using sequence-based statistical scores, physicochemical properties and amino acid composition, respectively. The performances of the three models are shown in Table 2. It can be seen that the EAPscoring model performed best among the three models during 5-fold cross validation. A "grid-search" combined with cross-validation was adopted to search for the optimal parameters -c and -g in SVM models . The result of the grid search is shown in the support information (S1 File). It is shown that the performances of three EAP models during 5-fold cross validation have been improved significantly using the optimized parameters (Table 2).
Evaluation of the predictive efficiency of SVM models on independent test set
The performance of the SVM models was evaluated using an independent dataset of experimentally validated peptides that were not contained in the learning dataset (Table 1). In the EAPphysico model where physicochemical properties of peptides were used as input features, an accuracy of 65% with a MCC value of 0.31 was observed (Table 3). In the EAPcompo model where amino acid composition features were used, the predictive accuracy and the MCC value are slightly higher. When the sequence-based statistical function scores were used as input in the EAPscoring model, a remarkable accuracy of 92% was achieved with a MCC value of 0.84. Thus the sequence-based statistical scores developed in the present research are predominantly superior to the conventional physicochemical properties or amino acid decomposition features in identifying active peptides derived from enveloped proteins.
Comparison of the predictive efficiency of the AVP and EAP Models
AVPpred is a web server for prediction of the activities of general antiviral peptides (AVPs) based on a number of experimentally validated positive and negative data sets . The peptide inhibitors employed in AVPpred target a variety of biological targets involved in virus infection. In contrast, the self-derived peptides of enveloped proteins being studied in the present research competitively bind to E proteins so as to mediate the virus fusion process. Because the self-derived peptides share similar mechanism of action, it is feasible to retrieve common features from them to build predictive SVM models. In order to evaluate the performance in predicting peptide inhibitors of the enveloped virus, we compared the AVPpred models with our EAPpred models using an independent 26p+26n dataset as test set. The results are shown in Table 3.
Four different features were employed in the AVPpred models, namely conserved motif search using MEME/MAST, amino acid composition, sequence alignment using BLAST and physicochemical parameters including secondary structure, charge, size, hydrophobicity and amphiphilic character . When the AVPmotif model was used to predict the activities of the self-derived peptide inhibitors, it performed rather poorly with accuracy of 52% and MCC of 0.14. This is not surprising because AVPmotif was developed based on 20 general antiviral peptide motifs. However, the self-derived peptide inhibitors may not share a conserved motif with the general antiviral peptides since the latter interact with various biological targets with different mechanisms of action. In the AVPalign model, the peptide sequences were classified into active and non-active databases and the query peptide sequences were matched against the active and non-active databases using the BLAST program. Compared with AVPcompo and AVPphysico, AVPalign performed better with a predictive accuracy of 73% and MCC value of 0.52. Fusion mechanism is highly conserved among related viruses and entry of viruses into host cells has been inhibited by peptides derived from various regions of envelope glycoproteins . Self-derived peptides would inhibit interactions of their original domain by mimicking its mode of binding to partner proteins . Because similar sequences are often associated with similar structure and function, the sequence-based property AVPalign would account for the activities of the self-derived peptide inhibitors which regulate the virus fusion by mimicking the binding to E proteins.
In the AVPphysico model, 25 best performing physicochemical properties were selected out of the 544 properties to build the SVM model . Antiviral peptide inhibitors are generally amphiphilic  and the activities of peptide entry inhibitors are dependent on their interfacial hydrophobicity . Therefore we only employed five physicochemical properties reflecting hydrophobicity, solvent accessibility and secondary structure features as SVM input features. It was demonstrated that the accuracy and MCC of EAPphysico is comparable to that of AVPphysico model, indicating the five properties used in current modeling building are critical for their activities.
The MCC value of the AVPcompo models is 0.20, indicating that the antiviral activities of the peptides are related to amino acid composition. When the amino acid composition was used as input, the predictive accuracy of the EAPcompo model was higher than that of the AVPcompo model, indicating the peptide inhibitors of E proteins employed in the training set is sufficient to represent the contribution of amino acid composition to their inhibitive activities. In the EAPcompo model, the preference of the amino acid composition was ranked as: P, R, Q, D, F, W, E, L, T, I, N, H, Y, C, A, S, M, V, K, G (Fig 1). The role of arginine-arginine pairing and its contribution to protein-protein interactions has been investigated by computational approaches . The higher abundance of R at protein-protein interfaces compared to K may be attributed to the formation of cation-π-interactions and the greater capacity of the guanidinium group in R to form hydrogen bonds (compared to K) [62–64]. Furthermore, it was suggested that the interface regions are enriched in aliphatic (L, V, I, M) and aromatic (H, F, Y, W) residues and depleted in charged residues (D, E, K) with the exception of arginine [62, 65–69]. This is in agreement with our amino acid composition analysis, where higher population of aliphatic Leu residue as well as aromatic residues Trp and Phe was observed, whereas positively charged Lys was hardly observed. The predominant occurrence of proline and glutamine residues is characteristic for the unique protein-protein interactions for E proteins. e.g. a conserved proline-rich motif was suggested to be engaged in monomer-monomer interactions in Dengue E proteins . A conserved glutamine-rich layer is involved in the extensive H-bond network in HIV-1 gp41 E proteins . Thus the preference of the amino acid composition identified from the EAPcompo model is generally in accordance with the predominant residues involved in protein-protein interactions, manifesting the amino acid composition of the self- derived peptide inhibitors are closely related to their potential activities in mediating the protein-protein interactions in the virus fusion process.
X-axis is the type of amino acid, Y-axis is W * W.
Because the antiviral activities of peptides are dependent on amino acid composition, we presume amino acid composition discriminated by the propensity of their activities would be an intrinsic feature in the self-derived peptide inhibitors which share a common mechanism of action. When statistical function scores were employed in the SVM model (EAPscoring), a remarkable predictive accuracy of 92% with an ideal MCC value of 0.84 was achieved, significantly better than any AVP models. The logarithm form of the discriminatory function (Eq 1) can be deemed as the pseudo energy of the system. In our previous study, we suggested that the stability of proteins is related to their in situ binding potential to the partner regions . The prominent performance of EAPscoring model indicates the sequence-based stability feature of self-derived peptides may reflect their potential of binding to E proteins so as to regulate the virus entry process.
We developed three SVM models using physicochemical properties, amino acid composition and statistical discriminative function as input features. The prediction accuracy and the MCC value of the EAPphysico model where five physicochemical properties were employed are comparable with the previous AVPphysico model where 25 physicochemical properties were used. The AVPcompo and EAPcompo models demonstrated that the activities of antiviral peptides are dependent on amino acid composition. A sequence-based scoring function was developed for the self-derived peptide inhibitors of E proteins. The outperformance of the EAPscoring models supports our hypothesis that an intrinsic feature, represented by the propensity of each amino acid for being active in self-derived peptides, is responsible for the activities of the peptides to regulate virus fusion by mimicking the binding to their accessory proteins. The sequence-based statistical scoring function would be useful in development of novel antiviral therapies to target the initial step of viral infection.
S1 File. Parameters optimization by Grid-research combined with 5-fold cross validation.
x-axis is log2g, y is log2c and z-axis represents accuracy(%) (Figure A) Parameters Optimization for EAPphysico model. (Figure B) Parameters Optimization for EAPcompo model. (Figure C) Parameters Optimization for EAPscoring model.
The authors are grateful for the computing resources from QUB high performance computing Centre. The authors declare no conflict of interest.
Conceived and designed the experiments: YX MH. Performed the experiments: YX MH JZ GH SY. Analyzed the data: YX MH JZ GH SY NR RO XT. Wrote the paper: MH.
- 1. Teissier E, Penin F, Pécheur EI. Targeting cell entry of enveloped viruses as an antiviral strategy. Molecules. 2011; 16: 221–250.
- 2. Backovic M, Jardetzky TS. Class III viral membrane fusion proteins. Curr Opin Struct Biol. 2009; 19: 189–196. pmid:19356922
- 3. Kielian M, Rey FA. Virus membrane-fusion proteins: more than one way to make a hairpin. Nat Rev Microbiol. 2006; 4: 67–76. pmid:16357862
- 4. London N, Raveh B, Movshovitz-Attias D, Schueler-Furman O. Can self-inhibitory peptides be derived from the interfaces of globular protein-protein interactions? Proteins. 2010; 78: 3140–3149. pmid:20607702
- 5. Qureshi NM, Coy DH, Garry RF, Henderson LA. Characterization of a putative cellular receptor for HIV-1 transmembrane glycoprotein using synthetic peptides. Aids. 1990; 4: 553–558. pmid:1974767
- 6. Wild C, Dubay JW, Greenwell T, Baird T Jr, Oas TG, McDanal C, et al. Propensity for a leucine zipper-like domain of human immunodeficiency virus type 1 gp41 to form oligomers correlates with a role in virus-induced fusion rather than assembly of the glycoprotein complex. Proc Natl Acad Sci USA. 1994; 91: 12676–12680. pmid:7809100
- 7. Wild C, Greenwell T, Matthews T. A synthetic peptide from HIV-1 gp41 is a potent inhibitor of virus-mediated cell-cell fusion. AIDS Res Hum Retroviruses.1993; 9: 1051–1053. pmid:8312047
- 8. Liu S, Jing W, Cheung B, Lu H, Sun J, Yan X, et al. HIV gp41 C-terminal heptad repeat contains multifunctional domains. Relation to mechanisms of action of anti-HIV peptides. J Biol Chem. 2007; 282: 9612–20. pmid:17276993
- 9. Egelhofer M, Brandenburg G, Martinius H, Schult-Dietrich P, Melikyan G, Kunert R, et al. Inhibition of human immunodeficiency virus type 1 entry in cells expressing gp41-derived peptides. J Virol. 2004;78:568–75. pmid:14694088
- 10. Koedel Y, Eissmann K, Wend H, Fleckenstein B, Reil H. Peptides derived from a distinct region of GB virus C glycoprotein E2 mediate strain-specific HIV-1 entry inhibition. J Virol. 2011; 85: 7037–7047. pmid:21543477
- 11. Sainz B Jr, Mossel EC, Gallaher WR, Wimley WC, Peters CJ, Wilson RB, et al. Inhibition of severe acute respiratory syndrome-associated coronavirus (SARS-CoV) infectivity by peptides analogous to the viral spike protein. Virus Res. 2006; 120: 146–55. pmid:16616792
- 12. Zheng BJ, Guan Y, Hez ML, Sun H, Du L, Zheng Y, et al. Synthetic peptides outside the spike protein heptad repeat regions as potent inhibitors of SARS-associated coronavirus. Antivir Ther. 2005; 10: 393–403. pmid:15918330
- 13. Yuan K, Yi L, Chen J, Qu X, Qing T, Rao X, et al. Suppression of SARS-CoV entry by peptides corresponding to heptad regions on spike glycoprotein. Biochem Biophys Res Commun. 2004; 319:746–52. pmid:15184046
- 14. Spence J. Design and Characterization of Glycoprotein-derived Peptide Inhibitors of Arena Virus Infection. PhD. Dissertation, Tulane University. 2013.
- 15. Autoimmune Technologies, Safety, Tolerability, and PK of Escalating Doses of Flufirvitide-3 Dry Powder for Inhalation in Healthy Subjects. 18 Nov 2013. Available: http://clinicaltrials.gov/ct2/show/NCT01990846?term=Flufirvitide&rank=2%202013.
- 16. Hrobowski YM, Garry RF, Michael SF. Peptide inhibitors of dengue virus and West Nile virus infectivity. Virol J. 2005; 2: 1–10.
- 17. Costin JM, Jenwitheesuk E, Lok SM, Hunsperger E, Conrads KA, Fontaine KA, et al. Structural optimization and De Novo design of dengue virus entry inhibitory peptides. PLOS Neglected Tropical Diseases. 2010; 4: 1–11.
- 18. Bai F, Town T, Pradhan D, Cox J, Ashish Ledizet M, Anderson JF, et al. Antiviral Peptides Targeting the West Nile Virus Envelope Protein. J Virol. 2007; 81: 2047–2055. pmid:17151121
- 19. Schmidt AG, Yang PL, Harrison SC. Peptide inhibitors of dengue virus entry target a late-stage fusion intermediate. PLoS Pathog. 2010; 6: e1000851. pmid:20386713
- 20. Alhoot M, Rathinam A, Wang S, Manikam R, Sekaran S. Inhibition of Dengue Virus Entry into Target Cells Using Synthetic Antiviral Peptides. International Journal of Medical Sciences. 2013; 10: 719–729. pmid:23630436
- 21. Li C, Zhang L, Sun M, Li P, Huang L, Wei J, et al. Inhibition of Japanese encephalitis virus entry into the cells by the envelope glycoprotein domain III (EDIII) and the loop3 peptide derived from EDIII. Antiviral Research. 2012; 94: 179–183. pmid:22465300
- 22. Koehler J, Smith J, Ripoll D, Spik K, Taylor S, Badger C, et al. A Fusion-Inhibiting Peptide against Rift Valley Fever Virus Inhibits Multiple, Diverse Viruses. PLOS Neglected Tropical Diseases. 2013; 7: 1–11
- 23. Liu R, Tewari M, Kong R, Zhang R, Ingravallo P, Ralston R. A peptide derived from hepatitis C virus E2 envelope protein inhibits a post-binding step in HCV entry. Antiviral Res. 2010; 86: 172–179. pmid:20156485
- 24. Sabahi A. Early Events in Hepatitis C Virus Infection: An Interplay of Viral Entry, Decay and Density. PhD. Dissertation, Tulane University. 2008.
- 25. Si Y, Liu S, Liu X, Jacobs J, Cheng M, Niu Y, et al. Human Claudin-1–Derived Peptide Inhibits Hepatitis C Virus Entry. HEPATOLOGY. 2012; 56: 507–515. pmid:22378192
- 26. Galdiero S, Falanga A, Vitiello M, D’Isanto M, Cantisani M, Kampanaraki A, et al. Peptides containing membraneinteracting motifs inhibit herpes simplex virus type 1 infectivity. Peptides. 2008; 29: 1461–1471. pmid:18572274
- 27. Galdiero S, Falanga A, Vitiello M, D’Isanto M, Collins C, Orrei V, et al. Evidence for a role of the membrane- proximal region of herpes simplex virus type 1 glycoprotein H in membrane fusion and virus inhibition. Chembiochem. 2007; 8:885–895. pmid:17458915
- 28. Galdiero S, Vitiello M D'Isanto M, Falanga A, Cantisani M, Browne H, et al. The identification and characterization of fusogenic domains in herpes virus glycoprotein B molecules. Chembiochem. 2008; 9: 758–67. pmid:18311743
- 29. Galdiero S, Vitiello M, D'Isanto M, Falanga A, Collins C, Raieta K, et al. Analysis of synthetic peptides from heptad-repeat domains of herpes simplex virus type 1 glycoproteins H and B. J Gen Virol. 2006;87:1085–97. pmid:16603508
- 30. Cantisani M, Falanga A, Incoronato N, Russo L, De Simone A, Morelli G, et al. Conformational modifications of gB from herpes simplex virus type 1 analyzed by synthetic peptides. J Med Chem. 2013; 56:8366–76. pmid:24160917
- 31. Akkarawongsa R, Pocaro NE, Case G, Kolb AW, Brandt CR. Multiple peptides homologous to herpes simplex virus type 1 glycoprotein B inhibit viral infection. Antimicrobial Agents and Chemotherapy. 2009; 53: 987–996. pmid:19104014
- 32. Melnik LI, Garry RF, Morris CA. Peptide inhibition of human cytomegalovirus infection. Virol J. 2011; 8: 1–11.
- 33. Torrent M, Nogues VM, Boix E. A theoretical approach to spot active regions in antimicrobial proteins. BMC Bioinformatics. 2009; 10: 1–9.
- 34. Fjell CD, Jenssen H, Cheung WA, Hancock RE, Cherkasov A. Optimization of antibacterial peptides by genetic algorithms and cheminformatics. Chem Biol Drug Des. 2011; 77: 48–56. pmid:20942839
- 35. Lata S, Mishra NK, Raghava GP. AntiBP2: improved version of antibacterial peptide prediction. BMC Bioinformatics. 2010; 11(Suppl 1): S19. pmid:20122190
- 36. Cherkasov A, Jankovic B. Application of ‘inductive’ QSAR descriptors forquantification of antibacterial activity of cationic polypeptides. Molecules. 2004; 9: 1034–1052. pmid:18007503
- 37. Frecer V. QSAR analysis of antimicrobial and haemolytic effects of cyclic cationic antimicrobial peptides derived from protegrin-1. Bioorg Med Chem. 2006; 14: 6065–6074. pmid:16714114
- 38. Taboureau O, Olsen OH, Nielsen JD, Raventos D, Mygind PH, Kristensen HH. Design of novispirin antimicrobial peptides by quantitative structure-activity relationship. Chem Biol Drug Des. 2006; 68: 48–57. pmid:16923026
- 39. Jenssen H, Fjell CD, Cherkasov A, Hancock RE. QSAR modeling and computer-aided design of antimicrobial peptides. J Pept Sci. 2008; 14: 110–114. pmid:17847019
- 40. Fjell CD, Jenssen H, Hilpert K, Cheung WA, Pante N, Hancock RE, et al. Identification of novel antibacterial peptides by chemoinformatics and machine learning. J Med Chem. 2009; 52: 2006–2015. pmid:19296598
- 41. Frecer V, Ho B, Ding JL. De novo design of potent antimicrobial peptides. Antimicrob Agents Chemother. 2004; 48: 3349–3357. pmid:15328096
- 42. Jenssen H, Lejon T, Hilpert K, Fjell CD, Cherkasov A, Hancock RE. Evaluating different descriptors for model design of antimicrobial peptides with enhanced activity toward P. aeruginosa. Chem Biol Drug Des. 2007; 70: 134–142. pmid:17683374
- 43. Thakur N, Qureshi A, Kumar M. AVPpred: collection and prediction of highly effective antiviral peptides. Nucleic Acids Res. 2012; 40 (Web Server issue): W199–204. pmid:22638580
- 44. White JM, Delos SE, Brecher M, Schornberg K. Structures and mechanisms of viral membrane fusion proteins: multiple variations on a common theme. Crit Rev Biochem Mol Biol. 2008; 43: 189–219. pmid:18568847
- 45. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982; 157: 105–132. pmid:7108955
- 46. Cheng J, Randall A, Sweredoski M, Baldi P. SCRATCH: a Protein Structure and Structural Feature Prediction Server. Nucleic Acids Research. 2005; 33 (web server issue): W72–76. pmid:15980571
- 47. Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol. 1998; 275: 893–914.
- 48. Cortes C, Vapnik V. Support-Vector Networks. Machine Learning. 1995; 20: 1–31.
- 49. Fan R-E, Chen P-H, Lin C-J. Working set selection using second order information for training SVM. Journal of Machine Learning Research. 2005; 6: 1889–1918,
- 50. Hsu C-W, Chang C-C, Lin C-J. A Practical Guide to Support Vector Classification. Initial version: 2003 Last updated: 15 April 2010; 1–16. Available: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
- 51. Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta. 1975; 405: 442–451. pmid:1180967
- 52. Zhou H, Zhou Y. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci. 2002; 11: 2714–2726. pmid:12381853
- 53. Lu H, Skolnick J. A distance-dependent atomic knowledge-based potential for improved protein structure selection. Proteins. 2001; 44: 223–32. pmid:11455595
- 54. Shen MY, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci. 2006; 15: 2507–2524. pmid:17075131
- 55. Moult J. Comparison of database potentials and molecular mechanics force fields. Curr Opin Struct Biol. 1997; 7: 194–199. pmid:9094335
- 56. Ngan S-C, Hung L-H, Liu T, Samudrala R. Scoring functions for de novo protein structure prediction revisited. Methods in Molecular Biology. 2007; 413: 243–282.
- 57. Galdiero S, Falanga A, Tarallo R, Russo L, Galdiero E, Cantisani M, et al. Peptide inhibitors against herpes simplex virus infections. J Pep Sci. 2013; 19: 148–158.
- 58. Badani H, Garry RF, Wimley WC. Peptide entry inhibitors of enveloped viruses: the importance of interfacial hydrophobicity. Biochim Biophys Acta. 2014; 1838: 2180–2197. pmid:24780375
- 59. Pelay-Gimeno M, Glas A, Koch O, Grossmann TN. Structure-Based Design of Inhibitors of Protein–Protein Interactions: Mimicking Peptide Binding Epitopes. Angew Chem Int Ed Engl. 2015; 54: 8896–8927. pmid:26119925
- 60. Vigant F, Santos NC, Lee B. Broad-spectrum antivirals against viral fusion. Nat Rev Microbiol. 2015;13: 426–37. pmid:26075364
- 61. Vondrášek J, Mason PE, Heyda J, Collins KD, Jungwirth P. The molecular origin of like-charge arginine—Arginine pairing in water. Journal of Physical Chemistry B. 2009; 113: 9041–9045.
- 62. Glaser F, Steinberg DM, Vakser IA, Ben-Tal N. Residue frequencies and pairing preferences at protein-protein interfaces. Proteins, 2001; 43: 89–102. pmid:11276079
- 63. Bahadur RP, Chakrabarti P, Rodier F, Janin JA. Dissection of Specific and Non-specific Protein-Protein Interfaces. Journal of Molecular Biology. 2004; 336: 943–955. pmid:15095871
- 64. Chakrabarti P, Janin J. Dissecting protein-protein recognition sites. Proteins: Structure, Function and Genetics. 2002; 47: 334–343.
- 65. Jones S, Thornton JM. Principles of protein-protein interactions. Proc Natl Acad Sci USA.1996, 93: 13–20. pmid:8552589
- 66. Tsai C-J, Lin SL, Wolfson HJ, Nussinov R. Studies of protein-protein interfaces: A statistical analysis of the hydrophobic effect. Protein Science. 1997; 6: 53–64. pmid:9007976
- 67. Conte LL, Chothia C, Janin J. The atomic structure of protein-protein recognition sites. Journal of Molecular Biology. 1999; 285: 2177–2198. pmid:9925793
- 68. Janin J, Séraphin B. Genome-wide studies of protein-protein interaction. Current Opinion in Structural Biology. 2003; 13: 383–388. pmid:12831891
- 69. Bahadur RP, Zacharias M. The interface of protein-protein complexes: Analysis of contacts and prediction of interactions. Cellular and Molecular Life Sciences. 2008; 65: 1059–1072. pmid:18080088
- 70. Gadkari RA, Srinivasan N. Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures. BMC Struct Biol. 2010;10:17. pmid:20550721
- 71. Suntoke TR, Chan DC. The fusion activity of HIV-1 gp41 depends on interhelical interactions. J Biol Chem. 2005; 280:19852–19857. pmid:15772068
- 72. Xu Y, Rahman N ABD, Othman RB, Hu P, Huang M. Computational Identification of Self-inhibitory Peptides from Envelope Proteins. Proteins: Structure, Function and Bioinformatics. 2012; 80: 2154–2168.