Fig 1.
Schematic representation of the overall framework.
(A) Training phase: 1. Pre-trained Conditional VAE. The symbol represents the concatenation operation. 2. Pre-trained hidden diffusion model. 3. Fine-tuning the conditional diffusion model. 4. Trained MIC predictor. (B) Generating phase: AMPs generation and MIC prediction.
Fig 2.
The detailed workflow for designing AMPs targeting the specific bacteria.
Fig 3.
The overall architecture of MIC predictor.
Note that ESM (Evolutionary Scale Modeling) [39] is a pretrained protein language model for sequence embedding, while MLP (Multilayer Perceptron) is a feed-forward neural network for prediction.
Fig 4.
Results of performance comparison with baseline models.
(A) Predicted MIC for E. coli of AMPs generated by six different models. (B) Predicted MIC for S. aureus of AMPs generated by six different models. (C) MSE values for different MIC predictors. (D) MAE values for different MIC predictors.
Table 1.
Results of performance comparison on MIC ratio metrics.a
Table 2.
Comparative evaluation for different MIC predictors.
Table 3.
Comparison results on novelty and diversity of sequences.
Fig 5.
Evaluation of the conditional generative ability.
(A) Conditional Generation. The y-axis represents normalized property values. (B) Unconditional Generation. (C) Comparison of MIC distributions. This panel compares the predicted MIC distributions of model-generated peptides with those of the positive and negative samples from the training set. (D) Comparison of the resulting physicochemical properties of generated peptide sequences under two different target property specifications. The red boxes highlight regions where differences exist between the two property groups. (E) Adjusting the properties of Cecropin and comparing the distributional differences in the generated peptides. (F) Enhancement of MIC for Cecropin. The left panel shows results against E. coli, and the right panel shows results against S. aureus.
Table 4.
Comparison of evaluation indicators between Cecropin and Cecropin-Improve.a
Fig 6.
Visualization of hidden representations.
(A) Visualization of hidden representations after Encoder of CVAE for positive data and negative data. (B) Contour plots to demonstrate generated positive samples and negative samples. (C) Visualization of t-SNE embedding for generated positive samples and negative samples (n_components = 2, perplexity = 30, n_iter = 1000, random_state = 43).
Table 5.
Top 20 AMPs for targeting E.coli and S.aureus.
Fig 7.
In silico test of biological properties for identified AMPs.
(A) MIC Prediction Targeting E. coli. (A) MIC Prediction Targeting S. aureus. (C)(D) Hemolytic Evaluation of AMPs. (E)(F) Toxin Evaluation of AMPs. (G) Molecular Docking of AMP with E. coli Cell Membrane. (H) Molecular Docking of AMP with S. aureus Cell Membrane.
Table 6.
Analysis of the antibacterial activity of the top 20 AMPs.
Table 7.
Analysis of the hemolytic and toxic properties of the top 20 AMPs.