TastepepAI: An artificial intelligence platform for taste peptide de novo design

doi:10.1371/journal.pcbi.1013602

TastepepAI: An artificial intelligence platform for taste peptide de novo design

Fig 3

Architecture and workflow of LA-VAE.

(A) Schematic illustration of the loss-supervised adaptive data generation framework. The training process is strategically divided into three phases: (1) Initial exploration phase (first half of total epochs, blue) monitors and records the global minimum loss while maintaining the model’s generative capability; (2) Convergence optimization phase (second half of total epochs, purple) generates sequences and terminates upon discovering a lower loss value, otherwise continues training; (3) Extension phase (additional epochs, dark purple) activates when a new optimal loss is not found during phase II, enabling further optimization. The lower panel shows the core components of the variational autoencoder architecture, including the encoder for latent space mapping, the latent space sampler, and the decoder for sequence reconstruction. Yellow and purple dots represent generated and training data points, respectively, illustrating the progressive refinement of the model’s generative distribution. (B) Contrastive learning-based taste property control mechanism. Left panel: Workflow of selective taste removal, where user-specified taste peptides are split into positive training and negative sets, each processed through variational autoencoders to establish contrasting latent spaces. Middle panel (Step 1): Visualization of latent space distribution displaying positive training data (pink), negative data (green), and generated data points (orange). Right panel (Step 2): Quality assessment of generated peptides based on Euclidean distances to k-nearest neighbors (k = 5). Upper plots show high-quality generated peptides (GP 1-3) with significant distance differences between positive and negative samples (*p < 0.05, **p < 0.01), while lower plots demonstrate low-quality peptides (GP 4-6) with non-significant differences (ns). Scatter plots illustrate the spatial distribution of high-quality (upper) and low-quality (lower) generated peptides (gray) relative to positive training data (yellow) and negative data (green) in the latent space.

doi: https://doi.org/10.1371/journal.pcbi.1013602.g003