Fig 1.
The overall workflow of ComMat.
At each cycle, a community of size N, represented by single, pair, and structure features, progresses through updates facilitated by communication via cross-over and pair feature update.
Fig 2.
Performance comparison between ’With cross-over’ and ’No cross-over’ prediction setups on the IgFold set.
(a) Structure prediction errors measured by loop RMSD for best sampled and top scored models with and without cross-over (N = 32). Box plots display the median, interquartile range (IQR) bounds, whisker length of 1.5 ✕ IQR, and outliers beyond the 1.5 ✕ IQR range. (b) Success rates of best sampled and top scored structures with and without cross-over. (c) Best RMSD obtained by the two algorithms for individual targets. (d) Example case (PDB ID 7WPH) where best sampled and top scored models with cross-over (RMSD = 1.34 Å and 0.93 Å, respectively) outperform those without cross-over (3.06 Å and 2.75 Å, respectively).
Table 1.
Median CDR H3 loop RMSD of the best sampled structures for the test complexes in the IgFold set for different methods.
Table 2.
Median CDR H3 loop RMSD of the top-ranking structures for the test complexes in the IgFold set for different methods.
Fig 3.
Comparison of predictions by ComMat with AF2Rank (N = 32) against (a) ImmuneBuilder and (b) AlphaFold-Multimer 2.2 for individual prediction targets, illustrating that different methods excel with different targets. (c) An example case (PDB ID: 7S0B) in which ComMat achieved higher prediction accuracy compared to both ImmuneBuilder and AlphaFold-Multimer 2.2.
Fig 4.
t-SNE visualization of the loop structure sampling trajectory for the Glucosyltransferase domain of Clostridium difficile toxin B binding antibody (PDB ID 7SO5, H3 loop length = 13, maximum sequence identity to the training set = 38.5%) by (a) a single-inference sampling with ComMat trained with cross-over (N = 32) and (b) 32 independent sampling with ComMat without cross-over.
The structures obtained through the eight cycles of ComMat are indicated with round dots. Crystal structures, the best sampled and scored models by ComMat, and predictions by IgFold and AF-Multimer are denoted with different symbols. Comparison of the H3 loop structures is presented in (c), (d), and (e).
Fig 5.
t-SNE visualization of the loop structure sampling trajectory for the SARS-Cov2 binding antibody (PDB ID 7SN1, H3 loop length = 14, maximum sequence identity to the training set = 50.0%) by (a) a single-inference sampling with ComMat trained with cross-over (N = 32) and (b) 32 independent sampling with ComMat without cross-over.
The structures obtained through the eight cycles of ComMat are indicated with round dots. Crystal structures, the best sampled and scored models by ComMat, and predictions by IgFold and AF-Multimer are denoted with different symbols. Comparison of the H3 loop structures is presented in (c), (d), and (e).
Fig 6.
Dependency of RMSD of the best sampled loops on the community size for different loop length ranges.