Systematic benchmarking of deep-learning methods for tertiary RNA structure prediction

doi:10.1371/journal.pcbi.1012715

Systematic benchmarking of deep-learning methods for tertiary RNA structure prediction

Fig 5

RMSD and TMscore comparison for the RNA targets in the RNA-puzzles dataset.

Models predicted by Machine-Learning-based (ML-based) methods are coloured in blue, the ones predicted by Fragment-Assembly-based (FA-based) methods are in green and the average RMSD of all models for each target is in red. The shape of the points is based on the RNA type with circle denoting a natural RNA, + denoting a synthetic RNA and a square denoting an RNA-protein complex. On average, the performance of most methods on this dataset is much better than on CASP15 or the New dataset, possibly because many targets might have been part of the training set of the ML-methods and also many homologous structures for these targets are available in the PDB. The ML-based methods have the best quality models (low RMSD and High TMscore) and the FA-based methods have the lowest quality models for most targets in this dataset a) Plot showing the RMSD values in Å for the targets in the CASP dataset. For most targets, the ML-based methods (in blue) have much lower RMSD than the average (in red) and the FA-based methods (in green). b) Plot showing the TMscores for the predicted models for each target. TMscore for almost all targets for ML-methods (in blue) is higher compared to the Average(in red) and FA-based methods (in green). Model with the best TMscore for each target is always the one predicted by a ML-based method.

doi: https://doi.org/10.1371/journal.pcbi.1012715.g005