Skip to main content
Advertisement

< Back to Article

Fig 1.

Current coverage of the human proteome.

a-b) Barplot showing the absolute (a) or relative (b) number of PDB coordinate files mapping to human proteomes at >95%, 95–50% and 50–20% thresholds of sequence identity. Legends in barplots a and b are the same. c) Evolution of the coverage of the human proteome by three-dimensional coordinate files in the Protein Data Bank (y-axis) according to the minimum percent identity of the BLAST hits (x-axis). Each line represents the coverage using only the coordinate files available in PDB in a given year. d) Barplot showing the coverage of the human proteome by different types of structural features, both linear (PFAM domains and IDRs) and three-dimensional (PDB) (y-axis is the same as in c). e) Coverage of the proteome by different AlphaFold pLDDT score thresholds (y-axis is the same as in c). f) Coverage (y-axis) of different types of regions (x-axis) depending on AlphaFold confidence levels. g) Current coverage (y-axis) of the human proteome.

More »

Fig 1 Expand

Fig 2.

Changes in the structural coverage at the protein level after AlphaFold.

a) Histogram showing the number of proteins (y-axis) according to their structural coverage (x-axis) before (left) and after (right) the release of AlphaFold models. b) Histogram showing the number of proteins for which we previously had less than 1% of structural coverage (y-axis) according to their current structural coverage after AlphaFold. c) Same as b but now including only high-confidence (pLDDT > 90) AlphaFold predictions (x-axis). d) Histogram showing how much AlphaFold high-confidence predictions contribute (x-axis) to our coverage of proteins with >95% structural coverage. e-g) AlphaFold models for previously structureless AGMO, DEGS1 and PEMT proteins. Models are colored in blue-red scale showing the pLDDT score for the residue, with red representing low pLDDT and blue high pLDDT.

More »

Fig 2 Expand

Fig 3.

Changes in structural coverage of biomedical proteins due to AlphaFold models.

a) Current structural coverage (y-axis) of different subsets of proteins (x-axis). Bars are colored according to the source of the structural coverage. b) Same as a but focusing on Clinvar mutations classified by their pathogenicity (x-axis). c) Same as a but focusing on somatic mutations from TCGA, classified by their likely oncogenicity (x-axis). d) Same as a but focusing on oncogenic mutations from BoostDM. e) AlphaFold model for B3GALT6. Residues are colored according to their pLDDT from red (lower values) to blue (higher values). Pathogenic mutations from Clinvar are highlighted in yellow. e) AlphaFold model for MED12. Coloring is the same as for d, but yellow residues indicate oncogenic mutations.

More »

Fig 3 Expand

Fig 4.

Changes in protein structural coverage in other organisms.

a) Comparison of the structural coverage (y-axis) of the five different organisms (x-axis) based on PDB sequence identity. b) Additional structural coverage provided by AlphaFold models in the different species, split by pLDDT score. c) Current high quality structural coverage of the five organisms combining PDB and AlphaFold data.

More »

Fig 4 Expand