Accurate model and ensemble refinement using cryo-electron microscopy maps and Bayesian inference

Samuel E. Hoff; F. Emil Thomasen; Kresten Lindorff-Larsen; Massimiliano Bonomi

doi:10.1371/journal.pcbi.1012180

Abstract

Converting cryo-electron microscopy (cryo-EM) data into high-quality structural models is a challenging problem of outstanding importance. Current refinement methods often generate unbalanced models in which physico-chemical quality is sacrificed for excellent fit to the data. Furthermore, these techniques struggle to represent the conformational heterogeneity averaged out in low-resolution regions of density maps. Here we introduce EMMIVox, a Bayesian inference approach to determine single-structure models as well as structural ensembles from cryo-EM maps. EMMIVox automatically balances experimental information with accurate physico-chemical models of the system and the surrounding environment, including waters, lipids, and ions. Explicit treatment of data correlation and noise as well as inference of accurate B-factors enable determination of structural models and ensembles with both excellent fit to the data and high stereochemical quality, thus outperforming state-of-the-art refinement techniques. EMMIVox represents a flexible approach to determine high-quality structural models that will contribute to advancing our understanding of the molecular mechanisms underlying biological functions.

Author summary

EMMIVox introduces an innovative Bayesian inference method designed to generate precise structural models from cryo-electron microscopy data. Unlike many existing techniques that often compromise structural quality to better fit the data, EMMIVox harmoniously balances experimental data with accurate physico-chemical models. This unique approach also allows EMMIVox to describe structural heterogeneity that is frequently overlooked by conventional methods, providing researchers with both single-structure models and structural ensembles. By explicitly addressing challenges such as data correlation and noise, and by incorporating precise B-factor determination, EMMIVox achieves remarkable improvements in data fitting and stereochemical quality compared to current refinement techniques. Its flexibility and accuracy not only enhance the quality of cryo-EM structural models but also promise to significantly advance our understanding of the complex molecular mechanisms underlying biological functions. In doing so, EMMIVox effectively bridges the gap between complex experimental data and insightful structural models.

Citation: Hoff SE, Thomasen FE, Lindorff-Larsen K, Bonomi M (2024) Accurate model and ensemble refinement using cryo-electron microscopy maps and Bayesian inference. PLoS Comput Biol 20(7): e1012180. https://doi.org/10.1371/journal.pcbi.1012180

Editor: Dina Schneidman, Hebrew University of Jerusalem, ISRAEL

Received: December 16, 2023; Accepted: May 20, 2024; Published: July 15, 2024

Copyright: © 2024 Hoff et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: EMMIVox is implemented as a part of the PLUMED-ISDB module in the development version (GitHub master branch) of PLUMED (https://github.com/plumed/plumed.github.io). The GROMACS topologies and PLUMED input files used in our benchmark as well as the EMMIVox refined single-structure models are available in PLUMED-NEST (www.plumed-nest.org), the public repository of the PLUMED consortium, as plumID:23.041. Scripts to prepare and analyze EMMIVox simulations as well as complete tutorials for single-structure and ensemble refinement are available on GitHub (https://github.com/COSBlab/EMMIVox).

Funding: S.E.H. is funded by the French Agence Nationale de la Recherche (ANR), ANR-20-CE45-0002 (project EMMI). S.E.H. is funded by a Roux-Cantarini fellowship from the Institut Pasteur (Paris, France). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Cryo-electron microscopy (cryo-EM) has become a powerful tool to determine the structures of complex biological systems. Recent advances in physical instrumentation and image processing algorithms have pushed the boundaries of what can be achieved with this technique in terms of resolution and coverage across a wide spectrum of system size and complexity [1]. While most single-particle cryo-EM density maps still have a resolution between 3 Å and 4 Å (Panel a in Fig A in S1 Text), the number of systems determined at atomic resolution is steadily increasing, with the current record set by Apoferritin, resolved in 2020 at ~1.2 Å [2,3]. At the same time, tremendous progress in cryo-electron tomography (cryo-ET) is enabling the determination of the structure of complex biological systems in situ at sub-nanometer resolution [4] (Panel b in Fig A in S1 Text). It is also becoming increasingly clear that single-structure models are not always a faithful representation of the three-dimensional (3D) cryo-EM density maps [5]. Small-scale continuous dynamics of functionally important flexible regions within biomolecules are typically averaged out during the reconstruction of 3D maps resulting in fuzzy density regions with reduced resolution. To accurately interpret these regions, we should therefore move away from single-structure models towards ensembles of conformations [6,7]. In all these situations, it is of paramount importance to convert cryo-EM data into high-quality structural models, for example for in silico structure-based drug design [8] or training deep learning approaches on accurate structural data [9].

Over the years, a variety of different metrics have been proposed to evaluate the quality of structural models obtained from cryo-EM maps [10–13]. Generally speaking, these metrics fall into two categories: how well a model (or set of models) explains the observed map or directly the single-particle images (fit to the data) and how good the model is in terms of basic stereochemical parameters, such as length of chemical bonds, backbone and sidechain dihedral distributions, and clashes between close atoms. To evaluate the fit to the data, the most common approach is to compare the experimental map with the map predicted from a model, for example by calculating the cross correlation between the two maps over the entire 3D space or in the proximity of the structural model. In low-resolution areas, the density can be predicted either as an average over multiple conformations or from a single-structure model by introducing temperature factors (B-factors), which implicitly account for intrinsic dynamics and other reasons behind the observed fuzzy density, such as errors in the image alignment or local damage caused by the electron beam. The main challenge in determining a high-quality structural model is to obtain a good balance between the fit to the observed density map and the overall stereochemical quality of the model.

Several modelling approaches have been developed to refine structural models into cryo-EM maps [14]. These techniques rely on various approaches including homology modelling [15], rigid-body fitting [16–19], flexible refinement [20–28], integrative modelling [29,30], and machine learning to automate model fitting to 3D density maps [31]. Most of these methods do not optimize B-factors, and therefore, while they are extremely useful to create structural models that occupy the space defined by the cryo-EM density, a quantitative evaluation of the model fit to the experimental map with the metrics commonly used for PDB validation is challenging. On the other hand, a handful of methods have been proposed to model structural ensembles from either 2D single-particle images [32–34] or 3D maps [22,35,36]. As of today, one of the most popular refinement software commonly used prior to depositing models in the PDB database is PHENIX [23], which enables real-space refinement, optimization of B-factors, and modelling residues in alternative conformations. The approach implemented in PHENIX relies on an empirical scoring function to maximize the correlation between the cryo-EM map predicted from a model and the experimental map while trying to preserve stereochemical properties. This approach typically leads to an excellent fit to the data (Panels a and b in Fig B in S1 Text) but often at the expense of physico-chemical properties, especially in terms of clashes between atoms (Panels c and d in Fig B in S1 Text).

Here we present EMMIVox, a computational approach to determine single-structure models as well as conformational ensembles using cryo-EM maps. EMMIVox is based on a Bayesian inference framework [37] to balance automatically the experimental information with state-of-the-art physico-chemical models of the system and the surrounding environment. Explicit treatment of data correlation and uncertainty as well as accurate inference of B-factors contribute to the determination of structural models with excellent fit to the data without sacrificing the stereochemical quality of the models. We benchmarked our approach on nine complex biological systems and demonstrated that EMMIVox models outperformed those obtained with state-of-the-art refinement techniques and deposited in the PDB database in terms of several quality metrics. We also illustrate how EMMIVox can be used in combination with medium-low resolution cryo-EM maps to refine coarse-grained models of large protein complexes and to determine conformational ensembles describing the structural heterogeneity hidden in low-resolution areas of atomistic cryo-EM maps. EMMIVox is implemented in the open-source, freely available PLUMED library [38,39] (www.plumed.org) and aims at setting a new standard for single-structure and ensemble refinement by optimally integrating cryo-EM maps with accurate atomistic and coarse-grained physico-chemical models of the system as well as other experimental data, when available.

Results

This section is organized as follows. We first provide a general overview of the EMMIVox approach and illustrate its accuracy in refining both atomistic and coarse-grained single-structure models using cryo-EM maps at various resolutions. We then focus on ensemble refinement and present the results of our benchmark as well as one case study: the GPT type 1a tau filament from progressive supranuclear palsy neurodegenerative disease [40].

Overview of the EMMIVox approach

EMMIVox is based on a Bayesian inference framework to generate a hybrid energy function that combines the molecular mechanics force fields used in classical Molecular Dynamics (MD) simulations with spatial restraints to enforce the agreement of a structural model with the density observed in the voxels of a cryo-EM map. To balance automatically stereochemical quality of the models and fit to the data, EMMIVox: i) pre-filters the voxels of a cryo-EM map to reduce the correlation between experimental data points and therefore help avoid data overweighting; ii) uses prior models of uncertainty in the experimental data obtained from independent 3D reconstructions (half maps) while allowing for the presence of random and systematic errors in the map; and iii) builds spatial restraints weighted by the estimated accuracy in each voxel. Furthermore, EMMIVox exploits modern atomistic and coarse-grained force fields to accurately describe the physico-chemical properties of a biological system as well as its environment, including water molecules, ions, lipids, and small molecules. A Monte Carlo optimization of residue-level B-factors coupled with structural refinement guided by the EMMIVox hybrid energy function enables the determination of single-structure models that fit the observed cryo-EM map while preserving a high stereochemical quality. Instead of implicitly modelling local dynamics with B-factors, EMMIVox can be combined with metainference [41] to obtain structural ensembles that explicitly represent the continuous dynamics of highly flexible regions of the system. A detailed description of EMMIVox is provided in Materials and Methods.

Accurate atomistic single-structure refinement

We first benchmarked the accuracy of EMMIVox in refining single-structure models using as test systems the GPT type 1a tau filament (1.90 Å, PDB 7p6a) [40], the ChRmine channelrhodopsin (2.02 Å, PDB 7w9w) [42], the Anaplastic lymphoma kinase extracellular domain fragment in complex with an activating ligand (2.27 Å, PDB 7n00) [43], a dimeric unphosporylated Pediculus humanus protein kinase (2.35 Å, PDB 7t4n) [44], the human CDK-activating kinase bound to the inhibitor ICEC0942 (2.50 Å, PDB 7b5o) [45], the major facilitator superfamily domain containing 2A in complex with LPC-18:3 (3.03 Å, PDB 7mjs) [46], the Escherichia coli PBP1b (3.28 Å, PDB 7lq6) [47], the mammalian peptide transporter PepT2 (3.50 Å, PDB 7nqk) [48], and the SPP1 bacteriophage tail tube (4.00 Å, PDB 6yeg) [49]. These examples include soluble and membrane proteins, protein complexes, as well as systems containing ordered waters, small molecules, and lipids.

The quality of the EMMIVox models was evaluated and compared to the deposited PDBs using four key metrics. The physico-chemical quality was quantified by the clashscore [50] and MolProbity score [50], while the fit to the data using the correlation coefficient between predicted and observed maps in the proximity of the model (CC_mask) [51] and the EMRinger score [52]. EMMIVox models generally showed improved quality in all four metrics (Fig 1). The improvements are particularly pronounced for clashscore that indicates the absence of serious clashes between atoms in the EMMIVox models, and the MolProbity score, suggesting a superior overall stereochemical quality. At the same time, EMMIVox models often improved the CC_mask indicating a better fit to the experimental cryo-EM map, and the EMRinger scores, particularly in maps with resolution better than 4 Å. Overall, the four metrics indicate that EMMIVox generates single-structure models that improve upon current models deposited in the PDB.

Download:

Fig 1. EMMIVox single-structure refinement benchmark.

a-i) Assessment of the quality of the EMMIVox single-structure model (colored bars) and deposited PDB (grey bars) for each of the nine systems in our benchmark set. Models are evaluated based on their stereochemical quality and fit with the experimental cryo-EM map. The stereochemical quality is measured by two metrics: i) clashscore, which corresponds to the number of serious clashes per 1000 atoms, and ii) MolProbity score, which is a global measure of quality that combines clashscore, percentage of Ramachandran dihedrals in non-favored regions and percentage of bad sidechain rotamers. For both metrics, lower values correspond to higher quality models. The model fit to the data is measure by i) CC_mask, which is the cross-correlation between experimental map and map predicted from the model in the proximity of the structural model, and ii) EMRinger, which measures the precise fitting of an atomic model into the map. For both metrics, higher values correspond to models that better fit the data.

https://doi.org/10.1371/journal.pcbi.1012180.g001

In the following sections, we will focus on the different EMMIVox components that contribute to the observed high quality of the models. Three key aspects affect the balance between stereochemical quality and fit to the data: i) the number of experimental data points (voxels) that are fit using spatial restraints; ii) the strength of these spatial restraints; iii) the accuracy of the cryo-EM map predictor from a model.

Removal of correlated data leads to balanced model refinement. Neighboring voxels of a cryo-EM map contain correlated information, with the strength of correlation depending on the voxel size and the map resolution. Ignoring such correlation leads to overcounting the number of (independent) data points available and ultimately biasing the refinement towards (over)fitting the data at the expense of the stereochemical quality of the models. To address this point, we developed a pre-filtering procedure to subsample the set of cryo-EM voxels and reduce data correlation (Materials and Methods). We first tested our procedure using over 2400 cryo-EM density maps deposited in the EMDB with resolution ranging from 1.78 Å to 4 Å. Upon removal of correlated data, the median number of voxels per model atom decreased as the resolution of the cryo-EM map worsened and became independent of the voxel size (Fig C in S1 Text). This indicates, as expected, that the amount of spatial information provided by high-resolution maps is greater than in medium-low resolution maps.

We then benchmarked the quality of the EMMIVox single-structure refinement as a function of the amount of data removed. Removing a large portion of correlated data led to better stereochemical models compared to utilizing all available voxels (Figs D and E in S1 Text) at the expense of the quality of the fit to the entire cryo-EM map (Figs F and G in S1 Text). As fewer points were removed, the balance shifted towards the fit to the data. We therefore identified a correlation coefficient equal to 0.8 as the optimal threshold for data removal in EMMIVox refinement to guarantee a good balance between stereochemical quality of the model and fit to the data. In our benchmark set, this threshold corresponded on average to removing 45% (± 30%) of the voxels in the proximity of the deposited PDB, resulting in 13 ± 5 voxels per atom.

Bayesian noise models reduce data overfitting. In EMMIVox, the strength of the spatial restraints used to enforce the model agreement with the cryo-EM map reflects the estimated accuracy (noise level) of the density in each voxel. Our Bayesian inference framework enables us to estimate the noise level on the fly based on the consistency between cryo-EM data, physico-chemical prior and potentially additional experimental data (Materials and Methods), and ultimately to downweigh voxels that are considered outlier data points during refinement. To guide noise inference, we developed priors based on the density variability in each voxel calculated from two independent 3D reconstructions, or half maps. In regions of the map where large variations are observed, our approach helps to avoid overfitting the data.

To exemplify this point, we examined the relation between per-residue fit to the map (local CC_mask) and median noise level of the voxels around a given residue. This analysis illustrates that in the EMMIVox models residues in low-noise regions sometimes fit the map even better than in the deposited PDB, while in regions with high noise level the local fit to the map is often much worse (Fig 2A and Fig H in S1 Text). In these regions, during refinement EMMIVox correctly reduced the weight of the experimental data in favor of the molecular mechanics forcefield, ultimately increasing the overall stereochemical quality of the model. To determine which specific physico-chemical properties of the system were better represented in our models, we computed the total number of atom clashes, hydrogen bonds, and salt bridges in the EMMIVox models and deposited PDBs across our entire benchmark (Fig 2B). EMMIVox dramatically reduced the number of serious clashes observed in the deposited PDBs and optimized the positions of backbone and sidechain atoms so as to better describe hydrogen bond and salt bridge geometries (Fig 2C). The improvement in these three physico-chemical fingerprints is more accentuated for residues in regions of the cryo-EM map with high noise, supporting the ability of our noise models to shift, when needed, the balance of the refinement towards the accurate molecular mechanics forcefield used by EMMIVox.

Download:

Fig 2. EMMIVox noise models reduce data overfitting and improve stereochemical quality.

a) Relation between individual residue fit to the experimental cryo-EM map (local CC_mask) and local error in the map in the proximity of each residue for the single-structure EMMIVox model (blue) and the deposited PDB (red) in the case of ChRmine channelrhodopsin (2 Å, PDB 7w9w). The local error is calculated from the difference between the two cryo-EM half maps as the median value across all the voxels associated to a residue by Voronoi tessellation. The same analysis for all the other systems of our benchmark set is reported in Fig H in S1 Text. b) Total number of clashes, hydrogen bonds, and salt bridges in the EMMIVox single-structure models (blue) and deposited PDBs (red) across all the nine systems of our benchmark set. Dark and light bars indicate clashes, hydrogen bonds, and salt bridges that involve residues of the EMMIVox model in low and high CC_mask regions, respectively. Relative variations of these three physico-chemical fingerprints between PDB and EMMIVox models are indicated on top of the blue bars, separately for low and high CC_mask regions. c) Examples of salt bridges formed upon EMMIVox refinement of the deposited PDBs (grey).

https://doi.org/10.1371/journal.pcbi.1012180.g002

Accurate cryo-EM map prediction with Bayesian inference of B-factors. During refinement and validation, a predictor of cryo-EM density from a 3D model (forward model) is required to measure the agreement of a model with the experimental map. Per-residue B-factors need to be determined to smoothen the prediction obtained with classical forward models derived from Gaussian fits of electron scattering factors and thus to explain fuzzy densities with an individual conformation. To reduce data overfitting, PHENIX adds restraints to avoid B-factors of residues close in space to deviate too much from each other. In presence of sharp transitions between ordered and disordered regions, these additional restraints lead to under (over) estimating the B-factors corresponding to flexible (rigid) residues, effectively compressing the space that B-factors can sample. Inspired by PHENIX, we developed a Bayesian inference approach to determine B-factors that allows these restraints to be violated, especially in regions corresponding to order-disorder transitions (Materials and Methods; Fig I in S1 Text). The B-factors inferred by EMMIVox contribute to a more accurate density prediction across the whole system and ultimately to increase the model fit to the data.

Coarse-grained single-structure refinement

For prospective use in single-structure refinement using medium-low resolution cryo-EM and cryo-ET data, we developed a forward model to predict density maps from coarse-grained Martini 3 models, in which each amino acid is represented by a few beads [53] (Materials and Methods). To evaluate the accuracy of our coarse-grained forward model, we calculated density maps from the Martini representations of a set of 1909 cryo-EM structures and quantified the agreement between the predicted and experimental cryo-EM data. For comparison, we also calculated density maps from the all-atom structures using the atomistic forward model. Comparing the CC_mask with the experimental maps given by the coarse-grained and atomistic forward models as a function of experimental resolution revealed that they perform equally well at lower resolutions (>4.0 Å), while the atomistic forward model is more accurate at higher resolutions (<4.0 Å) (Fig 3). These results confirm the accuracy of our coarse-grained forward model and suggest that Martini models may be useful for single-structure refinement of medium-low resolution cryo-EM and cryo-ET data. Additionally, our new forward model will be useful for validating and biasing Martini simulations using cryo-EM and cryo-ET data. However, it should be noted that it is unlikely that Martini coarse-grained models could be directly used for model refinement without modification to the elastic network that Martini uses to limit the flexibility of such models. The use of coarse-grained models during single-structure and ensemble refinement may therefore require informed modifications to the elastic network or the use of other structure-based models (i.e., Go-models) to allow flexibility during sampling.

Download:

Fig 3. EMMIVox coarse-grained refinement.

a) Difference between atomistic and coarse-grained (Martini 3) forward model in the agreement with 1909 experimental cryo-EM density maps. The plot shows the average difference in CC_mask (ΔCC) given by the two forward models as a function of the experimental resolution. Error bars show the standard deviation. b-c) Example structures and cryo-EM density maps at medium (b: PDB 7nnh [94], 4.0 Å) and high (c: PDB 7o6q [95], 1.88 Å) resolution. From left to right: experimental density maps and density maps predicted by the atomistic and coarse-grained forward models. The predicted maps are colored by local cross correlation to the experimental map (low: red, high: blue) and the global CC_mask with the experimental map is shown.

https://doi.org/10.1371/journal.pcbi.1012180.g003

Ensemble refinement

When coupled with metainference [41], EMMIVox can be used to model structural ensembles by interpreting low resolution areas of a cryo-EM map in terms of a mixture of conformational heterogeneity and noise (Materials and Methods). To determine the accuracy of this combined approach, we determined structural ensembles for all the systems in our single-structure refinement benchmark set and compared their quality of fit to the experimental map to the EMMIVox single-structure model (Fig 4A and Fig J in S1 Text). The CC_mask scores for structural ensembles had a median increase of 13% compared to single-structure models, reaching up to a CC_mask of 0.95 for the ensemble of Escherichia coli PBP1b (3.28 Å, PDB 7lq6). The largest improvement was observed for the Bacteriophage SPP1 (4 Å, PDB 6yeg), whose CC_mask score increased by 28.4% from 0.60 observed in the single-structure model to 0.77 for the ensemble.

Download:

Fig 4. EMMIVox ensemble refinement benchmark.

a) Model fit to experimental data (CC_mask) of the nine systems in our benchmark set for the EMMIVox single-structure models (light blue) and EMMIVox ensemble (dark blue). The percentage increase in ensemble CC_mask is reported on top of the dark blue bars. b) Percentage of residues across the entire benchmark set that in the EMMIVox ensembles display multimodal conformational distributions according to the Folding Test of Multimodality. Each bar is decomposed into percentage of residues with only backbone (dark blue), only sidechain (light blue), and both backbone and sidechain (light red) multimodal distributions. Percentages calculated on all residues (left bar) and only on residues displaying RMSFs greater than 6 Å (right bar) are displayed separately. c) Percentage of residues that display multimodal conformational distributions as a function of residue RMSF. Colors as in panel b). d) Relation between per residue B-factor in the single-structure EMMIVox model and residue MSF within the EMMIVox ensemble in the case of ChRmine channelrhodopsin (2 Å, PDB 7w9w). The red line indicates a Bayesian linear fit between these two quantities. The same analysis for all the other systems of our benchmark set is reported in Fig K in S1 Text1. e) Cryo-EM density map of ChRmine channelrhodopsin (EMD-32377) overlaid to the EMMIVox single-structure model. Residues with large B-factors highlighted in the dashed box of panel d) are colored in violet. f) EMMIVox structural ensemble of ChRmine channelrhodopsin. Colors as in panel e).

https://doi.org/10.1371/journal.pcbi.1012180.g004

The superior model-data fit is not necessarily due only to the larger number of parameters used in ensemble refinement, i.e. multiple conformations, because single-structure models can implicitly account for dynamic effects using residue-level B-factors to describe fluctuations around an average conformation. The improved ensemble CC_mask suggests therefore that the dynamics of flexible regions might not be well represented by such Gaussian fluctuations. To investigate this point further, we examined in more detail the nature of the structural diversity observed in our EMMIVox ensembles. Across our entire benchmark set, 48% of residues showed a dynamic behavior that cannot be described by a unimodal distribution (Fig 4B), neither at the backbone nor side-chain level. When restricting our analysis to medium-to-large scale dynamics, defined by residue Root Mean Square Fluctuations (RMSF) greater than 6 Å, the majority of residues (61%) sampled multimodal conformational distributions impacting a larger portion of backbone atoms as dynamics increases (Fig 4C). These results, combined with the higher ensemble CC_mask, indicate that unimodal, Gaussian-like distributions centered around a single-structure model with fluctuations proportional to the B-factors cannot accurately describe the conformational heterogeneity of the backbone and side chains that are averaged out in cryo-EM maps.

To determine whether the EMMIVox ensembles overinterpret experimental noise as conformational heterogeneity, we compared for each system the per-residue B-factors of the single-structure model to the residue MSF within the EMMIVox ensemble (Fig K in S1 Text). For all systems we observed a correlation between these two quantities indicating that the majority of the (multi-modal) heterogeneity observed in our ensembles corresponds to an increased B-factor in the single-structure model. However, most systems contained several residues with large B-factors that were not matched by proportionately large MSFs. This suggests that fuzzy densities around these residues correspond to experimental noise rather than structural dynamics. One intriguing example is ChRmine channelrhodopsin (2 Å, PDB 7w9w). In the EMMIVox single-structure model we observed disproportionately high B-factors for residues 191–214 and 269–279 (Fig 4D, dashed box), which are located on the intracellular side (Fig 4E). Density around these residues is extremely fuzzy, most likely due to the residual density from the antibody introduced to facilitate image alignment and/or radiation damage as this region is unprotected by the micelle during data collection (Fig 4E). EMMIVox correctly did not overinterpret this fuzzy density as an extremely dynamic region but generated an ensemble with reduced conformational heterogeneity compared to the single-structure B-factors (Fig 4F). This analysis, along with the previous observation that structural variety is often described by multimodal distributions, show that it is possible to get additional information about conformational heterogeneity beyond simply representing the continuous dynamics averaged in cryo-EM maps as Gaussian-like fluctuations proportional to the B-factors.

Structural dynamics from atomistic cryo-EM maps: The GPT type 1a tau filament

As cryo-EM approaches atomistic resolution the question arises as to whether ensemble descriptions of density maps are still necessary [54]. We argue that ensemble representations are essential to extract all the information contained in these maps, such as the presence and location of semi-ordered water and lipids around proteins or local side chain flexibility, which cannot be captured by single-structure models nor by 2D particle classification. We demonstrate this concept using as example the GPT type 1a tau filament, which was recently determined at 1.9 Å resolution (PDB 7p6a, EMDB 13223) [40].

Tau 1a proteins, which are thought to play a role in various neurodegenerative diseases, organize into tens-of-nm-long filaments with fold-dependent repeating structures. With EMMIVox, we determined the structural ensemble of the tau 1a filament from an atomistic cryo-EM map and observed several interesting features. First, the location of ordered water molecules found in the deposited PDB was correctly identified by the water density within the EMMIVox ensemble without the use of symmetry constraints (Fig 5A and 5B). Notably, we also observed additional water density close to the ordered waters, which was not present in the deposited PDB. This density corresponds to a second hydration shell composed of semi-ordered waters in exchange with bulk solution. Second, it is known that the aromatic residues at the surfaces of fibrils may be important to fibril growth [55]. Recent NMR studies revealed that the dynamics of these aromatic rings vary significantly depending on the amino acid location, with surface exposed amino acids populating a larger number of conformations in rapid exchange compared to aromatics within the fibril core [56]. This behavior was also observed in our EMMIVox ensemble (Fig 5C and 5D). This raises the question as to whether other non-aromatic surface exposed amino acids occupy multiple conformations, and if this information can be extracted from atomistic cryo-EM maps. We calculated the population of the conformers of two surface-exposed residues from the EMMIVox ensemble: K343, which in the deposited PDB was modeled in two alternative conformations with equal occupancy, and K347. We confirm that K343 occupies two distinct conformations with equal probability of 50% (Fig 5E). Interestingly, we also observed a previously unmodeled minor conformation for K347 with an occupancy of 30% compared to 70% for the major conformation present in the deposited PDB (Fig 5F).

Download:

Fig 5. Case study: The GPT type 1a tau filament.

a) Ordered waters in proximity of H299 resolved in the deposited structure (PDB 7p6a, 1.9 Å resolution, EMDB 13223). b) Water density within the EMMIVox ensemble showing both ordered and semi-ordered molecules in the proximity of H299. cd) Distribution of the χ1 sidechain dihedral angles of residues F346 (c) and F378 (d) computed from the EMMIVox ensembles, along with representative conformations. These distributions indicate that exposed aromatic residues populate a larger number of conformations compared to buried aromatics, as previously observed in NMR studies of other fibrils. ef) Structural ensembles of residues K343 (e) and K347 (f) indicating the presence of multiple sidechain conformations. In the EMMIVox ensemble, K343 occupies two distinct states with equal populations, as illustrated by the distribution of the lysine sidechain nitrogen positions projected on a plane parallel to the fibril surface. Both states were modelled in the deposited PDB with equal occupancy. K347 instead was modelled in a single conformation in PDB 7p6a, while in the EMMIVox ensemble it occupies a second minor conformation, populated only by 27%.

https://doi.org/10.1371/journal.pcbi.1012180.g005

Discussion

Here we presented EMMIVox, a tool to determine accurate single-structure models as well as structural ensembles by combining MD simulations driven by state-of-the-art molecular mechanics force fields and cryo-EM density maps. EMMIVox incorporates cryo-EM data in the form of voxels using a Bayesian framework that explicitly accounts for data correlation and noise. The method automatically balances model fit to experimental data with stereochemical quality and produces single-structure models that outperform the structures deposited in the PDB in terms of multiple key metrics. EMMIVox systematically reduced the number of serious clashes observed in deposited structures and optimized the positions of backbone and side-chain atoms to improve the description of physico-chemical interactions fundamental for protein stability, such as hydrogen bonds and salt bridges. These single-structure models can be leveraged for all those applications that require high-quality data, such as in silico structure-based drug design or training machine learning approaches.

Even more importantly, EMMIVox can be used to determine structural ensembles that reveal the conformational heterogeneity hidden in low-resolution areas of cryo-EM maps. While 2D classification, manifold embedding and other advanced image processing techniques [57–64] have the potential to identify distinct conformational states directly from the single-particle data, we have shown here that even atomistic maps still present regions in which dynamics of flexible regions is averaged out. EMMIVox relies on both the cryo-EM data and physico-chemical knowledge to determine structural ensembles that capture key information that is lost in single-structure models, such as the presence and population of minor sidechain conformational states, semi-ordered waters, lipids and ligands. Our analysis indicated that this structural variability often corresponds to multimodal conformational distributions that cannot be well described by Gaussian-like fluctuations around an average model and proportional to single-structure B-factors. While these might appear as minor details, we have shown in the past that dynamics of flexible regions extracted from cryo-EM maps often play a crucial role in biological function, for example for ligand recognition in the ASCT2 transporter [65] and to understand the effect of post-translational modifications in microtubules [66].

EMMIVox presents multiple advantages with respect to our previous approach for structure determination from cryo-EM maps (EMMI) [35]. EMMIVox directly utilizes the density voxels of 3D cryo-EM density maps, replacing the previous representation method using Gaussian Mixture Models (GMM). While representing the cryo-EM density as voxels requires careful removal of correlated datapoints, it significantly increases usability, especially with atomistic cryo-EM density maps, for which determining accurate GMMs is prohibitively expensive. The direct use of voxels also enables estimation of experimental errors from deposited half-maps and the use of noise models for each density voxel, ensuring that various sources of error are not interpreted as structural dynamics when modeling structural ensembles. Additionally, our novel Bayesian inference approach to determine B-factors, which is unique to EMMIVox, enables accurate single-structure refinement and quantitative comparison of the models to the deposited PDBs using standard metrics of quality.

In spite of significant improvements, EMMIVox still presents some limitations. First, the computational cost of EMMIVox is higher compared to the real-space refinement approach implemented in PHENIX. The main reason is the use of accurate models of the environment, i.e. explicit water molecules, lipid bilayer, and solution ions, and of the intermolecular forces, especially long-range interactions. Furthermore, while single-structure refinement can be performed on a GPU-enabled Desktop computer in 10 to 24 hours, the determination of structural ensembles requires simultaneous use of multiple computer nodes, which may be out of reach for some users. Second, EMMIVox was designed to capture the underlying conformational distribution of the selected single-particle images used to reconstruct the 3D density map. As this data is a subset of the entire particle set, it is important to note that the resulting ensemble may represent local dynamics around a stable macrostate and not the entire conformational landscape. Furthermore, it should be kept in mind that the cryo-EM cooling process might perturb the room temperature ensemble by reducing thermal motion and enabling equilibration into lower free-energy conformations [67]. In EMMIVox, these issues are partially mitigated by using the room temperature ensemble provided by the molecular mechanics force field as a prior. Third, while EMMIVox has been shown to accurately and rapidly capture conformations present in available cryo-EM density maps, including those which are difficult to observe on short time scales in standard MD simulations (Fig L in S1 Text), EMMIVox may have trouble sampling conformations separated by large free-energy barriers. In this case, sampling issues can be alleviated by combining EMMIVox with various enhanced sampling techniques, such as metadynamics as recently proposed in the MEMMI approach [68]. Finally, because of these sampling limitations, an initial model that already fits the experimental density to a good extent is required as starting point. AlphaFold2 [9], ab initio machine learning and other automated model building techniques [31,69], or flexible fitting methods such as MDFF [21], TEMPy-ReFF [22], and the novel maximum likelihood approach implemented in GROMACS [24], can provide excellent starting models for accurate single-structure and/or ensemble refinement with EMMIVox.

Despite these limitations, EMMIVox enables determining accurate structural and dynamic models using in vitro data, and holds promise as a valuable tool for future efforts focused on structure determination in situ. As the resolution of cryo-ET improves (Panel b in Fig A in S1 Text), high-quality in cell data will become readily available and integrative approaches that combine various sources of experimental and computational data will be required to obtain accurate structural models. The coarse-grained forward model implemented in EMMIVox will enable refining models of large macromolecular architectures from sub-nanometer subtomogram averaging data. Additionally, the Bayesian framework on which EMMIVox is built enables automatic weighting of multiple sources of in silico and experimental data, making it a perfect framework for integrative structure and dynamic determination in cell. This integration of a diverse set of data is further facilitated by the implementation of EMMIVox in the PLUMED-ISDB [70] module of PLUMED [38], which makes it possible to combine a wide range of different types of experimental data.

In summary, EMMIVox represents a flexible instrument to convert cryo-EM data into high-quality structural models. Our approach can be used to reveal dynamic properties of proteins, lipids, ligands, waters, and ions by extracting information from cryo-EM density maps that would otherwise be lost. These models can advance our understanding of the molecular mechanisms that drive biological functions, and provide valuable information for structure-based drug design and training of novel machine learning approaches. Looking ahead, we envision that EMMIVox will be an indispensable part of integrative structural biology pipelines and will contribute to obtaining a more complete picture of highly intricate and dynamic systems in biologically relevant environments.

Materials and methods

Theory of EMMIVox for single-structure refinement

General overview.

EMMIVox is based on a Bayesian inference framework [37] that estimates the probability of a model M given the information available about the system, including prior physico-chemical knowledge and newly acquired experimental data. The posterior probability p(M|D) of model M, which is defined in terms of its structure X and other parameters, given data D and prior knowledge is: (1) where the likelihood function p(D|M) is the probability of observing data D given M and the prior p(M) is the probability of model M given the prior information. At variance with the previous EMMI approach [29,35], here we define the experimental data as a set of density values D = {d_i} observed in the voxels of a cryo-EM three-dimensional (3D) map. To define the likelihood function, one needs i) a forward model f_i (X) to predict the density that would be observed in voxel i for structure X in absence of noise, and ii) a noise model that defines the distribution of deviations between observed and predicted data.

Atomistic forward model.

To predict the density in voxel i of a cryo-EM map from an atomistic model, we used the fast Fourier transform of a 5-Gaussian fit of the electron scattering factors [71]: (2) where X_j and V_i are the coordinates of atom j and the center of the voxel i, respectively. The Gaussians parameters A_k and B_k depend on the atom type (Table A in S1 Text). The B-factors , which are defined for each atom j but are identical within individual residues, enable smoothing the forward model prediction to describe low-resolution regions with fuzzy density using a single-structure model. The external sum runs over all the non-hydrogen atoms, with exclusion of the carboxylate oxygens of glutamic and aspartic acid, as these groups are often damaged by the electron beam. To reduce the computational cost, we further restricted this summation to a neighbor list of atoms with distance cutoff from the voxel center of 1.0 nm.

Noise model.

We use a Gaussian noise model for the density d_i observed in voxel i.: (3) where the uncertainty parameter quantifies the level of experimental noise as well as errors in the forward model. These variable parameters allow to dynamically adjust the intensity of the cryo-EM spatial restraints during model refinement based on the accuracy in the data, which is inferred by the consistency between cryo-EM data, physico-chemical prior and potentially additional experimental data. High-noise voxels (outliers) are therefore automatically identified and downweighed in the refinement of the structural model in favor of the prior information. α is a constant scaling factor between the entire set of voxels of the predicted and experimental maps, and it is optimized prior to production runs (Scale factor optimization).

Data correlation and likelihood function.

The data likelihood p(D|M) is typically expressed as a product of individual likelihoods p(d_i |M), one for each experimental data point, under the assumption that these points are independent. In case of cryo-EM data, this assumption does not hold because neighboring voxels of a 3D map contain correlated information. Ignoring such correlation would lead to overcounting the number of independent data points and ultimately biasing the refinement towards overfitting the data at the expenses of the stereochemical quality of the models. To reduce data correlation, we developed the following pre-filtering procedure:

We first selected all the voxels of the 3D cryo-EM map that are within 3.5 Å of the atoms in the PDB structure. Density voxels with negative values were discarded. The selected atoms include protein atoms as well as ordered waters, ions, small-molecules and lipids. We sorted the selected voxels in descending order based on the value of the density: this constitutes our initial pool of data points D = {d_i};
We started from the first voxel d₁ (highest density) in pool D and calculated the density autocorrelation function for displacements up to a few voxels in the x, y, and z direction starting from this location. For each displacement, the autocorrelation function is calculated by averaging over multiple starting voxels in a cubic minibox of side equal to 4 Å and centered on d₁. Equivalently, the autocorrelation can be expressed as the Pearson correlation coefficient of a series of density values and its space-lagged version;
We removed from pool D all the voxels within the cubic minibox with Pearson correlation coefficient greater than a predefined threshold, with the exception of d₁.
We moved to the next voxel d₂ in pool D in descending order of density and re-applied the filtering procedure of step 2 and 3;
We iterated until reaching the voxel in pool D with lowest density.

It should be noted that sub-sampling procedures are commonly used in structural modelling with experimental data, for example to remove correlation between points in Small-Angle X-ray scattering profiles [72]. PHENIX itself utilizes only the density in the positions occupied by the atoms during each step of refinement, which is obtained by interpolating the density in the voxels surrounding each atom [23]. After pre-filtering the map with the procedure describe above, we assume that the N_V selected voxels can be considered as independent data points and write the total likelihood function as: (4)

Priors.

Experimental cryo-EM density maps may contain errors from various sources including systematic errors during 3D reconstruction, noise from the use of a limited number of images taken at various sample orientations, and radiation damage from exposure to the electron beam. Furthermore, the forward model used to predict a map from a structural model is intrinsically inaccurate. While these errors might be difficult to measure, it is critically important that they are not misinterpreted as structural dynamics during the generation of structural ensembles. To guide the inference of the uncertainty parameters σ_i that quantify the noise level in the map, we defined a lower bound for each voxel by calculating the voxel-by-voxel variation between two independent reconstructions, or half maps. While this quantity does not describe all the sources of error at play, we expect that the total error in a given voxel cannot be lower than . We imposed this lower bound using the following prior for the uncertainty parameters σ_i [73]: (5) where 1/σ_i is a typical Jeffreys prior.

Inspired by PHENIX, we also added a restraint to guide B-factors inference and avoid that residues close in space have B-factors significantly different from each other. These restraints are enforced by: (6) where the product is over all pairs of B-factors corresponding to residues for which at least a pair of atoms is closer than 5 Å. At variance with the B-factor restraint implemented in PHENIX, our Bayesian approach tolerates outliers, i.e. pair of close residues with significantly different B-factors, via the introduction of the uncertainty parameters σ^f = {σ_jk}. This approach allows us to better model during single-structure refinement those regions of the system that undergo a sudden transition between order and disorder.

Finally, as structural prior p(X) we used state-of-the-art atomistic force fields E_FF (X), which enabled us to accurately model the system as well as the environment, including explicit water molecules, ions, small-molecules, and lipids: (7) where k_B is the Boltzmann constant and T the temperature of the system.

Marginalization.

To avoid sampling the uncertainty parameters σ_i and σ^f, we marginalized the corresponding distributions. The resulting marginal data likelihood is: (8) while the B-factors prior becomes upon introduction of a Jeffreys prior 1/σ_jk: (9) where is a lower bound for the B-factor uncertainty parameters set equal to 0.1 nm².

EMMIVox hybrid energy function.

After defining all the component of our approach and marginalizing the uncertainty parameters, we obtain the final EMMIVox posterior distribution: (10) where the marginal data likelihood is given by Eq 8, the B-factors prior by Eq 9, and the structural prior by Eq 7. To sample the posterior, we define the associated hybrid energy function as: (11)

The EMMIVox hybrid energy is therefore decomposed into: i) the molecular mechanics force field E_FF (Eq 7), ii) the spatial restraints E_(cryo−EM) to enforce the model agreement with the cryo-EM map (Eq 8), and iii) the restraints E_bf to guide B-factors determination (Eq 9). A Gibbs sampling scheme is used to sample model coordinates with Molecular Dynamics (MD) and B-factors parameters with Monte Carlo (MC).

Development of the coarse-grained forward model

Parameterization.

To enable the use of EMMIVox with coarse-grained representations, we developed a forward model to calculate density maps from Martini 3 protein structures [53]. We parameterized the model by fitting to predictions of the atomistic forward model. In the atomistic forward model, the atomic electron scattering factors are given by the Fourier transform of a 5-gaussian mixture (see section Atomistic forward model and Eq 2). We used the same framework for the Martini forward model, but with bead electron scattering factors given by a single Gaussian, i.e. only two parameters A_k and B_k for each Martini bead type k. We determined an individual set of A_k and B_k for each bead position in each type of amino acid for a total of 52 different bead types using the following procedure. To capture the conformational variation in the atoms mapping to each bead type, we used a set of 2906 structures to fit the parameters. These were all the structures obtained with single-particle cryo-EM in the period 2020–2023, with resolution ranging from 1.2 Å to 11 Å. We divided the structures into groups of atoms, each corresponding to one Martini bead type, skipping any beads with missing atoms. For each group, we defined a cubic box of voxels centered on each atom and with side equal to 6 Å (for a total of 6859 evenly spaced voxels). In this set of voxels, we calculated a density map from the group of atoms using the atomistic forward model and from the corresponding Martini bead using the coarse-grained forward model with a grid scan of A_k and B_k. We scanned values of A_k from 0 to 8.0 in steps of 0.1 and B_k from 0 Å² to 40.0 Å² in steps of 0.5 Å². We evaluated the agreement between the two forward models as a function of A_k and B_k using the squared error summed over all the voxels: (12) where N_V is the number of voxels, and are the voxel densities predicted by the Martini and atomistic forward models, respectively. For each of the 52 Martini bead types, we averaged the MSE grid scan over the bead type’s occurrences in all 2906 structures and selected the set of A_k and B_k that minimized the average MSE (Table B in S1 Text).

Validation.

To validate the Martini forward model and compare its accuracy with that of the atomistic forward model, we evaluated the agreement between predicted and experimental density maps for 1909 cryo-EM structures. These were selected from our initial set of 2906 structures as those that could be easily mapped to the Martini 3 representation using the martinize2 python script without any additional modifications to the structures [74]. We calculated density maps from the atomistic and Martini representations using the respective forward models and calculated the CC_mask with the experimental density maps. We used the same protocol described below (in Details of the single-structure refinement benchmark) to preprocess the experimental maps and to optimize the B-factors and scale factors of the predicted maps to maximize CC_mask. The B-factors were sampled for 1000 MC steps and the scale factor was scanned from 0.7 to 1.3 in steps of 0.05. No data correlation filtering was used for the experimental maps. To evaluate the difference between the forward models as a function of the experimental resolution, we binned the structures by resolution (bin size equal to 1.0 Å) and calculated for each bin the average difference in CC_mask between the atomistic and Martini forward models: (13) where N_S is the number of structures in the bin.

Theory of EMMIVox for ensemble refinement

To model structural ensembles fitting a cryo-EM density map, we coupled EMMIVox with metainference [41]. Metainference is a general Bayesian inference approach that enables modelling structural ensembles using any kind of ensemble-averaged experimental data as well as prior physico-chemical information. Inspired by the maximum entropy/replica-averaged approach [75], the metainference posterior is expressed in terms of several copies of the system (or replicas), which represent the conformational heterogeneity of the system. With a Gaussian data likelihood, as in the EMMIVox case, the metainference posterior is: (14) where {X_r} are the structures of the N_R replicas of the system and {σ_r,i} the error parameters, one per replica r and experimental data point i. The predicted experimental observable f_i ({X_r}) is calculated as the average of f_i over the N_R replicas. Sampling of the metainference posterior is performed by a multi-replica MD/MC simulation driven by the hybrid energy function associated to Eq 14. This approach has been extensively used to determine structural ensembles of highly dynamic and disordered systems using NMR spectroscopy [76,77] as well as SAXS/SANS [78,79] data. Metainference was also used to determine structural ensembles from cryo-EM data in our previous EMMI approach based on a Gaussian Mixture Models representation of the cryo-EM map [29,35]. In combination with EMMIVox and the more accurate noise models developed here, metainference enables ensemble-refinement by interpreting low resolution areas of a cryo-EM map in terms of a mixture of conformational heterogeneity and noise.

Details of the single-structure refinement benchmark

Details of the systems.

Nine systems determined at a resolution ranging from 1.9 Å to 4.0 Å were chosen to benchmark the EMMIVox single-structure refinement (Table C in S1 Text): the GPT type 1a tau filament, the ChRmine channelrhodopsin, the Anaplastic lymphoma kinase extracellular domain fragment in complex with an activating ligand, a mutant of dimeric unphosporylated Pediculus humanus protein kinase, the human CDK-activating kinase bound to the clinical inhibitor ICEC0942, the major facilitator superfamily domain containing 2A in complex with LPC-18:3, the Escherichia coli PBP1b, the mammalian peptide transporter PepT2, and the SPP1 bacteriophage tail tube.

Setup and general MD details.

Missing residues in the deposited PDB were modelled using GalaxyFill [80] or, in case missing residues could not be placed, with Modeller [15] v. 10.1. The resulting model was then processed using the CHARMM-GUI [81] server. Membrane proteins (PDB ids 7w9w, 7mjs, and 7nqk) were inserted in a homogeneous POPC lipid bilayer. Each system was solvated in a triclinic box with dimensions chosen in such a way that the edge of the box was 1.0 nm away from the closest model atom. K+ and Cl- were added to ensure charge neutrality at concentration equal to 0.15 M. The CHARMM36m [82] forcefield was used for proteins and lipids, CgenFF for small molecules [83], and the mTIP3P [84] model for water molecules. In all simulations the equations of motion were integrated by a leap-frog algorithm with timestep equal to 2 fs. The smooth particle mesh Ewald [85] method was used to calculate electrostatic interactions with a cutoff equal to 1.2 nm. Van der Waals interactions were gradually switched off at 1.0 nm and cut off at 1.2 nm. All simulations were carried out using GROMACS [86] v. 2020.5 equipped with the development version of PLUMED [38]. To optimize performances, EMMIVox is implemented in PLUMED with libtorch [87] to efficiently calculate the cryo-EM forward model and the hybrid energy function on the GPU.

Cryo-EM map preprocessing.

For each system, we downloaded the cryo-EM map as well as the two half maps from the EMDB database. We applied our pre-processing procedure to select the voxels within 3.5 Å of the model atoms and filter them to reduce data correlation, with a threshold of 0.7, 0.8, 0.9, and 1.0 (no filtering). The two experimental half maps were used to calculate the lower bound for the density uncertainty parameters. Since CHARMM-GUI processing of the deposited PDB often translates and rotates the input conformation, we calculated the transformation that aligns initial and final models and applied it to all the voxels selected for refinement. Finally, a single map data file was created with the list of voxels to be used by PLUMED for refinement and, for each voxel, the following information: (transformed) coordinates of the voxel center, density value, and uncertainty lower bound.

Equilibration.

Energy minimization was performed on each system using the steepest decent approach. A 1 ns-long NPT equilibration was then performed with the Bussi-Donadio-Parrinello thermostat [88] and the Berendsen barostat [89], set at 300 K and 1 atm respectively. In systems containing a lipid bilayer, the pressure coupling type was semi-isotropic to allow deformations in the xy plane independent from the z-axis. Next, NVT simulations were carried out for 2 ns using the Bussi-Donadio-Parrinello thermostat at 300 K. During the last two steps of equilibration, positional restraints were applied to all the heavy atoms of the protein and any other components contributing to the predicted cryo-EM density map.

Modelling of ordered waters.

In presence of ordered waters in the deposited PDB, a special treatment was required. First, resolved waters as well as all the water molecules within 3.5 Å in the initial CHARMM-GUI solvated model (buffer waters) contributed to the map prediction via our forward model. Second, the positions of ordered waters were restrained during equilibration; the buffer waters were restrained to stay within 8 Å from a reference protein atom, defined as the closest to each water molecule in the initial model.

Scale factor optimization.

To reduce the number of free parameters to sample during the production run, we determined an optimal scaling factor between predicted and observed cryo-EM maps. We analyzed with the PLUMED driver tool the trajectory obtained with positional restraints during the NVT equilibration for different values of the scaling factors in the range from 0.5 to 1.5 at intervals of 0.05. For each value of the scaling factor, we sampled with MC the B-factors while re-reading the trajectory, with a maximum MC move per B-factor equal to 0.05 nm². To optimize sampling, B-factors were initialized using an empirical relationship between map resolution (res, in nm) and average B-factor values (in nm²) determined from 8000 deposited PDBs obtained from cryo-EM maps with resolution less than 5 Å: (15)

The scaling factor that resulted in the lowest EMMIVox hybrid energy along the entire trajectory was selected to be used in the production simulations described in the following section.

Production.

All single-structure refinement production simulations were performed in the NVT ensemble with Bussi-Donadio-Parrinello thermostat at 300 K for 20 ns (NPT ensemble with Parrinello-Rahman barostat [90] at 1 atm for membrane proteins). Before calculating the EMMIVox hybrid energy, structural discontinuities due to Periodic Boundary Conditions (PBC) were fixed on-the-fly by PLUMED. Furthermore, to optimize performances, we: i) updated the forward model neighbor list every 50 MD steps; ii) sampled the B-factors every 500 MD steps with maximum MC move equal to equal to 0.05 nm²; iii) used a multiple-time step algorithm for PBC reconstruction and EMMIVox hybrid energy calculation with stride equal to 4 MD steps [91]. The system trajectory was saved to file every 10 ps for subsequent analysis. Ordered and buffer waters that contributed to the map prediction were restrained to stay within 8 Å from a reference protein atom, defined as the closest to each water molecule in the initial model. These restraints allowed: i) exchanges between the ordered molecules resolved in the PDB and their surrounding buffer waters; ii) to identify additional sites of (semi)ordered molecules in the proximity of the resolved waters, for example a second coordination shell.

Analysis.

To generate the final single-structure refined model, we first extracted the conformation with lowest EMMIVox hybrid energy from the production run as well as the associated B-factors. We then performed an (EMMIVox hybrid) energy minimization using the steepest decent approach. During minimization, B-factors were sampled starting from those found in the initial model using a modified MC sampler in which only downhill moves were accepted. B-factors were sampled every 100 steps with a maximum MC move equal to 0.1 nm². The forward model neighbor list was updated at every step. Next, we selected the conformation at the end of the minimization and fixed discontinuities in the structure due to PBC with the PLUMED driver tool. We then generated a PDB file containing only the heavy atoms of the systems that were used to predict the cryo-EM density map during production. This model was re-aligned to the original cryo-EM map downloaded from the EMDB using the inverse transformation computed in the section Cryo-EM map preprocessing. Finally, we added to the PDB the B-factors obtained at the end of the final minimization. MolProbity was used to compute the clashscore and MolProbity score, while PHENIX v. 1.15.2 was used to evaluate the fit to the experimental map with the EMRinger score. We implemented the calculation of CC_mask [51] in a GPU-enabled python script and generalized it to compute this score from a structural ensemble. At variance with PHENIX, we did not optimize an isotropic B-factor before the CC_mask calculation. Despite this difference, the CC_mask calculated by PHENIX and by our python script are strongly correlated (Fig M in S1 Text). For all the fit-to-data calculations, the original map as downloaded from the EMDB was used, regardless of the data correlation cutoff and the total number of voxels used in the refinement. For the analysis of physico-chemical fingerprints (Fig 2), we used MDAnalysis [92] v. 2.0 to calculate the number of hydrogen bonds from a single-structure model with donor-acceptor distance and angle cutoff equal to 3.0 Å and 150 degrees, respectively. Hydrogen bonds between anionic carboxylate (RCOO⁻) of either aspartic acid or glutamic acid and the cationic ammonium (RNH₃⁺) from lysine or arginine were classified as salt bridges. Clashes between atoms with overlap between van der Waals radii greater than 0.4 Å were calculated with PHENIX.clashscore.

Details of the ensemble refinement benchmark

Setup.

We determined EMMIVox structural ensembles fitting the cryo-EM map for all the systems selected for the single-structure refinement benchmark. To prepare the ensemble simulations, we first extracted the conformation with lowest EMMIVox hybrid energy from the single-structure refinement production run and identified the minimum B-factor across all residues of this conformation. We then extracted 16 frames from the second half of the single-structure refinement production runs equally distributed in time. These conformations will be used as starting points of the production run described in the following section.

Production.

EMMIVox ensemble simulations were performed with the same settings used in single-structure refinement, except for the fact that during ensemble refinement B-factors were not sampled but kept constant and all equal to the identified in the previous step. This value of B-factors can be considered as either a baseline dynamic of the most rigid residue of the system or a minimum noise level of the best resolved residue. In both cases, using such constant B-factor during ensemble refinement contributes to avoiding overinterpreting noise in terms of conformational heterogeneity. 16 metainference replicas were used, each one simulated for 20 ns, resulting in 320 ns of ensemble trajectory for each system.

Analysis.

After the EMMIVox ensemble simulations were completed, the trajectories of all the metainference replicas were concatenated, resulting in the complete structural ensemble of the system. After fixing with PLUMED the structural discontinuities due to PBC, we re-aligned the EMMIVox ensemble to the original cryo-EM map downloaded from the EMDB using the inverse transformation computed in the section Cryo-EM map preprocessing. We then evaluated the fit to the data by calculating the CC_mask of the average cryo-EM map from the EMMIVox ensemble and the experimental map. To make a fair comparison, we selected the voxels for the CC_mask calculation based only on the single-structure model using the standard convention [51]. In this way, the calculation of CC_mask for single-structure and ensemble models was performed on the same set of voxels. The analysis of multimodality of the conformational distribution of each individual residue (Fig 4) was performed using the Folding Test of Unimodality implemented in libfolding [93] (https://github.com/asiffer/python3-libfolding).

Supporting information

S1 Text. Supplementary Tables and Figures.

Table A. Parameters of the forward model to predict a density map from an atomistic model. These parameters are obtained from a 5-Gaussians fit of the electron atomic scattering factors for s up to 6.0 Å^-1. B_! parameters are expressed in Å². Table B. Parameters of the forward model to predict a density map from a Martini 3 coarse grained model. B_! parameters are expressed in Å². Table C. Details of the systems used in the single-structure and ensemble refinement benchmark. For each system, the first 8 columns report details about the deposited structure: PDB and EMDB ids, resolution, number of protein residues, number of protein chains, number of resolved water molecules, number of resolved lipids, number of ligands. The last 5 columns contain details of the system as prepared for EMMIVox simulations: box size, number of water molecules, number of lipids, number of buffer ions, and total number of atoms. Fig A. The cryo-EM and cryo-ET revolutions. Number of (a) single-particle cryo-EM and (b) subtomogram averaging maps released per year and classified based on resolution. Data was obtained from the Electron Microscopy Data Bank (https://www.ebi.ac.uk/emdb/) on October 12^th 2023. Fig B. Assessment of the quality of the cryo-EM structures deposited in the PDB database. Violin plots of (a) CC_"#$!, (b) EMRinger score, (c) clashscore, and (d) MolProbity score as a function of the resolution of the cryo-EM structures. For this analysis, 2476 structures determined by single-particle cryo- EM in the resolution range 1.78 Å to 4.00 Å and released between 2015 and 2022 were used. In the violin plots, the white circle corresponds to the median value, the black rectangle extends from the first to the third quantiles, and the thin black line represents the 95% confidence intervals. Fig C. Benchmark of the approach to remove correlated voxels. Analysis of the number of voxels per atom for 2477 structures determined by single-particle cryo-EM between 2015 and 2022 in the resolution range 1.78 Å to 4.00 Å as a function of the map resolution and voxel correlation cutoff value (CCcut). Maps were separated into 5 resolution bins of size equal to 0.5 Å and ranging from 1.75 Å to 4 Å. The median values within the 5 bins are reported as blue points and the standard deviations as error bars. The reduction of the error bar at fixed resolution bin observed as CCcut is decreased indicates that upon removing correlated data the number of voxels per atom across different maps of similar resolution depends less and less on the voxel size. Fig D. Stereochemical quality of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of clashscore for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. Fig E. Stereochemical quality of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of MolProbity scores for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. Fig F. Fit-to-data of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of CC_"#$! scores for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. Fig G. Fit-to-data of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of EMRinger scores for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. Fig H. Effect of EMMIVox noise models on model fit-to-data. Relation between individual residue fit to the experimental cryo-EM map (local CC_"#$!) and local error in the map in the proximity of each residue for the single-structure EMMIVox model (blue) and the deposited PDB (red) for the 9 systems in our benchmark set. The local error is calculated from the difference between the two cryo-EM half maps as the median value across all the voxels associated to a residue by Voronoi tessellation. Fig I. Bayesian inference of B-factors in EMMIVox single-structure refinement. Comparison between the B-factor values of the deposited PDBs (red) and the EMMIVox single-structure models (blue) for the 9 systems in our benchmark set. Each panel compares the B-factor values for a single chain within the system and is representative of the system as a whole. Fig J. Fit-to-data of deposited PDBs, EMMIVox single-structure models and ensembles. Comparison of the CC_"#$! scores for the deposited PDB (light color), EMMIVox single-structure model (medium color) and EMMIVox ensemble (dark color). Fig K. Assessment of overfitting of data noise with conformational heterogeneity in EMMIVox structural ensembles. Relation between per residue B-factor in the single-structure EMMIVox model and residue mean square fluctuations (MSF) within the EMMIVox ensemble for the 9 systems in our benchmark set. The red line indicates a Bayesian linear fit between the two quantities. Fig L. EMMIVox sampling efficiency for the case of the GPT type 1a tau filament. Structural ensembles of residues K343 (top row) and K347 (bottom row) from standard MD simulation (left column) and EMMIVox ensembles (right column). Total simulation time for both ensembles was equal to 320 ns. In the EMMIVox ensemble, K343 occupies two distinct states with equal populations, as illustrated by the distribution of the lysine sidechain nitrogen positions projected on a plane parallel to the fibril surface. The MD ensemble for K343 shows a 2–3 split of chains occupying state 1 and state 2. Limited sampling is seen between the two states for each individual chain. K347 in the EMMIVox ensemble occupies two conformations, the major conformation found in the deposited PDB with an occupancy of 73%, and a second minor conformation, populated only by 27%. In contrast, K347 for the standard MD simulation only occupies the major conformation found in the deposited PDB and does not sample the minor conformation in the same simulation time of the EMMIVox ensemble. Fig M. Correlation between PLUMED and Phenix CC_mask. Analysis of the relation between CC_"#$! calculated with Phenix and with a custom python script for 2477 structures determined by single-particle cryo-EM between 2015 and 2022 in the resolution range 1.78 Å to 4.00 Å. Our implementation of CC_"#$! does not optimize an isotropic B-factor, unlike Phenix. Despite this difference, a strong correlation between the two CC_"#$! scores is observed across the analyzed systems. Points are colored based on local density, with yellow and blue indicating high and low densities, respectively. Fig N. Fit-to-data of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of RMS(Bonds) values for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. Most values fall under the 0.02 Å threshold used to define high-resolution models[1,2]. The black line indicates the reference value in the deposited PDB. Fig O. Fit-to-data of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of RMS(Angles) values for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. Most values are within the common range of RMS(angle) values [1–3]. The black line indicates the reference value in the deposited PDB. Fig P. Fit-to-data of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of percentage of residues with dihedral values that fall in the favored Ramachandran region for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. The black line indicates the reference value in the deposited PDB. Fig Q. Fit-to-data of single-structure EMMIVox models as a function of data correlation cutoff. Comparison of percentage of residues with dihedral values that fall in the non-favored Ramachandran region for EMMIVox refined single-structure models as a function of correlation coefficient cutoff for the 9 systems in our benchmark set. The black line indicates the reference value in the deposited PDB.

https://doi.org/10.1371/journal.pcbi.1012180.s001

(PDF)

Acknowledgments

M.B. would like to acknowledge PRACE for awarding access to Piz Daint at CSCS, Switzerland.

References

1. Nogales E. The development of cryo-EM into a mainstream structural biology technique. Nat Methods 13, 24 (2016). pmid:27110629
- View Article
- PubMed/NCBI
- Google Scholar
2. Yip K.M., Fischer N., Paknia E., Chari A. & Stark H. Atomic-resolution protein structure determination by cryo-EM. Nature 587, 157 (2020). pmid:33087927
- View Article
- PubMed/NCBI
- Google Scholar
3. Lazic I. et al. Single-particle cryo-EM structures from iDPC-STEM at near-atomic resolution. Nat Methods 19, 1126 (2022). pmid:36064775
- View Article
- PubMed/NCBI
- Google Scholar
4. Turk M. & Baumeister W. The promise and the challenges of cryo-electron tomography. Febs Lett 594, 3243 (2020). pmid:33020915
- View Article
- PubMed/NCBI
- Google Scholar
5. Bonomi M. & Vendruscolo M. Determination of protein structural ensembles using cryo-electron microscopy. Curr Opin Struc Biol 56, 37 (2019). pmid:30502729
- View Article
- PubMed/NCBI
- Google Scholar
6. Tang W.S., Zhong E.D., Hanson S.M., Thiede E.H. & Cossio P. Conformational heterogeneity and probability distributions from single-particle cryo-electron microscopy. Curr Opin Struct Biol 81, 102626 (2023). pmid:37311334
- View Article
- PubMed/NCBI
- Google Scholar
7. Doerr A. A dynamic direction for cryo-EM. Nat Methods 19, 29 (2022). pmid:35017731
- View Article
- PubMed/NCBI
- Google Scholar
8. Scapin G., Potter C.S. & Carragher B. Cryo-EM for Small Molecules Discovery, Design, Understanding, and Application. Cell Chem Biol 25, 1318 (2018). pmid:30100349
- View Article
- PubMed/NCBI
- Google Scholar
9. Jumper J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583 (2021). pmid:34265844
- View Article
- PubMed/NCBI
- Google Scholar
10. Lawson C.L. et al. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat Methods 18, 156 (2021). pmid:33542514
- View Article
- PubMed/NCBI
- Google Scholar
11. Farabella I. et al. TEMPy: a Python library for assessment of three-dimensional electron microscopy density fits. J Appl Crystallogr 48, 1314 (2015). pmid:26306092
- View Article
- PubMed/NCBI
- Google Scholar
12. Pintilie G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat Methods 17, 328 (2020). pmid:32042190
- View Article
- PubMed/NCBI
- Google Scholar
13. Terashi G., Wang X., Subramaniya S.R.M.V., Tesmer J.J.G. & Kihara D. Residue-wise local quality estimation for protein models from cryo-EM maps. Nat Methods 19, 1116 (2022). pmid:35953671
- View Article
- PubMed/NCBI
- Google Scholar
14. Malhotra S., Trager S., Dal Peraro M. & Topf M. Modelling structures in cryo-EM maps. Curr Opin Struc Biol 58, 105 (2019). pmid:31394387
- View Article
- PubMed/NCBI
- Google Scholar
15. Sali A. & Blundell T.L. Comparative Protein Modeling by Satisfaction of Spatial Restraints. J Mol Biol 234, 779 (1993).
- View Article
- Google Scholar
16. Wang R.Y. et al. Automated structure refinement of macromolecular assemblies from cryo-EM maps using Rosetta. Elife 5, e17219 (2016). pmid:27669148
- View Article
- PubMed/NCBI
- Google Scholar
17. Pettersen E.F. et al. UCSF chimera—A visualization system for exploratory research and analysis. J Comput Chem 25, 1605 (2004). pmid:15264254
- View Article
- PubMed/NCBI
- Google Scholar
18. Joseph A.P., Lagerstedt I., Patwardhan A., Topf M. & Winn M. Improved metrics for comparing structures of macromolecular assemblies determined by 3D electron-microscopy. J Struct Biol 199, 12 (2017). pmid:28552721
- View Article
- PubMed/NCBI
- Google Scholar
19. Kawabata T. Multiple Subunit Fitting into a Low-Resolution Density Map of a Macromolecular Complex Using a Gaussian Mixture Model. Biophys J 95, 4643 (2008). pmid:18708469
- View Article
- PubMed/NCBI
- Google Scholar
20. Igaev M., Kutzner C., Bock L.V., Vaiana A.C. & Grubmuller H. Automated cryo-EM structure refinement using correlation-driven molecular dynamics. Elife 8, e43542 (2019). pmid:30829573
- View Article
- PubMed/NCBI
- Google Scholar
21. Trabuco L.G., Villa E., Mitra K., Frank J. & Schulten K. Flexible fitting of atomic structures into electron microscopy maps using molecular dynamics. Structure 16, 673 (2008). pmid:18462672
- View Article
- PubMed/NCBI
- Google Scholar
22. Cragnolini T., Beton J. & Topf M. Cryo-EM structure and B-factor refinement with ensemble representation. Nat Commun 15, 444 (2024). pmid:38200043
- View Article
- PubMed/NCBI
- Google Scholar
23. Afonine P.V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr D 74, 531 (2018). pmid:29872004
- View Article
- PubMed/NCBI
- Google Scholar
24. Blau C., Yvonnesdotter L. & Lindahl E. Gentle and fast all-atom model refinement to cryo-EM densities via a maximum likelihood approach. PLoS Comput Biol 19, e1011255 (2023). pmid:37523411
- View Article
- PubMed/NCBI
- Google Scholar
25. Topf M. et al. Protein structure fitting and refinement guided by cryo-EM density. Structure 16, 295 (2008). pmid:18275820
- View Article
- PubMed/NCBI
- Google Scholar
26. Croll T.I. ISOLDE: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr D 74, 519 (2018). pmid:29872003
- View Article
- PubMed/NCBI
- Google Scholar
27. Emsley P., Lohkamp B., Scott W.G. & Cowtan K. Features and development of Coot. Acta Crystallogr D 66, 486 (2010). pmid:20383002
- View Article
- PubMed/NCBI
- Google Scholar
28. Scheres S.H.W. RELION: Implementation of a Bayesian approach to cryo-EM structure determination. J Struct Biol 180, 519 (2012). pmid:23000701
- View Article
- PubMed/NCBI
- Google Scholar
29. Bonomi M. et al. Bayesian Weighing of Electron Cryo-Microscopy Data for Integrative Structural Modeling. Structure 27, 175 (2019). pmid:30393052
- View Article
- PubMed/NCBI
- Google Scholar
30. van Zundert G.C.P., Melquiond A.S.J. & Bonvin A.M.J.J. Integrative Modeling of Biomolecular Complexes: HADDOCKing with Cryo-Electron Microscopy Data. Structure 23, 949 (2015). pmid:25914056
- View Article
- PubMed/NCBI
- Google Scholar
31. Giri N., Roy R.S. & Cheng J.L. Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions. Curr Opin Struc Biol 79, 102536 (2023). pmid:36773336
- View Article
- PubMed/NCBI
- Google Scholar
32. Cossio P. & Hummer G. Bayesian analysis of individual electron microscopy images: Towards structures of dynamic and heterogeneous biomolecular assemblies. J Struct Biol 184, 427 (2013). pmid:24161733
- View Article
- PubMed/NCBI
- Google Scholar
33. Velazquez-Muriel J. et al. Assembly of macromolecular complexes by satisfaction of spatial restraints from electron microscopy images. Proc Natl Acad Sci USA 109, 18821 (2012). pmid:23112201
- View Article
- PubMed/NCBI
- Google Scholar
34. Tang W.S. et al. Ensemble Reweighting Using Cryo-EM Particle Images. J Phys Chem B 127, 5410 (2023). pmid:37293763
- View Article
- PubMed/NCBI
- Google Scholar
35. Bonomi M., Pellarin R. & Vendruscolo M. Simultaneous Determination of Protein Structure and Dynamics Using Cryo-Electron Microscopy. Biophys J 114, 1604 (2018). pmid:29642030
- View Article
- PubMed/NCBI
- Google Scholar
36. Riley B.T. et al. qFit 3: Protein and ligand multiconformer modeling for X-ray crystallographic and single-particle cryo-EM density maps. Protein Sci 30, 270 (2021). pmid:33210433
- View Article
- PubMed/NCBI
- Google Scholar
37. Rieping W., Habeck M. & Nilges M. Inferential structure determination. Science 309, 303 (2005). pmid:16002620
- View Article
- PubMed/NCBI
- Google Scholar
38. Tribello G.A., Bonomi M., Branduardi D., Camilloni C. & Bussi G. PLUMED 2: New feathers for an old bird. Comput Phys Commun 185, 604 (2014).
- View Article
- Google Scholar
39. Bonomi M. et al. Promoting transparency and reproducibility in enhanced molecular simulations. Nat Methods 16, 670 (2019). pmid:31363226
- View Article
- PubMed/NCBI
- Google Scholar
40. Shi Y. et al. Structure-based classification of tauopathies. Nature 598, 359 (2021). pmid:34588692
- View Article
- PubMed/NCBI
- Google Scholar
41. Bonomi M., Camilloni C., Cavalli A. & Vendruscolo M. Metainference: A Bayesian inference method for heterogeneous systems. Sci Adv 2, e1501177 (2016). pmid:26844300
- View Article
- PubMed/NCBI
- Google Scholar
42. Kishi K.E. et al. Structural basis for channel conduction in the pump-like channelrhodopsin ChRmine. Cell 185, 672 (2022). pmid:35114111
- View Article
- PubMed/NCBI
- Google Scholar
43. Reshetnyak A.V. et al. Mechanism for the activation of the anaplastic lymphoma kinase receptor. Nature 600, 153 (2021). pmid:34819673
- View Article
- PubMed/NCBI
- Google Scholar
44. Gan Z.Y. et al. Activation mechanism of PINK1. Nature 602, 328 (2022). pmid:34933320
- View Article
- PubMed/NCBI
- Google Scholar
45. Greber B.J., Remis J., Ali S. & Nogales E. 2.5 A-resolution structure of human CDK-activating kinase bound to the clinical inhibitor ICEC0942. Biophys J 120, 677 (2021).
- View Article
- Google Scholar
46. Cater R.J. et al. Structural basis of omega-3 fatty acid transport across the blood-brain barrier. Nature 595, 315 (2021). pmid:34135507
- View Article
- PubMed/NCBI
- Google Scholar
47. Caveney N.A. et al. CryoEM structure of the antibacterial target PBP1b at 3.3 A resolution. Nat Commun 12, 2775 (2021).
- View Article
- Google Scholar
48. Parker J.L. et al. Cryo-EM structure of PepT2 reveals structural basis for proton-coupled peptide and prodrug transport in mammals. Sci Adv 7, eabh3355 (2021). pmid:34433568
- View Article
- PubMed/NCBI
- Google Scholar
49. Zinke M. et al. Architecture of the flexible tail tube of bacteriophage SPP1. Nat Commun 11, 5759 (2020). pmid:33188213
- View Article
- PubMed/NCBI
- Google Scholar
50. Chen V.B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D 66, 12 (2010). pmid:20057044
- View Article
- PubMed/NCBI
- Google Scholar
51. Afonine P.V. et al. New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr D 74, 814 (2018). pmid:30198894
- View Article
- PubMed/NCBI
- Google Scholar
52. Barad B.A. et al. EMRinger: side chain-directed model and map validation for 3D cryo-electron microscopy. Nat Methods 12, 943 (2015). pmid:26280328
- View Article
- PubMed/NCBI
- Google Scholar
53. Souza P.C.T. et al. Martini 3: a general purpose force field for coarse-grained molecular dynamics. Nat Methods 18, 382- (2021). pmid:33782607
- View Article
- PubMed/NCBI
- Google Scholar
54. Fraser J.S., Lindorff-Larsen K. & Bonomi M. What Will Computational Modeling Approaches Have to Say in the Era of Atomistic Cryo-EM Data? J Chem Inf Model 60, 2410 (2020). pmid:32090567
- View Article
- PubMed/NCBI
- Google Scholar
55. Daskalov A. et al. Contribution of Specific Residues of the beta-Solenoid Fold to HET-s Prion Function, Amyloid Structure and Stability. Plos Pathog 10, e1004158 (2014).
- View Article
- Google Scholar
56. Becker L.M. et al. The Rigid Core and Flexible Surface of Amyloid Fibrils Probed by Magic-Angle-Spinning NMR Spectroscopy of Aromatic Residues. Angew Chem Int Edit 62, e202219314 (2023). pmid:36738230
- View Article
- PubMed/NCBI
- Google Scholar
57. Scheres S.H.W. Processing of Structurally Heterogeneous Cryo-EM Data in RELION. Method Enzymol 579, 125 (2016). pmid:27572726
- View Article
- PubMed/NCBI
- Google Scholar
58. Punjani A., Rubinstein J.L., Fleet D.J. & Brubaker M.A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat Methods 14, 290 (2017). pmid:28165473
- View Article
- PubMed/NCBI
- Google Scholar
59. Punjani A. & Fleet D.J. 3DFlex: determining structure and motion of flexible proteins from cryo-EM. Nat Methods 20, 860 (2023). pmid:37169929
- View Article
- PubMed/NCBI
- Google Scholar
60. Herreros D. et al. Estimating conformational landscapes from Cryo-EM particles by 3D Zernike polynomials. Nat Commun 14, 154 (2023). pmid:36631472
- View Article
- PubMed/NCBI
- Google Scholar
61. Kinman L.F., Powell B.M., Zhong E.D., Berger B. & Davis J.H. Uncovering structural ensembles from single-particle cryo-EM data using cryoDRGN. Nat Protoc 18, 319 (2023). pmid:36376590
- View Article
- PubMed/NCBI
- Google Scholar
62. Chen M. & Ludtke S.J. Deep learning-based mixed-dimensional Gaussian mixture model for characterizing variability in cryo-EM. Nat Methods 18, 930–936 (2021). pmid:34326541
- View Article
- PubMed/NCBI
- Google Scholar
63. Frank J. & Ourmazd A. Continuous changes in structure mapped by manifold embedding of single-particle data in cryo-EM. Methods 100, 61 (2016). pmid:26884261
- View Article
- PubMed/NCBI
- Google Scholar
64. Schwab, J., Kimanius, D., Burt, A., Dendooven, T. & Scheres, S. DynaMight: estimating molecular motions with improved reconstruction from cryo-EM images. bioRxiv, 2023.10.18.562877 (2023).
65. Garibsingh R.A.A. et al. Rational design of ASCT2 inhibitors using an integrated experimental-computational approach. Proc Natl Acad Sci USA 118, e2104093118 (2021). pmid:34507995
- View Article
- PubMed/NCBI
- Google Scholar
66. Eshun-Wilson L. et al. Effects of alpha-tubulin acetylation on microtubule structure and stability. Proc Natl Acad Sci USA 116, 10366 (2019).
- View Article
- Google Scholar
67. Bock L.V. & Grubmuller H. Effects of cryo-EM cooling on structural ensembles. Nat Commun 13, 1709 (2022). pmid:35361752
- View Article
- PubMed/NCBI
- Google Scholar
68. Brotzakis Z.F. et al. Determination of the Structure and Dynamics of the Fuzzy Coat of an Amyloid Fibril of IAPP Using Cryo-Electron Microscopy. Biochemistry-Us 62, 2407 (2023).
- View Article
- Google Scholar
69. Jamali, K. et al. Automated model building and protein identification in cryo-EM maps. bioRxiv, 2023.05.16.541002 (2023).
70. Bonomi M. & Camilloni C. Integrative structural and dynamical biology with PLUMED-ISDB. Bioinformatics 33, 3999 (2017). pmid:28961689
- View Article
- PubMed/NCBI
- Google Scholar
71. Peng L.M., Ren G., Dudarev S.L. & Whelan M.J. Robust parameterization of elastic and absorptive electron atomic scattering factors. Acta Crystallogr A 52, 257 (1996).
- View Article
- Google Scholar
72. Paissoni C., Jussupow A. & Camilloni C. Determination of Protein Structural Ensembles by Hybrid-Resolution SAXS Restrained Molecular Dynamics. J Chem Theory Comput 16, 2825 (2020). pmid:32119546
- View Article
- PubMed/NCBI
- Google Scholar
73. Sivia D. & Skilling J. Data Analysis: A Bayesian Tutorial. (Oxford University Press, Oxford; 2006).
74. Kroon P.C. et al. Martinize2 and Vermouth: Unified Framework for Topology Generation. Elife 12, RP90627 (2023).
- View Article
- Google Scholar
75. Cavalli A., Camilloni C. & Vendruscolo M. Molecular dynamics simulations with replica-averaged structural restraints generate structural ensembles according to the maximum entropy principle. J Chem Phys 139, 094112 (2013).
- View Article
- Google Scholar
76. Heller G.T. et al. Small-molecule sequestration of amyloid-beta as a drug discovery strategy for Alzheimer’s disease. Sci Adv 6, eabb5924 (2020).
- View Article
- Google Scholar
77. Heller G.T. et al. Sequence Specificity in the Entropy-Driven Binding of a Small Molecule and a Disordered Peptide. J Mol Biol 429, 2772 (2017). pmid:28743590
- View Article
- PubMed/NCBI
- Google Scholar
78. Jussupow A. et al. The dynamics of linear polyubiquitin. Sci Adv 6, eabc3786 (2020). pmid:33055165
- View Article
- PubMed/NCBI
- Google Scholar
79. Cezar H.M. & Cascella M. SANS Spectra with PLUMED: Implementation and Application to Metainference. J Chem Inf Model 63, 4979 (2023). pmid:37552250
- View Article
- PubMed/NCBI
- Google Scholar
80. Coutsias E.A., Seok C., Jacobson M.P. & Dill K.A. A kinematic view of loop closure. J Comput Chem 25, 510 (2004). pmid:14735570
- View Article
- PubMed/NCBI
- Google Scholar
81. Jo S., Kim T., Iyer V.G. & Im W. CHARMM-GUI: A web-based graphical user interface for CHARMM. J Comput Chem 29, 1859 (2008). pmid:18351591
- View Article
- PubMed/NCBI
- Google Scholar
82. Huang J. et al. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods 14, 71 (2017). pmid:27819658
- View Article
- PubMed/NCBI
- Google Scholar
83. Vanommeslaeghe K. et al. CHARMM General Force Field: A Force Field for Drug-Like Molecules Compatible with the CHARMM All-Atom Additive Biological Force Fields. J Comput Chem 31, 671 (2010). pmid:19575467
- View Article
- PubMed/NCBI
- Google Scholar
84. MacKerell A.D. et al. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102, 3586 (1998). pmid:24889800
- View Article
- PubMed/NCBI
- Google Scholar
85. Essmann U. et al. A Smooth Particle Mesh Ewald Method. J Chem Phys 103, 8577 (1995).
- View Article
- Google Scholar
86. Abraham M.J. et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1–2, 19 (2015).
- View Article
- Google Scholar
87. Paszke A. et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Adv Neur In 32 (2019).
- View Article
- Google Scholar
88. Bussi G., Donadio D. & Parrinello M. Canonical sampling through velocity rescaling. J Chem Phys 126, 014101 (2007). pmid:17212484
- View Article
- PubMed/NCBI
- Google Scholar
89. Berendsen H.J.C., Postma J.P.M., Vangunsteren W.F., Dinola A. & Haak J.R. Molecular-Dynamics with Coupling to an External Bath. J Chem Phys 81, 3684 (1984).
- View Article
- Google Scholar
90. Parrinello M. & Rahman A. Polymorphic Transitions in Single-Crystals—a New Molecular-Dynamics Method. J Appl Phys 52, 7182 (1981).
- View Article
- Google Scholar
91. Ferrarotti M.J., Bottaro S., Perez-Villa A. & Bussi G. Accurate Multiple Time Step in Biased Molecular Simulations. J Chem Theory Comput 11, 139 (2015). pmid:26574212
- View Article
- PubMed/NCBI
- Google Scholar
92. Michaud-Agrawal N., Denning E.J., Woolf T.B. & Beckstein O. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. J Comput Chem 32, 2319 (2011). pmid:21500218
- View Article
- PubMed/NCBI
- Google Scholar
93. Siffer, A., Fouque, P.A., Termier, A. & Largouet, C. Are your data gathered? The Folding Test of Unimodality. Proceedings of the 24th Acm Sigkdd International Conference on Knowledge Discovery & Data Mining, 2210 (2018).
94. Wang K.T. et al. Cryo-EM reveals the architecture of placental malaria VAR2CSA and provides molecular insight into chondroitin sulfate binding. Nat Commun 12, 2956 (2021). pmid:34011972
- View Article
- PubMed/NCBI
- Google Scholar
95. Dimos N. et al. CryoEM analysis of small plant biocatalysts at sub-2 angstrom resolution. Acta Crystallogr D 78, 113 (2022).
- View Article
- Google Scholar

[ref1] 1. Nogales E. The development of cryo-EM into a mainstream structural biology technique. Nat Methods 13, 24 (2016). pmid:27110629
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Yip K.M., Fischer N., Paknia E., Chari A. & Stark H. Atomic-resolution protein structure determination by cryo-EM. Nature 587, 157 (2020). pmid:33087927
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Lazic I. et al. Single-particle cryo-EM structures from iDPC-STEM at near-atomic resolution. Nat Methods 19, 1126 (2022). pmid:36064775
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Turk M. & Baumeister W. The promise and the challenges of cryo-electron tomography. Febs Lett 594, 3243 (2020). pmid:33020915
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Bonomi M. & Vendruscolo M. Determination of protein structural ensembles using cryo-electron microscopy. Curr Opin Struc Biol 56, 37 (2019). pmid:30502729
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Tang W.S., Zhong E.D., Hanson S.M., Thiede E.H. & Cossio P. Conformational heterogeneity and probability distributions from single-particle cryo-electron microscopy. Curr Opin Struct Biol 81, 102626 (2023). pmid:37311334
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Doerr A. A dynamic direction for cryo-EM. Nat Methods 19, 29 (2022). pmid:35017731
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Scapin G., Potter C.S. & Carragher B. Cryo-EM for Small Molecules Discovery, Design, Understanding, and Application. Cell Chem Biol 25, 1318 (2018). pmid:30100349
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Jumper J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583 (2021). pmid:34265844
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Lawson C.L. et al. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat Methods 18, 156 (2021). pmid:33542514
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Farabella I. et al. TEMPy: a Python library for assessment of three-dimensional electron microscopy density fits. J Appl Crystallogr 48, 1314 (2015). pmid:26306092
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Pintilie G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat Methods 17, 328 (2020). pmid:32042190
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref13] 13. Terashi G., Wang X., Subramaniya S.R.M.V., Tesmer J.J.G. & Kihara D. Residue-wise local quality estimation for protein models from cryo-EM maps. Nat Methods 19, 1116 (2022). pmid:35953671
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref14] 14. Malhotra S., Trager S., Dal Peraro M. & Topf M. Modelling structures in cryo-EM maps. Curr Opin Struc Biol 58, 105 (2019). pmid:31394387
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref15] 15. Sali A. & Blundell T.L. Comparative Protein Modeling by Satisfaction of Spatial Restraints. J Mol Biol 234, 779 (1993).
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref16] 16. Wang R.Y. et al. Automated structure refinement of macromolecular assemblies from cryo-EM maps using Rosetta. Elife 5, e17219 (2016). pmid:27669148
View Article
PubMed/NCBI
Google Scholar

[61] View Article

[62] PubMed/NCBI

[63] Google Scholar

[ref17] 17. Pettersen E.F. et al. UCSF chimera—A visualization system for exploratory research and analysis. J Comput Chem 25, 1605 (2004). pmid:15264254
View Article
PubMed/NCBI
Google Scholar

[65] View Article

[66] PubMed/NCBI

[67] Google Scholar

[ref18] 18. Joseph A.P., Lagerstedt I., Patwardhan A., Topf M. & Winn M. Improved metrics for comparing structures of macromolecular assemblies determined by 3D electron-microscopy. J Struct Biol 199, 12 (2017). pmid:28552721
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref19] 19. Kawabata T. Multiple Subunit Fitting into a Low-Resolution Density Map of a Macromolecular Complex Using a Gaussian Mixture Model. Biophys J 95, 4643 (2008). pmid:18708469
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref20] 20. Igaev M., Kutzner C., Bock L.V., Vaiana A.C. & Grubmuller H. Automated cryo-EM structure refinement using correlation-driven molecular dynamics. Elife 8, e43542 (2019). pmid:30829573
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref21] 21. Trabuco L.G., Villa E., Mitra K., Frank J. & Schulten K. Flexible fitting of atomic structures into electron microscopy maps using molecular dynamics. Structure 16, 673 (2008). pmid:18462672
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref22] 22. Cragnolini T., Beton J. & Topf M. Cryo-EM structure and B-factor refinement with ensemble representation. Nat Commun 15, 444 (2024). pmid:38200043
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref23] 23. Afonine P.V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr D 74, 531 (2018). pmid:29872004
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref24] 24. Blau C., Yvonnesdotter L. & Lindahl E. Gentle and fast all-atom model refinement to cryo-EM densities via a maximum likelihood approach. PLoS Comput Biol 19, e1011255 (2023). pmid:37523411
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref25] 25. Topf M. et al. Protein structure fitting and refinement guided by cryo-EM density. Structure 16, 295 (2008). pmid:18275820
View Article
PubMed/NCBI
Google Scholar

[97] View Article

[98] PubMed/NCBI

[99] Google Scholar

[ref26] 26. Croll T.I. ISOLDE: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr D 74, 519 (2018). pmid:29872003
View Article
PubMed/NCBI
Google Scholar

[101] View Article

[102] PubMed/NCBI

[103] Google Scholar

[ref27] 27. Emsley P., Lohkamp B., Scott W.G. & Cowtan K. Features and development of Coot. Acta Crystallogr D 66, 486 (2010). pmid:20383002
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref28] 28. Scheres S.H.W. RELION: Implementation of a Bayesian approach to cryo-EM structure determination. J Struct Biol 180, 519 (2012). pmid:23000701
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref29] 29. Bonomi M. et al. Bayesian Weighing of Electron Cryo-Microscopy Data for Integrative Structural Modeling. Structure 27, 175 (2019). pmid:30393052
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref30] 30. van Zundert G.C.P., Melquiond A.S.J. & Bonvin A.M.J.J. Integrative Modeling of Biomolecular Complexes: HADDOCKing with Cryo-Electron Microscopy Data. Structure 23, 949 (2015). pmid:25914056
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

[ref31] 31. Giri N., Roy R.S. & Cheng J.L. Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions. Curr Opin Struc Biol 79, 102536 (2023). pmid:36773336
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref32] 32. Cossio P. & Hummer G. Bayesian analysis of individual electron microscopy images: Towards structures of dynamic and heterogeneous biomolecular assemblies. J Struct Biol 184, 427 (2013). pmid:24161733
View Article
PubMed/NCBI
Google Scholar

[125] View Article

[126] PubMed/NCBI

[127] Google Scholar

[ref33] 33. Velazquez-Muriel J. et al. Assembly of macromolecular complexes by satisfaction of spatial restraints from electron microscopy images. Proc Natl Acad Sci USA 109, 18821 (2012). pmid:23112201
View Article
PubMed/NCBI
Google Scholar

[129] View Article

[130] PubMed/NCBI

[131] Google Scholar

[ref34] 34. Tang W.S. et al. Ensemble Reweighting Using Cryo-EM Particle Images. J Phys Chem B 127, 5410 (2023). pmid:37293763
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref35] 35. Bonomi M., Pellarin R. & Vendruscolo M. Simultaneous Determination of Protein Structure and Dynamics Using Cryo-Electron Microscopy. Biophys J 114, 1604 (2018). pmid:29642030
View Article
PubMed/NCBI
Google Scholar

[137] View Article

[138] PubMed/NCBI

[139] Google Scholar

[ref36] 36. Riley B.T. et al. qFit 3: Protein and ligand multiconformer modeling for X-ray crystallographic and single-particle cryo-EM density maps. Protein Sci 30, 270 (2021). pmid:33210433
View Article
PubMed/NCBI
Google Scholar

[141] View Article

[142] PubMed/NCBI

[143] Google Scholar

[ref37] 37. Rieping W., Habeck M. & Nilges M. Inferential structure determination. Science 309, 303 (2005). pmid:16002620
View Article
PubMed/NCBI
Google Scholar

[145] View Article

[146] PubMed/NCBI

[147] Google Scholar

[ref38] 38. Tribello G.A., Bonomi M., Branduardi D., Camilloni C. & Bussi G. PLUMED 2: New feathers for an old bird. Comput Phys Commun 185, 604 (2014).
View Article
Google Scholar

[149] View Article

[150] Google Scholar

[ref39] 39. Bonomi M. et al. Promoting transparency and reproducibility in enhanced molecular simulations. Nat Methods 16, 670 (2019). pmid:31363226
View Article
PubMed/NCBI
Google Scholar

[152] View Article

[153] PubMed/NCBI

[154] Google Scholar

[ref40] 40. Shi Y. et al. Structure-based classification of tauopathies. Nature 598, 359 (2021). pmid:34588692
View Article
PubMed/NCBI
Google Scholar

[156] View Article

[157] PubMed/NCBI

[158] Google Scholar

[ref41] 41. Bonomi M., Camilloni C., Cavalli A. & Vendruscolo M. Metainference: A Bayesian inference method for heterogeneous systems. Sci Adv 2, e1501177 (2016). pmid:26844300
View Article
PubMed/NCBI
Google Scholar

[160] View Article

[161] PubMed/NCBI

[162] Google Scholar

[ref42] 42. Kishi K.E. et al. Structural basis for channel conduction in the pump-like channelrhodopsin ChRmine. Cell 185, 672 (2022). pmid:35114111
View Article
PubMed/NCBI
Google Scholar

[164] View Article

[165] PubMed/NCBI

[166] Google Scholar

[ref43] 43. Reshetnyak A.V. et al. Mechanism for the activation of the anaplastic lymphoma kinase receptor. Nature 600, 153 (2021). pmid:34819673
View Article
PubMed/NCBI
Google Scholar

[168] View Article

[169] PubMed/NCBI

[170] Google Scholar

[ref44] 44. Gan Z.Y. et al. Activation mechanism of PINK1. Nature 602, 328 (2022). pmid:34933320
View Article
PubMed/NCBI
Google Scholar

[172] View Article

[173] PubMed/NCBI

[174] Google Scholar

[ref45] 45. Greber B.J., Remis J., Ali S. & Nogales E. 2.5 A-resolution structure of human CDK-activating kinase bound to the clinical inhibitor ICEC0942. Biophys J 120, 677 (2021).
View Article
Google Scholar

[176] View Article

[177] Google Scholar

[ref46] 46. Cater R.J. et al. Structural basis of omega-3 fatty acid transport across the blood-brain barrier. Nature 595, 315 (2021). pmid:34135507
View Article
PubMed/NCBI
Google Scholar

[179] View Article

[180] PubMed/NCBI

[181] Google Scholar

[ref47] 47. Caveney N.A. et al. CryoEM structure of the antibacterial target PBP1b at 3.3 A resolution. Nat Commun 12, 2775 (2021).
View Article
Google Scholar

[183] View Article

[184] Google Scholar

[ref48] 48. Parker J.L. et al. Cryo-EM structure of PepT2 reveals structural basis for proton-coupled peptide and prodrug transport in mammals. Sci Adv 7, eabh3355 (2021). pmid:34433568
View Article
PubMed/NCBI
Google Scholar

[186] View Article

[187] PubMed/NCBI

[188] Google Scholar

[ref49] 49. Zinke M. et al. Architecture of the flexible tail tube of bacteriophage SPP1. Nat Commun 11, 5759 (2020). pmid:33188213
View Article
PubMed/NCBI
Google Scholar

[190] View Article

[191] PubMed/NCBI

[192] Google Scholar

[ref50] 50. Chen V.B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D 66, 12 (2010). pmid:20057044
View Article
PubMed/NCBI
Google Scholar

[194] View Article

[195] PubMed/NCBI

[196] Google Scholar

[ref51] 51. Afonine P.V. et al. New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr D 74, 814 (2018). pmid:30198894
View Article
PubMed/NCBI
Google Scholar

[198] View Article

[199] PubMed/NCBI

[200] Google Scholar

[ref52] 52. Barad B.A. et al. EMRinger: side chain-directed model and map validation for 3D cryo-electron microscopy. Nat Methods 12, 943 (2015). pmid:26280328
View Article
PubMed/NCBI
Google Scholar

[202] View Article

[203] PubMed/NCBI

[204] Google Scholar

[ref53] 53. Souza P.C.T. et al. Martini 3: a general purpose force field for coarse-grained molecular dynamics. Nat Methods 18, 382- (2021). pmid:33782607
View Article
PubMed/NCBI
Google Scholar

[206] View Article

[207] PubMed/NCBI

[208] Google Scholar

[ref54] 54. Fraser J.S., Lindorff-Larsen K. & Bonomi M. What Will Computational Modeling Approaches Have to Say in the Era of Atomistic Cryo-EM Data? J Chem Inf Model 60, 2410 (2020). pmid:32090567
View Article
PubMed/NCBI
Google Scholar

[210] View Article

[211] PubMed/NCBI

[212] Google Scholar

[ref55] 55. Daskalov A. et al. Contribution of Specific Residues of the beta-Solenoid Fold to HET-s Prion Function, Amyloid Structure and Stability. Plos Pathog 10, e1004158 (2014).
View Article
Google Scholar

[214] View Article

[215] Google Scholar

[ref56] 56. Becker L.M. et al. The Rigid Core and Flexible Surface of Amyloid Fibrils Probed by Magic-Angle-Spinning NMR Spectroscopy of Aromatic Residues. Angew Chem Int Edit 62, e202219314 (2023). pmid:36738230
View Article
PubMed/NCBI
Google Scholar

[217] View Article

[218] PubMed/NCBI

[219] Google Scholar

[ref57] 57. Scheres S.H.W. Processing of Structurally Heterogeneous Cryo-EM Data in RELION. Method Enzymol 579, 125 (2016). pmid:27572726
View Article
PubMed/NCBI
Google Scholar

[221] View Article

[222] PubMed/NCBI

[223] Google Scholar

[ref58] 58. Punjani A., Rubinstein J.L., Fleet D.J. & Brubaker M.A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat Methods 14, 290 (2017). pmid:28165473
View Article
PubMed/NCBI
Google Scholar

[225] View Article

[226] PubMed/NCBI

[227] Google Scholar

[ref59] 59. Punjani A. & Fleet D.J. 3DFlex: determining structure and motion of flexible proteins from cryo-EM. Nat Methods 20, 860 (2023). pmid:37169929
View Article
PubMed/NCBI
Google Scholar

[229] View Article

[230] PubMed/NCBI

[231] Google Scholar

[ref60] 60. Herreros D. et al. Estimating conformational landscapes from Cryo-EM particles by 3D Zernike polynomials. Nat Commun 14, 154 (2023). pmid:36631472
View Article
PubMed/NCBI
Google Scholar

[233] View Article

[234] PubMed/NCBI

[235] Google Scholar

[ref61] 61. Kinman L.F., Powell B.M., Zhong E.D., Berger B. & Davis J.H. Uncovering structural ensembles from single-particle cryo-EM data using cryoDRGN. Nat Protoc 18, 319 (2023). pmid:36376590
View Article
PubMed/NCBI
Google Scholar

[237] View Article

[238] PubMed/NCBI

[239] Google Scholar

[ref62] 62. Chen M. & Ludtke S.J. Deep learning-based mixed-dimensional Gaussian mixture model for characterizing variability in cryo-EM. Nat Methods 18, 930–936 (2021). pmid:34326541
View Article
PubMed/NCBI
Google Scholar

[241] View Article

[242] PubMed/NCBI

[243] Google Scholar

[ref63] 63. Frank J. & Ourmazd A. Continuous changes in structure mapped by manifold embedding of single-particle data in cryo-EM. Methods 100, 61 (2016). pmid:26884261
View Article
PubMed/NCBI
Google Scholar

[245] View Article

[246] PubMed/NCBI

[247] Google Scholar

[ref64] 64. Schwab, J., Kimanius, D., Burt, A., Dendooven, T. & Scheres, S. DynaMight: estimating molecular motions with improved reconstruction from cryo-EM images. bioRxiv, 2023.10.18.562877 (2023).

[ref65] 65. Garibsingh R.A.A. et al. Rational design of ASCT2 inhibitors using an integrated experimental-computational approach. Proc Natl Acad Sci USA 118, e2104093118 (2021). pmid:34507995
View Article
PubMed/NCBI
Google Scholar

[250] View Article

[251] PubMed/NCBI

[252] Google Scholar

[ref66] 66. Eshun-Wilson L. et al. Effects of alpha-tubulin acetylation on microtubule structure and stability. Proc Natl Acad Sci USA 116, 10366 (2019).
View Article
Google Scholar

[254] View Article

[255] Google Scholar

[ref67] 67. Bock L.V. & Grubmuller H. Effects of cryo-EM cooling on structural ensembles. Nat Commun 13, 1709 (2022). pmid:35361752
View Article
PubMed/NCBI
Google Scholar

[257] View Article

[258] PubMed/NCBI

[259] Google Scholar

[ref68] 68. Brotzakis Z.F. et al. Determination of the Structure and Dynamics of the Fuzzy Coat of an Amyloid Fibril of IAPP Using Cryo-Electron Microscopy. Biochemistry-Us 62, 2407 (2023).
View Article
Google Scholar

[261] View Article

[262] Google Scholar

[ref69] 69. Jamali, K. et al. Automated model building and protein identification in cryo-EM maps. bioRxiv, 2023.05.16.541002 (2023).

[ref70] 70. Bonomi M. & Camilloni C. Integrative structural and dynamical biology with PLUMED-ISDB. Bioinformatics 33, 3999 (2017). pmid:28961689
View Article
PubMed/NCBI
Google Scholar

[265] View Article

[266] PubMed/NCBI

[267] Google Scholar

[ref71] 71. Peng L.M., Ren G., Dudarev S.L. & Whelan M.J. Robust parameterization of elastic and absorptive electron atomic scattering factors. Acta Crystallogr A 52, 257 (1996).
View Article
Google Scholar

[269] View Article

[270] Google Scholar

[ref72] 72. Paissoni C., Jussupow A. & Camilloni C. Determination of Protein Structural Ensembles by Hybrid-Resolution SAXS Restrained Molecular Dynamics. J Chem Theory Comput 16, 2825 (2020). pmid:32119546
View Article
PubMed/NCBI
Google Scholar

[272] View Article

[273] PubMed/NCBI

[274] Google Scholar

[ref73] 73. Sivia D. & Skilling J. Data Analysis: A Bayesian Tutorial. (Oxford University Press, Oxford; 2006).

[ref74] 74. Kroon P.C. et al. Martinize2 and Vermouth: Unified Framework for Topology Generation. Elife 12, RP90627 (2023).
View Article
Google Scholar

[277] View Article

[278] Google Scholar

[ref75] 75. Cavalli A., Camilloni C. & Vendruscolo M. Molecular dynamics simulations with replica-averaged structural restraints generate structural ensembles according to the maximum entropy principle. J Chem Phys 139, 094112 (2013).
View Article
Google Scholar

[280] View Article

[281] Google Scholar

[ref76] 76. Heller G.T. et al. Small-molecule sequestration of amyloid-beta as a drug discovery strategy for Alzheimer’s disease. Sci Adv 6, eabb5924 (2020).
View Article
Google Scholar

[283] View Article

[284] Google Scholar

[ref77] 77. Heller G.T. et al. Sequence Specificity in the Entropy-Driven Binding of a Small Molecule and a Disordered Peptide. J Mol Biol 429, 2772 (2017). pmid:28743590
View Article
PubMed/NCBI
Google Scholar

[286] View Article

[287] PubMed/NCBI

[288] Google Scholar

[ref78] 78. Jussupow A. et al. The dynamics of linear polyubiquitin. Sci Adv 6, eabc3786 (2020). pmid:33055165
View Article
PubMed/NCBI
Google Scholar

[290] View Article

[291] PubMed/NCBI

[292] Google Scholar

[ref79] 79. Cezar H.M. & Cascella M. SANS Spectra with PLUMED: Implementation and Application to Metainference. J Chem Inf Model 63, 4979 (2023). pmid:37552250
View Article
PubMed/NCBI
Google Scholar

[294] View Article

[295] PubMed/NCBI

[296] Google Scholar

[ref80] 80. Coutsias E.A., Seok C., Jacobson M.P. & Dill K.A. A kinematic view of loop closure. J Comput Chem 25, 510 (2004). pmid:14735570
View Article
PubMed/NCBI
Google Scholar

[298] View Article

[299] PubMed/NCBI

[300] Google Scholar

[ref81] 81. Jo S., Kim T., Iyer V.G. & Im W. CHARMM-GUI: A web-based graphical user interface for CHARMM. J Comput Chem 29, 1859 (2008). pmid:18351591
View Article
PubMed/NCBI
Google Scholar

[302] View Article

[303] PubMed/NCBI

[304] Google Scholar

[ref82] 82. Huang J. et al. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods 14, 71 (2017). pmid:27819658
View Article
PubMed/NCBI
Google Scholar

[306] View Article

[307] PubMed/NCBI

[308] Google Scholar

[ref83] 83. Vanommeslaeghe K. et al. CHARMM General Force Field: A Force Field for Drug-Like Molecules Compatible with the CHARMM All-Atom Additive Biological Force Fields. J Comput Chem 31, 671 (2010). pmid:19575467
View Article
PubMed/NCBI
Google Scholar

[310] View Article

[311] PubMed/NCBI

[312] Google Scholar

[ref84] 84. MacKerell A.D. et al. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102, 3586 (1998). pmid:24889800
View Article
PubMed/NCBI
Google Scholar

[314] View Article

[315] PubMed/NCBI

[316] Google Scholar

[ref85] 85. Essmann U. et al. A Smooth Particle Mesh Ewald Method. J Chem Phys 103, 8577 (1995).
View Article
Google Scholar

[318] View Article

[319] Google Scholar

[ref86] 86. Abraham M.J. et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1–2, 19 (2015).
View Article
Google Scholar

[321] View Article

[322] Google Scholar

[ref87] 87. Paszke A. et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Adv Neur In 32 (2019).
View Article
Google Scholar

[324] View Article

[325] Google Scholar

[ref88] 88. Bussi G., Donadio D. & Parrinello M. Canonical sampling through velocity rescaling. J Chem Phys 126, 014101 (2007). pmid:17212484
View Article
PubMed/NCBI
Google Scholar

[327] View Article

[328] PubMed/NCBI

[329] Google Scholar

[ref89] 89. Berendsen H.J.C., Postma J.P.M., Vangunsteren W.F., Dinola A. & Haak J.R. Molecular-Dynamics with Coupling to an External Bath. J Chem Phys 81, 3684 (1984).
View Article
Google Scholar

[331] View Article

[332] Google Scholar

[ref90] 90. Parrinello M. & Rahman A. Polymorphic Transitions in Single-Crystals—a New Molecular-Dynamics Method. J Appl Phys 52, 7182 (1981).
View Article
Google Scholar

[334] View Article

[335] Google Scholar

[ref91] 91. Ferrarotti M.J., Bottaro S., Perez-Villa A. & Bussi G. Accurate Multiple Time Step in Biased Molecular Simulations. J Chem Theory Comput 11, 139 (2015). pmid:26574212
View Article
PubMed/NCBI
Google Scholar

[337] View Article

[338] PubMed/NCBI

[339] Google Scholar

[ref92] 92. Michaud-Agrawal N., Denning E.J., Woolf T.B. & Beckstein O. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. J Comput Chem 32, 2319 (2011). pmid:21500218
View Article
PubMed/NCBI
Google Scholar

[341] View Article

[342] PubMed/NCBI

[343] Google Scholar

[ref93] 93. Siffer, A., Fouque, P.A., Termier, A. & Largouet, C. Are your data gathered? The Folding Test of Unimodality. Proceedings of the 24th Acm Sigkdd International Conference on Knowledge Discovery & Data Mining, 2210 (2018).

[ref94] 94. Wang K.T. et al. Cryo-EM reveals the architecture of placental malaria VAR2CSA and provides molecular insight into chondroitin sulfate binding. Nat Commun 12, 2956 (2021). pmid:34011972
View Article
PubMed/NCBI
Google Scholar

[346] View Article

[347] PubMed/NCBI

[348] Google Scholar

[ref95] 95. Dimos N. et al. CryoEM analysis of small plant biocatalysts at sub-2 angstrom resolution. Acta Crystallogr D 78, 113 (2022).
View Article
Google Scholar

[350] View Article

[351] Google Scholar