Fig 1.
Local roughness of the patches.
A) Discretized representation of a molecular surface of the TDP-43 fragment 209-269 (PDB id: 4BS2). Each point of the surface is coloured according to the local roughness value, . B) Distribution of the roughness
found for each point i of the considered surface.
Fig 2.
Sampled points fractions for varying parameters.
Percentage of surface points that are sampled by varying α, β, γ and δ. These plots were obtained after the sampling of the surface of A1.
Table 1.
Best sampling for each range of selected points (nS).
Fig 3.
Analysis of the limit cases of the sampling function.
A) Results obtained when there is no dependency on . B) Results obtained when there is no dependency on the distance from the patch center. C) Results obtained with the steepest sampling function. The first column shows the percentage of surface points that are sampled for varying parameters. The second one depicts d for varying parameters; these plots are obtained with an interpolation of the effectively computed values, that -when not set to zero or one- are all the combinations of α = [0.1, 0.2, 0.4, 0.6, 0.8, 1], β = [0, 0.2, 0.4, 0.6, 0.8, 1], γ = [0, 2, 4, 6, 8, 10] and δ = [0, 2, 4, 6, 8, 10]. The last column shows, for the best parameters combination in each limit case, the box plot for different mean
of
(in red) and
(in blue). These plots refer again to the application of this sampling method to the surface of A1.
Fig 4.
Selection of the absolute best sampling parameters.
A) d as a function of varying values of γ and δ. Each plot is obtained with a fixed value for β, and for all the plots the same fixed value of α = 1.0 is used. The colouring is given by the d value. The plotted surfaces are obtained with an interpolation of the effectively computed values, that were obtained with all the possible combination, for each β value, of γ = [0, 2, 4, 6, 8, 10] and δ = [0, 2, 4, 6, 8, 10]. B) For each plot on the left, the point corresponding to the highest d is selected. The colouring is given by the β value (as described by the color-bar). The plotted surface is obtained from the interpolation of these points, and shows, for all the δ-γ combinations the value of β that will result in the best sampling. The maximum of this surface (red points) corresponds to the best sampling parameters.
Fig 5.
Comparison between the best sampling and the random selection of points for patches with different roughness.
A) On the left, an example of how a rough patch is represented with all the points in the surface (whole patch), with only the points resulting from the sampling (sampled points) and with randomly extracted points (random points). On the right, the same three representation cases for a plane patch. B) In red, the box plot for different ranges of patches’ roughness of . In blue, the box plot for different ranges of patches’ roughness of
. This is in the case of a sampling with parameters α = 1, β = 0, γ = 6 and δ = 0, whose combination results in the highest value of d for the A1 surface.
Fig 6.
Visualization of the 3D surfaces reconstruction.
A) 3D reconstruction of the A1 surface from all its surface points. B) The three columns depict the reconstruction of the same surface, with an increasing sampling density. In each column, the first row shows the reconstruction with a subset of the original points selected with the sampling, whereas the second row shows the reconstruction with a subset that counts the same number of points selected with the sampling, but in this case randomly extracted.
Table 2.
Comparison between the best and the average set of parameters for 70 proteins.
Fig 7.
Principal component analysis of the total and sampled points Zernike vectors for four binding regions.
For each of the four considered proteins, the PCA of the Zernike vectors describing the interacting region is performed. The first column shows the projection on the first two PCs of the Zernike vectors of each point in the patch (in blue) and of the points selected with the optimal sampling (in red), with the respective marginal density distributions. The percentage of sampled over total points is reported in the legend. The bar plots in the second column show the percentage of points sampled with different sets of parameters; the bar in the box corresponds to the optimal set. In the third column the percentage of spanned PCs space as a function of the number of sampled points for different sets of parameters is shown. The last column reports the points in the PCs space colored according to the roughness of the centered patch.