Fig 1.
(A) Beginning with the collection of ALPL mutations associated with different HPP phenotypes and the computational modeling of TNSALP protein structure, molecular signatures of mutational hotspots were calculated. In addition to normally used signatures, three levels of parameters for describing mutations were analyzed: conservation and coevolution analysis at the sequence level, PSN-based network matrices at the structural level, and ENM-related features at the dynamics level. (B) The classification and prediction of pathogenicity of ALPL mutations based on the statistical analysis of molecular signatures and the construction of machine learning models. (C) Allosteric effect analysis of predicted mutations by single-residue perturbation, molecular dynamics and long-range pathway analysis.
Fig 2.
Sequence and structural analysis of ALPL mutations in the TNSALP protein.
(A) The composition and statistics of ALPL mutations of based on the three categories: mild HPP, severe HPP, and control group. (B) The conservation distribution of mutations related to different phenotypes. (C) The structure of the TNSALP protein and the distribution of mutation sites in the active site (red), calcium site (purple), crown domain (brown), dimeric interface (blue), and N-terminal domain (green). Missense mutations in the three groups are shown as colored spheres based on the coloring scheme of the domains to which they belong. (D) Distribution of different clinical phenotypic mutations across the length of the TNSALP protein.
Fig 3.
The (A) entropy (S(i)), (B) coevolutionary (MI) and (C) relative solvent accessible area (RASA) profiles for each residue in TNSALP; mild, severe, and control mutations are highlighted as yellow, red, and blue diamonds, respectively. The comparison of (D) the S(i), (E) MI and (F) RASA, between the control group mutations and mild and severe mutations. Statistical significance was determined by the Wilcoxon signed-ranked test, with P values<0.01.
Fig 4.
(A) The folding free energy changes induced by single point mutations. The profiles of mild, severe, and control mutations are shown as yellow, red and blue bars, respectively. (B) The significant difference between the predicted folding free energy change of ALPL single residue variations in the control, mild, severe groups was measured by Wilcoxon. test, with P values<0.01. (C) The structural distribution of disease-causing mutations with low ΔΔG.
Fig 5.
Network topological and dynamic analysis of mutations in ALPL.
(A)ΔDC, (B) ΔBC, (C) ΔCC, (D) ΔC, (E) MSF, (F) effectiveness, (G) sensitivity, (H) MBS, and (I) stiffness profiles for the three types of mutations. Control, mild and severe mutations are represented as blue, orange, and red bars, respectively.
Fig 6.
Comparison of the change of four network centralities and five dynamics-based parameters, including (A)ΔDC, (B) ΔBC, (C) ΔCC, (D) ΔC, (E) MSF, (F) effectiveness, (G) sensitivity, (H) MBS, and (I) stiffness, among mutations in the control, mild, and severe groups by Wilcoxon signed ranked test.
Fig 7.
(A) The distribution of ΔΔG and ΔBC for ALPL mutations. Scatterplot showing the distribution of ΔBC vs ΔΔG of different mutation types. Severe, mild and control mutations are depicted in red, yellow, and blue, respectively. N47I, L289F, and M355I are three severe mutations with low ΔΔG and high ΔBC. Among the three significant mutations predicted by the scatterplot, two (E452K and R391K) were not originally included in the severe mutation group but were validated as two severe mutations in the newly collected clinical samples. (B) Mean values of three replicas of the differential RMSF (ΔRMSF) of N47I (blue), L289F (red), and M355I (green) with respect to WT. For each system, a replica of 500 ns was singled out to compare the BC values of the two different networks of TNSALP WT (C) and N47I (D) mutant. The green and grey lines show the BC values of residues of DRN and AACEN. Mild and severe mutations are highlighted as yellow and red diamonds, respectively.
Fig 8.
Allosteric paths originating at three mutational sites and terminating at S258 in the WT (A) and mutant states (B), as well as terminating at N190 in the WT (C) and mutant states (D), respectively. The TNSALP structure is depicted as represented by a semitransparent colored cartoon, and the starting and ending residues of all the paths are represented as green and cyan spheres, respectively. The alpha-carbon of the path through the residues is shown as silver (chain A) and orange (chain B) spheres, in which active sites are represented by red spheres.
Table 1.
The constituent residues of the shortest pathway from the three severe mutational sites (N47, L289, and M355 to N190/S258).
Residues in the two chains are denoted by different colors.
Fig 9.
Allosteric effects of three studied severe mutations (N47I, L289F and M355I) and two predicted severe mutations (R391C and E452K) calculated by AlloSigMA.
Cartoon structures of the TNSALP protein colored according to their free energy values obtained for the cases of (A) N47I, (B) L289F, (C) M355I, (D) R391C and (E) E452K, while blue color indicates negative allosteric free energy and red color indicates positive modulation. (F) Their free energy profiles are illustrated graphically with the residue index (chain A: 1–524; chain B: 525–1048) on the x-axis and Δg value on the y-axis. Blue, yellow, green, pink and red profiles represent the results for N47I, L289F, M355I, R391C and E452K, respectively.
Fig 10.
Performance evaluation of feature classification.
(A) Heatmap of pairwise Spearman’s rank correlation coefficients between different features. (B) Feature importance of all used features ranked based on mean decrease accuracy in the RF classification. ROC curve AUCs for 14 features as a function of 1-specificity, including ROC curves for evaluating each feature in classifying ALPL mutations between (C) the control and severe groups, (D) control and mild groups, and (E) mild and severe groups.
Table 2.
The AUCs of 14 characteristics or parameters in each comparison group.
The characteristics that performed best in each group are highlighted in red.