Comparative Study of Structural Changes Caused by Different Substitutions at the Same Residue on α-Galactosidase A

Missense mutations in the α-galactosidase A (GLA) gene comprising the majority of mutations responsible for Fabry disease result in heterogeneous phenotypes ranging from the early onset severe “classic” form to the “later-onset” milder form. To elucidate the molecular basis of Fabry disease from the viewpoint of structural biology, we comprehensively examined the effects of different substitutions at the same residue in the amino acid sequence of GLA on the structural change in the enzyme molecule and the clinical phenotype by calculating the number of atoms affected and the root-mean-square-distance value, and by coloring of the atoms influenced by the amino acid replacements. The results revealed that the severity of the structural change influences the disease progression, i.e., a small structural change tends to lead to the later-onset form and a large one to the classic form. Furthermore, the study revealed the residues important for expression of the GLA activity, i.e., residues involved in construction of the active site, a disulfide bond or a dimer. Structural study from such a viewpoint is useful for elucidating the basis of Fabry disease.


Introduction
Fabry disease (MIM 301500) is an X-linked genetic disorder resulting from a deficiency of a-galactosidase A (GLA; EC 3.2.1.22) activity [1]. GLA deficiency causes the progressive accumulation of glycolipids, predominantly globotriaosylceramide, in lysosomes of cells. The disease exhibits a wide range of clinical phenotypes, from the early-onset severe ''classic'' form to the ''later-onset'' milder one [2]. Generally, male patients with the classic form of Fabry disease, who have little or no GLA activity, develop pain in the peripheral extremities, hypohidrosis, angiokeratomas and corneal opacities in childhood or adolescence, and manifest renal, cardiac, and cerebrovascular complications in the fourth to fifth decade of life [3]. On the other hand, male patients with the later-onset form, who have residual GLA activity, develop heart and kidney disorders without the childhood symptoms [4]. Heterozygous Fabry females exhibit a wide spectrum of disease severity ranging from asymptomatic to presentation with the classic disease due to random X-chromosomal inactivation [5].
The GLA gene is localized to Xq22.1 and encodes a precursor GLA comprising a 429-amino acid polypeptide, the enzyme being glycosylated and then processed to the mature form comprising 398 amino acids, and it exists as a homodimer in lysosomes [1].
Each monomer contains a (b/a) 8 barrel domain containing the active site and an anti-parallel b-sheet domain [6]. So far, more than 600 genetic mutations causing Fabry disease have been identified, and it is known that gross alterations, nonsense mutations, and most of the splicing mutations of the GLA gene lead to the classic form. However missense mutations comprising the majority of mutations result in heterogeneous phenotypes ranging from the classic form to the later-onset one.
Previously, Garman and his research group determined the GLA structure by means of X-ray crystallography and analyzed the locations of missense and nonsense mutations in the threedimensional structure [6,7]. Our research group studied structural changes caused by missense mutations responsible for Fabry disease by calculating the numbers of affected atoms and the rootmean-square distance (RMSD) values [8], and proposed a phenotype prediction model based on sequential and structural information [9].
In this study, we comprehensively examined different substitutions at the same residue in the amino acid sequence of GLA, focusing on their effects on the structural change in the enzyme protein and the clinical phenotype, as such investigation will provide us with information about the relationship between the enzyme structure and the disease.

GLA missense mutations
We collected GLA missense mutations and polymorphisms registered on the Human Gene Mutation Database (http://www. hgmd.cf.ac.uk/) and Fabry database (http://fabry-database.org/). From them, we selected cases in which more than two substitutions at the same residue in the amino acid sequence of GLA have been reported. Finally, we analyzed 157 amino acid substitutions at 67 residues in this study.
Calculation of the number of atoms influenced by an amino acid substitution and the RMSD values between the wild type GLA and mutant GLAs Each mutant model was superimposed on the wild type GLA structure based on the Ca atoms by the least-square-mean fitting algorithm, in which the optimal rotations and translations are found by minimizing the sum of the squared distances among all structures in the superposition [15][16][17][18][19]. We defined that the atom was affected by an amino acid substitution when the position of the atom in a mutant differed from that in the wild type structure by more than 0.15 Å . We calculated the numbers of atoms affected in the main chain and in the side chain of the enzyme, and in the active site (E170 and E231). Then, we calculated the RMSD values between the wild type GLA and mutant GLAs [15][16][17][18][19].
Determination of the solvent-accessible surface area (ASA) value The ASA value of an amino acid residue in the wild type GLA was calculated using Stride (http://webclu.bio.wzw.tum.de/ stride/) to evaluate the location of the residue in the GLA molecule.

Coloring of the atoms influenced by an amino acid substitution
To determine the influence of the amino acid substitutions geographically and semi-quantitatively, coloring of the influenced atoms in the three-dimensional structure of the enzyme molecule was performed for 12 mutants (M72I, M72R, M72V, E66G, E66K, E66Q, C56G, C56F, C56Y, W236C, W236L, and W236R) as to four positions (M72, E66, C56, and W236) in the GLA structure. The colors of affected atoms were shown on the basis of the distance between the wild type and mutant one.

Statistical analysis
To determine the differences in the number of the affected atoms and the RMSD value between the classic Fabry group and later-onset one, statistical analysis was performed using Excel 2013 (Microsoft, Redmond, WA) by means of one side Welch's t test, it being taken that there was a significant difference if p,0.05. Then, power analysis (http://www.statmethods.net/stats/power.html) was performed using G*POWER3 to evaluate statistical power for this Welch's t test [20]. In power analysis calculation, sample sizes of two groups and significant level were set to 134, 11, and 0.05, respectively.

Results
Different substitutions at the same residue in the amino acid sequence of GLA We examined the numbers of affected atoms for the whole enzyme protein and for the active site, and the RMSD and ASA values. The results are shown in Table 1. The numbers of atoms affected in the main chain and in the side chain, and the RMSD values in the classic Fabry group were 1076129 (134), 1316152 (134), and 0.08960.074 Å (134), respectively. The values are expressed as average 6 standard deviation (number of cases). On the other hand, in the later-onset Fabry group, they were 23636 (11), 28650 (11), and 0.03360.038 Å (11), respectively. The statistical analysis showed significant differences between the classic Fabry group and the later-onset Fabry group in numbers of affected atoms in the main chain (P,0.001, Welch's t test) and in the side chain (P,0.001, Welch's t test), and RMSD (P,0.001, Welch's t test). The results of the power analysis revealed that the estimated values of power were 0.70, 0.72, and 0.80 for numbers of affected atoms in the main chain and in the side chain, and RMSD, respectively. This suggests that the structural change resulting from the amino acid substitutions leading to the classic phenotype is essentially greater than that in the later-onset one, although there are some exceptional cases, i.e., in R112H and R301Q, the numbers of affected atoms and the RMSD values are apparently large, although the patients with these mutations exhibited the later-onset phenotype (Table 1). Furthermore, the results revealed that there were no later-onset Fabry cases in which the structure of the active site was affected, although there were 57 affected cases among the 134 classic Fabry ones. This suggests that a defect of the active site tends to lead to the classic phenotype.

Structural analysis of representative amino acid substitutions
We examined different amino acid substitutions at M72, E66, C56, and W236, because they are expected to provide us with useful information for elucidating the mechanism by which structural changes caused by them influence the severity of the disease and for identifying residues essential for the maintenance of proper folding. The localization of these residues in the dimer is shown in Fig. 1. The residues are widely distributed over the GLA molecule and are distant from the catalytic residues (D170 and D231).
M72 (M72I, M72R, and M72V). M72 is located on the ahelix (66-84) of the (b/a) 8 barrel domain. The ASA value of this residue is 0 Å 2 , suggesting that it is fully buried. The numbers of atoms influenced by M72I in the main chain, side chain and active site are 38, 46 and 0, respectively, the RMSD value being 0.054 Å . The numbers of atoms influenced by M72R in the main chain, side chain and active site are 145, 198, and 1, respectively, the RMSD value being 0.119 Å . Considering the results, the structural changes in GLA caused by these amino acid substitutions are thought to be large. The patients with these mutations exhibited the classic form of Fabry disease. On the other hand, as to M72V, the numbers of atoms influenced in the main chain, side chain and active site are 7, 6 and 0, respectively, the RMSD value being 0.026 Å . This suggests that the structural change caused by M72V is small, and that it does not affect the active site. The patients with M72V exhibited the later-onset Fabry disease.  Coloring of the influenced atoms allowed clear visualization of the differences in the structural changes between these cases (Fig. 2a). E66 (E66G, E66K and E66Q). E66 is located on the a-helix (66-84) of the (b/a) 8 barrel domain. The ASA value is 29.2 Å 2 , suggesting that the residue is half-exposed to the solvent. For the E66G substitution, the numbers of atoms influenced in the main chain, side chain and active site are 45, 74, and 0, respectively, the RMSD value being 0.062 Å . For the E66K substitution, the numbers of atoms affected in the main chain, side chain and active site are 422, 503, and 7, respectively, the RMSD value being 0.361 Å . The patients with such large structural changes exhibited the classic form of Fabry disease. On the other hand, as to the E66Q substitution, which has been reported to be a functional polymorphism [21], the numbers of atoms affected in the main chain, side chain, and active site are 23, 32, and 0, respectively, the RMSD value being 0.048 Å . These results suggest that the structural change is moderate and that it does not affect the active site. Fig. 2b clearly shows that the structural change caused by E66Q is restricted to a small region on the molecular surface, although those caused by E66G and E66K extend over a broad area around the substituted residue.
C56 (C56G, C56F, and C56Y). C56 is located between two a-helices (47-50 and 66-84). The ASA value of the residue is 38.4 Å 2 , suggesting that it is exposed to the solvent. The C56 residue forms a disulfide bond with C63 (Fig. 3), and it plays an important role in conformation of the enzyme molecule. Fig. 2c shows the structural changes caused by the C56G, C56F, and C56Y amino acid substitutions. These amino acid substitutions at  the C56 position are predicted to disturb the formation of disulfide bond between C56 and C63, and thus the mutant proteins would be excessively degraded before they are transported to the lysosomes. All of the patients with these mutations presented the classic form of Fabry disease. W236 (W236C, W236L, and W236R). W236 is located on the a-helix (236-247) of the (b/a) 8 barrel domain, the ASA value being 40.6 Å 2 , suggesting that the residue is exposed to the solvent. As Fig. 2d shows, the structural changes caused by W236C, W236L, and W236R are small (The numbers of atom in the main chain affected by W236C, W236L, and W236R are 2, 0, and 6, respectively, and those in the side chain are 7, 2, and 23, respectively. The RMSD values for them are 0.012 Å , 0.005 Å , and 0.025 Å , respectively). None of them affects the active site. However, as W236 is located on the dimer interface of GLA (Fig. 1), and the side chain of W236 forms a hydrogen bond with E358 (Fig. 4), the amino acid substitution is thought to affect the conformation of the GLA molecule.

Discussion
Recently, the results of newborn screening revealed a high incidence (1 in ,1, 250-9,000) of Fabry disease [22][23][24]. As Fabry disease can be treated with recombinant human GLAs [25][26][27], it is very important to understand the basis of the disease and to predict the outcome for patients found on screening. For this purpose, a structural study will provide us with valuable information. Garman and Garboczi reported that there are at least two classes of mutations in GLA that lead to disease progression: those near the active site and those of buried residues distant from the active site that adversely affect the folded state of the molecule, and a mild phenotype tends to be more solventaccessible than a severe one [6]. Our research group obtained essentially the same results as those of Garman and Garboczi. Our previous study revealed that structural changes in the classic Fabry group were generally large and tended to be localized to the core region or located in the functionally important region including the active site, and that those in the later-onset group were small and localized on the surface of the molecule [8].
As further structural study, we focused on different substitutions at the same residue in the amino acid sequence of GLA, because such specific cases are useful for examining the influence of the severity of the structural changes on the disease progression and for identifying the residues important for the expression of GLA activity.
In this study, we could select 157 amino acid substitutions at 67 residues from two databases, and examined the correlation between the structural changes in GLA and the clinical phenotype. The results revealed that the structural changes leading to the later-onset Fabry disease tend to be smaller than those for the classic Fabry disease, i.e., M72 is buried and E66 is exposed to the solvent, and at both residues, amino acid substitutions causing a small structural change (M72V and E66Q) lead to later-onset Fabry disease or a functional polymorphism, and ones causing a large structural change (M72I, M72R, E66G, and E66K) result in classic Fabry disease. This study also revealed residues important for expression of the GLA activity. A structural change affecting the active site tends to lead to the classic form. C56 and W236 are thought to be involved in the formation of a disulfide bond and the dimer, respectively. Substitutions at these residues should affect proper folding and lead to classic Fabry disease, even if the structural change is small.
In conclusion, we investigated the effects of different substitutions at the same residue in the amino acid sequence of GLA on structural changes in the enzyme molecule and the clinical phenotype. The results revealed that structural changes influence the disease progression. Structural study from such a unique viewpoint is useful for elucidation of the basis of Fabry disease.