Structural Basis of Mucopolysaccharidosis Type II and Construction of a Database of Mutant Iduronate 2-Sulfatases

Mucopolysaccharidosis type II (MPS II, Hunter syndrome) is an X-linked genetic disorder caused by a deficiency of iduronate 2-sulfatase (IDS), and missense mutations comprising about 30% of the mutations responsible for MPS II result in heterogeneous phenotypes ranging from the severe to the attenuated form. To elucidate the basis of MPS II from the structural viewpoint, we built structural models of the wild type and mutant IDS proteins resulting from 131 missense mutations (phenotypes: 67 severe and 64 attenuated), and analyzed the influence of each amino acid substitution on the IDS structure by calculating the accessible surface area, the number of atoms affected and the root-mean-square distance. The results revealed that the amino acid substitutions causing MPS II were widely spread over the enzyme molecule and that the structural changes of the enzyme protein were generally larger in the severe group than in the attenuated one. Coloring of the atoms influenced by different amino acid substitutions at the same residue showed that the structural changes influenced the disease progression. Based on these data, we constructed a database of IDS mutations as to the structures of mutant IDS proteins.


Introduction
Iduronate 2-sulfatase (IDS, EC 3.1.6.13) is a lysosomal enzyme that catalyses the hydrolysis of sulfated esters at the non-reducing-terminal iduronic acids in the glycosaminoglycans (GAGs) heparan sulfate and dermatan sulfate. A deficiency of IDS activity results in systemic accumulation of GAGs, leading to a rare metabolic disease, mucopolysaccharidosis type II (MPS II, OMIM 309900), which is also known as Hunter syndrome [1]. Patients with MPS II typically exhibit systemic manifestations including a short stature, a specific facial appearance, dysostosis multiplex, a thick skin, inguinal hernia, hepatosplenomegaly, hearing difficulty, an ophthalmic problem, a respiratory defect, heart disease, and occasional neurologic involvement. However, this disease exhibits a wide range of clinical phenotypes from the "severe form" with progressive clinical deterioration to the "attenuated form" without mental retardation.
Enzyme replacement therapy (ERT) involving recombinant human IDS is approved in many countries for treatment of MPS II patients. As recent studies demonstrated that ERT improved the manifestations of MPS II, especially when it was started at an early age [2][3][4][5], an early diagnosis and clinical phenotype determination are becoming more and more important for predicting the prognoses of patients and a proper therapeutic plan.
The IDS gene is located at the Xq27/28 boundary [6], and contains nine exons spread over 24 kb [7,8]. It has been reported that 2.3 kb-IDS cDNA encodes a polypeptide of 550 amino acids showing high homology with the sulfatase protein family [8,9]. The synthesized IDS requires post-translational modification including removal of the signal sequence peptide, glycosylation, phosphorylation, proteolysis, and conversion of C84 to the catalytic formylglycine: the 76 kDa precursor is processed through intermediates to the 55 kDa and 45 kDa mature forms [10]. So far, at least 530 genetic mutations responsible for MPS II have been identified, and it is known that gross alterations lead to the severe form. However, missense mutations comprising about 30% of the MPS II mutations result in heterogeneous phenotypes ranging from the severe to the attenuated form.
As to human IDS, no crystal structure has been reported, and a few structural models have been constructed by means of homology modeling using crystal structure information of other sulfatases including arylsulfatase A and arylsulfatase B as templates [11][12][13][14]. In those studies, some missense mutations were found to be localized on the predicted IDS structure and the effects of amino acid substitutions were discussed. But the number of mutations discussed was very small.
In this study, we built a new structural model of human IDS by means of the homology modeling and molecular dynamics methods, and predicted structural changes in IDS caused by 131 amino acid substitutions responsible for MPS II, and examined the relationship between the mutant IDS structures and the respective clinical phenotypes.

Missense mutations in the IDS gene
In this study, we analyzed 131 missense mutations in the IDS gene for which the MPS II phenotypes have been clearly described (67 severe and 64 attenuated). These missense mutations, the respective phenotypes, and the references are summarized in Table 1, and other missense mutations in the IDS gene that were excluded in this analysis are summarized in S1 Table. Structural modeling of the human wild type IDS protein Because no IDS crystal structure has been reported, we built a model of human IDS using homology modeling server I-TASSER (zhanglab.ccmb.med.umich.edu/ I-TASSER/) [15]. To take protein structure fluctuations into consideration, we conducted molecular dynamics calculations using Gromacs (version 5.0.5) [16]. In the simulation, we used the AMBER99sb force field and SPC/E water models. At first, we performed structure optimization using the steepest-descent method. Then, we carried out equilibration in two phases (NVT and NPT ensemble). As the production run, we performed 10ns MD simulation using Parrinello-Rahman pressure coupling and V-rescale temperature simulation. In order to obtain a representative structure, we used the simulated annealing method. we used molecular modeling software TINKER [17], and energy minimization was performed, the root-mean-square graduate value being set at 0.05 kcal/molÁÅ. Then, each mutant model was superimposed on the wild type IDS structure based on the Cα atoms by the least-squaremean fitting method. We defined that an atom was influenced by an amino acid substitution when the position of the atom in a mutant IDS protein differed from that in the wild type one by more than the cut-off distance (0.15 Å) based on the total root-mean-square distance (RMSD), as described previously [18]. We calculated the numbers of influenced atoms in the main chain (the protein backbone: alpha carbon atoms linked to the amino group, the carboxyl one, and hydrogen atoms of the molecule) and the side chain (variable components linked to the alpha carbons of the molecule), and in the active site (D45, D46, C84, K135, and D334). Then, average numbers of influenced atoms were calculated for the severe MPS II group and the attenuated one.

Determination of the RMSD values of all atoms in the mutant IDS proteins
To determine the influence of each amino acid substitution on the total conformational change in IDS, the RMSD values of all atoms in the mutant IDS proteins were calculated according to the standard method [19]. Then, average RMSD values were calculated for the severe group and the attenuated one.

Determination of the accessible surface area (ASA) of the amino acid residues of the IDS protein
The ASA of each amino acid residue in the structure of the wild type IDS was calculated using Stride [20] to determine the location of the residue in the IDS molecule. Then, the average ASA values of the residues for which substitutions had been identified in the severe group and the attenuated one were calculated.

Statistical analysis
Statistical analysis to determine the differences in the numbers of influenced atoms, the RMSD, and the ASA values between the severe MPS II group and the attenuated one was performed by means of Welch's t-test, and it was taken that there was a significant difference if p< 0.05.

Coloring of the atoms influenced by different amino acid substitutions at the same residue on IDS
Coloring of the influenced atoms in the three-dimensional structure of IDS was performed for amino acid substitutions including P480R (phenotype: severe), P480L (phenotype: attenuated), D334G (phenotype: severe), and D334N (phenotype: attenuated) as representatives, based on the distance between the wild type and mutant to determine the influence of the amino acid substitutions geographically and semi-quantitatively according to the method previously described [18].

Construction of a database including genotypes, clinical phenotypes, references and mutant IDS structures in MPS II
In order to help researchers and clinicians who study MPS II, we built a database including the genotypes, clinical phenotypes, references, and structures of mutant IDSs, according to the method described previously [21].

Structural model of the human IDS protein and locations of the amino acid residues associated with MPS II
A homology modeling server was used to construct a structural model of the wild type IDS protein. For this calculation, several known structures were used as templates (pdb id: 2w8s, 2vqr, 3b5q, and 2quz). The results of the homology modeling revealed that IDS consists of alpha/beta folds, and that it contains two antiparallel beta-sheets (each sheet has 4 beta strands) with alpha helixes around them. The active site is located in the loop region near the N-terminal antiparallel beta-strand, and the D45, D46, C84, K135 and D334 residues comprising the putative active site [14] are indicated (Fig 1A). Then, to elucidate the relationship between the locations of the amino acid substitutions associated with MPS II and the respective phenotypes, we identified the positions of the residues in the wild type IDS protein structure (Fig 1B). The locations of the amino acid residues involved in the substitutions were widely spread over the enzyme molecule.

Numbers of atoms influenced by amino acid substitutions associated with MPS II
Then, we constructed structural models of the mutant IDS proteins and calculated the number of atoms influenced by the amino acid substitution for each mutant model. The results are summarized in Table 1. In the severe phenotypic group, the average values (± standard deviation, SD) for the influenced atoms in the main chain and the side chain regarding the amino acid substitutions were 106 (± 113) and 124 (± 125), respectively. In particular, 42 of the 67 severe cases (63%) had 50 atoms or more influenced in the main chain. On the other hand, in the attenuated group, the averages (± SD) of the influenced atoms in the main chain and side chain were 63 (± 78) and 71 (± 84), respectively. Notably, regarding the main chain atoms, 39 of the 64 of the attenuated cases (61%) had 49 atoms or less affected. Fig 2 shows the means ± SD and boxplots of the influenced atoms of the main chain (Fig 2A) and the side chain (Fig 2B) in the severe MPS II group and the attenuated one. Welch's t-test revealed that there was a significant difference in the numbers of influenced atoms in both the main chain and the side chain (p<0.05) between the severe and attenuated groups. These results indicate that the structural changes caused by the amino acid substitutions responsible for severe MPS II were generally larger than those for attenuated MPS II. As to the influence of amino acid substitutions on the active site, the atoms of the residues comprising the putative active site were affected in 31 of the 67 severe cases (46%) and 17 of the 64 attenuated ones (27%).

RMSD values between mutant IDS proteins and the wild type
To clarify the total structural change caused by each amino acid substitution, the RMSD value between the mutant IDS protein and the wild type was calculated, and the results are summarized in Table 1. The means ± SD and boxplots of RMSD values for the severe MPS II group and attenuated one are shown in Fig 2C. The average RMSD values (± SD) for the severe and attenuated groups were 0.074 (± 0.050) and 0.056 (± 0.043) Å, respectively. The Welch's t-test showed that there was a significant difference in RMSD between the severe MPS II group and the attenuated one (p<0.05). These results indicate that structural changes in the severe MPS II group were larger than those in the attenuated one.

ASA of the amino acid residues associated with MPS II mutations
To identify the locations of the residues associated with MPS II mutations in the IDS protein, the ASA values of the residues in the wild type structure were calculated and the results are summarized in Table 1. Fig 2D shows the means ± SD and boxplots of ASA for the severe group and the attenuated one. In the severe MPS II group, the average ASA value (± SD) was 25.2 (± 40.8) Å 2 . On the other hand, in the attenuated MPS II group, it was 27.9 (± 38.7) Å 2 . The results of the Welch's t-test (p> 0.05) showed that there was no significant difference between the two groups.  Table 1). For seven of those residues (N63, S71, G134, K227, N265, D334, and D480), the RMSD value and the number of affected atoms for the severe group were larger than those for the attenuated one. On the other hand, for 3 residues (H138, D198, and C432), the RMSD value and the number of affected atoms for the attenuated group were larger than those for the severe one. As to the other 2 residues (P228 and W345), it could not be determined for which phenotype the RMSD value and the number of affected atoms were larger than the other. Then, we examined the structural changes in IDS caused by P480R (phenotype: severe), P480L (phenotype: attenuated), D334G (phenotype: severe), and D334N (phenotype: attenuated) by means of coloring of the atoms affected. P480 is located near the molecular surface, and far from the active site. P480R is thought to cause a large structural change around the substituted residue, although it does not affect the active site, leading to the severe phenotype. The structural change caused by P480L is small and the influenced atoms are limited, leading to the attenuated phenotype (Fig 3A). On the other hand, D334 is a residue comprising the active site, and the D334G and D334N substitutions are both thought to directly affect the structure of the active site. The structural change in the former is larger than that in the latter, leading to the difference in phenotype (Fig 3B) Database of the genotypes, clinical phenotypes, references and mutant IDS structures in MPS II We developed a database including the genotypes, clinical phenotypes, references and mutant IDS structures responsible for MPS II (mps2-database.org) (Fig 4A). The database provides readers with information on 530 MPS II mutations, which include 161 missense ones. The database is equipped with several tools. A text search tool is provided for searching the given text in selected fields of the database. Furthermore, using the control table option, users can search for MPS II mutations connected with specific phenotypes. A structural viewer is included in the database, which allows users to display the three-dimensional structures of the molecules using Jmol (Fig 4B). Furthermore, users can use many options for visualizing the structures of mutant IDS proteins.

Discussion
Structural information on defective IDS proteins is important to elucidate the pathogenesis of MPS II, and is also useful for understanding the basis of the disease in each patient and for preparing a proper therapeutic plan for him or her. So far, several structural models of the wild type IDS have been reported [11][12][13][14]. Unfortunately, they have not been registered in the Protein Data Bank (pdb), and thus we could not precisely compare our new model with them. However, as far as we examined the figures in their reports, it seems that there are no large differences in the whole structure between their models and ours, but there are small differences between them, i.e., in the loop regions of the molecule.
Regarding the locations of residues involved in amino acid substitutions, Kato et al. examined eight cases (phenotypes: 4 severe and 4 attenuated), and reported that the mutations found in the severe phenotype probably undergo direct interactions with the active site residues or break the hydrophobic core region of IDS, whereas residues of the missense mutations found in the attenuated phenotype were located in the peripheral region [13]. But our study, in which 131 missense mutations (phenotypes: 67 severe and 64 attenuated) were analyzed, revealed that there was essentially no difference in the localization of amino acid substitutions between the severe group and the attenuated one (Figs 1B and 2D).
In order to examine structural changes caused by the amino acid substitutions we calculated the numbers of atoms affected and the RMSD values. The results revealed that structural changes in the severe group were generally larger than those in the attenuated one (Fig 2A, 2B  and 2C).
We examined structural changes caused by different substitutions at the same residue in the amino acid sequence of IDS and the phenotypes. Furthermore, structural changes caused by P480R, P480L, D334G, and D334N were examined as representative cases by coloring of the atoms affected in IDS. The results revealed that the structural changes influenced the disease progression.
A lot of expression studies have been performed, and the results revealed that the "attenuated" mutants expressed the precursor and small amounts of the mature IDS, resulting in residual enzyme activity, although the "severe" mutants expressed the precursor but no mature IDS,  [14,[22][23][24][25]. Considering these findings, the majority of the "severe" missense mutants cause large structural changes and defects of the molecular folding, leading to rapid degradation and/or insufficient processing. On the other hand, the "attenuated" missense mutants generally cause small structural changes and moderate folding defects, leading to partial degradation of the enzyme protein and residual activity. Some mutants may directly affect the active site, leading to a decrease in enzyme activity according to the degree of the structural changes.
Finally, we built a database of IDS gene mutations responsible for MPS II, clinical phenotypes, references, and predicted structures of mutant IDSs. This database can be accessed via the Internet, being user-friendly.
In conclusion, we constructed structural models of the wild type and mutant IDS proteins by means of the latest homology modeling techniques, and showed the correlation between the structural changes and clinical phenotypes. This database based on the results of a structural study will help investigators and clinicians who study MPS II.