Prediction of Structure of Human WNT-CRD (FZD) Complex for Computational Drug Repurposing

The observed genetic alterations of various extracellular and intracellular WNT (Wingless, Int-1 proto-oncogene) signaling components can result in an increase or decrease in gene expression, and hence can be obstructed proficiently. These genetics target sites may include the prevention of WNT-FZD (Frizzled) binding, destruction of β-catenin and formation of Axin, APC and GSK-3β complex. Hence, the localized targeting of these interacting partners can help in devising novel inhibitors against WNT signaling. Our present study is an extension of our previous work, in which we proposed the co-regulated expression pattern of the WNT gene cluster (WNT-1, WNT-6, WNT-10A and WNT-10B) in human breast carcinoma. We present here the computationally modeled three dimensional structure of human WNT-1 in complex with the FZD-1 CRD (Cysteine Rich Domain) receptor. The dimeric cysteine-rich domain was found to fit into the evolutionarily conserved U-shaped groove of WNT protein. The two ends of the U- shaped cleft contain N-terminal and C-terminal hydrophobic residues, thus providing a strong hydrophobic moiety for the frizzled receptor and serving as the largest binding pocket for WNT-FZD interaction. Detailed structural analysis of this cleft revealed a maximum atomic distance of ∼28 Å at the surface, narrowing down to ∼17 Å and again increasing up to ∼27 Å at the bottom. Altogether, structural prediction analysis of WNT proteins was performed to reveal newer details about post-translational modification sites and to map the novel pharmacophore models for potent WNT inhibitors.


Introduction
The large family of WNT ligands manipulates many diverse functions in humans, for example: embryonic induction, generation of cell polarity, and specification of cell fate [1]. At sequencelevel, amino acid similarity within 19 WNT homologues ranges from 27% to 83% [2]. Approximately 43 kDa glycoprotein is encoded by WNTs [3]. The Wnt signal-transduction pathway has been widely conserved during animal evolution including mouse, Caenorhabditis elegans, and Drosophila [4,5]. The conserved cysteine motifs at the C-terminus help WNT ligands to bind with Frizzled (FZD) receptors and initiate the WNT signaling cascade [1]. In summary, basic shared features of all WNTs comprise a signal sequence for secretion, the WNT family signature, a number of highly charged amino acid residues, numerous glycosylation sites, trans-membrane helices and conserved cysteines (Fig. 1).
The WNT signaling pathway is intricately linked with different types of cancers including colon cancer, breast cancer, gastric cancer, pancreatic and heptocellular carcinoma etc [6][7][8]. Tumor genesis can be caused by genetic alterations in Axin, Adenomatous Polyposis Coli (APC), b-catenin; loss of DKK, SFRP & WIF; and mutations in b-catenin-Axin-adenomatous polyposis coli (APC)glycogen synthase kinase (GSK)-3b multi-protein complex [9]. Combination of small secreted signaling proteins of the WNT family and their receptors, makes up the most complex relationship in this signaling pathway. WNT ligands bind at the Cysteine Rich Domain (CRD) of Frizzled receptors and together with FZD and LRP form a signaling complex to initiate the WNT pathway [10]. A number of routes have been observed to interrupt the WNT signaling pathway in numerous loss-of-function and gain-of-function mutations. These include the binding of WNT ligands with their receptors to form ligand-receptor complexes, extracellular inhibition of small WNT ligands by inhibiting their accurate processing, inactivation of the intracellular complex of Axin and GSK-3b, destruction of the destructor (b-catenin) and hyper-activation of naturally occurring WNT inhibitors (WIF, DKK and SFRP) [11]. Post-translational modifications are essential for accurate processing of WNT ligands. As WNTs are the secretary proteins, possessing a signal sequence which is necessary for proper targeting, this signal sequence is recognized by resident kinases of endoplasmic reticulum and hence glycosylate wingless proteins, before further processing [12]. WNT proteins are Nlinked glycosylated, which may not be important for their activation but is necessary for their secretion and function. However, this glycosylation is always in competition with the disulphide bond formation. For the canonical WNT signaling pathway to become activated, palmitoylation of WNT is necessary [13]; on the other hand this palmitoylation also helps WIF to inhibit WNT signaling [14]. Inside the endoplasmic reticulum (ER), accurate processing of WNT requires porcupine, which also causes its palmitoylation thus interfering with disulphide bond formation and finishing the process of glycosylation.
The most intricate and least studied route of WNT signaling inhibition includes the targeting of small WNT ligands and the study of the ligand-receptor complex. The reason for this is that the primary amino acid sequence of WNT implies that they are soluble in water; however, the secreted WNTs are surprisingly hydrophobic and are associated with membranes. The hydrophobicity of WNTs is one of the reasons why no crystal structure of WNT has yet been identified [12]. Complex dimerization of FZD CRD is also important for signaling pathway activation [15,16].The purposes of our study is to computationally model the tertiary structure of human WNT and FZD CRD proteins and suggest the important interacting residues of the receptor and the ligand involved in the activation of this pathway which could possibly be targeted to inhibit the interactions and hence stop the abnormal signaling pathway. The computationally modeled threedimensional structures of WNT and CRD yielded good quality values when critically analyzed through various evaluation softwares. The Root Mean Square Deviation (RMSD) values calculated for Ca and the side chain also verified the quality assurance of modeled structures. The binding of a dimerized CRD domain into the potential binding pocket of WNT demonstrates two important facts: firstly, the CRD dimerization is necessary for initiation of WNT pathway; second, the structural details of WNT can be exploited to locate the significant residues of palmitoylation and glycosylation for targeted drug therapies. Our present study is an extension of Ain et al., 2011 [17], in which we proposed the coregulated expression pattern of WNT gene cluster (WNT-1, WNT-6, WNT-10A and WNT-10B) in human breast carcinoma. The present study aims to computationally model the structures of human WNT-1 and its receptor protein CRD of Frizzled. Along with this, structures of WNT-6, WNT-10A, and WNT-10B are predicted and refined. Lipid modified Xenopus WNT-8 in complex with Mouse FZD-8 was crystallized by Janda et al., 2012 [18]. However, in our study we modeled the binding interactions of WNT-1 complex with FZD-1 in Homo sapiens.

Dataset collection
The sequence similarities of the two closest homolog clusters, (WNT-1 and WNT-6) and (WNT-10A and WNT-10B) is 40% and 60% respectively. ClustalX [19] was used for sequence alignment and further editing was performed using GenDoc. Sequences of all WNT proteins were extracted from Ensemble genome browser [20], and subjected to PSI-BLAST with a protein structure database [21] to find the best template. As no crystal structure of WNT was available, the closest homolog retrieved was the ligand-binding face of the semaphorins revealed by the high resolution crystal structure of SEMA4D protein (pdb id: 1OLZ), having 39% sequence homology with WNT-2B.

Homology Modeling
The three-dimensional structure prediction of WNT-2B (PM0078113) was done by comparative homology modeling of target sequences using Modeller 9v8 [22]. Using WNT-2B as template, structures of WNT-1 (PM0078114), WNT-6 (PM0078115), WNT-10A (PM0078128) and WNT-10B (PM0078116) were predicted. The molecular model of these proteins was subjected to optimization. The overall geometry optimization was followed by energy minimization of retrieved structure using Chimera1.5.2 [23]. This took a total of 100 steps (step size 0.02 A) employing conjugate gradient method followed by protonation of wild type histidines using AMBER ff98 method. The generated models were ranked on the basis of the Modeler objective function, obtained after more than ,2000 average iterations of a template structure with simulated annealing algorithm using Modeller9V8. Later each generated model was processed for quality assurance. One of the quality assurance parameters includes DOPE plot. The discrete optimized potential energy (DOPE) profile plot of each target structure was generated through Modeller 9v8 with reference to the template, and the stability of generated target model was tested.

Structure Validation and Analysis
Predicted 3D models were verified by PROCHECK [24] and validation of stereo-chemical quality of a models was performed through WHATIF, ERRAT, PROCHECK, PDBSUM, Ramachandran Plot2.0 [25] and Verify3D. A Ramachandran plot was generated for each computationally predicted structure to testify the steric hindrances of protein residues, and the outlier errors were corrected accordingly using WinCoot [26]. Most of the structural mis-alignment errors in rotamers occurred due to side chain packaging and folding. These problematic rotamers were refined using Dunbrack rotamer library. The values for RMSD (Root mean square deviation), SDM (Structural deviation measure), B-factor and Q (quality)-factor were used to verify further the quality of the resulting structures of WNT proteins as well as receptor CRD using Chimera 1.5.2 (Table S1). The Qscore assessment was given by Krissinel E and Henrick K, 2004 [27] as, Where, N 1 and N 2 represent the number of residues of aligned structures. Q-finder and pocket-finder were used to determine the probable binding pockets of WNT protein. One of the important measures to calculate the geometry accuracy is RMSD, obtained after superimposition of the target to its native structure. RMSD depends on the number of equivalent atom pairs of both proteins (target and template) that are compared, which in turn depends upon the maximum allowed distance between atom pairs.

Molecular Docking
In order to predict the exact amino acid residues involved in CRD dimmer interaction, docking studies were performed using AUTODOCK 4.2. Polar hydrogen atoms were added to both ligand and receptor molecules. A population size of 150 with 10 million energy evaluations was used. The number of total docking runs was set to 100. The grid size was to cover the whole receptor to find a potential binding pocket for the WNT-FZD complex. A  19 WNT proteins are marked with yellow. These cysteine residues help in the formation of disulphide bridges and improve folding process. The palmitoylation site is also shown, displaying the conserved sequence for post-translational modification of WNTs. All WNTs have a common conserved family signature, as indicated. The signal peptide sequences were obtained from SignalP web server. TMHMM web server was used to determine helices inside, outside and between membranes. Motif enrichment analysis was performed using PRINTS, PRODOM, Blocks, Pfam and InterPro. The MSA was generated using ClustalX 2.0. doi:10.1371/journal.pone.0054630.g001 hybrid Lamarckian genetic algorithm was used for flexible docking. Detailed docking and grid parameters are shown in Table 1. The other parameters were set to default values. The docking results were visualized and analyzed by using Chime-ra1.5.2 and VMD [28] tools. Protein-protein interactions were further verified using another web server called Patch dock [29]. Patchdock is a molecular docking server based on the principle of shape complementarity and its related server FireDock [30] and SymmDock were used for fine refinement of docked complex. Based on the atomic energy scores, the predicted protein binding was confirmed, refined and then used for results and analysis.

Homology Modeling
Among 19 human WNT proteins, WNT-2B was taken initially to be modeled because of its greatest query coverage with the template (pdb ID: 1OLZ). Later, this structure was used as a template for other proteins. Each generated model was selected on the basis of modeler objective function. The lower the value of this objective function, the more preferred the model would be. Among 10 generated models of each protein, only one model (with least modeler objective function) was selected for further analysis.

DOPE Plot
DOPE profiles of WNT predicted protein models were plotted against their residue numbers in Fig. 2A and Fig. 2B. Energy profiles of individual residues help in understanding the critical assessment areas, for example loop regions. Newly generated structures of proteins have shown low energy values as compared to their template residue energies.

Q-Score
This scoring function tends to achieve a higher objective value as it is based solely on sequence similarity of target and template. Values of Q-score range from zero to one, where 1 represents the identical structures and 0 represent the dissimilar structures. The predicted models of CRD, WNT-1, WNT-6, WNT-10A and WNT-10B showed high Q-score.

Structural Deviation Measure
The SDM of each protein should be lower when compared with its template protein. This confirms the good quality of the predicted structure. Some of the disturbing noise values in SDM were observed because of the sequence diversity between target and template proteins. However, the important functional domains and motifs showed 100% conservation. The quality parameters of modeled structures are defined in Table S1. The predicted 3D structure of the CRD complex after geometry optimization was also evaluated by RamachanranPlot2.0 which confirmed that all of the amino acid residues lay within the allowed region (except Ala 18 of beta chain as shown in Figure S1).

Root Mean Square Deviation
The overall RMSD values for the predicted models of WNT backbone range from 0.3 to 0.6 Å . Evolutionarily highly conserved backbone structure were observed in all models with few fluctuating residues and rotamers. In Fig. 3, residues between 57 and 80 showed the conserved region, whereas small divergence was observed in case of Val-81-Gly-88 and Val-60-Ser-68. Comparing the sequences of CRD across various species, it was inferred that the domain's sequence is ,99% conserved among vertebrates (human, macaque, mouse, rat and zebra-fish) (Fig. 4A). The secondary structure of CRD predicted by PSIPRED [31] showed extensive disulphide bridges, formed due to the presence of a high number of conserved cysteine residues. Along with this, many residues of CRD were predicted for ligand contacts which were further tested for interaction with WNT protein ligands during protein-protein interaction studies. However, the important predicted helices, strands and coils were also found in the threedimensional modeled structure, using mouse CRD (1IJY) as a template (Fig. 4B & 4C). Possible topologies of secondary structures ( Figure S2, S3) in the folded protein were predicted by pdbsum [32]. The overall RMSD value calculated for human CRD-1 against mouse CRD-8 was 0.62 Å . Dann et al., 2001 [33] reported that the dimeric feature of CRD was important for WNT binding and may initiate the signaling pathway. Amino acid residues involved in dimerization of CRD can give us a strong structural insight about its mechanism of action and also enable us to trace important target residues for designing drug inhibitors against the WNT pathway. The minimized free energy CRD dimmer complex generated from protein-protein interaction was subjected to detailed analysis to study the electrostatics and van der waals interaction between residues of complex. The amino acid residues ranging from Pro-25 to Glu-39 showed strong hydrophobic interactions. The peptide fragment of both CRD (A & B), consisting of Cys-67, Thr-68, Val-69, Leu-70, Glu-71, Gln-72, Ala-73, Pro-75, were found to have strong hydrogen bonding, hydrophobic interactions and disulphide bridge formation (Fig. 5). In total, three strong hydrogen bonding interactions were observed in dimmer formation. One of the hydrogens of the amide group of Gln-72 was found to bind with the oxygen of the carboxylic group of His-3, and the other interacted with the carboxylic group of Tyr-5. The third hydrogen bonded interaction was observed between Thr-102 (A) and Thr-102 (B). Both polar residues were found to have a strong bonded interaction with each other. Further electrostatic interactions were found within the loops of helices; H2, H5 and H4. The interaction was with residues Pro-100, Glu-106 and Lys-107 of H4 and Phe-106, Asp-101 and Trp-99 of H5 suppressed by H2 as shown in Fig. 5 in detail.

Binding interaction of WNT and CRD
More than 30 amino acid residues were found to be engaged in binding of the WNT ligand with the CRD domain. The binding cleft forms a U-shaped contour into which the receptor fits and initiates the signaling cascade. The hydrophobic residues present at both arms of U-shaped pocket grasp the CRD domain by generating a strong hydrophobic moiety around it. Here we suggest two active binding sites of the WNT protein. Site 1 constitutes the right arm of the U-shaped pocket, consisting of Nterminal hydrophobic residues; the left C-terminal arm is site 2. A few residues at the base of the U-shaped groove also participate in binding interactions with Frizzled CRD. Measuring the atomic  distances revealed that the opening face of the U-shaped groove is wide enough (starting at ,28 Å then widening to ,31 Å ), to allow the receptor's CRD to fit in. The groove narrowed to ,17 Å in the middle; however the base of the groove was found to be wide (Fig. 6).

Discussion
The chromosomal location of WNT proteins revealed that WNT gene clusters originated through block duplication events. WNT-1 and WNT-6 exist in the form of a cluster at chromosome 12 and 2q35 respectively and both are the closest homolog of each other. Therefore, in order to predict the structure of WNT-6, the WNT-1 modeled structure was used as a template. The sequence similarity between the two proteins is ,42%; however, the model obtained for WNT-6 has low structural deviation and good quality factor values. The structures of the other homologous cluster of WNT-10A and WNT-10B were also generated computationally. The quality parameters of all modeled structures were verified including Q-score, Z-score, SDM, RMSD and DOPE plots.
The RMSD curve generally correlates with the percentage sequence identity, suggesting lower deviation values and strong structural homology between WNT proteins and their templates. However, some notable differences between the structures of template and target proteins were observed due to the insertions and deletions in loop regions during alignment. Most of the alignment errors occurred due to side-chain packaging, loop modeling and occasional alignment errors [34]. DOPE plots clearly supported the computationally modeled structures while measuring the residue by residue energy values. These energy plots proved the stability of modeled protein structures and can be processed for further analysis.
Molecular docking studies of CRD dimmer interaction and the WNT-CRD complex enabled us to locate the critical residues for targeted drug discovery. One of the approaches to inhibit the WNT pathway could be to target the amino acid residues involved in accurate processing of WNT ligands (mostly cysteines, Nglycosylated asparagines and O-glycosylated Serine and Threonine). After careful examination of all types of interactions (CRD-A with CRD-B and CRD with WNT), an interface summary of three proteins was drawn ( Figure S4).
Two chains of CRD dimmer interacted with one another with the help of three alpha helices, labeled as H4 (B), H5 (A) and H2 (A) and loop region in between. Most of the amino acid residues present at the N-terminal end are involved in dimmer hydrophobic interactions. This dimmer then further fits itself into a Ushaped large groove serving as the WNT active binding pocket.
Here we suggest Gln-72 as a very strong candidate to be targeted among dimmer interacting residues. Due to its strong and dual hydrogen binding behavior, it could serve as a very active target for pharmacophore drug modeling. Along with this, we also suggest a peptide fragment of Trp-99, Pro-100, Asp-101, Thr-102, Glu-106, Lys-107, Phe-108, Pro-109 and Leu-116, which forms a strong contour and hydrophobic pocket involved in dimmer interactions. Either by designing an inhibitor against a single residue or a whole peptide, we could stop the CRD dimmer Similarly, a U-shaped binding pocket was found to be conserved in all modeled WNT proteins. This conserved site is proposed to be one of the most important sites in WNT protein structure and is involved in the binding interaction with CRD. Our docking analysis suggested that this conserved binding groove could be targeted in order to design the inhibitors of the WNT signaling pathway. Interface summary of the WNT-CRD protein complex and binding interactions of WNT ligands with CRD allowed us to confirm the binding residues involved in both protein interactions. Most of the residues involved in N-glycosylation and cysteine rich region of WNT-1 (Table S2) were found to be actively participating in the binding interactions between CRD dimmer and WNT protein.

Conclusion
The present structural study proposes a U-shaped binding contour in WNT protein structure for receiving the receptor's cysteine rich domain. This is strongly corroborated by Janda et al., 2012 [18] and Beinz, 2012 [35], who described the U-shaped groove as index finger and thumb grasping the CRD.
Along with this, our study proposes the evolutionarily conserved superimposed amino acid residues which could possibly be utilized in the targeted therapy of the WNT signaling pathway. The computationally predicted structures of the WNT protein give us a strong structural insight into initiation of WNT signaling cascade. Our study also addresses the problem of targeted therapy by locating the critical amino acid residues. These residues could help us to establish a pharmacophore model of novel drugs against the WNT pathway. Furthermore, our study supports the possible peptides suggested by Voronkov, 2008 [16] and predicts new interacting peptides in human WNT-1 and FZD-1 complex which could further be exploited for drug discovery. The residues involved in hydrogen bonding are shown in blue; residues connected to each other with disulphide bridge formation are shown in yellow, whereas the electro statically interacting residues are shown in orange red. This dimmer surface has the greatest interactions and has binding potential to bind with WNT ligands and hence could be targeted for inhibition. H-2 and H-4 helices and a loop region between these helices are involved in binding interactions. doi:10.1371/journal.pone.0054630.g005 Figure S1 Ramachandran Plot of CRD dimmer protein computationally modeled by homology modeling approach. All residues of the model lie within fully allowed, additionally allowed and generously allowed region except Ala 18-A. The Phi and Psi positions of the disallowed residue have been mentioned in the figure.

Supporting Information
(TIF) Figure S2 Secondary structure topology human CRD protein (chain A). Topology diagram depicts the residues involved in the formation of secondary structures elements (alpha helices and beta-sheets). These structures were created by PdbSum database. (TIF) Figure S3 Topology of human WNT protein. Topology diagram depicts the residues involved in the formation of secondary structures elements (alpha helices and beta-sheets). These structures were created by PdbSum database. (TIF) Figure S4 Interface summary of CRD dimmer and WNT interactions. The trimmer formed by the CRD dimmer protein and the WNT ligand contains three different types of interaction (hydrogen bonding: blue; electrostatic van der waal: orange; disulphide bridges: orange-red). The chains of CRD are labeled as A and B; WNT is labeled as C. The interface summary was created by PdbSum. (TIF) Table S1 Quality parameters of homology modeled three dimensional structures of proteins. The sequence similarity was calculated by ClustalX. Rotamer analysis, Ramachandran outliers and quality factor was measured by Molprobity server and NIH web server. For Q-score, SDM and RMSD values Chimera 1.5.3 was used. (DOC)

Table S2
Functionally annotated sites of predicted models of WNT-1, WNT-6, WNT-10A and WNT-10B. N-glycosylation site is the most important site for post-translational modification of WNTs. Cysteine rich region at N-terminal end of the WNTs is important for cell signaling and lipid modification of WNTs. (DOC) Figure 6. WNT U-shaped binding pocket with two hydrophobic arms grasping Frizzled CRD. N-terminal binding Site 1 contains most hydrophobic amino acid residues including Leu-3, Ala-5, and Pro-8. Binding site 2 is present at C-terminal end. This binding site was found to be most conserved in all WNT proteins, ranging between amino acid residues 300-325. The opening of the U-groove is 28 Å wide. The groove then widens to 31 Å , before narrowing to 17 Å at its centre and finally widening again to 27 Å at the base. doi:10.1371/journal.pone.0054630.g006 Human WNT-CRD (FZD) Complex PLOS ONE | www.plosone.org