A Computational Approach to Evaluate the Androgenic Affinity of Iprodione, Procymidone, Vinclozolin and Their Metabolites

Our research is aimed at devising and assessing a computational approach to evaluate the affinity of endocrine active substances (EASs) and their metabolites towards the ligand binding domain (LBD) of the androgen receptor (AR) in three distantly related species: human, rat, and zebrafish. We computed the affinity for all the selected molecules following a computational approach based on molecular modelling and docking. Three different classes of molecules with well-known endocrine activity (iprodione, procymidone, vinclozolin, and a selection of their metabolites) were evaluated. Our approach was demonstrated useful as the first step of chemical safety evaluation since ligand-target interaction is a necessary condition for exerting any biological effect. Moreover, a different sensitivity concerning AR LBD was computed for the tested species (rat being the least sensitive of the three). This evidence suggests that, in order not to over−/under-estimate the risks connected with the use of a chemical entity, further in vitro and/or in vivo tests should be carried out only after an accurate evaluation of the most suitable cellular system or animal species. The introduction of in silico approaches to evaluate hazard can accelerate discovery and innovation with a lower economic effort than with a fully wet strategy.


Introduction
During the last years, following some evidence suggesting that exposure to environmental chemicals can lead to disruption of endocrine function in a number of wildlife species (molluscs, crustacean, fish, and birds), concern has been expressed also for human health. Even if many EU regulations contain specific provisions on chemicals that can affect the endocrine system, (e.g. REACH [1], Plant Protection Products Regulation [2], Biocides Regulation [3], Regulation on cosmetics [4], Water Framework Directive [5]), warning was raised by Bars et al. [6], who stated that recent European legislation has created a hazard-based approval criterion, which supports marketing and use of chemicals only on the basis that they do not induce endocrine activation in humans or wildlife species.
The in silico approaches have become relevant to these legislations as far as they can help reduce the number of animals used (by pre-screening and prioritising chemicals for more intensive testing). Moreover, they are in line with the vision of the 21 st century toxicity paradigm: chemicals will be subjected to a multiplicity of high-throughput screening tests to detect cellular response to an array of ''pathways of toxicity'', and results will feed into computational systems biology tools that model dose-response effects and inform new risk assessment approaches [7]. Several computational approaches may be useful for evaluating interactions between a receptor and its putative ligands: some of them are based on molecular docking, which has been reliably used for decades in pharmacological research and development [8,9].
The present research is aimed at devising and assessing a computational approach to evaluate the affinity towards the androgen receptor (AR) of hormonally active substances and of their metabolites. To build the model, the ligand binding domain (LBD) structures of ARs of three distantly related species (human, rat, and zebrafish) were used. The use of three reference species was also meant to evaluate whether their sensitivities to the test chemicals do differ. Three fungicides (vinclozolin, iprodione and procymidone) and their rat metabolites [10][11][12][13], all with a wellestablished androgenic activity [14][15][16][17][18], were tested. The proposed model can anticipate, very early in the hazard identification procedure, the ability of a chemical to bind the AR LBD. This information is very useful to set up a priority list during the screening of a large chemical database, and may be exploited also to design new chemicals for use in different fields [19]. To date, considering the available tests, evidence coming just from in silico assays cannot be considered sufficient for the identification of an endocrine-active substance, and the assessment of its possible relevance to humans. However, the information about qualitative and quantitative hormonal response to a chemical might be useful to identify its 'potential' for interaction with the endocrine system, and therefore to better design and carry out further testing steps.

Comparative modeling
The human and rat AR LBD crystal structures were downloaded from the RCSB Protein Data Bank [PDB entry: 3L3X (chain A), and PDB entry: 1I37, respectively]. The crystallographic structures of the human and rat receptors were then submitted to a preparation step, based on energy minimization (EM) with the Amber12:EHT force field [20] and the reaction field solvation model. Refinement was carried out down to a Root Mean Square (RMS) gradient of 0.05 kcal/mol/Å 2 . All the computational procedures were carried out with the Molecular Operating Environment (MOE).
The zebrafish AR sequence was downloaded from the UniProt Protein Knowledgbase database [entry: B9P3Q7]. 1T7R, corresponding to the chimpanzee AR LBD, was set as template in order to compute a 3D structural model [21]. The alignment produced by the MOE Align program with default parameters was manually adjusted. Comparative model building was carried out with the MOE Homology Model program. Ten independent models were built and refined, then the highest scoring intermediate modelaccording to the electrostatic solvation energy calculated using a Generalized Born/Volume Integral (GB/VI) methodology [22] was submitted to a further round of EM. Both for the intermediate and the final structures, the refinement procedures consisted in EM runs based on the Amber12:EHT force field with the reaction field solvation model. The quality of the final model was carefully checked with the MOE Protein Geometry module, in order to make sure that the Ramachandran plot, the side chain packing, and the stereochemical quality of the generated structure were acceptable.

Binding site analysis
The binding site of each receptor was identified through the MOE Site Finder program, which uses a geometric approach to calculate putative binding sites in a protein, starting from its tridimensional structure. This method is not based on energy models, but only on alpha spheres, which are a generalization of convex hulls [23]. The prediction of the binding sites, performed by the MOE Site Finder module, confirmed the binding sites defined by the co-crystallized ligands in the holo-forms of the investigated proteins.

Molecular database preparation
The database was prepared by building with the MOE Builder the molecular structures of the three fungicides and of their major rat metabolites, as well as the molecular structures of the endogenous hormones in each species (human, rat and zebrafish). Each structure was converted into a tridimensional structure, and energy was minimized, with the MOE Energy Minimize program and the Amber12:EHT force field, down to a RMS gradient of 0.05 kcal/mol/Å 2 . Since some of these molecules contain stereogenic centres (see Figure 1, atoms marked by an asterisk), all the possible enantiomers/diastereomers were built and added to the database. Moreover, 20,000 conformations were generated for each ligand by sampling all their rotatable bonds.

Molecular docking
The in silico screening was carried out with the MOE Dock program, part of the MOE Simulation module. The whole procedure was carried out for each of the three AR LBD -human, rat and zebrafish. The AR LBD was set as 'Receptor'. The selected placement methodology was 'Triangle Matcher', which is the best method for standard and well-defined binding sites. With Triangle Matcher the poses are generated by superposing triplets of ligand atoms and triplets of receptor site points. The receptor site points are alpha spheres centres that represent locations of tight packing. Thirty complexes were generated for each tested ligand. Duplicate complexes were then removed: poses are considered as duplicates if the same set of ligand-receptor atom pairs are involved in hydrogen bond interactions and the same set of ligand atom receptor residue pairs are involved in hydrophobic interactions. The accepted poses were scored according to the London dG scoring function, which estimates the binding free energy of the ligand from a given pose.
where A and B are the protein and/or ligand volumes with atom i belonging to volume B; R i is the solvation radius of atom i (taken as the OPLS-AA van der Waals sigma parameter plus 0.5 Å ); and c i is the desolvation coefficient of atom i. The coefficients (c, c HB , c M , c i ) have been fitted from approx. 400 x-ray crystal structures of protein-ligand complexes with available experimental pK i data.
Atoms are categorized into about a dozen types for the assignment of the c i coefficients. The triple integrals are approximated using Generalized Born integral formulas. All the saved solutions were submitted to a further refinement step, based on molecular mechanics (MM). In order to speed up the calculation, residues over 6 Å cutoff distance away from the pre-refined pose were ignored, both during the refinement and in the final energy evaluation. All receptor atoms were held fixed during the refinement. During the course of the refinement, solvation effects were calculated using the reaction field functional form for the electrostatic energy term. The final energy, docking score, was evaluated using the GBVI/WSA dG scoring function with the Generalized Born solvation model (GBVI) [24]. The GBVI/WSA dG is a forcefield-based scoring function, which estimates the free energy of binding of the ligand from a given pose. It has been trained using the MMFF94x and AMBER99 forcefields on the 99 protein-ligand complexes of the Solvated Interaction Energy (SIE) training set [25]. The functional form is a sum of terms: where c represents the average gain/loss of rotational and translational entropy. a and b are constants, which were determined during training (along with c) and are forcefielddependent. E coul is the coulombic electrostatic term, which is calculated using currently loaded charges, using a constant dielectric of 1. E sol is the solvation electrostatic term, which is calculated using the GB/VI solvation model. E vdw is the van der Waals contribution to binding. SA weighted is the surface area weighted by exposure. This weighting scheme penalizes exposed surface area. All the ligands of the molecular database were tested according to the above procedure. The Amber12:EHT force field was used for all the computational procedures. Docking accuracy was evaluated using the present procedure for reproducing 10 ligand-receptor crystallographic complexes. Ligand RMSD values between crystallographic vs computational complexes were measured.
As a negative dataset, we randomly selected 1,000 compounds from the Asinex Platinum Database (http://www.asinex.com) and docked them on the human AR LBD, using the above procedure.

Low-mode molecular dynamics simulations
For studying the flexibility of AR LBD helix 12, we applied the low-mode molecular dynamics approach, aimed at focusing a MD trajectory along the low-mode vibrations and featuring a very efficient way vs classical MD for searching for minima troughs on the potential energy surface. To run these computations, we used the MOE Conformational Search program of the Conformations module. This program uses an efficient implicit method for estimating the low-frequency modes and is based on the attenuation of high-range velocities as described in detail in [26].
The human AR LBD bound to: i) dihydrotestosterone (DHT), an agonist, ii) cyproterone acetate, an antagonist, and iii) in its apo form was simulated after preparation. The complex with DHT was obtained from RCSB PDB (3L3X); the apo form was obtained by in silico removing DHT, and the complex with cyproterone acetate was obtained through molecular docking on the same crystal structure.
Both helix 12 (set as a rigid body) and the loop joining helix 12 to the preceding helix were left free to move during the low-mode molecular dynamics, whereas the residues more than 4.5 Å away were fixed (not free to move, but used for the energy calculations); the other residues were defined as inert (fixed and not used for energy calculations). The simulation was carried out with default parameters, except for strain energy cutoff, which was set at 100 kcal/mol. One thousand conformations were generated and analysed.
The same computational approach was used also to produce ensembles of natural ligand-receptor complexes, in order to estimate with greater accuracy their binding free energies. To this purpose, we started from the top scoring poses obtained from the molecular docking procedure. The ligand and the residues within 4.5 Å from it were left free to move during the low-mode molecular dynamics, whereas the residues more than 4.5 Å away were fixed (see above); the other residues were defined as inert (see above). The simulation was carried out with default parameters, except for strain energy cutoff, which was set at 50 kcal/mol. Four hundred conformations were generated and the one with the lowest energy was used to compute the complex dissociation constant value. The Amber12:EHT force field was used for all the computational procedures.

Dissociation constant calculation
The estimated binding affinity of the top-scoring solution for each complex (receptor-ligand) was not directly computed from the GBVI/ WSA dG value, but the complexes were further refined through the use of a set of specific MOE procedures, named LigX, aimed at the minimization of ligands in the receptor binding site. The dissociation constant (K i ) was computed through the binding free energy estimated with the GBVI/WSA dG scoring function, after complex optimization with LigX, according to the following equation: where R represents the gas constant and T the temperature. The K i was computed starting from the binding free energy values at a fixed temperature (300 K).

Comparative modeling
The homology model of the zebrafish AR LBD was built using as template 1T7R, the crystal structure of chimpanzee AR LBD (66% sequence identity). Figure S1 in File S1 shows the alignment used for carrying out the modelling procedure. Ten independent models were built and refined, and the one top scoring according to the electrostatic solvation energy was selected. The presence of a well-defined binding site, shown in Figure 2, was probed through the MOE Site Finder program. The same approach was applied to human and rat AR LBD crystals. Table 1 reports the binding site scores for the three receptor structures and lists the residues lining each of them. Figure 3 shows the global alignment of the investigated AR LBD; the residues in the binding sites are highlighted. Finally, after a structural superposition, the global and the binding site RMSD values were computed both for a-carbons and for whole residues of the three AR LBD; data are summarized in Table 2.

Validation of the docking protocol
The accuracy of the docking protocol detailed under Methods was extensively validated by reproducing the ligand-receptor complexes for 10 different AR LBDs deposited in the RCSB PDB. Table 3 reports the selected structures and the RMSD values between the co-crystallized and the docked ligands: the latter range between 0.13 and 0.35 Å .
In order to assess the correlation between docking scores and biological data, we docked to the rat AR LBD seven ligands from a published dataset, whose relative affinities for the rat AR binding site had been experimentally determined [27]. The computed relative affinities (dissociation constants, K i ) showed the same ranking as the experimental ones; Table 4 reports experimental (literature data) vs in silico (our computations) data.
Furthermore, from an Asinex combinatorial chemistry dataset, we randomly selected 1,000 compounds, which have never been predicted/demonstrated to bind AR LBDs, and evaluated their docking score on the human AR. All of their docking scores were positive (.0, as plotted in Figure S2 in File S1), which substantiates the ability of the proposed procedure to correctly identify negative (non-interacting) compounds.
Results of docking the test compounds to zebrafish, rat and human AR LBD Figure 1 reports the structures of the three parent test fungicides and of their selected metabolites. The molecular database also contains the main endogenous androgenic hormones for all the three species, namely testosterone and DHT for human and rat [28], testosterone and 11-ketotestosterone for zebrafish [29]. Table 5 reports the dissociation constants (K i ) computed complexes through equation (4) (see under Methods) for all AR LBD. From these data, endogenous hormones show the highest affinity for their AR in rat; instead, both in zebrafish and humans, the affinity of endogenous hormones for their AR is lower than with some xenobiotics. Table 6 Table 6 were used to build the box plot reported in Figure 4.
In rat, iprodione and its metabolites have the lowest affinity In zebrafish, procymidone metabolites vary extensively in energy: (R,R)-procymidone_3 (O) has the best affinity (2 11.06 kcal/mol), but procymidone_4 (P) has the third worst energy (28.52 kcal/mol), and the parent compound has a low affinity (28.49 kcal/mol) for AR.
Only for exemplification purposes, Figure 5 reports the top scoring poses for iprodione complexed with human (A), rat (B), and zebrafish (C) AR, respectively; iprodione orientation appears similar in the three complexes, with a RMSD value of 0.8 Å between human and zebrafish, and 1.9 Å between rat and zebrafish.

Low-mode molecular dynamics simulations
Differences between experimental and computational values of K i are consistently expected for nuclear steroid receptors and are specifically connected with the dynamic changes in their structure. After the initial molecular recognition step, nuclear steroid receptors deeply rearrange [32][33][34], partially blocking the ligand into their binding site through a displacement of helix 12; as a result, apparent K i measured through experimental approaches  are very low. Our in silico approach does not take into account the receptor rearrangement and this results in higher values for computed K i s. Low-mode molecular dynamics simulations of AR LBDs were then run, under different computational setups, with a twofold aim: i) sampling the conformational space of helix 12 in the human AR LBD, when bound to an agonist and to an antagonist, and in its apo form); ii) estimating binding affinities of the natural hormones for the three AR LBDs at a higher accuracy level than with molecular docking.
During the molecular dynamics simulations, helix 12 keeps a closed conformation when the human AR binds an agonist (DHT), whereas it opens when the LBD is empty or bound to an antagonist (cyproterone acetate). Starting from a common closed conformation of helix 12 in all the three setups, only the apo and the antagonist-bound structures rapidly evolve towards helix 12 opening. Figure 6 shows the three closed starting conformations (helix 12), superposed to the most energetically favoured open conformation for the apo (Figure 6, panel B) and the antagonistbound LBD ( Figure 6, panel C). On the contrary, helix 12 does not open (1,000 generated and analysed conformations) when LBD is bound to an AR agonist, such as DHT ( Figure 6, panel A). In spite of the ability of molecular dynamics to correctly sample the reported helix 12 conformational transition, the differences between experimental and computational dissociation constant values for natural agonists cannot yet be compensated for. Indeed, as shown in Table 5, the binding affinities of the natural hormones for the three investigated LBDs, computed applying LigX to the lowest energy complex out of 400 obtained from the low-mode molecular dynamics simulations, are very close to the affinities obtained from our rapid docking procedure. The experimental issue is connected with the definition of K i as the ratio k off /k on , where k off is the dissociation rate constant in min 21 and k on is the association rate constant in M 21 min 21 . The closed conformation of helix 12 induced by the binding of an agonist produces a decrease in k off values, thus reducing the apparent dissociation constant. The discrepancy between experimental and computational K i for the tested natural agonists is strictly connected with this phenomenon. The analysis of the interactions between helix 12 and DHT, carried out on a crystallographic complex (RCSB PDB code: 3L3X), shows only one weak and non-specific interaction between the ligand and the side chain of Met 895, in helix 12 (see Figure S3 in File S1), corroborating a kinetic more than a thermodynamic effect as the reason for the discrepancy between computed and experimental K i s.

Discussion
Traditional in vitro and in vivo toxicity-testing strategies, which are expensive and time consuming and involve a large number of test animals, have been evolving over the last few decades, in order to address increasing concerns about a wider variety of toxic responses, such as subtle neurotoxic effects, adverse immunologic changes, and endocrine activity. Moreover, toxicity testing is under increasing pressure, and the most utilized approach, which  relies primarily on in vivo mammalian toxicity testing, is unable to adequately meet the competing demands [35].
Although the current knowledge may not yet allow to fully eliminating the need for in vivo testing, our proposed computational approach, associated with suitable in vitro assays, can provide an effective tool to identify, at a very early stage, the potency of any EAS, through the measurement of its affinity (the binding free energy/complex K i ) for the AR. This kind of information can be useful at the very beginning of the pipeline of hazard identification for compounds with putative EAS activity. The direct interaction between the putative EAS and the AR is a prerequisite to biological activity and should be carefully kept into account, as done in our model. Interaction depends principally upon affinity, which is a term referring to the strength of interaction between two molecules. The affinity of a ligand for a specific receptor determines its residence time of association, a parameter often quantified by the dissociation constant [36]. Generally, the higher is the affinity the longer is the residence time. Low affinity ligands do not need any evaluation of intrinsic activity (a, which is the relative ability of a drug-receptor complex to produce a maximum functional response), since they do not spend enough time in the receptor binding site to exert any effect either in in vitro or in in vivo tests.
Although interaction between ligand and receptor is essential in order to cause any effect, the intrinsic activity is the key to the ability of a molecule to induce a specific response. For the molecules with in silico high affinity for ARs, in vitro selected tests should first confirm the computed affinity. Next, only for the molecules that show high in vitro affinity, the intrinsic activity should be evaluated in order to discriminate between agonist (a = 1), partial agonist (1, a ,0), or antagonist (a = 0).
Our in silico approach can thus compute the affinity of the simulated complexes, whereas reliable values for intrinsic activity (a) can be obtained only by in vitro and/or in vivo tests [37,38].
Nowadays, several in vitro tests aim at evaluating the affinity of chemicals for ARs, such as the AR Binding Assay described in OCSPP Guideline 890.1150 and proposed by the EPA [39] as part of the Tier 1 of the Endocrine Disruptor Screening Program (EDSP). This test consists of a radioligand binding assay that identifies compounds able to compete for AR binding in vitro, and is not meant to measure the molecular intrinsic activity (a) and to classifying them as (partial) agonists or antagonists.
Tests aimed at the evaluation of the intrinsic activity (a) of putative EASs are mainly based on two end points: the measurement of cell proliferation or the use of an androgenresponsive reporter gene. The A-SCREEN assay [40] measures the proliferation of sensitive cells to screen for androgen activity. The MDA-kb-2 cell line has been developed by scientists at the EPA [41] through a stable transfection of AR and the insertion of an MMTV-driven luciferase reporter gene into the human mammary cancer MDA-MB-453 cell line. Both androgen and glucocorticoid agonists can activate the MMTV luciferase gene, and antagonists can be tested with respect to a fixed reference concentration of the agonist.  Table 4. Experimental (from [27]) and in silico (computed) dissociation constants for the selected compounds with respect to rat AR binding site.
Letter (Fig. 1 We carried out a validation of the molecular modelling and docking procedures we had devised by comparing experimental and computed dissociation constants (K i ) for the human, rat and zebrafish endogenous hormones with respect to their ARs. A high in silico affinity between the endogenous ligands and their specific  ARs is a conditio sine qua non for applying our selected approach to the investigated problem. Experimental data available from scientific reports are associated with very high standard error of the mean. Computed binding free energies are evaluated from the analysis of specific ligand-receptor non-covalent interactions; their occurrence is associated with a score that can be interpolated on an experimental curve [42]. We have already reported that the use of empirical scoring functions for estimating dissociation constant values has accuracy in the range of one order of magnitude [43]. Furthermore, the specific ligand-induced activation mechanism of nuclear steroid receptors is based on a conformational transition, leading to the rearrangement of helix 12 [32,33], which seemingly traps the ligand inside the binding site. Accordingly, the experimentally measured K i have very low apparent values (corresponding to most favourable affinities). Conversely, the K i values obtained through our in silico approach do not take into account such a displacement. This structural rearrangement is peculiar of nuclear steroid receptors, and should not be confused with the ligand-induced-fit process, which characterizes all the ligand-receptor interaction events. No induced-fit protocols [44] have been implemented in this investigation, since the ligand binding sites of the selected receptors were already well defined in the available crystallographic structures.
In this paper, our model was used to study the affinity of three fungicides (vinclozolin, iprodione and procymidone), as well as of their metabolites, with respect to the AR LBD of three different species (human, rat and zebrafish). These molecules were selected because of their classification in the same chemical group (dicarboximides) as well as of their classification as EASs. The mechanism behind the endocrine effects of both vinclozolin [10] and procymidone [16] is well-documented. They compete with the endogenous hormones for the binding to the AR, but they  cannot activate it, because of their low intrinsic activity (a . 0), and thus exert antiandrogenic effects. The toxic mechanism of iprodione has not been fully clarified yet and this compound is classified sometimes as antiandrogenic [16] sometimes as androgenic agent [14]. From our results, it is clear that all the tested fungicides and their metabolites can bind AR and compete Step 1: database production; step 2: in silico binding assay; step 3 in vitro binding assay for the selected dataset; step 4: in vitro activity assays only for the high affinity molecules (positive hits); and identification of agonist (a = 1), partial agonist (1,a,0) and antagonist (a = 0) activity. doi:10.1371/journal.pone.0104822.g007 with the endogenous hormones in all the tested species, exerting antiandrogenic effects. In detail, in human and zebrafish, the tested compounds and metabolites can bind AR LBD with affinities comparable to the endogenous hormones: this suggests that there is a strong competition to occupy the binding site. On the contrary, the rat AR shows a lower affinity for the tested compounds, and -as a single assay -it is thus not a suitable molecular model to assess the toxicity of EASs and of their derivatives. However, identifying a toxic molecule in rat is an important alert signal, because this compound is likely to have an even stronger impact in humans.

Conclusions
Our results shed a new light on the selection of the in vitro tests used for EAS hazard identification. Actually, rat seems to be less sensitive than human to the tested putative EASs. In vitro tests based on rat preparations could underestimate the sensitivity to these classes of molecules, differently from the human AR. On the other hand, zebrafish could be a more reliable model than rat, especially for environmental effects. For these reasons, the human AR LBD seems the most reliable target to be considered for estimating the EAS hazard in humans, whereas the zebrafish AR LBD should be considered, when environmental effects of EAS have to be investigated.
Our in silico approach emerges as a computational methodology for the evaluation of the AR affinity (prioritization of assessment) of a large number of molecules. During the design of new chemicals, the molecules that show highest affinities should in principle be disregarded and the main efforts should be focused on the molecules that show the lowest affinities. On the contrary, during a safety evaluation process, attention should be focused on the molecules that show the highest affinities.
While the in silico screening cannot be used as a stand-alone procedure, it can be successfully used as a first prioritizing step in a tier approach (Figure 7). The second mandatory check for the in silico positive hits should be an in vitro evaluation procedure, in which the affinity of the positive hits are measured through a reference cellular assay. From our results, the choice of the species to use in competitive binding assay should be carried out carefully, because it may lead to hazard over-under-estimation.

Supporting Information
File S1 The Supporting Information file contains: Figure S1. Global alignment between primary structures of zebrafish and chimpanzee AR LBDs. Figure S2. Docking Score plot for the 1,000 non-interacting randomly selected compounds.