Insight into the Interactions between Novel Isoquinolin-1,3-Dione Derivatives and Cyclin-Dependent Kinase 4 Combining QSAR and Molecular Docking

Several small-molecule CDK inhibitors have been identified, but none have been approved for clinical use in the past few years. A new series of 4-[(3-hydroxybenzylamino)-methylene]-4H-isoquinoline-1,3-diones were reported as highly potent and selective CDK4 inhibitors. In order to find more potent CDK4 inhibitors, the interactions between these novel isoquinoline-1,3-diones and cyclin-dependent kinase 4 was explored via in silico methodologies such as 3D-QSAR and docking on eighty-one compounds displaying potent selective activities against cyclin-dependent kinase 4. Internal and external cross-validation techniques were investigated as well as region focusing, bootstraping and leave-group-out. A training set of 66 compounds gave the satisfactory CoMFA model (q 2 = 0.695, r 2 = 0.947) and CoMSIA model (q 2 = 0.641, r 2 = 0.933). The remaining 15 compounds as a test set also gave good external predictive abilities with r 2 pred values of 0.875 and 0.769 for CoMFA and CoMSIA, respectively. The 3D-QSAR models generated here predicted that all five parameters are important for activity toward CDK4. Surflex-dock results, coincident with CoMFA/CoMSIA contour maps, gave the path for binding mode exploration between the inhibitors and CDK4 protein. Based on the QSAR and docking models, twenty new potent molecules have been designed and predicted better than the most active compound 12 in the literatures. The QSAR, docking and interactions analysis expand the structure-activity relationships of constrained isoquinoline-1,3-diones and contribute towards the development of more active CDK4 subtype-selective inhibitors.


Introduction
Cyclin-dependent kinases (CDKs), a family of serine/threonine protein kinases, play a central role in the growth, development, proliferation and death of eukaryotic cells [1,2]. There are more than 13 CDKs of which 12 different cyclin families have been identified up to now, and different CDK/cyclin combinations are active during each phase of the cell cycle [3][4][5]. Among these CDK/cyclin complexes, the D/CDK4 and E/CDK2 complexes have been greatly concerned [6]. In the G1-S phase transition, the retinoblastoma susceptibility gene family of proteins (Rb) was phosphorylated by the D-type cyclins (D1, D2 or D3) in combination with CDK4 and cyclin E/CDK2 complexes. Phosphorylation of the Rb activated the E2F transcription factors and resulted in the transcription of genes required for DNA synthesis. This kind of function exerted by D/CDK4 and E/ CDK2 complexes is positively regulated by the mitogenic signaling pathways and negatively regulated by the cyclin-dependent kinase inhibitors (CKIs) [7][8][9][10][11][12][13]. Inhibition of cyclin-dependent kinases (CDKs) with small molecules has been suggested as a strategy for treatment of cancer, based on deregulation of CDKs commonly found in many types of human tumors. Selective CDK inhibitors such as CYC-202 [14] and BMS-387032 [15] targeting CDK2, and PD0332991 [16] targeting CDK4 have been under clinical evaluations. Recently, a series of novel isoquinoline-1, 3-(2H, 4H)diones have been found to possess excellent selective inhibitory activity against the CDK4 [7,8].
The three-dimensional quantitative structure-activity relationship (3D-QSAR) models derived from the most widely used computational methods, CoMFA (comparative molecular field analysis) and CoMSIA (comparative molecular similarity indices analysis), could be used to guide rational synthesis of potent novel inhibitors and now aimed to elucidate the structural features required for CDK4 inhibitors. The best developed models have been duly validated by a systemic external validation, on the basis of which a set of twenty new potent molecules have been designed and predicted stronger activity than 12, the most active compound reported in the literatures [7,8].
Molecular docking techniques have been extensively used as an important tool in the discovery of new small-molecule drugs for targeting proteins. [17][18][19][20][21] Based on an idealized representation that a ligand makes every potential interaction with the binding sites, docking uses an incremental construction algorithm to place flexible ligands into a fully specified active site. Surflex-Dock is particularly successful at eliminating false positive results and therefore used to narrow down the screening pool significantly, while retaining a large number of active compounds. The binding interactions of the isoquinolinedione derivatives within the CDK4 active sites were discussed. The step-wise description of methodology used for 3D-QSAR analysis, molecular docking and designing of new CDK4 inhibitors is as shown in Figure 1.

Data Set
All isoquinoline-1,3-(2H,4H)-dione derivatives and their biological activities were collected from literatures [7,8] (Figure 2). A total set of 81 molecules were randomly segregated into training and test sets comprising 66 and 15 molecules, respectively, based on the following rules: (i) Diversity of the molecules was very necessary to assess the statistical significance. (ii) To avoid any redundancy or bias in terms of structure features and activity range, the information of the selected compounds must be clear and concise. (iii) The most active and least active compounds should be included in the training set. The activities of the CDK4 inhibitors were reported in IC50 and converted to pIC50 by taking Log (1/IC50) for the convenience. The activity range from 4.6 to 8.6 log units of these compounds provided a broad and homogenous data set for 3D-QSAR study. In general, the spread of activity should cover at least 3 log units for a reliable 3D-QSAR model [22].

Molecular Modeling and Alignment
The structures of the derivatives were sketched in SYBYL 8.1 (Tripos, Inc., St. Louis, MO, USA) molecular modeling package and Gasteiger-Hückel charges were assigned to the atoms of all the compounds. A good alignment is the most essential for the quality and the predictive ability of CoMFA and CoMSIA models [23], and common substructure, pharmacophore or docking overlaps can be available to align molecules [24,25]. The isoquinoline-1, 3-(2H, 4H)-dione ring with structural rigidity was selected as the common substructure and the compound 12 with the strongest inhibitory activity as the template molecule ( Figure 3). It can be seen that all the compounds studied have similar active conformations.

CoMFA and CoMSIA Setup
Three-dimensional grid spacing was set at 2 Å in the x, y, and z directions. The steric and electrostatic field energies were calculated using the Lennard-Jones and the Coulomb potentials [26]. For CoMFA method, a sp 3 hybridized carbon atom with a+1 charge was identified as the probe atom to determine the magnitude of the steric and electrostatic field values, whose truncation was set at 30 kcal/mol [27][28][29].
The CoMSIA method, similar to CoMFA in terms of fields around the molecule, was based on the assumption that changes of ligands in binding affinities are associated with changes of molecular properties. Besides steric and electrostatic fields, three other different fields of hydrophobic, hydrogen bond donor and hydrogen bond acceptor are also calculated [30]. Moreover, a Gaussian function introduced in similarity indices makes it be calculated at all grid points, inside and outside different molecular surfaces. Equation used to calculate the similarity indices is as follows: Where, A is the similarity index at grid point q, summed over all atoms i of the molecule j under investigation. W probe, k is the probe atom with radius 1 Å , charge +1, hydrophobicity +1, hydrogen bond donating +1 and hydrogen bond accepting +1. W ik is the actual value of the physicochemical property k of atom i. r iq is the mutual distance between the probe atom at grid point q and atom i of the test molecule. a is the attenuation factor whose optimal value is normally between 0.2 and 0.4, with a default value of 0.3 [31].

Partial Least Squares (PLS) Analysis
For Partial Least Squares (PLS) analysis [32], the ''leave-oneout'' cross-validation method was first carried out to generate a cross-validated r 2 (q 2 ) value and the optimal number of components (ONC), based on the lowest standard error of prediction (SEP) which usually corresponds to the highest cross-validated squared coefficient (q 2 ). To avoid over-fitting the models, a higher component was accepted only when the q 2 differences between two components was larger than 10% [24]. Region focusing was performed to maximize q 2 value by rotating the extracted principal components [33]. The q 2 is a good indicator of the accuracy of actual predictions and a q 2 value of 0.5 means halfway between no model and a perfect model [34]. Non-cross-validation was then executed to establish the final 3D-QSAR model after the optimal External Validation q 2 is often a useful but not sufficient criterion for model validation. In many cases, a model with high r 2 cv and r 2 values were proved to be inaccurate. Even though a model may exhibit a good predictive ability based on the statistics for the test set, it is not always sure that the model will perform well on a new set of data [35]. Therefore, an external test sets (r 2 pred ) [36] was recommended for the estimation of predictive ability. Predictive values r 2 pred were calculated as follows: Therein, SD is the sum of squared differences between the measured activities of the test set and the average measured activity of the training set.
Several other statistics such as r 2 m , r 2 0 , R and k were calculated using the following equations, and 3D-QSAR models were considered acceptable only if they satisfy the following conditions: 15 and r 2 m .0.5 [36,37].
Where, y i and y y p are the actual and predicted activities, respectively. y y o and y y p are the average values of the observed and predicted pIC 50 values of the test set molecules, respectively. r 2 is the non-cross-validated correlation coefficient from PLS process.

Molecular Docking
Surflex-Dock in SYBYL 8.1, using a patented search engine and an empirical scoring function to dock ligands into a protein's binding site [19], was applied to study molecular docking in the     present paper. The crystal structure of CDK4 with ligand 1GIH was retrieved from the RCSB Protein Data Bank [38]. A protomol, a computational representation of the receptor's binding cavity to which putative ligands are aligned, was generated automatically with a threshold parameter of 0.31 and a bloat parameter of 1 Å , and composed of a collection of fragments or probe molecules such as CH4, N-H, and C = O that characterize steric effects in the binding pocket, hydrogen bond donor and acceptor groups, respectively. [39,40] All the water molecules and sulfate salt in CDK4 1GIH (receptor) were deleted, and hydrogen atoms and Gasteiger charges were added [41,42]. All of the eightyone ligands were docked sequentially into the binding pocket of CDK4 using the parameters previously optimized. Surflex-Dock total scores are expressed in log 10 (K d ) to represent binding affinities. The scores of 10 docked conformers of each isoquinolinedione derivatives were ranked in a molecular spreadsheet, and the highest total score was taken into consideration for ligandreceptor interactions. To visualize the binding mode between the protein and ligands, the MOLCAD (Molecular Computer Aided Design) program was applied to calculate and display the surfaces of channels and cavities, as well as the separating surface between protein subunits. MOLCAD program provides several types to create a molecular surface, in which the Robbin surfaces illustrating the secondary structure elements of the binding structure was applied to build the MOLCAD Robbin and Multi-Channel surfaces displayed with several potentials. Other parameters were established in default.

CoMFA and CoMSIA Analysis
The results taken from the PLS analysis were summarized in Table 1. The actual versus the predicted pIC 50 values for the training and the test set molecules were listed in Table 2 and depicted graphically in

External Validation Results
The calculated results of the external validation were listed in Table 3.     green contours represent regions of high steric tolerance (80% contribution) and the yellow contours (20% contribution) for unfavorable steric effect. The electrostatic field defined blue contours (80%) and red contours (20%) for electron-donating andwithdrawing substituents, respectively. In Figure 5a, one huge green contour around the R 1 position revealed that bulky substituents at this site would benefit the activity, and two huge yellow contours near the R 2 and R 3 positions suggested bulky groups at these sites unfavorable. This may explain the facts that derivatives 10-12 with relative bulkier groups (e.g. N-pyrrolyl, 3-furyl and 3-thienyl) at R 1 displayed the strongest activity, while derivatives 25-28, 31-32, 38-42, 44-45, 47, 55, 57-60, 74 and 80 bearing a relative bulkier substituents (e.g. 4-methylpiperazinyl, -CH 2 -(1-piperidinyl)) at R 3 position showed a weak activity. Especially, derivatives 5, 7 and 8 with the corresponding substituent of chloro, bromo and iodo showed their activities in the following order of 5,7,8.

CoMFA Contour Maps
One red contour near the R 1 and one red around the R 2 and R 3 in Figure 5b indicated an electron-withdrawing group favorable.

CoMSIA Contour Maps
The CoMSIA steric, electrostatic, hydrophobic, hydrogen bond donor and acceptor contours plots for the compound 12 were shown in Figure 6. The CoMSIA steric and electrostatic contour maps (Figs. 6a and 6b) were similar as the CoMFA steric and electrostatic contour maps ( Figure 5). For hydrophobic field, white (20% contribution) and yellow (80% contribution) contours highlighted hydrophilic and hydrophobic properties, respectively. Hydrogen bond donor and acceptor fields take the cyan (80%) and purple (20%) contours for hydrogen bond donor, and the magenta (80%) and red (20%) contours as favor and unfavor for hydrogen bond acceptor, correspondingly.
In Figure 6c, one yellow contour near the R 1 region demonstrated that a hydrophobic substituent at this site would benefit the activity. Most of the active derivatives involved in present study possessed a hydrophobic group (e.g. 3-furyl, Npyrrolyl, 3-thienyl, bromo-, chloro-, iodo-, phenyl) at R 1 , while those with only a hydrogen atom (e.g. 2, 3, 6 and 25-27) exhibited significantly decreased potencies. Three pieces of white contour around the R 2 position highlighted the hydrophilic properties of compounds 1-25 and 62-69. One purple contour near the R 1 position in Figure 6d suggested a hydrogen bond donor group unfavorable. Therefore, the compounds 10-12, 49, 50 and 52-53 with hydrogen bond acceptor oxygen or nitrogen atoms at R 1 site exhibited better potencies. Both one red and two magenta contours located near the R 1 position in Figure 6e revealed that the hydrogen bond acceptor field was not very important for this site. The magenta contour near the R 2 and R 3 sites indicated hydrogen bond acceptor properties favorable. The hydroxyl groups at R 2 could act as hydrogen bond acceptor at the same time. Therefore, the magenta contour confirmed the importance of the hydroxyl group at this region. Compounds 7-12 and 23 bearing a hydrogen bond acceptor substituent (methoxyl, 4pyridyl) at R 3 showed the most activities.    . illustrated the binding modes between compound 12 and the ATP pocket. The carbonyl group at C-3 position acted as a hydrogen bond acceptor by forming a H-bond with the -NH group of Leu83 residue; the imino group at N-10 position served as a hydrogen bond donor and formed H-bond with the carbonyl group of Gln131 residue; the hydroxyl group at R 2 site acted as both hydrogen bond donor and acceptor and formed two H-bonds with the carbonyl group of Asp145 and the -NH group of Asn132 residues, respectively. The observations taken from Figure 7 satisfactorily matched the corresponding CoMSIA hydrogen bond donor and acceptor contour maps.

Summary of Structure-activity Relationship
The structure-activity relationship revealed by 3D-QSAR and docking studies was illustrated in Figure 8. In short, the bulky, electron-withdrawing, hydrophobic and hydrogen bond acceptor groups at R 1 position are favorable; the minor, electronwithdrawing, hydrophilic, hydrogen bond donor and acceptor groups at R 2 position may benefit the potency; the minor, electron-withdrawing and hydrogen bond acceptor substituent at R 3 position would increase the activity. The carbonyl group at C-3 and the imino group at N-10 site were essential for binding to the ATP pocket of CDK4.

Designing of Potent Derivatives
Based on the structure-activity relationship revealed by the present study, twenty novel isoquinoline-1, 3-(2H,4H)-dione derivatives were designed. These molecules were aligned to the database and their activities were predicted better than compound 12 by the best CoMFA and CoMSIA models established previously, especially D4, D11 and D12 showed 10 folds more active than compound 12. The chemical structures and predicted pIC 50 values of these compounds were shown in Figure 9. The results validated the structure-activity relationship in this present work.

Conclusions
In this frame-work, a combined docking and 3D-QSAR analysis was performed to explore the interactions between isoquinoline-1,3-diones and CDK4 protein. The satisfactory CoMFA model (q 2 = 0.695, r 2 = 0.947) and CoMSIA model (q 2 = 0.641, r 2 = 0.933) showing good correlative and predictive abilities were obtained via internal and external cross-validation techniques, region focusing, bootstraping and leave-group-out. Our analyses found that all five parameters (steric, electrostatic, hydrophobic, hydrogen bond donor and acceptor properties) are highly desirable for potent inhibitory activity. The contour maps and the docking binding structures showed that the carbonyl group at C-3 and the imino group at N-10 site were necessary for binding to the ATP pocket of CDK4. Based on the interactions, twenty new designed molecules predicted higher activities than compound 12, confirming that the models could provide a valuable clue for the development of more active CDK4 subtype-selective inhibitors.