Mapping Hydrophobicity on the Protein Molecular Surface at Atom-Level Resolution

A precise representation of the spatial distribution of hydrophobicity, hydrophilicity and charges on the molecular surface of proteins is critical for the understanding of the interaction with small molecules and larger systems. The representation of hydrophobicity is rarely done at atom-level, as this property is generally assigned to residues. A new methodology for the derivation of atomic hydrophobicity from any amino acid-based hydrophobicity scale was used to derive 8 sets of atomic hydrophobicities, one of which was used to generate the molecular surfaces for 35 proteins with convex structures, 5 of which, i.e., lysozyme, ribonuclease, hemoglobin, albumin and IgG, have been analyzed in more detail. Sets of the molecular surfaces of the model proteins have been constructed using spherical probes with increasingly large radii, from 1.4 to 20 Å, followed by the quantification of (i) the surface hydrophobicity; (ii) their respective molecular surface areas, i.e., total, hydrophilic and hydrophobic area; and (iii) their relative densities, i.e., divided by the total molecular area; or specific densities, i.e., divided by property-specific area. Compared with the amino acid-based formalism, the atom-level description reveals molecular surfaces which (i) present an approximately two times more hydrophilic areas; with (ii) less extended, but between 2 to 5 times more intense hydrophilic patches; and (iii) 3 to 20 times more extended hydrophobic areas. The hydrophobic areas are also approximately 2 times more hydrophobicity-intense. This, more pronounced “leopard skin”-like, design of the protein molecular surface has been confirmed by comparing the results for a restricted set of homologous proteins, i.e., hemoglobins diverging by only one residue (Trp37). These results suggest that the representation of hydrophobicity on the protein molecular surfaces at atom-level resolution, coupled with the probing of the molecular surface at different geometric resolutions, can capture processes that are otherwise obscured to the amino acid-based formalism.


Introduction
The shape of, and the physico-chemical properties on the protein molecular surfaces govern the specific molecular interactions in protein-ligand complexes [1]. Therefore, studies as diverse as those on protein folding [2], protein conformational stability [3], inter-and intra-protein interactions [4], molecular recognition [5] and docking [6]; as well as applications-orientated ones, such as drug design [7,8], protein and peptide solubility [9], crystal packing [10], and enzyme catalysis [11], benefit from an accurate and precise representation of the molecular surfaces. Furthermore, for large, intricate protein complexes, such as ion-channels [12], mechano-sensitive channels [13], or molecular chaperones [14], where the biomolecular functionality occurs on the inner molecular surface of the complex, makes the precision of the representation of molecular surfaces even more imperative.
A relatively under-studied aspect of the construction of molecular surfaces is the resolution at which the hydrophobicity is represented. Because the biomolecular recognition is a geometrically-localized and charge-and hydrophobicity-specific event, its accurate description requires the representation of molecular surfaces with the finest resolution possible. However, while the charges are atom-localized and therefore their representation at high spatial resolution is immediate, the assignment of hydrophobicity based on residues inherently translates into its representation at a much lower resolution than that for electrical properties. Several studies [15][16][17][18][19][20] developed ''atomic hydrophobicities'' proposing different sets of atom types, but a sensitivity analysis regarding the number of atom types, as well as study comparing the protein molecular surfaces obtained using atom-or amino acid-level hydrophobicity is lacking.
Separate from the physical resolution of hydrophobicity, i.e., at atom-or amino acid-level, the impact of using different geometrical resolutions for the construction of the molecular surface has been also relatively under-studied. Indeed, the representation of the molecular surface, which relies on procedures [21][22][23][24][25][26][27] that use the protein structure deposited in databases, such as Protein Database, PDB [28], usually uses a geometrical resolution between 1.4 to 5 Å , which represents the size of the small molecular species the proteins interact with. However, as discussed before [29], there are many situations that justify the use of larger probes because the protein interacts with larger objects, e.g., membrane lipid rafts [30], cytoskeleton proteins [31], amyloid plaques [32], biomaterials surface [33], biomedical micro-devices [34,35] and chromatographic media [36]. Also, from the methodology point of view, the probing of the molecular surfaces with at different geometrical resolutions, i.e., using different probe radii, can reveal structural features of the proteins, e.g., shielding of the hydrophobic core [29].
To this end, the present study proposes a methodology for the derivation of atomic hydrophobicity from any hydrophobicity scale, runs a sensitivity analysis to assess the suitability of alternative atom types, and compares the results obtained with atom-and amino acid-level representation of hydrophobicity on molecular surfaces.

Terminology and definitions
Usually, hydrophobicity defines the property of a physico-chemical unit, i.e., a material, a surface, a molecule, or a chemical group, which reflects a particular density and geometrical distribution of water molecules around that unit. When this property, measured by various methods, reflects the repelling of water molecules, this value, usually negative, is also denominated as hydrophobic. Conversely, when the property reflects an increased density of water molecules around the unit, the measured property, with values usually positive, is denominated as hydrophilicity. A physico-chemical unit, in particular a molecule or a chemical group, could contain various sub-units, e.g., chemical groups, or atoms, respectively, which have distinct and different hydrophobicities and/or hydrophilicities. If at least two, non-contiguous units present a hydrophobic and a hydrophilic character, respectively, the unit is deemed amphiphilic. To avoid confusions resulting from the overlap of terms for different parameters, and for the purposes of the analysis of the characterization of protein molecular surfaces, the following terminology will be used: (i) hydrophobicity is the measured hydrophobicity of a unit, i.e., atom, or amino-acid, which is hydrophobic and which does not have an amphiphilic character, i.e., an atom, or which is assumed, or assigned not to have an amphiphilic character, i.e., amino-acids; (ii) total hydrophobicity is the sum of the hydrophobicities of the units, i.e., atoms, or amino-acids, which are exposed on the protein molecular surface, weighted with their respective exposed areas; (iii) hydrophilicity is the measured hydrophobicity of a unit, e.g., amino-acid or atom, which is hydrophilic and which does not have an amphiphilic character, i.e., an atom, or which is assumed, or assigned not to have an amphiphilic character, i.e., amino-acids; (iv) total hydrophilicity is the sum of the hydrophilicities of the units, i.e., atoms, or amino-acids, which are exposed on the protein molecular surface, weighted with their respective exposed areas; (v) overall hydrophobicity is the hydrophobicity of the amphiphilic protein (previously [29] denominated as amphiphilicity), calculated as the algebraic sum of the total hydrophobicity and total hydrophilicity of the units exposed on the molecular surface, calculated by either using amino-acid, or atom-based methodologies.

Proteins
A set of 35 proteins (Table 1) selected from the Protein Bata Bank [28], comprising several representative types, i.e., lactalbumins, lysozymes, ribonucleases, hemoglobins and related proteins, albumins and antibodies, have been selected for the comparison of amino acid-and atom-level representation of amphiphilicity. For the purposes of this contribution the chosen proteins need to have a convex shape. Indeed, the probing of proteins that exhibit concave shapes, most notably channel proteins, by probes with increasing radii will produce unreliable results, because much of their interior molecular surface will be inaccessible to larger probes. Finally, to ensure a representative comparison between the atomic-and amino acid-based hydrophobicity, the selected set of proteins is identical with the one used in a previous contribution [29], which reports on the probing of protein molecular surfaces with probes of different sizes. The selected proteins have various molecular weights (14 to 148 [kDa]), residues (123 to 1344), isoelectric points (4.5 to 11) and shapes (globular, Yshaped). Five representative proteins, i.e., lysozyme, ribonuclease, hemoglobin, albumin and IgG (Table 1, in bold) have been selected for an in-depth comparison of the atom-level and amino acid-level representation of hydrophobicities. The full results are presented in the Supporting Information section.
A subset of the hemoglobin class has been selected to test the fine differences between the hydrophobicity represented at atom-and amino acid-level resolution. Briefly, the subset comprises eight mutant structures of the deoxy forms of the protein, with the same number of residues (574), but with (i) the Trp37 residue, i.e., 1A0U and 1A0Z, for the crystal form 1 and 2, respectively; and with residues replacing the Trp37 residue by (ii) Tyr37, i.e., structures 1Y46 and 1A00, for crystal 1 and 2, respectively; (iii) Ala37, i.e., 1Y4F and 1A01, for crystal form 1 and 3, respectively; (iv) Glu37, i.e., 1Y4P, for crystal form 1; and (v) Gly37, i.e., 1Y4G for crystal form 1. A full description of these single residue mutations has been reported elsewhere [37].

Derivation of atomic hydrophobicities
The atomic amphiphilicities have been calculated as independent variables of the following system of linear equations: for j51 to 20; and for each j th amino acid AA j : X m i~1 hypho at i : n ij~h ypho aa j where j5amino acid index; i5atom type index; AA j 5the j th amino acid; hypho_at i -atomic hydrophobicity for atom type i; n ij -number of atoms of type i in amino acid j; hypho_aa j -hydrophobicity of the amino acid j. This system of equations has been solved using several sets of atom types, proposed according to their chemical nature and charge. The number of atom types tested, m, was from 8 to 13. The proposed atom type matrices, [n ij ], are presented in File S1. The system of equations (Eq. 1) has multiple solutions because the number of equations is not equal with the number of variables. For each of these number of atom types the solution retained was the one that presented the best fit, i.e., the smallest standard deviation between the target values (amino acid hydrophobicities) and estimated ones (atomic hydrophobicities multiplied with the n ij respective to amino acid j). The solution that represents the best fit, and consequently the one that has been retained for further calculations contains 12 atom types. The respective matrix, [n i12 ], is presented in Table 2.

Atom-based hydrophobicities
The initial test of fitness versus the number of atom types, from 8 to 13 atom types, used two hydrophobicity scales, i.e., the hydrophobicity of an amino acid embedded in a penta-peptide [38] as a measure of the enthalpy for its transfer (i) through a lipid membrane (DGwif); and (ii) from water to octanol (DGoct). The results of these calculations are presented in File S2. For the best fit of the atom types (m512), additional sets of atomic hydrophobicities have been calculated from other hydrophobicity scales, namely (i) Kyte-Doolittle, KD [39]; (ii) Hopp-Woods, HW [40]; (iii) logP [41] (cf. its implementation in HyperChem); (iv) two ''estimated hydrophobic effects'', for ''residue burial'', RB; and for ''side chain burial'', SCB, [42]; (v) two measurements of HPLC retention, i.e., retn21 and retn74, [43]; (vi) position-specific apparent free energy of membrane insertion, DG app(i) app , at position 0, DGapp_0, [44]; (vii) water-to-bilayer transfer free energy scale, DG sc wbi [45]; and (viii) unified hydrophobicity scale (UHS) for the water-membrane transfer free energy [46]. The results of these calculations are presented in File S3.

Molecular surfaces
The molecular surfaces of the selected proteins have been constructed using Connolly's algorithm [22,23], which records the position of the points of contact (or at a distance equivalent to the van der Waals radius of the respective atoms) between a virtual rolling probing ball with a set radius and the atoms on the surface of the protein. For amino acid-based overall hydrophobicity, total hydrophobicity and hydrophilicity, their spatial distribution was determined through the allocation, at the point of contact, of the hydrophobicity of the amino acid, weighted by the ratio of the probed surface per the total area of the amino acid. A similar procedure was used for mapping the spatial distribution of the atom-based hydrophobicities. The procedure involved the allocation of specific atomic hydrophobicity weighted with the ratio between the probed atomic area and the total atomic area. The results of the calculations regarding the exposed area vs. probe radii are presented in File S4.
The calculations used an in-house program [29], which is an upgrade of the Connolly's original software code [23,47], embedded in a Windows interface. The program has been run on a personal computer with a 64-bit operating system, an Intel Core i7-3630QM CPU @2.40 GHz, and an installed memory of 8GB. The 4D points (x, y, z coordinates and molecular property) have been visualized using DS Viewer Pro. (from Accelerys Inc.). The molecular surfaces have been constructed for all 35 proteins in the dataset (Table 1), for probe radii ranging from 1.4 Å to 20 Å . Beyond probe radii of 20 Å it was found [29] that the change of the properties on the molecular surface is negligible. Consequently the calculations stopped at this threshold.

Protein properties on the molecular surface
Three types geometrical and physico-chemical properties have been calculated on the molecular surface of the selected proteins: (i) global properties (i.e., total surface; overall hydrophobicity, total hydrophobicity and hydrophilicity, for amino acid-and atom-based calculations); (ii) property relative densities (i.e., overall and total hydrophobic and hydrophilic relative density, calculated by dividing the property value to the total molecular area); and (iii) property specific densities (calculated by dividing the respective property, e.g. total hydrophobicity, to the area that property turns up, e.g. hydrophobic area). For the comparison purposes, the overall hydrophobicity, i.e., the algebraic sum of hydrophobicity expressed in negative numbers; and hydrophilicity expressed in positive numbers, has been calculated for both amino acid-based and atom-based hydrophobicity scales. This methodology, applied here to atom-based properties, was used before [29] but only for amino acid-based properties. The full results are presented in File S5.
The hemoglobins subset has been separately analyzed using the same procedures. To compare the molecular surface properties with the hydrophobicity of the single residue replacement, the values for the proteins that present two crystallographic forms, i.e., 1A0U and 1A0Z for Trp; 1Y46 and 1A00 for Tyr; and 1Y4F and 1A01 for Ala, have been averaged, but those with a single Results and discussion

Atomic hydrophobicity
Because the charge is an atom-based property, the spatial representation of charges on the protein molecular surface can be inherently performed at atomlevel resolution. In contrast, the spatial distribution of hydrophobicity cannot be usually represented at high resolution, because of two reasons. First, as the hydrophobicity is usually assigned to amino acids not to atoms, its spatial representation on the protein molecular surface is constructed at several-atoms resolution, i.e., from patches comprising several atoms belonging to a parent amino acid, which is probed by the molecular surface probing ball. Intuitively, an atom-level representation of hydrophobicity would allow a more precise quantification of the properties manifested on the molecular surface and inference of the molecular recognition between protein and small molecular species. For instance, the role arginine, which comprises chemical groups with various hydrophobicities along the molecule, plays in protein-protein interactions is difficult to be understood within the framework of an evenly distributed hydrophobicity. A schematic of the differences between a molecular surface which is represented at amino acid-and at atomic level is presented in Figure 1, a and b, respectively. Furthermore, constructing the molecular surface using larger probes, which could be relevant to the analysis of the interaction of proteins with larger objects [29], e.g., nanoparticles, flat surfaces, will result in more uncertain quantification of the hydrophobicity, as the molecular surface is represented by a collection of atoms which represent a decreasingly-smaller fraction of their parent amino acids. This situation is presented schematically in Figure 1, c and d, respectively. Second, in contrast with the charges, hydrophobicities are not represented in a standardized manner, with more than 100 hydrophobicity scales being presently proposed. Although ''hydrophobic potentials'' have been proposed [48][49][50][51], including some for atomic level representations [41,[50][51][52][53], the non-standardized hydrophobicity, in particular at atom level, precludes their universal use.
There are several possible avenues for the derivation of atomic hydrophobicities, either independent of the Accessible Solvent Area (ASA), as used primarily in this work; or accounting for ASA when solving the system of equations (Eq. 1). Probing the molecular surface at different geometrical resolutions -a central methodological tool for assessing the structuring of the molecular surface, will result in different ASA's for different probe radii (see File S7). Consequently, if ASA's are used as weighting factors for the calculation of atomic hydrophobicities, then the solution of the system of equations (Eq. 1) will be dependent on the radius of the probe used for the construction of the molecular surface. Equally important, the equivalence between atomic hydrophobicities and amino acid ones from which they are derived will cease, thus making the comparison between the two methods of constructing the distribution of hydrophobicity and hydrophilicity on the protein molecular surface inoperable. Furthermore, if ASA's are used for the calculation of atomic hydrophobicities, their equivalent formalism with atomic charges also cease to exist, making their possible use for the development of hydrophobic potential also inoperable. A full treatment of the modes of calculation of atomic hydrophobicities is presented in File S8. For all these reasons, and although we report results obtained both accounting or not ASA's (see File S3 and S4), the further analysis will mainly use the atomic hydrophobicities obtained from Eq. 1.

Derivation and use of atomic hydrophobicities
While several atomic hydrophobicity scales have been proposed in the last decades, they present several limitations. For example they (i) are estimated from large QSAR databases where amino acids represent a small fraction of the archived molecules [54,55], thus skewing the results away from the residues of interest for the analysis of proteins; or (ii) propose a small number of atom types, e.g., m55 [15,16,56], m56,7 [18,20], m58 [17], thus potentially not being able to describe the molecular surface with sufficient atom-specificity; or (iii) use proprietary Bottom row, the same representation for molecular surfaces probed with larger probes. Scheme upgraded from [29], which reports the mapping at low, amino acid-based hydrophobicity (i.e., a and c). parameters [41,57]; (iv) use ''hydrophobic potentials'' (the analogue to electrostatic potentials), usually embedded in proprietary software [41,[58][59][60]; or (v) result from the compilation of several different sources [61,62]. Most importantly, none of these atom-based hydrophobicity scales are derived from amino acid-based ones, therefore making the comparison of molecular surfaces constructed using amino acid-, or atom-based hydrophobicities difficult. The methodology for the derivation of atom-based hydrophobicity proposed here attempts to address many of these limitations.
Several sets of atomic hydrophobicities are proposed, each calculated for a number of representative atom types, varying from 8 atom types, i.e., starting with the set proposed by Efremov at al. [17], to 14. The selection of the atom types was based on the chemical structure and environment of the respective amino acid. For m58 the atom types are: Cl -aliphatic C; Cr -aromatic carbon; Cx -carbon linked to a heteroatom; N -uncharged nitrogen; O -uncharged oxygen; S -sulphur; Np -positive charged nitrogen; and On -negatively charged oxygen. For m512 this set was expanded by splitting the C atoms types according to their charge, i.e., in conformity with the charges assigned by the Amber force field [63]; and creating a new atom type for the N atom in lysine. The representative atom types for m512 are presented in Table 2.
The criterion for the choosing the optimum number of atom types has been the overall (i.e., for all 20 amino acids) best fit of the estimated atom-based hydrophobicities compared to the actual amino acid-based ones used for calculations. Two hydrophobicity scales, i.e., the hydrophobicity of an amino acid embedded in a penta-peptide, [38] derived from the thermodynamic measurements of the enthalpy of the transfer of the respective peptide through a lipid membrane (DGwif); and from water to octanol (DGoct), respectively, have been used to calculate the best fit between atom-based and amino acid based hydrophobicities. The best fit increased moderately, but steadily, with the increase of the number of atom types, m, from 8 to 12. For m513 the improvement of the fit ceased and for m514 the system could not be solved anymore. The detailed discussion on these results is presented in File S8 and a full description of the data is presented in Files S1-S5. The evolution of the fit with the number of atom types is presented in Table 3.

Protein overall hydrophobicity on the molecular surface
Once the optimum set of atom-based hydrophobicity, i.e., atom types and the values of the atomic hydrophobicities, has been established, one can quantify the protein overall hydrophobicity manifested on its molecular surface, and compare it with the one calculated with the classical amino acid-based hydrophobicity. The following discussion will focus on five representative proteins, i.e., lysozyme, ribonuclease, hemoglobin, albumin and IgG, which have vastly different molecular weights, i.e., from 129 to 1344 residues (Table 1, in bold); and shapes, i.e., globular, ellipsoidal and Y-shaped. While the following results are discussed for the DGwif-derived hydrophobicity only, similar results are obtained for all other hydrophobicity scales. The full results for all 35 model proteins are presented in File S5.
The comparison of the molecular surface (Figure 2 for ribonuclease) allows a qualitative discrimination between properties calculated at atom-level resolution, but of a different nature, i.e., charges and atomic hydrophobicity (Figure 2, left and middle columns, respectively); as well as between those of the same nature, i.e., hydrophobicity, but calculated at atom-and amino acid-level (Figure 2, middle and right columns, respectively). A preliminary inspection shows that the distribution of atomic hydrophobicity, despite being physico-chemically similar with the amino acid hydrophobicity, from which it is actually derived, resembles far more the distribution of charges on the molecular surface. Indeed, the molecular surface represented by amino acid hydrophobicity remains largely, and evenly, hydrophilic, regardless of the geometrical resolution it is probed at. Conversely, the molecular surface represented by atomic hydrophobicity offers a far more varied landscape. For instance, several hydrophobic 'fingers', not detected by the amino acid hydrophobicity molecular surface, but visible as nearzero charges on the charge molecular surface (Figure 3, left column), remain apparent, regardless of the probe radii. A more detailed graphical representation of the evolution of the property-molecular surface is presented in File S9.

Atomic and amino acid based hydrophobicities
This qualitative analysis is also supported by quantitative data, which could also provide a more detailed physical insight. The variation of the atomic physicochemical properties, i.e., overall hydrophobicity, total hydrophilicity and hydrophilicity, as well as their derived measures, i.e., relative area (hydrophilic or hydrophobic area divided by total molecular surface area), relative density (overall hydrophobicity, total hydrophobicity or hydrophilicity divided by the total molecular surface area) and specific density (hydrophilicity or hydrophobicity divided by their respective area) with the variation of the probe radius is presented in Figures 3-9 (top panels); and a synthetic overview of these parameters is presented in Table 4. Table 4 also presents the comparison between the atomic and their homologue amino acid properties (also presented in Figures 3-9, bottom panels). The qualitative ( Figure 2) and quantitative data (Figures 3-9 and Table 4) allows for the construction of the following framework regarding the structuring of the protein molecular surfaces: N Overall hydrophobicity. The slight, or -for albumin-considerable, increase of the density of overall hydrophobicity with the probe radius ( Figure 3, top) indicates that protein molecular surfaces are more hydrophilic towards their outer edges, which is consistent with the ''hydrophobic core'' model. Moreover, the considerable (approximately two times) higher values obtained for atomlevel density of overall hydrophobicity compared with amino acid ones   bottom) and higher hydrophilic relative densities ( Figure 5, bottom) than the homologue values obtained by amino acid-based calculations. This apparent contradiction can be reconciled if we assume that the hydrophilic areas are more ''hydrophilicity intense'' than predicted by amino acid calculations. The much higher atomic hydrophilic specific density than its amino acid counterpart (Figure 6, bottom) also supports this interpretation. N Total hydrophobicity. The above conclusion is also supported by hydrophobicity calculations. Indeed, the slight-to-considerable decrease of the hydrophobic relative area with the probe radius (Figure 7, top) and the considerable decrease of the hydrophobic relative density (Figure 8, top) support the ''hydrophobic core'' model. However, the much larger prediction of the hydrophobic areas by atomic based calculations compared with amino acid ones (approximately 5 times even for the smallest radius considered, but above 10-15 times for some Figure 6. Evolution of the ratio between the atomic hydrophilicity and the hydrophilic area (hydrophilic specific density); and of the ratio between the atomic and the amino acid hydrophilicity specific densities; vs. probe dimensions for 5 model proteins: lysozyme (1LYZ); ribonuclease-A (1AFU); human hemoglobin (1Y4F); human serum albumin (1AO6); human IgG (1HZH).  (Figure 7, bottom) suggests a much larger extent of the hydrophobic molecular surface predicted by atom-based calculations than amino acid ones. Apparently, the ''hydrophobic intensity'' of these extended hydrophobic areas is also considerably higher ( Figure 8, bottom) than those calculated from amino acid properties. The higher, but decreasing with the probe radius, ratio between the atomic hydrophobic specific density and its amino acid counterpart (   Figure 7. Evolution of the ratio between the atomic hydrophobic area and the total molecular surface area; and of the ratio between the atomic and the amino acid hydrophobic relative areas; vs. probe dimensions for 5 model proteins: lysozyme (1LYZ); ribonuclease-A (1AFU); human hemoglobin (1Y4F); human serum albumin (1AO6); human IgG (1HZH).
doi:10.1371/journal.pone.0114042.g007 Figure 9, bottom) results from the coupling of the decrease of the former (  Figure 9, top) and the constant values for the latter [29].
N Atom-based description of protein molecular surfaces. The observation that the atom-based representation of the molecular surfaces has considerably higher resolution, coupled with the fact that the respective atomic hydrophobicities have been derived directly from a chosen amino acid hydrophobicity scale, leads to the description of the protein molecular surface with better accuracy and precision than that using amino acid hydrophobicities. While both atomand amino acid-based calculations describe the protein molecular surface as hydrophilic, and more so with the increase of the probe radius, the atom-level Figure 8. Evolution of the ratio between the atomic hydrophobicity and the total molecular surface area (hydrophobic relative density); and of the ratio between the atomic and the amino acid hydrophobicity relative densities; vs. probe dimensions for 5 model proteins: lysozyme (1LYZ); ribonuclease-A (1AFU); human hemoglobin (1Y4F); human serum albumin (1AO6); human IgG (1HZH). doi:10.1371/journal.pone.0114042.g008 Atomic Hydrophobicity on Protein Molecular Surface description reveals a ''leopard skin'' design, with more intense hydrophobic and hydrophilic patches than the rather uniform-hydrophilic surface predicted by the amino acid calculations. Moreover, considering the specific hydrophobic density, the validity of the hydrophobic core concept appears not to be fully supported by amino acid-based calculations, especially for large proteins (where it should be the most apparent, [64,65]), but it is valid if atom-based hydrophobicity is used. These observations lead to the conclusion that atombased hydrophobicities offer a better representation of the protein molecular Figure 9. Evolution of the ratio between the atomic hydrophobicity and the hydrophobic area (hydrophobic specific density); and of the ratio between the atomic and the amino acid hydrophobicity specific densities; vs. probe dimensions for 5 model proteins: lysozyme (1LYZ); ribonuclease-A (1AFU); human hemoglobin (1Y4F); human serum albumin (1AO6); human IgG (1HZH).

Analysis of a homologous set of proteins
A more precise comparison of the differences between the atom-and amino acidbased hydrophobicity quantified on the protein molecular surfaces is occasioned by the analysis of a sub-set of hemoglobin single-residue mutants. Because the proteins in this sub-dataset are much more similar between themselves than the rest of the proteins in the overall, larger data set, as only one residue (Trp37) is different, and because this replacement, with Ala, Gly, Glu and Tyr, did not lead to substantial changes in the tertiary structure of the hemoglobins [37], the evolution of the molecular surface parameters with the probe radius is expected to be much closer than that for very different proteins. While this assumption is qualifiedly true, all conclusions drawn from the analysis of very different proteins, as Table 4. General comparison of the evolution of molecular surface properties with the probe radius, calculated at atom-and amino acid level.

Property (definition)
Atomic property relationship vs. probe radius (R) increase Atomic property ratio to amino acid homologue Generally larger, i.e., 1-2.5x; Decrease with R [ Figure 9 bottom] Notes: 1. Overall hydrophobicity is the algebraic sum of hydrophilicity (positive sign) and hydrophobicity (negative sign). Consequently, the increase of the overall hydrophobicity means that it is more hydrophilic.
described in the above section, are validated by the analysis on the hemoglobin dataset (see File S6). For example, the evolution of the density of the overall hydrophobicity with the radius of the probe (Figure 10), reveals an increasingly hydrophilic surface with the decrease of the probing resolution; and a higher hydrophilicity (approximately two times) of the molecular surface than that predicted by the amino acid calculations. Figure 10. Evolution of the ratio between atom-based overall hydrophobicity and total molecular surface area (relative density of the atomic overall hydrophobicity); and of the ratio of the atomic and the amino acid overall hydrophobicities; vs. probe dimensions for hemoglobin subset. Working with very similar set of proteins could lead to important conclusions following the removal of the ''noise'' caused by too large variations. For instance, the amino acid-based overall hydrophobicity density is essentially identical for all hemoglobins, for both the finest and the coarsest probe, i.e., 1.4 Å and 20 Å , respectively ( Figure 11, top and bottom, respectively). However, while the density of the atomic hydrophobicity for the finest probing resolution is also identical for all hemoglobins (albeit larger than amino acid homologue), the calculations for the coarsest probing shows an overall hydrophobicity density that seems to be protein-specific and correlated with the hydrophobicity of the amino acid that replaced the Trp37 in the natural hemoglobin structure.

Computing time
For the computing system used in this study, the run time ranges from 2 sec for a small protein (lysozyme, 1LYZ, 1001 atoms) for the smallest probe radius (1.4 Å ); to nearly 5000 sec for a large protein (IgG, 1HZH, 10196 atoms) for the largest probe radius considered (20 Å ), as presented in Table 5. No difference has been noted between the calculations using amino acid hydrophobicities and those using atomic ones.

Perspectives and future directions of research
The present study has demonstrated the benefits of using finer scale, atom-level description of hydrophobicity. These benefits could be further amplified pursuing several possible future directions of research: N Molecular surface databases. A recent comprehensive review of the present understanding of hydrophobicity [66], suggested that it would be beneficial to archive the data regarding the distribution of hydrophobicity and hydrophilicity on the molecular surface of the proteins, in particular those that have the structures deposited in the PDB. It was also suggested that this desideratum can be achieved through molecular simulations from which the fluctuations of the density of water molecules can be calculated. While this research avenue is certainly desirable, the calculations could be expensive and time consuming, even with the emergence of more powerful supercomputers. An interim solution could be the mapping of protein surfaces using atomic hydrophobicities, either the ones reported here, or others calculated using similar methodologies. Furthermore, once the atomic hydrophobicities of interest are derived, one can attempt to cluster the molecular surfaces of whole or parts of proteins through the comparison of atomic neighborhoods, as proposed recently [67].
N Universality of atomic hydrophobicities. The present study described how atomic hydrophobicities can be derived from amino acid ones. While different niche applications would find a particular hydrophobicity scale more relevant than another, e.g., chromatography vs. lipid membranes, a standardization of atomic hydrophobicity would greatly help the transfer of knowledge from one Figure 11. Density of the overall hydrophobicity on the molecular surface of hemoglobin subset, at small (top) and large (bottom) probe radius vs. the hydrophobicity of the residue 37. The hydrophobicities of the residue 37 are, from left to right, Gly, Ala, Glu, Tyr and Trp [38].  application to another. This desideratum can be achieved via two approaches. First, one approach could consist in assigning atom types in accordance to wide-spread used force field, e.g., AMBER. This approach would have the benefit of creating 'hydrophobic charges,' which can then be easily used in molecular surface representations, including the calculation of 'hydrophobic potentials', such as those previously proposed [41]. Second, a more thorough, albeit computational intensive, approach would be to derive the atomic hydrophobicities from molecular dynamics simulations, e.g., distribution of water molecules around particular atoms, quantification of the fluctuations of water molecules distribution, as alluded above, etc. Aside from the large effort required, this approach would have the benefit of creating truly universal atomic hydrophobicities, as the procedure could be applied to any molecules, e.g., DNA, ligands, glycopeptides, etc. thus opening new avenues for fundamental studies in molecular biology or for applied research, such as drug discovery.

Conclusion
The mapping and quantification of the physico-chemical properties on the molecular surfaces of proteins using atomic hydrophobicities derived from the corresponding amino acid hydrophobicities scales, offers insights into the structuring of the protein molecular surfaces. The demonstration of the finer representation of protein molecular surfaces at atom level justifies the derivation of sets of these hydrophobicities for any chosen hydrophobicity scale that is appropriate for a specific application, thus opening the opportunity for the engineering of optimum protein-small ligand interactions, as well as protein-solid surfaces interactions. Furthermore, the results are expected to benefit both fundamental studies of protein function and drug discovery by providing a pathway for high resolution mapping of hydrophobicities on the molecular surface.

Supporting Information
File S1. File S5. Complete set of data regarding the calculation of physico-chemical properties on the molecular surface of the proteins in the total set (Table 1), for atomic, amino acid and charges, the latter two from [29]. Author Contributions