Prediction of the Damage-Associated Non-Synonymous Single Nucleotide Polymorphisms in the Human MC1R Gene

The melanocortin 1 receptor (MC1R) is involved in the control of melanogenesis. Polymorphisms in this gene have been associated with variation in skin and hair color and with elevated risk for the development of melanoma. Here we used 11 computational tools based on different approaches to predict the damage-associated non-synonymous single nucleotide polymorphisms (nsSNPs) in the coding region of the human MC1R gene. Among the 92 nsSNPs arranged according to the predictions 62% were classified as damaging in more than five tools. The classification was significantly correlated with the scores of two consensus programs. Alleles associated with the red hair color (RHC) phenotype and with the risk of melanoma were examined. The R variants D84E, R142H, R151C, I155T, R160W and D294H were classified as damaging by the majority of the tools while the r variants V60L, V92M and R163Q have been predicted as neutral in most of the programs The combination of the prediction tools results in 14 nsSNPs indicated as the most damaging mutations in MC1R (L48P, R67W, H70Y, P72L, S83P, R151H, S172I, L206P, T242I, G255R, P256S, C273Y, C289R and R306H); C273Y showed to be highly damaging in SIFT, Polyphen-2, MutPred, PANTHER and PROVEAN scores. The computational analysis proved capable of identifying the potentially damaging nsSNPs in MC1R, which are candidates for further laboratory studies of the functional and pharmacological significance of the alterations in the receptor and the phenotypic outcomes.


Introduction
The melanocortin 1 receptor (MC1R) gene encodes for a G protein-coupled receptor (GPCR) with seven transmembrane domains involved in the control of melanogenesis. Ligation of the α-melanocyte stimulating hormone (α-MSH) to MC1R stimulates adenylate cyclase, with a consequent increase of cAMP levels that leads to the activation of tyrosinase (TYR) and other enzymes, resulting in the switch from the synthesis of phaeomelanin (red/yellow pigment) to eumelanin (black/brown pigment) in melanocytes [1].
The human MC1R protein contains 317 amino acids encoded in a single exon, and shows many polymorphisms that have been described in different populations [2]. Some human MC1R variants have been associated with variation in hair and skin pigmentation and with increased risk of developing melanoma and other skin cancers, and have been characterized in laboratory studies [3] [4] [5] [6] [7] [8] [9]. However, many of the polymorphisms have unknown effects. The non-synonymous single nucleotide polymorphisms (nsSNPs) in the coding region alter the corresponding proteins. These changes may affect the protein functions in many different ways, for instance by altering the catalytic or ligand binding sites, leading to improper protein folding, incorrect intracellular transportation, or decrease in the stability or loss of function of the gene product [10] [11] [12] [13] [14] [15] [16] [17] [18]. Understanding which molecular variations are related to Mendelian or complex diseases and to variations in phenotype is a challenge in genetic research [19]. Genome-wide association studies (GWAS) are powerful approaches to detect complex disease associated SNPs [20] [21] [22] [23] [24] however, factors as the degree of linkage disequilibrium between the disease variant and the SNP marker, difference in allele frequencies and the choose of the SNPs affect GWAS studies, resulting in lower detection power and in the demand of much larger samples than association studies using targeted candidate loci [25] [26] [27]. While in vitro tests can assess the effect of specific variations, it is laborious and time-consuming to evaluate the large amount of variation in the human genome [28].
Determining which SNPs affect the phenotype would make it possible to identify the molecular mechanisms of disease and phenotypic variation, and to help select the most important for association studies with populations. Several tools have been developed to differentiate the deleterious or disease-associated SNPs occurring in a gene from the neutral or tolerated alterations, and these tools use approaches based on different features [10]. These approaches include sequence-based methods that use evolutionary information on the amino-acid conservation in the gene, based on multiple sequence alignment (MSA) of homologous proteins in related species. Assuming that amino acids that are highly important for the structure and function of the protein will be more conserved in a protein family, mutations in those positions are more likely to be deleterious. Methods based on the structural, physical and chemical properties of the wild and mutant proteins also are available, and allow the identification of the SNPs that affect the stability and function of the protein [29] [30]. Other tools use machinelearning methods (such as the support vector machine, SVM; or Random Forest, RF) to predict the association of the SNPs with disease. These tools combine properties of the amino acid residues, structural information and evolutionary conservation, and databases that contain validated information about the biochemical and clinical evidence for SNPs known to be deleterious [19] [28]. In order to combine the results of the various tools, consensus predictors have been developed to allow comparison between methods that use different analytical approaches [10] [31]. Studies using combination of different prediction tools have identified deleterious mutations in genes involved in different biological processes, including, for example, cancer (breast cancer 1, early onset-BRCA1 gene) [32], STIL gene [33], Centromere-associated protein-E gene (CENP-E) [34], leukemia (c-abl oncogene 1-ABL1 gene) [35], lipoprotein metabolism (ATP-binding cassette transporter A1-ABCA1 gene) [36], cardiomyopathy (beta myosin heavy chain-MyH7 gene) [28], oxidative stress (superoxide dismutase 2-SOD2 gene) [37], amyotrophic lateral sclerosis (superoxide dismutase 1-SOD1 gene) [38], and melanogenesis (receptor tyrosine kinase-KIT gene [39], oculocutaneous albinism type 2-OCA2-P protein gene [40], tyrosinase-TYR gene [41], and tyrosinase-related protein 1-TYRP1 gene [42]), resulting in the establishment of the mutations with the highest pathogenic prediction.
Here we used prediction tools to evaluate 92 nsSNPs in the MC1R gene in relation to their damaging or pathogenic effects, and to predict the disease-associated variation.
Thus, by the combination of the prediction tools we classified the nsSNPs in the MC1R gene, and selected those that are the most likely to affect the function of the receptor in a way that could result in disease or phenotypic variation in pigmentation.

Data
Human MC1R gene data were obtained from OMIM (#155555 -http://www.ncbi.nlm.nih. gov/omim) and Entrez on the National Center for Biotechnology Information (NCBI) website, including Protein accession number (NP_002377) and mRNA accession number (NM_002386). The Uniprot accession number (Q01726) was obtained in the Swissprot database (http://expasy.org). The information on 92 SNPs in human MC1R was collected from dbSNP (http://www.ncbi.nlm.nih.gov/snp) including SNP ID (S1 Table), chromosome position, alleles and functional consequences, when available.

Functional analysis Prediction
The nsSNPs were analyzed using 11 prediction tools: SIFT, MutPred, Polyphen-2, PROVEAN, I-Mutant 3.0, PANTHER, SNPs3D, Mutation Assessor, PhD-SNP, SNPs&GO and SNAP (Table 1) and the consensus prediction tools PON-P and PredictSNP 1.0. The data for chromosome location, amino acid sequence of the human MC1R gene (ref. Seq. NP_002377), Uniprot accession number (Q01726), position in the protein, and wild and mutated residue of the nsSNPs were used according to the program requirements. The prediction tools were selected by use different approaches in order to obtain a classification of the nsSNPs according to one or more features. The tools are freely accessible and described in the literature. Each program's approach is detailed below.
The SIFT (Sorting Intolerant From Tolerant) tool uses a sequence homology based on the multiple sequence alignment (MSA) conservation approach to classify the nsSNPs as tolerated by or damaging to the protein. The SIFT score is the normalized probability that the amino acid change is tolerated. The score ranges from 0 to 1 with a cut-off score of 0.05. Amino acids substitutions with less than 0.05 are predicted to be deleterious, and those greater than or equal to 0.05 are predicted to be tolerated [43]. The MutPred tool was developed to classify an amino acid substitution as deleterious-/disease-associated or neutral, based on three classes of attributes, the evolutionary conservation of the protein sequence, the protein structure and dynamics, and in functional properties, including secondary structure, solvent accessibility, stability, intrinsic disorder, B-factor, transmembrane helix, catalytic residues and others. It determines the changes at atomic and molecular level induced by the amino acid substitution. MutPred uses the RF (Random Forest) classifier to provide the g score for the prediction of the probability that the substitution is deleterious, and the p score for the indication of the structural and functional properties impacted, for instance, gain of helical propensity or loss of a phosphorylation site [44].
Polyphen-2 (Polymorphism Phenotyping v2) is a sequence and structure-based method that determines the structural and functional consequences of nsSNPs. The PolyPhen-2 calculates the posterior probability that a nsSNP is damaging by a Bayesian classifier [45]. The conservation of a position in the MSA and the deleterious effect on the protein structure results in the Position-Specific Independent Count (PSIC) score that ranges from 0 to 1. The classification of the nsSNPs results in Possibly Damaging and Probably Damaging (PSIC > 0.5) or Benign (PSIC < 0.5).
PROVEAN (Protein Variation Effect Analyzer) measures the damaging effect of variations in protein sequences [46]. The prediction is based on the change, caused by an nsSNP, in the similarity of the sequence to related protein sequences in a MSA. PROVEAN uses a delta alignment score based on the reference and variant versions of the protein sequence with respect to the alignment of homologous sequences [47]. A score equal or below the threshold of-2.5 determines the classification as a deleterious nsSNP. I-Mutant 3.0 is a support vector machine (SVM) tool for the prediction of protein stability free-energy change (ΔΔG or DDG) on a specific nsSNP. It predicts the free energy changes starting from either the protein structure or the protein sequence [48]. A negative DDG value means that the mutation decreases the stability of the protein, while a positive DDG value indicates an increase in stability. I-Mutant 3.0 also implements a prediction of disease-associated SNPs from a sequence analysis based on a decision tree with the SVM-based classifier (SVM-Sequence) coupled to the SVM-Profile trained on sequence profile information. The nsSNPs are then classified as disease-related or neutral polymorphisms. PANTHER (Protein ANalysis THrough Evolutionary Relationships) estimates the likelihood that a particular nsSNP will result in a functional alteration of the protein. It calculates the subPSEC (substitution position-specific evolutionary conservation) score based on a hidden Markov model alignment of evolutionarily related proteins [49] [50]. Substitution with subPSEC = 0 is indicated as functionally neutral, whereas negative values of subPSEC predict deleterious substitutions. A subPSEC score cut-off of-3 corresponds to a 50% probability that an nsSNP is deleterious to the protein, with a probability of causing a deleterious effect on the protein function (Pdeleterious) of 0.5.
SNPs3D analyzes the likely impact of nsSNPs on protein function by two methods, one based on the protein structure and stability, stemming from the hypothesis that many disease nsSNPs affect protein function primarily by decreasing protein stability. The program is intended to identify which amino acid substitutions significantly destabilize the folded state. The second model was based on analysis of homology in a sequence of families related to human proteins, through analysis of amino acid conservation at the affected sequence position [30] [51]. A positive SVM score indicates a variant classified as non-deleterious, and a negative score indicates a deleterious variant. The larger the score, the more confident is the classification of the nsSNP, with accuracy significantly higher for scores greater 0.5 or less than-0.5 [51].
The Mutation Assessor predicts the functional impact of amino acid substitutions in proteins based on evolutionary conservation of the affected amino acid in protein homologs, providing a rough estimate of the probability that the mutation has a phenotypic consequence at the level of the organism. It uses information based on the analysis of evolutionary conservation patterns in protein family multiple-sequence alignments, which are subject to selective forces at the level of the ability of the organism to survive and reproduce [52]. The analysis results in a functional impact score based on evolutionary information (FIS) that classifies the nsSNP as neutral, low, medium or high.
PhD-SNP (Predictor of Human Deleterious Single Nucleotide Polymorphisms) is a SVM-based classifier that uses protein sequence information to predict whether an nsSNP is disease-associated, based on a supervised training algorithm. The output is obtained from the frequencies of the wild and mutant residues, the number of aligned sequences, and the conservation index calculated for the position involved, and provides a prediction of disease-related (disease) or neutral polymorphism [53].
SNPs&GO is a method based on SVM to predict disease-related mutations from the protein sequence, that uses information derived from evolutionary information, protein sequence and function as encoded in the Gene Ontology (GO) terms annotation to predict if a given mutation can be classified as disease-related or neutral [54].
SNAP (Screening for Non-Acceptable Polymorphisms) is a neural network-based method for the prediction of the functional effects of nsSNPs. SNAP uses evolutionary information for the residue conservation within sequence families, aspects of protein structure, and annotations, when available. The SNAP network takes protein sequences and lists of mutants and provides a score for each substitution, which can then be translated into binary predictions of a neutral or non-neutral effect [55].
We compared the prediction results of our combined analysis with two consensus tools, PON-P and PredictSNP1.0. The PON-P is a meta tool that combines five methods (SIFT, PhD--SNP, PolyPhen-2, SNAP and I-Mutant 3.0) to predict the probability that a nsSNP will affect protein function and may consequently be disease-related. It utilizes a machine learning-based method (RF) for predicting whether variants affect functions and thereby lead to diseases. The PON-P classifies the nsSNPs as neutral, unclassified or pathogenic with a corresponding probability of pathogenicity, and provides the data available in the Uniprot database for each entry [56].
PredictSNP1.0 is a SNP classifier tool that combines six prediction methods (MAPP, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP) to obtain a consensus prediction of the effect of the amino acid substitution. The six prediction tools are run using a dataset of nonredundant mutations. The individual confidence scores are transformed to percentages to allow comparison, and the individual predictions are combined in the consensus prediction. The predictions are supplemented by experimental annotations from Protein Mutant Database and Uniprot [31].
In order to identify the nsSNPs more probably damaging in the gene the categorical prediction of the individual tools were combined by the count of damage results and the nsSNPs were classified from the most neutral (no damaging results) to the most damaging (damaging prediction in the eleven tools).

Statistical analysis
The Pearson correlation coefficients between the prediction scores for deleterious effect or the probability of pathogenicity provided by the programs SIFT, Polyphen-2, PROVEAN, MutPred, PANTHER, SNPs3D and Mutation Assessor were analyzed. The associations among the neutral or damaging results of the categorical classification of the prediction tools were evaluated by Chi-square test (χ 2 ) for independence by contingency table analysis. The statistical significance of differences in the combine of damaging results of individual tools in the domains of the MC1R protein were evaluated by the Kruskal-Wallis test. The statistical analyses were performed in the SPSS v. 20 program (IBM Corp., Armonk, NY, USA).

Prediction Programs
A total of 92 nsSNPs from the NCBI dbSNP database were analyzed to identify the deleterious mutations. Of these, 76 were found to be damaging (score < 0.05) by SIFT, with 38 assigned a score of 0.
The PROVEAN score was lower than-2.5 for 51 nsSNPs, indicating that these variants do affect the protein function and are likely to be deleterious.
In Polyphen-2, a total of 54 nsSNPs were predicted as damaging (PSIC > 0.5); 12 of these nsSNPs were predicted to be highly deleterious, with a PSIC score of 1.
In the MutPred analysis, 57 nsSNPs showed a probability of being a deleterious mutation, with g scores higher than 0.5. For 22 of these nsSNPs the program indicated an actionable or confident hypothesis (p score < 0.05) that the molecular mechanism would be disrupted.
The PANTHER software estimates the likelihood that the nsSNPs will affect the function of the protein [50]. The calculated subPSECs were equal to or lower than-3, resulting in a probability of deleterious effect higher than 0.5 for 43 nsSNPs.
The DDG predicted by I-Mutant 3.0 classified 86 of the nsSNPs as decreasing the stability of the mutated protein (DDG <0) and 6 as increasing it (DDG>0). We used the sequencebased tool of the I-Mutant 3.0 suite to predict the disease-associated nsSNPs. A total of 73 nsSNPs were predictted to be disease-related by this method.
A negative SVM score in SNPs3D was obtained for 49 nsSNPs, indicating a variant classified as deleterious; the other 43 nsSNPs received a positive score, which indicates a likely nondeleterious mutation.
The PhD-SNP 2.0 and SNPs&GO tools classify the mutation as a disease-related or neutral polymorphism. Of the set of nsSNPs in the MC1R gene analyzed, 56 were predicted to be disease-related by PhD-SNP 2.0, and the SNPs&GO method classified 24 nsSNPs as diseaserelated. The SNAP method indicated that 60 nsSNPs were functionally non-neutral. The prediction results of the 11 tools are summarized in Fig. 1.
The deleterious scores from SIFT, Polyphen-2, PROVEAN, MutPred, PANTHER, SNPs3D and Mutation Assessor, provide a numerical value associated with the prediction. In Polyphen-2, MutPred, and Mutation Assessor highers scores indicate damaging mutations, while in SIFT, PROVEAN, PANTHER, SNPs3D lower or negative scores correspond to damaging SNPs. These differences in the score results in negative values of the correlation coeficient between tools with inverse mathematical signal. Considering the absolute value of the Pearson coefficients the tools showed significant correlation with each other with R 2 ranging from 0.276 between SIFT and MutPred to 0.755 between SNPs3D and Mutation Assessor ( Table 2).
The majority of the 11 tools had a significant association between their categorical prediction results (Chi-square test for independence-P<0.05), with the exception of I-Mutant 3.0, which showed a significant association only with SNPs&GO ( Table 3).
The results of the 11 prediction tools were combined in order to identify the most damage nsSNPs in the MC1R gene. A total of 57 nsSNPs (about 62%) were predicted as damaging by more than five tools (Fig. 2).
The numbers of damage results in the 11 tools for the 92 nsSNPs in the MC1R protein are represented in Fig. 3. Two nsSNPs (T19I and I98V) showed neutral results in all tools. A total of 14 nsSNPs (L48P, R67W, H70Y, P72L, S83P, R151H, S172I, L206P, T242I, G255R, P256S,  The prediction scores of the tools indicate differences between the nsSNPs selected as damaging by the 11 tools. Among the 14 nsSNPs, 12 showed a SIFT score of 0, and six (L48P, R67W, R151H, L206P, P256S and C273Y) showed a Polyphen-2 PSIC score of 1, indicating that they may be highly damaging mutations. The MutPred tool indicated hypotheses of the molecular mechanisms disrupted (g score >0.5 and p score <0.05) by the nsSNPs L48P, R67W, R151H, S172I, L206P and C273Y, including loss of solvent accessibility, loss of catalytic

Analysis of consensus prediction tools
The PredictSNP 1.0 and PON-P consensus tools predicted 58 and 20 nsSNPs as deleterious and pathogenic, respectively (S1 Table). The PON-P gave unclassified results for 36 nsSNPs. The two consensus analysis tools showed a significant association among these (χ 2 : 36.823, p<0.05).
While most of the nsSNPs with more than five damaging results coincided with PredictSNP 1.0 classifications, three nsSNPs that were classified as deleterious (S41C, I120T and I297V) were predicted as neutral in PredictSNP 1.0, and four (M1I, M128T, K278E, and I292T) with less than five damaging results were classified as deleterious in the PredictSNP 1.0 analysis.
Of the 57 nsSNPs classified as deleterious by more than five tools, 20 were predicted as pathogenic, 30 as unclassified and 7 as neutral by PON-P; while of the 35 nsSNPs classified as neutral in the combine analysis, 29 were also classified as neutral in PON-P and six were predicted as unclassified.

Determination of the most damaging nsSNPs
The non-synonymous polymorphisms situated in the MC1R gene were evaluated by 11 programs that use different methods to predict the damaging nsSNPs. The differences in the predictions generated by the programs indicate the need for a combined analysis that could identify with accuracy the nsSNPs that are most damaging to the function of the MC1R gene. For this purpose we combined the results of the 11 tools to classify the nsSNPs from, the most neutral to the more damaging. The majority of the nsSNPs (57, about 62%) were predicted as damaging, deleterious or disease-associated by more than five programs showing high concordance with two consensus prediction tools (Fig. 2).
The 14 nsSNPs classified as deleterious in the 11 tools were selected as the most damaging in our combined analysis and were predicted as deleterious by PredictSNP 1.0, and as pathogenic or unclassified by PON-P (S1 Table). Among the 14 nsSNPs only C289R (rs369542041) has been previously analyzed in the literature [8] showing absence of functional coupling to the cAMP pathway, and being unable to bind to agonist efficiently. The C273Y nsSNP that presents higher scores in five of the 11 tools are localized in the third extracellular loop domain (Fig. 3) and affects a cystein highly conserved in MC1R gene across different species, according to MSA analysis in Polyphen-2, PANTHER and Mutation Assessor. Although the majority of the 14 nsSNPs most damaging described here were not analyzed by in vitro tests and there is no information on the functional significance of these mutations in MC1R protein the results demonstrated that these can be prioritized in further populational and laboratory studies.
The strategy of use the predictions of different tools was utilized to analyze the nsSNPs in different genes involved in biological processes, allowing the most deleterious mutations to be selected. The combination of tools resulted in the indication of four, two and one nsSNPs as the most deleterious mutations in the TYR, TYRP1 and P proteins of the gene, which are associated with oculocutaneous albinism type IA (OCA1A) [41], type III (OCA3) [42] and type II (OCA2) [40], respectively. These results demonstrate that the use of a combination of tools could adjust for the differences between the programs and improve the accuracy of the search for the important polymorphisms, the occurrence of diseases or the phenotype variations.

Analysis of Red Hair Color (RHC) and Pathogenic MC1R variants
The MC1R gene has been associated with variation in human skin and hair pigmentation, UVinduced skin damage, and cutaneous malignant melanoma. The red hair color (RHC) phenotype is due to the production of more pheomelanin than eumelanin, and is usually a result of MC1R recessive alleles that impair the function of the receptor [57] [58]. The variants D84E, R151C, R160W and D294H are strongly associated with red hair and fair skin phenotypes, and are classified as high-penetrance R alleles; while the variants V60L, V92M, and R163Q have low penetrance in these features and are classified as r alleles [ [62]. The variants R142H and I155T are less frequent and have also been associated with RHC, based on findings of a strong family association. R142H shows an association with RHC that is similar to the other R alleles, while the association of I155T was low in a meta-analysis [63].
Additionally, some polymorphisms (V60L, D84E, V92M, R142H, R151C, I155T, R160W, R163Q and D294H) were identified as involved in elevated risk of the development of melanoma [ [68]. The available information in the NCBI and Uniprot databases about nsSNPs that are classified as pathogenic is listed in S2 Table. The polymorphisms characterized as RHC-associated or pathogenic in the dbSNP database R142H, R151C, R160W and D294H were predicted as having damaging effects in 10 of the 11 programs, I155T in nine programs and D84E in seven programs ( Fig. 3 and S2 Table). These six polymorphisms were classified as deleterious in the two consensus analyses (S1 Table).
The nsSNP R163Q was predicted as damaging in three programs, and V60L in two. The V92M mutation was classified as damaging only in I-Mutant 3.0. Those three nsSNPs were predicted as neutral in PredictSNP and PON-P consensus analyses.
Kanetsky et al. [69] found a concordance between the RHC categories of the MC1R variants and the prediction of damaging changes, by means of an evolutionary amino acid conservation approach using SIFT. The R alleles D84E, R142H, R151C, I155T, R160W and D284H were predicted to be intolerant, and the variants V60L, V92M and R163Q were predicted to be tolerant. Their categories defined by SIFT gave similar results in the analysis of association with phenotypes in relation to the literature classification in a Caucasian population. Zhang et al. [70] analyzed a set of 22 nsSNPs in MC1R with SIFT and Polyphen, and found that the two programs classified 11 as damaging, including the R variants.
The variation in the prediction results of nsSNPs indicated in the literature classification as major (R) and minor (r) associated with the RHC phenotype [71] [72] [73], [74], [75] highlight the need for laboratory studies of the functional effects of the other nsSNPs predicted as damaging in the MC1R gene.

Conclusion
The analysis of the SNP involved in the determination of variation in phenotypes or in complex diseases is a challenge that requires different approaches. Here, we used different methods to predict the most damaging mutations in the human MC1R gene, a key protein in the control of pigmentation in animals. Although some of the polymorphisms found in MC1R have been studied in the laboratory, many others have not yet been evaluated with respect to their possible damaging effects on protein structure and function.
The programs used here are based on evolutionary, structural and computational methods, gathering information on these different properties of the alterations caused by the mutations and predicting those that are most probably damaging or disease-associated. The analysis of the results demonstrated the association between the different methods employed, with the consensus tools supporting the strategies applied to the discrimination of the damaging from the neutral nsSNPs.
Our characterization of the nsSNPs as damaging or neutral based in the combination of the tools indicate differences in the damaging prediction of the RHC-associated alleles classified in the literature as high-penetrance (R) or low-penetrance (r) alleles, although it was not clear what mechanism or mechanisms are involved in the differences in the effects of these alleles. The selected most-probably damaging nsSNPs could be prioritized in further studies of the functional properties of the mutated receptor. In particular, the C273Y polymorphism, located in the third extracellular loop, was indicated as the most deleterious by different tools.
Finally, these results may contribute to the understanding of the variations in skin and hair phenotypes, and of the causes of complex diseases such as melanoma.
Supporting Information S1 Table. Prediction results of the nsSNPs in MC1R human gene. Results of the eleven individual tools, of the two consensus tools PON-P and PredictSNP 1.0. The nsSNPs in bold were selected by filter analysis. (DOC) S2 Table. Information available about the MC1R nsSNPs. The data in dbSNP (NCBI) and Uniprot databases about the nsSNPs classified as pathogenic and the alleles associated with RHC phenotype in literature. R: alleles with high penetrance; r: alleles with low penetrance in RHC. Ã alleles with divergences in the RHC classification. (DOC)

Author Contributions
Conceived and designed the experiments: DH GLG TROF. Performed the experiments: DH GLG TROF. Analyzed the data: DH GLG TROF. Contributed reagents/materials/analysis tools: DH GLG TROF. Wrote the paper: DH GLG TROF. Selection of prediction tools: DH GLG TROF.