It is a commonly accepted belief that cancer cells modify their transcriptional state during the progression of the disease. We propose that the progression of cancer cells towards malignant phenotypes can be efficiently tracked using high-throughput technologies that follow the gradual changes observed in the gene expression profiles by employing Shannon's mathematical theory of communication. Methods based on Information Theory can then quantify the divergence of cancer cells' transcriptional profiles from those of normally appearing cells of the originating tissues. The relevance of the proposed methods can be evaluated using microarray datasets available in the public domain but the method is in principle applicable to other high-throughput methods.
Using melanoma and prostate cancer datasets we illustrate how it is possible to employ Shannon Entropy and the Jensen-Shannon divergence to trace the transcriptional changes progression of the disease. We establish how the variations of these two measures correlate with established biomarkers of cancer progression. The Information Theory measures allow us to identify novel biomarkers for both progressive and relatively more sudden transcriptional changes leading to malignant phenotypes. At the same time, the methodology was able to validate a large number of genes and processes that seem to be implicated in the progression of melanoma and prostate cancer.
We thus present a quantitative guiding rule, a new unifying hallmark of cancer: the cancer cell's transcriptome changes lead to measurable observed transitions of Normalized Shannon Entropy values (as measured by high-througput technologies). At the same time, tumor cells increment their divergence from the normal tissue profile increasing their disorder via creation of states that we might not directly measure. This unifying hallmark allows, via the the Jensen-Shannon divergence, to identify the arrow of time of the processes from the gene expression profiles, and helps to map the phenotypical and molecular hallmarks of specific cancer subtypes. The deep mathematical basis of the approach allows us to suggest that this principle is, hopefully, of general applicability for other diseases.
Citation: Berretta R, Moscato P (2010) Cancer Biomarker Discovery: The Entropic Hallmark. PLoS ONE 5(8): e12262. https://doi.org/10.1371/journal.pone.0012262
Editor: William C. S. Cho, Queen Elizabeth Hospital, Hong Kong
Received: December 13, 2009; Accepted: June 26, 2010; Published: August 18, 2010
Copyright: © 2010 Berretta, Moscato. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The authors acknowledge the support of the Australian Research Council (ARC) Centre of Excellence in Bioinformatics, Hunter Medical Research Institute, The University of Newcastle, and ARC Discovery Projects DP0559755 (Evolutionary algorithms for problems in functional genomics data analysis) and DP0773279 (Application of novel exact combinatorial optimisation techniques and metaheuristic methods for problems in cancer research). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
In a seminal review paper published nine years ago, Hanahan and Weinberg  introduced the “hallmarks of cancer”. They are six essential alterations of cell physiology that generally occur in cancer cells independently of the originating tissue type. They listed: “self-sufficiency in growth signals, insensitivity to growth-inhibitory signals, evasion of the normal programmed-cell mechanisms (apoptosis), limitless replicative potential, sustained angiogenesis, and finally, tissue invasion and metastasis”. More recently, several researchers have advocated including “stemness” as the seventh hallmark of cancer cells. This conclusion has been reached from the outcomes of the analysis of high-throughput gene expression datasets , . The new role of stemness as a hallmark change of cancer cells is also supported by the observation that histologically poorly differentiated tumors show transcriptional profiles on which there is an overexpression of genes normally enriched in embryonic stem cells. For example, in breast cancer the activation targets of the pluripotency markers like NANOG, OCT4, SOX2 and c-MYC have been shown to be overexpressed in poorly differentiated tumors in marked contrast with their expression in well-differentiated tumors .
Other authors suggest different hallmarks, with many papers pointing alternative processes as their primary focus of their research. The difference may stem from the fact that these authors prefer to cite as “key hallmarks” physiological changes which occur at a “lower level” scale closer to the molecular events. These authors cite, for example, “mitochondrial dysfunction” ,  (including, but not limited to “glucose avidity”  and “a shift in glucosemetabolism from oxidative phosphorylation to glycolysis” , , “altered glycolysis” , “altered bioenergetic function of mitochondria” ), “dysregulation of cell cycle and defective genome-integrity checkpoints” , “aberrant DNA methylation”  (“promoter hypermethylation of hallmark cancer genes”  and “CpG island hypermethylation and global genomic hypomethylation” ), “shift in cellular metabolism” , , , “regional hypoxia” , “microenviroment acidosis” , “abnormal microRNA regulation” , , “aneuploidy” and “chromosome aberrations” , , , , , “disruption of cellular junctions” , “avoidance of the immune response” , “pre-existing chronic inflammatory conditions” , , “cancer-related inflammation” , “disabled autophagy” , “impaired cellular senescence” , “altered NF-kappaB signalling” , “altered growth patterns, not altered growth per se” , “disregulated DNA methylation and histone modifications” , “tissue dedifferentiation” , , and “somatically heritable molecular alterations” . This research enriches the list of the most important cancer hallmarks. However, these physiological changes occur at a “lower” molecular level they are likely related sub events of the orginial seven instead of newly discovered “key hallmarks”. More recently, Luo et al attempted a “stress-based” description of some of the hallmarks in terms of “stresses” (“DNA damage/replication stress, proteotoxic stress, mitotic stress, metabolic stress, and oxidative stress”) . While this is an interesting descriptive grouping, it is still a phenotypical characterization. What is needed is a higher level unifying genotypical characterization, from which individual disregulated processes can be identified in a quantitative way using the existing high-throughput data capture methodologies. It is clear that a unifying hallmark is needed if we aim at quantifying the cell's progression. It is then evident for us that a unifying mathematical formalism is necessary to uncover the cell transcriptome's progression from a normal to a more malignant phenotype.
We start our quest assuming an implicit working hypothesis common to many research groups around the world: the macroscopic physiological changes (i.e. Hanahan and Weinberg's “hallmarks”) must also correlate with global alterations of the molecular profiles of gene transcription. It is also assumed that the “hallmark changes” occur along a certain timeline, but that some of the sub-processes discussed before are concurrent. These processes may start in a slow incremental way with some of the major changes being early events while others (e.g. tissue invasion and metastasis) are likely later processes triggered by new events during cancer progression. The timeline is not explicit and it is also likely that cancer subtypes progress to similar timelines. In some cases the sequence of events are better understood (e.g. some leukaemia subtypes ). The elicitation and regulation of molecular events is likely to be an ongoing quest during this century for many types of cancer.
It is not to be assumed that some of the transitions of the transcriptome are gradual. That is a hypothesis that is unnecessary in this study. We envision that the progression of cancer may have “switches”, with a number of concurrent converging events leading to macroscopic observable changes in the gene expression profile resulting in dramatic variations of expression patterns. For instance, these molecular switches could not be characterized by an “oncogene” but by a large number of the genes that have changed its transcriptional state. These abrupt changes may be triggered by the confluence of several non-linear interactions, and are likely to be related to the physiological hallmarks we refer to above.
The presence of macroscopic observable changes that are computable from a large number of relatively smaller changes mean that it may be possible to find an objective mathematical formalism to infer the turning point at which these radical changes occur.
It is then evident that computing the Jensen-Shannon divergences, the Normalized Shannon Entropy, and the Statistical Complexity of samples reveal different global transcriptional changes. It is, however, not easy to infer if these changes would correlate with a gradual progression or sudden changes. However, one valid mathematical possibility is that the most important “hallmark of cancer”, a unifying principle above all, is the existence of a measurable gradual “progression” from a well-differentiated gene expression profile (corresponding to a healthy tissue). This would reveal the timeline of a higher level process that is observable and measurable via a change of Normalized Shannon Entropy and an increment of Jensen-Shannon divergences from the originating tissue type. If this is the case, by correlating the changes in Information Theory quantifiers with the expression of the genes we would be able to not only uncover useful biomarkers to track this progression but to explain the “hallmarks” in an ordered timeline. The timeline also yields clinical and translational important outcomes. Such analytical methodology will naturally produce “a continuous staging” of the cancer samples, based on a solid foundations of Information Theory, based on the knowledge of transcriptional profile of healthy cells as reference to measure divergences. In addition, as a mathematical methodology, it can be applied to other high-throughput technologies for which a probability distribution function of observed abundances has been computed.
With these ideas in mind, we provide a “transcriptomic-driven” method revealing important biomarkers for cancer progression a direction of time for which they are presented. The method, however, is generalizable to other type of high-throughtput techonologies (e.g. proteomic studies). We have chosen two types of cancers to study which are almost at the antipodes in terms of progression rates: prostate cancer and melanoma.
Prostate cancer progresses very slowly. Pathological samples are common in autopsies of men as young as 20 years old. By the age of 70 more than 80% of men have these alterations, a fact that already shows a relationship of this cancer type with increasing age. The clinical management of prostate cancer requires the identification of the so-called Gleason patterns in the biopsies , which after almost fifty years is still “the sole prostatic carcinoma grading system recommended by the World Health Organization”. However, undergrading, underdiagnosis, interobserver reproducibility and variable trends in grading have been observed as major problems , . Melanoma, on the other hand, differs from prostate cancer in its rapid progression  and it is considered one of the most aggressive types of cancer. One of melanoma's usual markers of progression and concern (i.e thickness) is measured in millimetres, which gives a rough idea of how devastatingly fast the disease can spread.
We will present our results starting with one prostate cancer dataset, followed by another in melanoma, to come back to the prostate cancer discussion using another highly relevant dataset. This is a departure from the alternative approach in which each disease is discussed in separate sections. However, after considering several possibilities, we are convinced that our approach is the most appropriate to showcase the technique and its power. Details on the datasets and methods used are given in the ‘Materials and Methods’ section of this paper. We also refer to the original studies and manuscripts associated to the three datasets we analysed.
Prostate Cancer – Lapointe et al.'s dataset (File S1)
The first dataset is the one from Figure one in Lapointe et al. . This data is available from http://microarray-pubs.stanford.edu/prostateCA/images/fig1data.txt and supplemen-tary material is also available from http://microarray-pubs.stanford.edu/prostateCA/.
In the original study, the authors used a cDNA microarray technology that allowed them to measure gene expression of several thousand genes on 112 samples, including 41 normal prostate specimens, 62 primary prostate tumours and 9 lymph node metastases. From that set, a subset of 5,153 probes were selected as differentiating prostate cancer samples from normal and metastases (this is the set from figure one in Lapointe et al.  and available at the web address given above). After imputation of missing values, we first calculated the Normalized Shannon Entropy and the MPR-Statistical Complexity for the each sample.
The flowing section explains the context in which our results were generated (refer to the ‘Materials and Methods’ section for detail on how our quantities are computed). The Normalized Shannon Entropy measure is widely used in ecosystem modelling to quantify species diversity, where it is acknowledge as having great sensitivity to relative abundances of species in an ecosystem . We utilise the same sensitivity to differentiate a samples in cancer datasets. Figure 1 shows that the Normalized Shannon Entropy of prostate cancer tumor samples do not differ much from normal samples. This is in contrast to lymph node metastasis samples that appear to have smaller values of Normalized Shannon Entropy.
Metastatic samples have typically lower values of Normalized Shannon Entropy than normal samples and prostate cancer primary tumors. The reduction in Normalized Shannon Entropy indicates that there exists a significant reduction on the expression of a large number of genes, or that the gene profile of metastatic samples has a more “peaked” distribution (due to the upregulation of a selected subset of genes). Both possibilities just cited are not mutually exclusive. We also note that neither the Normalized Shannon Entropy, nor the MPR-Statistical Complexity (as a single unsupervised quantifier), can help differentiate between tumor and normal samples, indicating that other Information Theory quantifiers are required for this discrimination.
A mathematical interpretation of this result is that the samples from lymph node metastases have cells that not only varied their transcriptomic profile, they have also “peaked” the distribution of expression values with significant fold increases on a smaller number of probes. This explains the reduction in Normalized Shannon Entropy. We note that there are several mechanisms that can explain a macroscopically observable global reduction of transcription. For instance, this may indicate that a relatively large number of genes have reduced their expression levels by genome damage, changes in gene regulation, or other silencing processes. It is reassuring to observe that the changes of the most prototypical quantitative measure we can draw from Information Theory, the Normalized Shannon Entropy correlate well with the transition between normal samples with to ones with metastases. However, it is also evident from that normal samples do not differentiate much from the tumor group (the Normalized Shannon Entropy values do not differ much). It is then not the number of genes with high expression values, but the change in the distribution of expression levels on the molecular profile, that can provide the other measure that could distinguish these other samples. This must be handled by the other statistical complexity measures to be discussed next.
Several statistical complexity measures can be defined which aim to clarify our argument. We will first discuss the results of computing the MPR-Statistical Complexity measure (in the previous figure the y-coordinates correspond to the MPR-Statistical Complexity values of each sample). The MPR-Statistical Complexity is proportional to both the Normalized Shannon Entropy associated to the transcription profile and the Jensen-Shannon's divergence between that probability density function and the uniform probability distribution. Again, we refer the reader to the ‘Materials and Methods’ section for an explanation of how these magnitudes are computed.
Although the results of using the MPR-Statistical Complexity might not seem particularly impressive, there are a few reasons why we introduce them at this stage. We want to illustrate a fact that can already be observed when we employ this measure on this dataset. In this dataset, for a given entropy value interval, normal tissue samples tend to have relatively lower MPR-Statistical Complexity values than tumor and lymph node metastasis. This means that both prostate cancer and metastases samples diverge from a “more uniform” distribution indicating that the distribution “peaks” in fewer active genes. It also means that, in terms of Jensen-Shannon's divergence, the transcriptional profile of a normal prostate cell sample is “closer” to a uniform distribution than to the one that is observed in a prostate cancer cell sample.
The reader will readily argue, and with reason, that the transcriptional profile of a normal cell is tissue-specific and that it hardly resembles that of a uniform distribution of expression values. That is correct and this observation motivates the introduction of two new statistical complexity measures. We generically call these two variants as ‘M-complexities’ (with ‘M’ standing for “modified”). They have the same functional form as the MPR-Statistical Complexity, but instead of computing the Jensen-Shannon's divergence from a uniform probability distribution we compute it against an ad hoc probability distribution functions derived from the data. In this sense, these measures are more supervised then the MPR-Statistical Complexity is. Another perspective is that the MPR-Statistical Complexity is a special case of this measure in which the ad hoc probability distribution function of reference is the equiprobability distribution. The relevance of this measure derives from being a general definition that allows accommodating several different reference states. We will use it to measure divergences to the “initial” and “final” transcriptomic states (two states of reference). Taken as computed averages over normal samples, and respectively metastatic ones, these measures will allow tracking the processes of differentiation of a cancer cell from a particular tissue type.
For example, using Lapointe et al.'s dataset, the M-Normal statistical complexity quantifier first requires the computation of the probability distribution function of the average gene expression profile of all normal prostate samples. Afterwards, the Normalized Shannon Entropy and the Jensen-Shannon's divergence of any sample profile will be computed using the divergence to that averaged normal distribution. Analogously, we compute the M-Metastases statistical complexity quantifier by first calculating the average profile of the metastases samples, and then generating the corresponding probability distribution function, finally computing the Jensen-Shannon's divergence with that profile. We refer to the ‘Materials and Methods’ section for details of the calculations.
The results can be observed in Figure 2. On the x-axis, the lymph node metastases have the largest values of M-Normal indicating a divergence from the normal profile. In addition, the M-metastases values of normal samples tend to be higher than most of the metastasis samples (with the exception of only one).
We have seen in Figure 1, that the Normalized Shannon Entropy and the MPR-Statistical Complexity differentiate the metastatic samples from the normal samples, but that these two measures can not help to discriminate the primary tumors from the normals. We show here the results of two statistical complexity measures which are in some sense supervised (i.e. dependent on the dataset being interrogated). We call these two stastical mesured M-Normal and M-Metastases. They have the same functional form of the MPR-Statistical Complexity, but they use the average normal and average metastatic profile as probability distribution functions of reference. As a consequence, the M-normal and M-metastases are directly proportional to the Jensen-Shannon divergences with the normal (and respectively with the metastatic) gene expression profile. It is remarkable that, although we are using these end processes only (from Lapointe et al's, dataset of 5,153 probes×112 samples), most of the primary tumor samples appear as a transitional state between the normal and metastatic group. This is remarkable since the primary tumor samples were not used to define the M-normal and M-metastases measures and, in principle, the samples could have been located anywhere in the (M-normal, M-metastases)-plane. Computation of correlations of the probe expressions values can help us identify genes which are highly correlated with a divergence from the normal expression profile and, at the same time, converge towards the average metastatic profile.
Figure 2 shows a gradual progression of the samples positions on this plane from a well-differentiated tissue type specific profile, first to a more heterogeneous primary tumor cluster, and finally to an even less differentiated metastatic profile.
The result presented in Figure 2 shows that the prostate cancer samples, which are not metastases and therefore could have been scattered anywhere on the plane, are clustered on a particular confined area between the two other groups. We understand that there are reasons to be sceptical about this result being not just a simple consequence of the gene selection process used by Lapointe et al. For example, if we assume that the 5,153 probes singled out by Lapointe et al. in their figure one of Ref.  (and that constitute our original data) have been selected with a supervised method that try to distinguish between normal and metastases, then the relative position of normal and metastases samples is perhaps something to be expected. However, even under that assumption, what is not expected is the position of all primary tumor prostate cancer samples, linking the normal cluster of samples with the metastases one. Note that the definition of both the M-Normal and M-Metastases measures do not use any information from the primary tumor prostate cancer samples, so the location of these samples between the normal cluster and the metastases, bridging them naturally is something to highlight. Together with Figure 1, it gives evidence that supports the working hypothesis that a gradual “progression” occurs, from the normal tissue specific profile to the metastasis one.
Indeed, following our line of argument, Figure 2 has even more relevance when we highlight the fact that the 5,153 probes have not been selected with a supervised method. The authors say that the only selection criteria was to single out the 5,153 cDNAs whose expression varied most across samples. In the supplementary notes of their paper the authors say: “We included for subsequent analysis only well measured genes whose expression varied, as determined by (1) signal intensity over background >1.5-fold in both test and reference channels in at least 75% of samples, and (2) 3-fold ratio variation from the mean in at least two samples; 5,153 genes met these criteria.” As a consequence, Figure 2 has been generated without class selection bias only using the genes that have the most varied expression pattern.
We now turn to another aspect of the statistical complexity and entropy analysis. We note that Figure 2 shows that the metastases samples have a clear reduction on Normalized Shannon Entropy in comparison with the values observed for the normal samples. At the same time, metastases samples, as expected, have higher M-normal complexity than the normal samples (Figure 2). It is then interesting to evaluate the value of the Jensen-Shannon divergence of these samples and to identify the genes that most correlate with the variations of Jensen-Shannon divergence to quantify one of the factors that is related to the statistical complexity changes.
We have computed the correlation of the gene expression profile corresponding to each of the 5,123 probes. For each of the 5,123 probes, we computed both the Pearson correlation (x-axis of Figure 3) and the Spearman correlation (y-axis of Figure 3) of each probe profile with the Jensen-Shannon divergence having as probability distribution of reference that of a metastasis profile (these values are called JSM2-Pearson and JSM2-Spearman in the accompanying Excel file provided). With this data, we have produced Figure 3, a scatter plot of the values associated to each probe. In this figure, there are two probes that are immediately recognizable by any cancer researcher, and in particular for those in prostate cancer: KLK3/PSA (Prostate Specific Antigen) and FOS.
We have computed the Pearson and Spearman correlation of each probe expression (across samples) with the Jensen-Shannon divergence of each of the samples with the average metastasis profile (these values are called JSM2-Pearson and JSM2-Spearman in the accompanying Excel file provided). One of the clinically most relevant markers for prostate cancer (KLK3/PSA) together with FOS, CCL2/MCP-1, SOX9 and a probe for LOC51334 (mesenchymal stem cell protein DSC54) appear with highly negative Spearman and Pearson correlations values, indicating that they are negatively correlated with the Jensen-Shannon divergence from the average metastatic profile. BRCA2 (highly regarded as a tumor suppressor in cancer research), FOXM1 (a putative regulator of the mitotic program and the control of chromosomal stability ), and CDKN2D (a CDK4 inhibitor) in opposition with KLK3/PSA, seems to be positively correlated. As will be seen later in the analysis of the melanoma dataset, these positive correlations with the Jensen-Shannon divergence from the average metastatic profile indicate a possible dysregulation of these critical processes for which these genes have key roles.
The interpretation of these scatter plots is not immediate and needs an introductory explanation. Each dot corresponds to one probe of the array. For example, a dot that is very close to the origin of coordinates (0,0) indicates a probe such that its pattern of gene expression (across all samples) is not correlated with the Jensen-Shannon divergence to the average profile of a metastasis pattern. It is, in essence, a probe which is highly uninteresting in this regard. Probes that have a high correlation, across all samples, either positive or negative with the Jensen-Shannon divergence to the average profile of a metastasis pattern are highly informative. They “co-express” with this measure.
Although we provide in the supplementary material the information corresponding to all probes, we will discuss just a few of them. This will allow the reader to understand these plots and will put our results in the perspective with current research in prostate cancer. We particularly highlight the position of KLK3/PSA, FOS and CCL2. To our surprise, we have found which is perhaps the most famous biomarker in prostate cancer KLK3/PSA (Kallikrein-related peptidase 3), probe G_914588 (correlations of −0.9312 and −0.9000 respectively). FOS and KLK3/PSA are the second and the fourth most negatively correlated probes in this ranking of all the genes in the microarray. With opposite signs for correlations are CDKN2D, FOXM1, and BRCA2. The following is a discussion of a selection of probes (highlighted in Figure 3) in the context of prostate cancer.
CDKN2D (Cyclin-dependent kinase inhibitor 2D, p19, inhibits CDK4).
One of the genes that has strong positive correlations is CDKN2D, (Cyclin-dependent kinase inhibitor 2D, p19, inhibits CDK4) (Pearson correlation of 0.7543, Spearman correlation 0.6833), probe G_145503. A gene that shows a positive correlation with the divergence of a metastasis profile indicates a gene that has a putative reduced expression on these samples. CDKN2D is a known regulator of cell growth regulator and controls cell cycle G1 progression , . Loss of CDKN2D in cancer cells is one event which is generally associated to a more malignant phenotype.
Another probe that presents positive correlations is FOXM1 (Forkhead box M1), with Pearson correlation of 0.7039 and Spearman correlation 0.7500), probe G_564803. It has been recently shown that the depletion of FOXM1 still allows cells to enter mitosis but they are unable to complete cell division. As a consequence this leads to mitotic catastrophe or endoreduplication . FOXM1 is considered a key regulator of a transcriptional cluster which is that is essential for proper execution of the mitotic program and the control of chromosomal stability .
BRCA2 - (Breast cancer 2, early onset).
Another gene with positive correlations is BRCA2 (Breast cancer 2, early onset), probe G_193736, with Pearson correlation of 0.8161 and Spearman correlation 0.7333). While the loss of BRCA2 function and its consequences in prostate cancer is being reconsidered , , , , BRCA2 is generally regarded as a “tumor suppressor”, with an established role in maintaining genomic stability via its function in the homologous recombination pathway for double-strand DNA repair. This result is supporting its proposed function. Loss of BRCA2 function is thus a warning sign of the existence of error prone cell processes. In prostate cancer BRCA2 has been associated to promotion of invasion through upregulation of MMP9 . BRCA2 loss of function due to mutations is linked to poor survival in prostate cancer  and rare germline mutations have been associated with early-onset of prostate cancer .
CCL2/MCP-1 (chemokine (C-C motif) ligand 2).
Bone is one of the most common sites of prostate cancer metastasis; close to 85% of men who die of prostate cancer have bone metastasis . The successful metastatic process to bone follows from the activation of osteoclasts with bone resorption, which in turns leads to the release of different growth factors from the bone matrix . CCL2 has been previously reported as expressed in human bone marrow endothelial cells; the CCL2 stimulation promotes prostate cancer cell migration and proliferation ,  and it has been proposed as a paracrine and autocrine factor for invasion and growth of prostate cancer . As a consequence of this central role in the tumor microenvironment, CCL2 is being the object of several studies and is included in the list of potential targets for novel therapies , , , , , , , , , .
FOS (V-fos FBJ murine osteosarcoma viral oncogene homolog).
A probe for FOS (G_811015; correlations of −0.9380 and −0.9500 computed with Pearson and Spearman) has a similar correlation than KLK3/PSA. The high rank of FOS was unexpected, but perhaps it is less of a surprise for some experienced researchers in prostate cancer as its role has been highlighted in the past , , . Amplification of members of the MAPK pathway was associated with androgen independent prostate cancer, and co-expression of RAF1, ERBB2/HER2 and c-FOS would lead to this phenotype .
We will not discuss in depth the known relationships between FOS, Lamin A/C and prostate cancer. We leave this discussion for later, as Lamin A/C will also appear in our study of the other prostate cancer dataset studied in this paper. Lamin A/C appears as a member of a set of genes with reduced expression for higher grade primary prostate cancer samples (note that the current analysis that gave FOS as a biomarker is on lymph node metastatic samples like here). However, we would like to point out a connection that is currently hypothesized between Lamin A/C and FOS, the gene we have just discussed. Ivorra et al. have recently proposed that “lamin A overexpression causes growth arrest, and ectopic c-Fos partially overcomes lamin A/C-induced cell cycle alterations. We propose lamin A/C-mediated c-Fos sequestration at the nuclear envelope as a novel mechanism of transcriptional and cell cycle control” . In addition: “c-Fos accumulation within the extraction-resistant nuclear fraction (ERNF) and its interaction with lamin A are reduced and enhanced by gain-of and loss-of ERK1/2 activity, respectively.” . These novel interactions between LMNA and FOS, their putative role in prostate cancer metastasis and their seemingly different behaviours in prostate cancer lymph node metastases warrant further investigation.
SOX9 (SRY (sex determining region Y)-box 9).
This transcription factor has been recently identified as having an importat role during embryogenesis and in the early stages of prostate development ,  and in testis determination , processes that link SOX9 upregulation to cancer development . Basal epithelial cells do express SOX9 in a normal prostate. While there exists no detectable expression in lumina epithelial cells, SOX9 has already been reported as “expressed in primary prostate cancer in vivo, at a higher frequency in recurrent prostate cancer and in prostate cancer cell lines (LNCaP, CWR22, PC3, and DU145)” . Wang et al., also in  add that: “Significantly, down-regulation of SOX9 by siRNA in prostate cancer cells reduced endogenous AR protein levels, and cell growth indicating that SOX9 contributes to AR regulation and decreased cellular proliferation. These results indicate that SOX9 in prostate basal cells supports the development and maintenance of the luminal epithelium and that a subset of prostate cancer cells may escape basal cell requirements through SOX9 expression.” An increased value of SOX9 expression in advanced prostate cancer has been associated to tumor progression and the epithelial-mesenchymal transition . SOX9 expression has been associated with a putative subgroup of prostate cancer , associated to lymph-node metastasis (as seems to be the case in this dataset) and has a know role in chondrogenic differentiation processes .
KLK3/PSA – (Kallikrein-related peptidase 3)/Prostate Specific Antigen.
To finalize our initial discussion on this dataset, we address KLK3. The high ranking of KLK3/PSA in our list is perhaps one of the most remarkable retrodictive outcomes of our approach. KLK3/PSA (also known as Prostate Specific Antigen) is a conspiquous member of our top rank list. It is perhaps the best blood biomarker for prostate cancer screening. Its relevance and popularity as a target of studies is so wide that it makes unfeasible any serious attempt to uncover its relevance in the prostate cancer literature. A search using PubMed using the keyword ‘KLK3’ (and the other alias names of this gene) reveals a total of 11,429 published papers. Of course, many of these publications relate to its role for early screening, but in this study we are uncovering its role as a tissue biomarker. Our results echoes a recent contribution by S. Miyano's and his collaborators  on a massive meta-analysis of microarray datasets. It is also in line with results from clinical studies that indicate that a 5-year PSA value is useful for predicting prostate cancer recurrence. Stock et al. recently concluded that “patients with a PSA value <0.2 ng/mL are unlikely to develop subsequent biochemical relapse”. Denham et al., studying data from radiation-treated patients on the TROG 96.01 clinical trial, found that on 270 patients there were two distinct “PSA-signatures”. These two different dynamical patterns (characterized as “single exponential” or “non-exponential”) stratified the population. Those patients in the second group (50% of the total) “had lower PSA nadir (nPSA) levels (p<.0001), longer doubling times on relapse (p = .006) and significantly lower rates of local (hazard ratio [HR]: 0.47, 95% confidence interval [0.30–0.75], p = .0014) and distant failure (HR: 0.25[0.13–0.46], p<.0001), death due to PC (HR: 0.20[0.10–0.42], p<.0001) and death due to any cause (HR: 0.37 [0.23–0.60], p<.0001)” . Certainly the dynamics of PSA, now perhaps with FOS and SOX9 added to the set of biomarkers of interest, warrant further investigation for patient population stratification after initial treatment.
The biomarkers discussed in this section warrant further investigation in prediction of lymph-node metastasis and clinical management of prostate cancer , , , , , , , , , , , , , , , , , , , , , , , . We refer the reader to the Supplementary Material to have a complete list of probes and their correlations with the Information Theory quantifiers.
Melanoma – Haqq et al.'s dataset (File S2)
The following sections present the results that we obtained with a melanoma dataset. Our aim is to observe if variations of the Normalized Shannon Entropy and the statistical complexity measures, MPR-complexity and the modified forms M-normal and M-metastases, provide interesting results in a different disease and experimental setting.
In this case we have selected a gene expression dataset from Haqq et al.  containing information of 14,772 cDNAs in 37 samples (Figure two from the ). The 37 samples include 3 normal skin, 9 nevi, 6 primary melanoma and 19 melanoma metastases. This datasets has more phenotypical characteristics for the group of samples.
After an initial process of data cleaning, we removed 35 probes which had an unsually high expression value on only a few samples, in some cases on a single one. The dataset we work with from original contributed by Haqq et al.consists of 14,737 probes. First, we computed the Normalized Shannon Entropy and the MPR-Statistical Complexity for each sample (refer to the ‘Materials and Methods’ section for a detailed presentation of these calculations). Figure 4 shows the values of these quantifiers for each sample.
It presents the MPR-Statistical Complexity of each sample as a function of its Normalized Shannon Entropy. This dataset contains information of 14,737 probes and 37 samples. The samples include 3 normal skin, 9 nevi, 6 primary melanoma and 19 melanoma metastases (these samples are 5 of melanoma metastasis ype I and 14 of type II, as labelled by Haqq et al). Following Haqq et al's original classification, the two types of melanoma metastases they identified are presented with different color coding. The plot illustrates that in this case, the Normalized Shannon Entropy does not help to differentiate the normal to metastatic progression (as it happened in the case of prostate cancer). We will show in Figure 5 that the modified statistical complexities M-skin and M-metastasis allow visualizing a clearer transitional pattern.
We first observe an important difference between Figure 1 and Figure 4. In this melanoma dataset, neither the use of the Normalized Shannon Entropy nor the MPR-complexity helps to discriminate between normal skin, nevi, primary and metastastic melanomas. Nevertheless, we decided to present this figure for methodological reasons. We envision that some researchers will calculate the Normalized Shannon Entropy and MPR-complexity using all the probes. We note that in Figure one of Haqq et al's original paper, the whole probe set was previously filtered by selecting those which vary across samples, thus indicating that they may have information about disease subtypes (although the phenotypic types were not biasing the selection). In this case we want to illustrate both the Normalized Shannon Entropy and MPR-complexity calculated using all the probes does not give the expected benefits. We will now see the benefits of using the M-complexities.
As we did for prostate cancer (see Figure 2), we aim at identifying if the use of the modified forms of the statistical complexity (the M-complexities) could give some insight where the Normalized Shannon Entropy and MPR-complexity measures fail. To compute the M-normal measure, we need to define the average gene expression profile for a normal cell (which we call Pave). We thus resort to the three normal skin profiles and we produce the average based on these profiles (details for computing the average profiles are given in the ‘Materials and Methods’ section). We call M-skin the resulting measure that relies on this profile. Analogously, we need to compute a pattern for M-metastasis, and we proceed to calculate the Pave profile averaging over the 19 metastases samples. The result is encouraging, as samples plotted in the (M-skin, M-metastasis)-plane cluster in groups, showing an important M-skin complexity transition between normal skin cells and nevi. Most importantly, this method naturally shows that some of the metastatic samples have a large value of M-skin complexity, so we present the results of another experiment, aimed at clarifying this fact.
In their original publication, Haqq et al. classified the melanoma metastases in two groups due to their molecular profiles: five samples were classified as ‘Type I’ and fourteen as ‘Type 2’ based on a hierarchical clustering approach. Our result reinforced the view that the Type II melanomas metastasis is a pretty homogeneous group, we will present the results on the (M-skin, M-metastasis I)-plane. This means that now the Pave profile will not be obtained by averaging over the 19 metastases samples, but instead using only the 14 samples which have been labelled as ‘Type II’. As such, we aim at revealing if Type I samples are indeed different in this plane, and if other clusters are also present.
Figure 5 presents the results. The first fact worth commenting is the pronounced gap between normal skin samples and the nevi, primary, and metastatic melanoma samples as revealed by the M-skin measure. Note also that the M-skin is based on the average profile that of the normal samples, which indicates that no information about the profiles of metastasis are used, yet M-skin reveals that increasing values of this measure may be linked with a ‘progression’ from nevi to primary and metastasis melanoma profiles.
This is the same set of samples of Figure 4 and we have used the same color coding. We are now using the modified statistical complexity measures M-skin and M-metastasis II. As expected, normal skin samples (in green) have a low value of the M-skin measure. Interestingly, most of the nevi samples (in yellow) have an intermediate value of the M-skin measure, and most of the primary and metastatic samples have even larger values of M-skin. This result, together with our observation and analysis of Figure 4, indicate that the Jensen-Shannon divergence of melanoma samples from the normal skin profile may be a relevant measure to quantitatively analyse progression even when the whole gene expression dataset is used. We observe that, although the M-metastasis II measure has used all the samples labelled as Type 2 (in Haqq et al.'s original contribution), their position in this plane shows two different clusters. This may indicate that a further heterogeneity may exist in this subgroup, a fact that warrants further study with a larger group of samples.
We now introduce another useful technique to identify genes which correlate with the transitions. The challenge is to find genes which are related with the progression towards metastases profiles, even when we recognize that there the group of metastasis samples is heterogeneous (containing at least two groups). Since the final outcome of Figure 4 and Figure 5 is that the Normalized Shannon Entropy does not help much in this experimental scenario, we will concentrate only on one of the multiplicative factors of the M-complexities, the Jensen-Shannon divergence. We compute two Pave profiles, one with the normal skin samples only, and the other with all the metastasis samples (regardless their type). We will call the two divergences JSM0 and JSM5 respectively. We then compute the Spearman correlation of the profile of all gene probes in the array across the 37 samples to both JSM0 and JSM5. We have listed all probes according to the absolute value of the difference of these correlations, i.e. Abs. Diff. (probe) = |JSM0(probe)−JSM5(probe)| in decreasing order. The results are provided as Haqq-PLoSONE-SupFile.xls, in the sheet labelled ‘Results-correlation’.
The rationale is to identify those probes which are highly correlated (both positively or negatively) with the Jensen-Shannon divergence of the normal tissue profile and that “reverse signs”. For instance, a probe for the TP63 gene (Tumor protein p63, keratinocyte transcription factor KET), AA455929, is ranked in the third position. Its correlation with the Jensen-Shannon divergence of the normal skin type is relatively high and negative (JSM0 = −0.63632) while at the same time is has a positive correlation with the Jensen-Shannon divergence of the metastasis profile (JSM5 = 0.62138). In the ranking, the first probe that presents the opposite behaviour is one for ADA (Adenosine deaminase), AA683578. Figure 6 helps to understand the relationship of these correlations with expression. Not only are these genes well correlated with the divergences, they also seem to be good markers of the progression from one tissue type profile to the metastasis profile.
We have computed the Jensen-Shannon divergence of each sample with the normal skin average. We then computed the correlation of each individual probe expression with the Jensen-Shannon divergence of each sample. As this correlation is computed on all samples, the resulting value (x-axis) was denoted as JSM0A-Spearman. Analogously, we compute the Jensen-Shannon divergence of each sample with the average metastastic profile and we also compute the correlation of each probe with this measure (y-axis). The position of one probe corresponding to the TP63 gene (Tumor protein p63, keratinocyte transcription factor KET), AA455929, is highlighted. The expression of this probe has a relatively high negative correlation with the Jensen-Shannon divergence of the normal skin type (JSM0-Spearman = −0.63632) while at the same time is has a positive correlation with the Jensen-Shannon divergence of the metastasis profile (JSM5 = 0.62138). The first probe that presents an opposite behaviour is one for ADA (Adenosine deaminase), AA683578. Probes for SPP1 (Secreted phosphoprotein 1 or Osteopontin) and PLK1 (Polo-like kinase 1 or Drosophila) are also highlighted. While PLK1 is currently less recognized as a biomarker in melanoma research, the importance of SPP1 in cutaneous pathology , , ,  and in particular in melanoma , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,  is increasing. Using a 5-biomarker panel that included SPP1, Kashani-Sabet et al. used tissue microarrays on 693 melanocytic neoplasms to show that SPP1 expression collaborates significantly improving the detection of high percentage of melanomas arising in a nevus, Spitz nevi, dysplastic nevi and misdiagnosed lesions . Like in the case of prostate cancer (Figure 3, in which KLK3/PSA - Prostate Specific Antigen was highlighted), our method allows the detection of important biomarkers with a high degree of concordance with current biological understanding of metastatic processes.
We will now discuss three of these genes in the context of current biological knowledge on melanoma drivers and metastatic progression. We provide many references for one of them, SPP1 (Secreted phosphoprotein 1 or Osteopontin). The discussion on this gene will be left for later, when we will discuss specifc oncosystems related to cell proliferation, chemotaxis and responses to external simulus. Figure 7 shows the expression of ADA (Adenosine deaminase, AA683578) as a function of TP63 (keratinocyte transcription factor KET, AA455929). All normal skin samples, as well as nevi and a couple of primary melanomas have relatively low values of ADA but they express TP63. There is a change of roles in metastatic and some primary melanomas, which have reduced TP63 expression but increased values of expression of ADA. As we will later see, these events correlate with other major transcriptional modifications which involve dozens of genes and that we have been able to map thanks to functional genomics bioinformatics tools. The role of SPP1 will be discussed in that context after some references to TP63, ADA, and PLK1 which follow.
All the samples that have TP63 expression are normal or nevi, with two primary melanomas still preserving TP63 expression but with higher ADA. The trend reverses for the rest of the primary melanoma samples and the metastatic ones, which all express ADA but not TP63.
The product of this gene ,  belongs to the same protein family of its more famous relative, TP53, a gene that is often mutated in human cancers  and highly regarded as a key “tumor suppressor”. TP63's product, p63, is a homologous protein to p53, which is considered to be phylogenetically newer  and also regarded as an important apoptotic and cell-cycle arrest protein. Mice that lack TP53 are born alive with a propensity for developing tumours; mice that lack TP63 do not appear to be tumour prone, although, new results are partially contradicting earlier findings . It appears that the diverse roles of the isoforms of the p63 family reveal that there exists a crosstalk with the different isoforms of the p53 family that needs to be systematically investigated . It has recently been shown that p63 is a key regulator of the development of stratified epithelial tissues  and that its deletion results in loss of stratified epithelial and of all keratinocytes . Melanocytes also express two isoforms of p63 , but p63 expression is not reported in 57 out of 59 tumors in a tissue microarray study performed by Brinck et al. . It is clear that the the role of loss of expression of TP63 in melanoma warrants further investigation.
ADA - (Adenosine deaminase) and DPP4/CD26 (Dipeptidyl-peptidase 4, CD26, adenosine deaminase complexing protein 2).
A link between TP63 and ADA has already been reported in the literature. ADA is a gene involved in cell division and proliferatation  and it has been suggested to have a regulatory role in dendritic cell innate immune responses .Translational modification is also a function of p63. Sbisa et al. have proved that ADA is a direct target of isoforms of p63, which is an important discovery as ADA has two TP53 binding sites, leading to a complex metabolic balance due to the different relationships between this trio and p21 yet to be completely elicitated , . Several studies indicate elevation of adenosine deaminase levels in sera of breast , head and neck , colorectal , acute lymphoblastic leukaemia  and laryngeal cancers .
We observe a marked increase of expression of a probe for ADA with melanoma progression while at the same time we observe a loss of expression of a probe corresponding to DPP4/CD26 (Dipeptidyl-peptidase 4, CD26, adenosine deaminase complexing protein 2), a membrane-bound, proline-specific serine protease  that has been attributed tumor suppressor functions . It has been previously reported that loss of DPP4 immunostaining helps to discriminate malignant melanomas from deep penetrating nevi, a variant of benign melanocytic nevus  and early reports of their absence in metastatic melanomas exist , . As deep penetrating nevi can mimic the vertical growth phase of nodular malignant melanoma, and ADA could potentially be downregulating DPP4 ,  we believe that the elicitation of the complementary role of these two biomarkers to distinguish these two entities is necessary and also warrants further clinical studies.
PLK1 (Polo-like kinase 1 (Drosophila)).
Another probe for gene that ranks high as a positive marker of metastasis is PLK1, Polo-like kinase 1, Serine/Threonine protein kinase 13 (AA629262). PLK1 is a centrosomal kinase  which is regarded as being linked to centrosome maturation and spindle assembly . PLK1 expression has also been singled out as a biomarker of a “death-from-cancer” signature, sharing with others the function of being an activator of mitotic spindle check point proteins. With other proteins it would has a stem cell-like expression profile phenotypically characterized by enabling metastasis with anoikis resistance and disregulated cell-cycle control . PLK1 inhibition could be a common target for gastric adenocarcinoma , bladder cancer , colon cancer , , hepatocellular carcinoma , medullary thyroid carcinoma , esophageal cancer , pancreatic cancer  and in some types of non-Hodgkin lymphomas  and breast cancer .
PLK1's Spearman correlation with the values of the Jensen-Shannon divergence of samples with the normal skin profile is relatively high (0.5863). PLK1 also has a high value of (negative) Spearman correlation with the values of the Jensen-Shannon divergence of samples with the average metastatic profile (−0.44571). In 2002 Kneisel et al. have conducted a study to investigate the expression of PLK1 in very thin melanomas (smaller or equal to 0.75 mm). On 36 patients, within five-years of follow-up, 22 melanomas developed metastases while 14 did not. In the comparison, it was found that metastatic malignant melanomas with expressed PLK1 at markedly elevated levels (median, 60.00% vs. 37.98%; p-value<0.000053), concluding that PLK1 is a reliable biomarker for patients at high risk of metastases, even when the most important prognostic clinical factor (Breslow's maximum thickness of the primary malignant melanoma) indicates the contrary . We consider this an important finding as PLK1 silencing is already part of an integrated oncolytic adenovirus approach currently being studied in mice models of orthotopic gastric carcinoma  and has promise due to the lack of a reported measurable immune response of siRNA-based therapeutics . Another positive note is the less sensitivity to PLK1 depletion of cells with a functional p53 , , and can help to sensitize cells to chemotherapy (as observed in lung cancer ). This constraint of aneuploid cancer cells to PLK1 expression, particularly in cells with inactivated p53 , could be exploited by lentivirus-based RNA interference .
Correlation analysis with Jensen-Shannon divergences reveals biomarkers for loss of cell adhesion, cell-cell communication, impairment of tight junction mechanisms and dysregulation of epithelial cell polarity.
As discussed before, the probe for ADA (Adenosine deaminase) is the first that has a different trend. Since we put all metastasis samples together in the same group when we calculated the average probability profile (and we have a heterogeneous group) we have on our ranking 58 probes that appear before ADA (we refer to the Supplementary File Haqq-PLoSONE-SupFile.xls). An analysis using GATHER (http://gather.genome.duke.edu/)  to interpret the collective influence of the lack of expression of all these genes in the metastasis samples reveals an interesting new perspective. Using Gene Ontology, we found that six of the 44 genes identified by GATHER are related to epidermis development (CDSN, DSP, EVPL, GJB5, KRT13, KRT5), p-value<0.0001, Bayes Factor 16, and eight genes are related to cell adhesion (CDSN, CLDN1, DSG1, DST, LGALS7, LRIG3, PCDH21, PKP1), p-value<0.0001, Bayes Factor 7. ANK1 (Ankyrin 1, erythrocytic), AA464755 was also singled out as by our Gene Ontology analysis as related to the maintenance of epithelial cell polarity (p-value = 0.002, Bayes Factor 3). The use of another profiler of genome signatures (g:Profiler, ) also reinforces the view that many genes that have lost expression are related to ‘Epidermis Development’ (COL17A1, DSP, EVPL, GJB5, KRT13, KRT5, LCE1C, MAFG, TGM3) with p-value = 7.78E-11. Thirteen are associated with Gene Ontology function of cell communication (ANK1, CDSN, CLDN1, DSG1, DST, GCHFR, GJB5, GPR115, LGALS7, LRIG3, PCDH21, PKP1, PTGER3), albeit with a p-value of only 0.02. GCHFR is also involved in nitric oxide metabolism.
If we add to the list of 44 genes already recognized by GATHER the other 77 probes that after ADA in this ranking have also loss of expression (until we found PDXP (Pyridoxal (pyridoxine, vitamin B6) phosphatase), the evidence is stronger, now COL7A1, GJB5, KLK4, and KRT1 also is in this group (the Bayes factor of this association returned by Gather is now 21 for the GO term ‘Epidermis development’). ‘Cell adhesion’ has now 13 genes, CDSN, CLDN1, COL7A1, DSC2, DSG1, DST, JUP, LGALS7, LRIG3, PCDH21, PKP1, SLIT3 THBS3 (p-value<0.001, Bayes factor 10). These results are considered statistically very relevant as identifiers of a particular process which seems to be undermined by this collective loss of expression.
If we put all this information together, we clearly observe a pattern of downregulation of gene expression that is associated with an impairment of epidermis development and the maintainance of its structure (Figure 8 and Table 1). This is, perhaps, an instantiation of one of the “extended hallmarks of cancer” (that of “tissue dedifferentiation”). This process includes the loss of function of genes that are essential for the maitainance of tight junction and epithelial cell-cell communication. While loss of epithelial structure is related to these genes, we observe that those that increase expression are associated to other developmental processes, not necessarily concerted in this panel. Instead they show a pattern of increasing cell motility, chemotaxis and positive regulation of cell proliferation. We will first discuss the processes related to the loss of adhesion, which could be linked to an increased probability of metastatic potential of these cells.
The average expression of the skin samples is shown in green. In yellow, the nevi samples, showing that some of them have a reduced average expression. The primary melanomas have a mixed behaviour (orange columns) with four of them having almost zero of negative average expression. The metastatic samples (columns in red) have all a negative average expression. Overall the figure indicates a progression, from the positive average expression of this gene panel for nevi and normal skin samples, towards negative expression values of the metastatic samples, “passing” through the mixed behaviour present in primary melanomas.
The loss of expression of Plakophilin 1, Junction plakoglobin, Desmoplakin and Desmoglein 1 indicate deficiencies in desmosome processes.
In general, this panel is composed of a number of genes that are losing expression during progression and that have Gene Ontology annotations related to tight junctions, gap junctions, adherens junctions and desmosomes, and an impaired set of processes that link, via intercellular channels and bridges, the cells of the epidermis. Mutations in these genes are linked to a number of skin genetic diseases , , , , , , , , , , , , , 
The desmosome are cell-cell adhesive junctions which provide a mechanical coupling between cells. These junctions are found in several epithelial tissues and the decreased assembly of the desmosome has been shown to be a common feature of many epithelial cancers , . Plakoglobin helps to connect transmembrane elements to the cytoskeleton . Plakophilin 1  (PKP1, one of the genes in our panel above) is a desmosomal plaque component  that stabilizes desmosomal proteins at the plasma membrane ,  and, with desmoplakin , recruits filaments to sites of cell-cell contacts . As a consequence, it has been proposed that the lack of PKP1 increases keratinocyte migration  and loss of PKP1 expression in head and neck squamous cell carcinoma and in esophageal squamous cell carcinoma may contribute to an invasive phenotypic behaviour , perhaps as a consequence of the impaired recruitment of desmoplakin.
The desmoglein-specific cytoplasmic region (DSCR) is the site of caspase cleavage during apopotosis and is a conserved region of yet undefined function and unknown structure, but it specifies the function of the desmoglein family of cell adhesion molecules (of which DSG 1 is a member). It has been recently shown that the DSCR has a weak interaction with PKP1, Plakophilin 1 (ectodermal dysplasia/skin fragility syndrome) and the cytoplasmic domain of Desmocollin 1 . Plakoglobin is cleaved by Caspase 3 during apoptosis . In addition, Kami et al. in Ref  also report and conclude that: “desmoglein 1 membrane proximal region also interacts with all four DSCR ligands, strongly with plakoglobin and plakophilin and more weakly with desmoplakin and desmocollin 1. Thus, the DSCR is an intrinsically disordered functional domain with an inducible structure that, along with the membrane proximal region, forms a flexible scaffold for cytoplasmic assembly at the desmosome”.
As previously discussed, all these genes progress towards a loss of expression, and they are highly correlated. Figure 9 shows the average expression of PKP1/Plakophilin 1 (ectodermal dysplasia/skin fragility syndrome), (NM_000299) and JUP, Junction plakoglobin, (BX648177) on the x-axis against that of DSP, Desmoplakin (NM_004415 Hs.519873) on the y-axis. Again, we see a clear pattern of progressive reduction of expression from normal skin and nevi (green and yellow, respectively), primary melanomas (in orange) and melanoma metastases (red).
The joint expression of the probe for PKP1 (Plakophilin 1 - ectodermal dysplasia/skin fragility syndrome - NM_000299) and the probe for JUP (Junction plakoglobin - BX648177), as added values on the x-axis, against the expression of the probe for DSP (Desmoplakin - NM_004415 Hs.519873) on the y-axis. There is a clear common downregulation trend of these biomarkers from the normal skin (Skin) to the nevi (MN) and to the primay melanoma and metastic melanoma samples (PM and MM respectively).
Joint loss of expression of Claudin 1 and members of the Aquaporin family are also linked to a transition to a more malignant phenotype
We note however, the Gene Ontology annotation is not the only way that we can make sense of this information. A detailed analysis of that list of 58 genes reveals other proteins involved in tight junction, like Aquaporin 3 (AQP3). Probes for AQP3 and Claudin 1 (CLDN1) have reduced expression with the progression of the disease as shown in Figure 10.
Other members of the aquaporin family of proteins have a similar behaviour. AQP3, together with CLDN1 are key components of the tight junction complexes of the epidermis and their joint loss of expression seem to be related to a transition to a more malignant phenotype. We use the same color coding as Figure 9.
AQP3 (Gill blood group) is a member of the aquaporin family of proteins, and currently is recognized as an ‘aquaglyceroporin’  of great importance to maintain skin hydration of mammals epidermis . Three proteins of this family (AQP1, AQP3, and AQP9) have probes that seem correlated with melanoma progression, all losing their expression in the process of going from normal skin to metastatic melanoma. AQP3 water channels have been pointed out as an essential pathway for volume-regulatory water transport in human epithelial cells . AQP3 is also selective for the passage of glycerol and urea and it has been suggested that osmotic stress up-regulates AQP3 gene expression in cultured keratinocytes . AQP3 was found to be the predominant aquaporin in human skin which increased expression and altered cellular distribution of AQP3 in eczema thus contributing to water loss . The putative involvement of aquaporins in the progression of melanoma, uncovered by our method in our results, warrants further investigation as it has been recently shown that another member of this family (AQP8) also facilitates hydrogen peroxide diffusion across membranes . It is suspected that AQP3 has other functions with a suggestion that it is involved in ultraviolet radiation induced skin dehydration . There is no probe for AQP8 in Haqq et al.'s dataset that we could scrutinize from its trend with progression but we note that a novel strategy for drug development for melanoma (i.e. Elesclomol) works by inducing apoptosis via a mechanism of elevation of reactive oxygen species (of course, including hydrogen peroxide in cancer cells) thus exploiting the “Achilles hell of cancer metabolism” .
Claudin 1, CLDN1 , a gene which is reported to be “normally expressed in all the living layers of the epidermis” , in concert with AQP3, is a key component of the tight junction complexes of the epidermis. Low CLDN1 gene expression was correlated with shorter overall survival in lung adenocarcinoma. Overexpression of CLDN1 was correlated with suppression of cancer cell migration, invasion and metastasis . Hoevel et al. report that re-expression of CLDN1, in breast tumor spheroids, induces apoptosis and they conclude: “These findings support a potential role of the tight junction protein CLDN1 in restricting nutrient and growth factor supplies in breast cancer cells, and they indicate that the loss of the cell membrane localization of the tight junction protein CLDN1 in carcinomas may be a crucial step during tumor progression” . Tokes et al.also report that malignant invasive breast tumors are negative for CLDN1 . As in breast cancer , in which reduced expression correlated with recurrence status, the low expression of CLDN1 and other tight junction proteins seems to contribute to cellular detachment.
The complementary set of correlations with the Jensen-Shannon divergences unveils biomarkers for cell proliferation, chemotaxis, and responses to external simulus.
If the use of Gene Ontology has produced very peculiar results, helping us to link the loss of expression of 44 genes with a significant change in epithelial structure and development. A natural question arises: “Which is the significance of another set, now arbitrarily chosen to be also of the same cardinality (i.e 44 genes) with the complementary behavioural pattern?” We have now listed all the probes according to Diff. (probe) = JSM0(probe)−JSM5(probe) in decreasing order. The results are provided as Haqq-PLoSONE-SupFile.xls (‘Results-correlation’ sheet). This now gives ADA as the first ranked gene. Again using GATHER  on the first 44 genes recognized by the software, and again using Gene Ontology, we observe as most important common function that of cell motility (CCL3, CXCL10, FPRL1, SEMA6A, SPP1), p-value = 0.0002, Bayes Factor 5, and chemotaxis (CCL3, CKLFSF7, CXCL10, FPRL1, SPP1), p-value<0.0001, Bayes Factor 7. The genes CXCL10, SPP1, and WARS, together with another gene that has been annotated as related to positive regulation of mitosis (SCH1), have also been annotated as regulators of cell proliferation (p-value = 0.007, Bayes Factor 2). Using the g:Profiler software , we obtain a complementary information. Sixteen genes (including SPP1, SEMA6A, LEF1 , CD230, ALS2CR2, DKK1, CYFIP2, SHC1, ANKRD7, IFI6, CITED1, and MID1) have been associated to the Gene Ontology term of ‘developmental process’.
SPP1 - Secreted phosphoprotein 1 (osteopontin).
SPP1 is one of the most conspicuous melanoma biomarkers , , , , , , , , , , , , , , , , , , , , , , , ,  (see also the references cited in Figure 6 and note its eminent position in this scatter plot). In 1990, Craig et al. reported that SPP1 may work as an autocrine adhesion factor for tumor cells (see also , , ). They observed that “SPP1 mRNA, which is barely detectable in normal mouse epidermis, was expressed at moderate-to-high levels in 2 of 3 epidermal papillomas and at consistently high levels in 7 of 7 squamous-cell carcinomas induced by an initiation-promotion regimen” . The evidence is being constantly expanded on the role of SPP1 as a molecular prognostic biomarker in melanoma . Activation of SPP1 may be an important event that allows the transformed melanocytes to invade the dermis as proposed by Geissinger et al. in 2002 . This causes SPP1 to avoid the apoptotic stimulus, one of the “hallmarks of cancer”, which invasive cells will be receiving from this new tissue.
If we extend the literature-based search so that we now include the first 200 gene probes recognized by GATHER then we have 27 gene probes associated with the Gene Ontology in terms of “cell proliferation” (p-value = 0.0002, Bayes Factor 5), and ‘regulation of cell proliferation’, p-value = 0.003, Bayes factor 3). However, other partners of PLK1 appear and their function in ‘mitotic cell cycle’ (p-value = 0.0003, Bayes Factor 5) is increasingly present (in particular, the M phase of the mitotic cell cycle). The details of the Gene Ontology terms which are significant and the genes associated to them are listed in Table 2.
The analysis using g:Profiler largely coincides with the analysis using GATHER, however, it retrieves 12 genes associated with the M phase of mitotic cell cycle, namely: AURKA and AURKB , , , BUB1 , , CDCA5A/Sororin/p35 , CDC7 , , CHEK1 , KIF23/MKLP-1 , , , MAP9/ASAP , , NCAPD3, NCAPG2 , NEK6 , , , , PLK1 , , , PTTG1/Securin , SHC1/p66 , ,  (discussed in the context of SHC4 signalling), and TFDP1/DP-1 . These are a significant finding by g:Profiler (p-value = 4.03E-07).
We have listed above some of the genes gene associated to the M phase of mitotic cell cycle and associated references which are either to current research in melanoma and/or its biological function. We now list other genes which have been associated with the term ‘cell proliferation’ by GATHER. These genes are: ARPC1B , ARPC2 (which, together with SPP1, is also in the novel 5-biomarker panel of Kashani-Sabet et al. ), BCCIP (BRCA2 and CDKN1A-interacting protein)/P21-and CDK-associated protein 1) , BST2/Bone marrow stromal antigen 2/Tetherin , CCL3/MIP-1alpha , , , CCT4, CDCA5/Sororin , , , , , CENPF/Mitosin , CXCL1/chemokine (C-X-C motif) ligand 1 (melanoma growth stimulating activity, alpha) , , , , , , , , , , , , , , , , , , , ,  (in uveal melanoma see ), CXCL10 , FLT1/VEGFR1 , , , , , , , , , , , , , FTH1/Ferritin Heavy Chain , ,  (which may indicate a necessary condition for the mainainance of iron sequestration and suppression of reactive oxygen species accumulation ), FPRL1, LIG3/DNA Ligase 3  (which, together with XPA and ERCC5 is associated to DNA repair in ionizing radition studies ), MCMDC1, PSEN2, NRP2/Neuropilin 2/Vascular endothelial cell growth factor 165 receptor 2 , , , SEMA6A (a member of the Semaphorin family, of increasing importance in cancer research , ,  and in particular due to its observed upregulation in undifferentiated embryonic stem cells ), SLAMF1/CD150 (a marker associated with hematopoietic stem cells ), SPP1/Osteopontin (which, together with ARPC2, is also in the novel 5-biomarker panel of Kashani-Sabet et al. ) , , , , , , ,, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , STK6 , , and WARS/Tryptophanyl-tRNA synthetise . Figure 11 shows a heat map of discussed gene probes annotated with functions on cell proliferation.
We have used the same convention we employed in Figure 8: in green, the normal skin samples; in yellow, the nevi samples; the primary melanoma samples (in orange) show increased expression for most of these biomarkers. This may indicate that the upregulation of genes involved in these processes is an earlier event (it occurs as a common feature in all the primary melanoma samples) while modifications to cell adhesion, cell-cell communication, tight junction mechanisms and epithelial cell polarity occur later (primary melanomas in Figure 4 show a transition). Finally, the metastatic samples (in red) show some heterogeneity, but overall provide an increased expression. The average expression of this panel could be a good indicator of the transition from nevi to a malignant phenotype, while the panel of Figure 8 can complement the information indicatingthe onset of tissue dedifferentiation processes.
The references provided next to each gene help to related these upregulated genes in the context of current research in melanoma or with the M phase of mitotic cell cycle, showing a high degree of correlation between our results and with published literature.
Prostate Cancer - True et al.'s dataset (File S3)
Another microarray dataset we have selected to evaluate for the relevance of transitions of Normalized Shannon Entropy and Statistical Complexity was contributed by True et al.  in 2006.
The original goal of True et al. was to identify a molecular correlate for Gleason patterns 3 and, if possible, the clinically most worrisome patterns 4 and 5. They partially succeeded by linking the expression of only 86 genes with Gleason pattern 3  using a standard statistical analysis. In this study, we eliminated sample 02-209C since data was acquired using a different platform and would not be useful for our analysis. The remaining thirty one (31) samples were assayed with the GPL3834 (FHCRC Human Prostate PEDB cDNA Array v4) platform using 15,488 probes. We also eliminated all the probes with missing values, remaining 13,188 probes.
We have first plotted the samples on the (Normalized Shannon Entropy, MPR-Statistical Complexity) plane (Figure 12). It was interesting to observe that there exists a high correlation between the two measures. Samples that are entirely composed of Gleason pattern 3 tend to have a greater value of Normalized Shannon Entropy than 0.985. We can also identify a cluster of samples that present Gleason patterns which are either 4 or 5. Note that there seems to be two outliers (02_003E and 03_063) to the generic trend of the other 29 samples. The two outliers are samples that correspond to samples labelled as having Gleason 3 patterns and both have unusually low values of Normalized Shannon Entropy that are well below the values of the rest of the group.
The dataset contains the expression of 13,188 probes and 31 samples. The samples include 11 samples labelled ‘Gleason 3’ (in green), 12 ‘Gleason 4’ samples, and 8 ‘Gleason 5’ (in red). Two samples seem to be outliers to a generic trend, which is somewhat expected. We do expect samples with a ‘Gleason 3’ label to have higher values of Normalized Shannon Entropy. This is indeed the case, no sample with a ‘Gleason 3’ label has a value of Normalized Shannon Entropy lower than 0.985, while 14 samples corresponding to samples which are either ‘Gleason 4’ or ‘Gleason 5’ have values smaller than that threshold. In agreement with some of the caveats discussed by True et al., there exist a group of samples that, irrespective of their label, have similar values of Normalized Shannon Entropy (near 0.992). Samples 02_003E and 03_063 seem to be outliers to this trend, and in the case of 03_063 the sample is not even close to a hypothetical linear fit which seems to be the norm for all the samples. Figure 13 will provide further evidence that may indicate that these two samples are outliers or not to the overall trend.
This raised a suspicion about the true nature of this phenomenon. If the labelling is correct, this may indicate a subsampled group of prostate cancer that has Gleason 3 pattern characteristics but very low entropy. Alternatively, it may indicate an experimental bias for reasons we can not explain with the available clinical information. In order to clarify the situation, and see if we can declare these two samples as outliers of the other group, we performed another experiment. We have now computed two modified complexities, which we will call M-Gleason 3 and M-Gleason 5 (Figure 13). The names are probably self-explanatory, but a brief reminder follows. To calculate the MPR-Complexity, by definition, we have used the equiprobable distribution as our probability distribution of reference (for the computation of the Jensen-Shannon Divergence of the gene expression profile to this distribution). In the case of the M-Gleason 3, the probability distribution of the reference is obtained averaging all the probability distributions of the samples that have been labelled as Gleason 3 (analogously, we calculated M-Gleason 5). Samples that have Gleason pattern 3 and 5 appear as separate clusters in the (M-Gleason 3, M-Gleason 5) plane with the two putative outliers of the general trend far apart (even if they have been used to calculate the average probability distribution function of the Gleason 3 pattern). Even samples with Gleason 4 pattern are located closer to samples of Gleason patterns 3, and 5, indicating that, perhaps, there exists a subsampled subtype of prostate cancer or there might be another experimental bias or factor that at present we can not resolve with the information we have for these samples. Consequently, we have decided to eliminate both samples (02-003E and PNA_03-063A) from further calculations. With these considerations, we now have a dataset with 13,188 probes and 29 samples as our dataset for further analysis.
We have used the same color coding convention we have used in Figure 12. We plot the values of two modified statistical complexities, which we will call M-Gleason 3 and M-Gleason 5. Instead of using the equiprobable distribution as our probability distribution of reference (for the computation of the Jensen-Shannon Divergence of the gene expression profile to this distribution), as required for the MPR-Statistical Complexity calculation, we used a different one. For the M-Gleason 3, the probability distribution of the reference is obtained averaging all the probability distributions of the samples that have been labelled as Gleason 3 (analogously, we calculated M-Gleason 5). This is analogous to our approach in melanoma (Figure 5) in which we used normal and metastatic samples as reference sets for a modified statistical complexity. We observe that, even in this case, 02_003E and 03_063 continue to appear as outliers. In addition to the evidence, we have observed that the deletion of these two samples did not significantly alter the identification of biomarkers.
Figure 14 shows the distribution of the samples using the Normalized Shannon Entropy and the MPR-complexity. By definition, the positions of the 29 samples in the plane do not change (this figure is basically “zooming in” one region of Figure 12 that contains these samples). We note again, however, that the 29 samples seem to be separating in three different clusters. Whether we can argue about the existence or not of these gaps in Normalized Shannon Entropy, it is clear that there seems to be a progression as we have seen with Lapointe et al's dataset. There is a group of three samples with Gleason pattern 3 that seem to have the the largest Normalized Shannon Entropy values. There is also a cluster that only contains samples of either Gleason pattern 4 and 5, all with Normalized Shannon Entropy values smaller than 0.985.
Due to the characteristics of this microarray dataset and the experiment setting, the Normalized Shannon Entropy correlates well with the established clinical notions of malignancy (high Gleason patterns). Most Gleason pattern 5 samples (in red) have lower values of Normalized Shannon Entropy than Gleason pattern 3 samples.
There is also very little variation (see Figure 15) of the positions of the 29 samples on the (M-Gleason 3, M-Gleason 5)-plane, indicating a degree of robustness that the computation of these modified complexities have, even in the presence of some outliers.
Correlations of the genes' expressions profiles across samples with the transitions of Entropy
After observing that Figure 14 shows a correlation of Gleason pattern score with Normalized Shannon Entropy, we asked ourselves: ‘which are the genes that most positively and negatively correlate with the transitions of Normalized Shannon Entropy?’ We have plotted Spearman versus Pearson correlation values of probe expressions to attempt to find those that best correlate, either positively or negatively, with the Normalized Shannon Entropy values of the samples. The results have revealed some of the most relevant biomarkers of progression, and some unexpected newcomers. Figure 16 shows the Pearson and Spearman correlations of all the 13,188 probes in the dataset with the Normalized Shannon Entropy values of the samples. We have highlighted some particular genes that are discussed below.
The identification of probes that best correlate, either positively or negatively, with the values of the Normalized Shannon Entropy of the samples highlights some of the most important biomarkers in prostate cancer, like CDKN2C, MAOA, CDK4, CDK7, AMACR, TP53 and BRCA1 (with an upregualtion trend from their normal expression values). The list includes others that present a downregulation from their normal values, like LMNA, CD40, and SFPQ. These genes are discussed in detail in the context of current prostate cancer research in the main text. This result has revealed some of the most relevant biomarkers of prostate cancer progression (AMACR, MAOA, CDK4, TP53, BRCA1, STAT3), and some unexpected new complementary biomarkers (i.e. SFPQ, CD40, STAT3, LMNA, CD59 etc).
CDKN2C (cyclin-dependent kinase inhibitor 2C (p18, inhibits CDK4).
When we compute the correlations of the probes expressions with the Normalized Shannon Entropy values of the samples, the gene that has the most negative correlations is CDKN2C (cyclin-dependent kinase inhibitor 2C - p18, inhibits CDK4 - NM_078626), which has been previously associated with the transition from prostatic intraepithelial neoplasia (PIN) to prostate cancer  (Spearman correlations with the Normalized Shannon Entropy range between −0.8010 and −0.7276 for all the probes for NM_078626 in this array). It has been recently argued that CDKN2C and PTEN partner in tumor suppression by constraining a positive regulatory loop between cell growth and cell cycle control pathways. Bai et al. reported that a “double mutant mice develop a wider spectrum of tumors, including prostate cancer in the anterior and dorsolateral lobes, with nearly complete penetrance and at an accelerated rate” . Using the cancer cell lines LNCaP, PC3, PC3M, PC3M-Pro4, and PC3M-LN4 and three immortalized prostate epithelial cell lines Wang et al. report hypermethylation of CDKN2C .
MAOA, monoamine oxidase A.
Four probes for MAOA (Monoamine oxidase type A), two for NM_000240 and two for BC008064, follow closely with CDKN2C (Spearman correlations with Normalized Shannon Entropy ranging between −0.7650 and −0.7202 echoing the interest of True et al. and other researchers on MAOA , , , ). Zhao et al. have recently reported that “MAO-A is also expressed in the basal epithelial cells of normal prostate glands. Using cultured primary prostatic epithelial cells as a model, we showed that MAO-A prevents basal epithelial cells from differentiating into secretory cells. Under differentiation-promoting conditions, clorgyline, an irreversible MAO-A inhibitor, induced secretory cell-like morphology and repressed expression of cytokeratin 14, a basal cell marker”. They also observed mRNA and protein expression of AR, the androgen receptor . Peehl et al. now report correlation of MAOA expression with the dedifferentiation process, with preoperative PSA levels and the percent of Gleason 4 and 5 cancers .
AMACR, Cyclin G2, CDK4 and CDK7.
Other probes that also have high negative correlations with the Shannon Normalized Entropy correspond to CCNG2 (Cyclin G2) CR598707, CDK4 (Cyclin-dependent kinase 4), CDK7 (Cyclin-dependent kinase 7, TFIIH basal transcription factor complex kinase subunit) , and AMACR (Alpha-methylacyl-CoA racemase), an “obscure metabolic enzyme (that has taken) centre stage”  as judged by the extraordinary convergence to this biomarker in prostate. We believe that our result is an important finding. AMACR was not judged of importance according to the methodology used in  and it was barely cited in that manuscript. Here we present results, from an unifying biological and informational principle, which allows (using Ref. 's own data) the identification of the most central current biomarker with a truly compelling body of support in independent studies , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,  that currently exists in prostate cancer.
TP53 and BRCA1.
There exist several studies linking two “tumor suppressors” BRCA1 and TP53, its expression, status and mutations, to prostate cancer progression , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , . BRCA1 is one coregulator of AR, the androgen receptor , , ,  and inhibits ESR1 (Estrogen receptor alpha) activity , . Knockdown of BRCA1 results in the accumulation of multinucleated cells, indicating that BRCA1 regulates gene expression of an orderly progression during mitosis , preserving chromosomal stability . BRCA1 showed decreased expression in a study involving immortalized prostate epithelial cells before and after their conversion to tumorigenicity . Lack of BRCA1 function may impair activation of STAT3 . Inactivation of TP53 by somatic mutations is also associated to the panel of disruptions which are common for this “tumor suppressor” . One possible mechanism for gene silencing is CpG island methylation. Rabiau et al.show in  that BRCA1, RASSF1, GSTP1 and EPHB2 promoter methylation is common in prostate biopsy samples. Mannicia et al. suggest that the mitochondrial localization of BRCA1 proteins may be a significant factor in regulating the mitochondrial DNA damage .
SFPQ - (Polypyrimidine tract-binding protein-associated splicing factor).
The most positively correlated gene with the loss of Normalized Shannon Entropy is SFPQ/PSF (Polypyrimidine tract-binding protein-associated splicing factor) (Spearman correlation of 0.7902), a multifaceted nuclear factor ,  which is also a putative regulator of growth factor-stimulated gene expression . This is extremely interesting as it has been recently shown that the AR/PSF complex interacts with human PSA gene and that PSF inhibits AR transcriptional activity . The loss of expression of SFPQ and other proteins that together regulate androgen receptor-mediated gene transcription  (see also , ) may indicate they have a role not only as a biomarker of the progression and well as transitions of the disease to androgen independence. In a study of human labor, Dong et al., also showed that SFPQ acts as a Progesterone Receptor corepressor, thus putatively contributing to the functional withdrawal of progesterone . We will return to this particular gene later on the ‘Discussion’ section as new evidence of its role in nuclear organization has been documented.
CD40 - (TNFRSF5, B-cell surface antigen CD40).
The loss of Normalized Shannon Entropy gives us several markers that indicate a de-differentiation from a epithelial basal phenotype and an increasing loss of control of cell cycle regulation (due to uncoordinated upregulation of CDK4, CDK7, CCNG2 with their functional partners). This poses the question: What can we observe while looking at the genes that most positively correlate with the loss of Normalized Shannon Entropy? We observe, second on the ranking of all samples, a probe for CD40 (TNFRSF5, B-cell surface antigen CD40), BX381481 with a Spearman correlation of 0.7616. Loss of CD40 expression has been previously reported in prostate cancer and it is the object of a study that attempts to establish dendritic cell gene therapies , , , , , , , , , , , , , , , , , , . We will continue discussing CD40 in the following subsection in concert with other genes.
Correlations of the genes' expressions profiles across samples with the MPR-Statistical Complexity
Another natural question can be asked: Which is the extra information that we can obtain the by analysing the correlations with the MPR-Statistical Complexity in this case? As we have discussed before, and can be appreciated from Figure 14, there is a strong correlation between the MPR-Statistical Complexity and the value of the Normalized Shannon Entropy. It appears in prostate cancer, as in this gene expression dataset, the reduction of Entropy is not the major factor responsible for the increase in MPR-Statistical Complexity. Again, it is perhaps better to now look at one of the multiplicative factors of the statistical complexity measure, the Jensen-Shannon divergence to the equiprobability distribution, as this is increasing the MPR-complexity.
We present more evidence of the case of CD40 as a biomarker, since a probe for CD40 (BX381481) ranks 6th (the Spearman correlation of the probe expression with the Jensen-Shannon divergence from the equiprobability distribution is −0.5764). CD40 is a member of the TNF receptor superfamily. Notably, in 56 out of 57 archival prostate cancer samples Palmer et al. have reported no CD40 expression . However, CD40 expression was present in normal prostatic acini, so they proposed that “invasive prostate cancer is a CD40-negative tumour” (see the previous results of Moghaddami et. al. ). Matching our observations, they proposed that CD40 provides “insight into progression of cancer from normal epithelium”; our proposed methodology is revealing this fact as well. Depletion of CD40 in the tumour microenvironment may be central in avoiding the action of the immune system , as prostate cancer induces a progressive suppression of the dendritic cell system . It is perhaps a central piece which should be put together in the context of other pieces of information coming from immunotherapy , , ,  and pharmacological studies  that warrant serious investigation towards the design of new and improved clinical studies , .
CD59 molecule, complement regulatory protein.
Four probes for protectin , , , CD59, with Spearman correlations with the Jensen-Shannon divergence from the equiprobable distribution, ranging from −0.61823 to −0.5089, rank between the 1st and 39th position (when we rank genes according to this correlation in ascending order). CD59 is an interesting gene as “a comprehensive investigation of CD59 expression in prostate cancer has not been conducted yet” . Like LMNA (which is ranked third and will be discussed later) the rank of CD59/protectin means that these genes progressively loose expression of these probes. CD59 is expressed in the prostatic epithelium  and in prostasomes ; secretory granules which are produced, stored and released by the glandular epithelial cells of the prostate . Babiker et al. concluded in  that prostasomes (via expression CD59) contribute to the protection of malignant cells from complement attack. We now investigate if the ratio of delta-catenin to CD59 can is a more robust biomarker for non-invasive prostate cancer detection, particularly after the results presented in . We also note that CD59 may be also relevant to reveal the heterogeneous nature of prostate cancer. Its correlation was good, but is not lower than −0.62, which in our experience, indicates that we may be dealing with at least two types tumors in this dataset. Indeed, Xu et al. obtained CD59 mRNA levels were determined by real-time PCR in matched (tumor/normal) microdissected tissues from 26 cases and they found that: “High rates of CD59 expression were noted in 36% of prostate cancer cases and were significantly associated with tumor pT stage (P = 0.043), Gleason grade (P = 0.013) and earlier biochemical (PSA) relapse in Kaplan-Meier analysis (P = 0.0013). On RNA level, we found an upregulation in 19.2% (five cases), although the general rate of CD59 transcript was significantly lower in tumor tissue (P = 0.03)” . They concluded that: “CD59 protein is strongly expressed in 36% of adenocarcinomas of the prostate and and is associated with disease progression and adverse patient prognosis” . Jarvis et al. have previously hypothesized that CD59 expression, in some cancer cells, may help to regulate the immunological response, protecting them from the cytolytic activity of complement  (see also , ).
LMNA (Lamin A/C).
The third probe in the ranking corresponds to a LMNA (Lamin A/C), AY528714. Mutations on LMNA have been linked at 10 different human diseases , . LMNA, due to its functions, could be involved in important cell fate decisions as lamins are involved in the organization of the functional state (and position) of interphase chromosome . Lamins are “scaffolders” for the function of nuclear processes such as chromatin organization, DNA replication, cellular integrity and transcription . As a consequence Lamins are involved in several clinical syndromes , , . Among the recent functions attributed to LMNA is as an intrinsic modulator of ageing within adult stem cells via a mechanism where LMNA act as signalling receptors in the nucleus. These observations correspond to Pekovic and Hutchinson who observed that dysfunction of LMNA leads to inappropriate activation of self-renewal pathways and initiation of stress-induced senescense . In lmna-deficient mouse embryonic fibroblasts (lmna(−/−) MEFs), the loss of lmna“dramatically affects the micromechanical properties of the cytoplasm”, since “Both the elasticity (stretchiness) and the viscosity (propensity of a material to flow) of the cytoplasm in Lmna(−/−) MEFs are significantly reduced” . Using ballistic intracellular nanorheology to evaluate the micromechanical properties of the cytoplasm of these cells, Lee et al. conclude: “Together these results show that both the mechanical properties of the cytoskeleton and cytoskeleton-based processes, including cell motility, coupled MTOC and nucleus dynamics, and cell polarization, depend critically on the integrity of the nuclear lamina, which suggest the existence of a functional mechanical connection between the nucleus and the cytoskeleton. These results also suggest that cell polarization during cell migration requires tight mechanical coupling between MTOC and nucleus, which is mediated by lamin A/C”  (see also , ). In addition to these very interesting findings, a functional association of LMNA and the retinoblastoma protein (pRB) exists. Nitta et al. have shown that pRB needs to be stabilized by LMNA for INK4A-mediated cell cycle arrest and that somatic mutations in LMNA may also have a role in tumor progression . In mammalian cells, LMNA a) colocalizes with c-FOS at the nuclear envelope, b) suppresses AP-1 through a direct interaction with c-FOS and, in LMNA-null cells perinuclear localization of c-FOS is absent (but it is restored when it is overexpressed, c) LMNA-null cells have enhanced proliferation . These results obtained by Ivorra et al. are giving the indication that of yet another mechanism of cell cycle and transcriptional control mediated by LMNA  (see also ). LMNA has also been proposed as an inhibitor of adipocyte differentiation . Hutchingson et al. have proposed the alias of “guardian of the soma” for lamins A and C as they seem to have “essential functions in protecting cells from physical damage, as well as in maintaining the function of transcription factors required for the differentiation of adult stem cells” .
NF-kappaB regulated genes reveal links to focal adhesion and ECM-receptor interaction and immune response disregulation
From our results, we can not completely establish if the downregulation of CD40 and CD59 are enough to pinpoint an impaired or abnormal immune response. If we continue the inspection of the list, the first 20 probes give us more supporting evidence. The 20 probes correspond to 13 different genes. Five of these 13 genes have Genome Ontology information annotated as “defense response”, the above mentioned CD59 and CD40 as well as IL4R (interleukin 4 receptor, CR616481), XBP1 (X-box binding protein 1, AK093842) and HLA-A (major histocompatibility complex class I HLA-A29.1, BU075230). Takahashi et al.  report an inverse correlation between XBP1 expression and histological differentiation in a series of prostate cancers without hormonal therapy, the expression of XBP1 was localized in epithelial and adenocarcinoma cells of the prostate and the majority of refractory cancer cases exhibited weak XBP1 expression), MST1/STK4 (along with MST2/STK3) act as inhibitors of endogenous AKT1, a mediator of cell growth and survival .
We can not yet know what reason is behind their joint downregulation, but another interesting common denominator is that 12 out of 13 genes share a regulatory motif for NF-kappaB (according to TRANSFAC, V$NFKB_Q6_01). A putative role for NF-kappaB in prostate cancer has been reported based on the observation of the centrality of NFKB on two up- and down-regulated networks compairing prostate tumors and healthy tissue  and in a larger study by McDonnel et al.  (255 core prostate cancer tissue microarrays from 47 prostatectomy specimens). Several other researchers are currently investigating different roles of the NFKB family in prostate cancer , , , , , , ,  and it could be a promising target for intervention , , , , , , , , , , , , , , , , . If we include other genes following the ranking order, the first 38 genes in the ranking include 33 that have the regulatory motif V$NFKB_Q6_01 (GATHER reports for this list a p-value of 0.0006). Even when we double the list to the probes that correspond to the first 76 different genes recognized by GATHER, 58 of them have the regulatory motif V$NFKB_Q6_01, with p-value = 0.003 (ATP6AP2, BCAT1, BTG2 , , , , , , , C14orf123, C18orf45, CCL2, CD302, CD40 (already discussed), CD59 (already discussed), CHI3L1, COL16A1, COMMD6, CRABP2, CSRP1, CTBP2, CTGF (Connective tissue growth factor, , , , ), DES, DMN, DNAJB1, EGF, EMP1, FHL2 , , , , , , GRIPAP1, GSTM1 , , HBEGF, IL4R, ITGA3, ITGA7, JUNB , , KIAA0152, KIAA1191, KIAA1324, KLF6, LAMB2, LMNA (already discussed), NFATC1, NFKB2, NUDC , P4HB, PDK2, PIM1, PISD, PXN, RAP1B, RNF40, SARA1, SEC61A1, SGTA , SLC12A2, SRD5A2, STAT6 , , TACSTD2, TBX1, TMED3, VPS39, WDFY3, XBP1 , ZAK). This result indicates that our results support the importance of NFkappa-B and the huge amount of research effort to understand the role of the NFkappa-B activity and its potential as a target for intervention in prostate cancer (File S4).
The group of 58 biomarkers contains one of particular interest, STAT6. This gene is considered a survival factor in prostate cancer and a key regulator of the genetic transcriptional program responsible for progression . STAT6 has been recently linked to HPN as one of the most robust pair of biomarkers for prostate cancer using an integrative approach that linked several microarray datasets .
Focal and cell adhesion modifications can be inferred by monitoring losses of a group of genes composed by EFG, Integrins, LAMB2, Paxillin and RAP1B
Analysis using GATHER of this group reveals that six of these 58 genes are in KEGG pathway path:hsa04510, Focal adhesion (EGF, ITGA3, ITGA7, LAMB2, PXN, RAP1B, p-value<0.0007) and from these there are three in pathway:hsa045122, ECM-receptor interaction (ITGA3, ITGA7, LAMB2, p-value<0.005) while four of these six are also in path:hsa04810: Regulation of actin cytoskeleton, (EGF, ITGA3, ITGA7, PXN, p-value<0.01).
Alterations of the gene profile of LAMB2 and CDKN2C/p18(Ink4c), a CDK4 inhibitor, have been reported on the transition from prostatic intraepithelial neoplasia (PIN) to prostate cancer  (see also ).
ITGA7 (integrin, alpha 7) and ITGA3 (integrin, alpha 3).
The contribution of the loss of these integrins and the subsequent derived impairment on cell adhesion has been reported in several tumours. Ren et al. in  report that “Focal or no integrin alpha 7 eexpression in human prostate cancer and soft tissue leiomyosarcoma was associated with a reduction of metastasis-free survival (for example, for prostate cancer with focal or no expression, 5-year metastasis-free survival was 32%, 95% CI = 24.4% to 40.3%, and for prostate cancer with at least weak expression, it was 85%, 95% CI = 79% to 91%; p-value<.001)”.
“Any method involving the notion of entropy, the very existence of which depends on the second law of thermodynamics, will doubtless seem to many far-fetched, and may repel beginners as obscure and difficult of comprehension.”
Willard Gibbs, Graphical Methods in the Thermodynamics of Fluids, (1873)
Transcriptional vs. Karyotypic Entropy
The changes of the Normalized Shannon Entropy and Statistical Complexity of the gene expression profile of a cancer cell are associated with the gradual deterioration of genome transcriptional information content due to the modification of its structural and functional integrity during disease progression. Our results clearly suggest that we can track the cancer cell's progression by following observable changes in the Shannon Entropy and, in particular, by employing the Jensen-Shannon Divergence of the gene expression profile of a sample to the normal expression profile. We have also shown if an average expression profile of some state of interest can be properly defined (i.e. distant metastasis) then the Jensen-Shannon Divergence can help us to identify which probes best correlate with these measures resulting in useful biomarkers.
Before any thermodynamical consideration could be discussed, we note that there is a clear and objective informational perspective that our study delivers. In this study we have chosen to position ourselves as the ‘receivers’ of a ‘transcriptional message’. In this experimental perspective the tumor tissue is the ‘sender’ (the source of information) and the high-throughput technology (gene expression microarrays in this case) can be regarded as the transmission medium (providing noise and distortion). As we explain in the ‘Materials and Methods’ section, the Shannon Entropy of a gene expression profile is the average expected surprisal of that profile understood as a message. The Normalized Shannon Entropy makes this surprisal an intensive measure and the correlation of the gene expression patterns across samples with this measure can deliver useful biomarkers to track the progression of transcriptional change. After normalization, we have a measure that does not depend of the number of probes of the high-throughput technology, although, it obviously does depend on the type of probes used.
We believe that the readers may have already noticed an apparent paradox. While some researchers understand cancer progression as a mechanism that increases entropy, we actually observe a reduction of Normalized Shannon Entropy in this work. This means that our normalized average expected surprisal, as receivers of the transcriptional message, is smaller. We must then discuss the physical meaning of thermodynamic entropy, its current use in systems biology and cancer research genetics and the informational measure we use in this paper to clarify these notions in this context.
In biomedical research there exists a certain consensus among cancer researchers that genetic instability or “mutability” is a major critical force of cancer progression, but it is not the only one to consider. It is clear that the mutational damage of key genes (like TP53, TERT, BRCA1, RB1, etc.), and the collective damage inflicted on key DNA repair mechanisms (like Nucleotide-excision repair and Base-excision repair) collaborate for an increasing acceleration of the number of genomic changes. Sub-microscopic alterations of the genome accumulate in cancer progression in an irreversible way and “are compounded by the widespread scrambling of the chromosome structure, and thus the karyotype, found in cells from the great majority of solid tumours” . In Weinberg's own words : “we learned that this chromosomal chaos also contributes this progression forward”.
This “chromosomal chaos”  or “cancer as a chromosomal disease” perspective is viewed by some researchers not as just a side consequence of mutational damage, but as the main core theme to understand a number of unexplained issues in cancer progression. “In sum, cancer is caused by chromosomal disorganization, which increases karyotypic entropy” . Regarding the cancer types studied in this paper, one particular “measure of disorder of a system”, aneuploidy, has been observed in poorly-differentiated prostate cancer cells and it is often associated with a more agreessive phenotype , , increased PSA levels , , and correlate with Gleason score , , . Gene fusions and chromosomal rearrangements are other source of increase in the “disorder” of the genome organization and they are increasingly being recognized as a major player in prostate cancer progression . The increase in “karyotypic complexity” and “extended aneuploidy and heteroploidy” may be already enough to develop a malignant melanoma phenotype, as the report of Gagos et al. indicate . The observed finding of aneuploidy in melanoma (also including uveal melanoma) is also increasingly important due to a number of different independent observations , , , , , , , , . It is in this context that the word ‘entropy’ has been used.
The magnitude of the “chromosomal chaos” is also evident from comparative genomic hybridization (CGH) studies which show significant variations in the copy number of individual chromosomal segments. ‘Chaos’ is really a very appropriate word to describe what we observe from CGH data. The genomic changes are not distributed uniformly at random. ‘Chaos’ has been described by some researchers as “a kind of order without any periodicity”. Some common changes seem to consistently appear in several independently arising tumours of the same type, and sometimes the researchers suggest common links . Our work has addressed, in part, this question: “Can we quantify the chaos observed in the genome from the increasingly available transcriptional data and relate it to tumour progression?” If no commonalities were observed, we would not have found interesting biomarkers that seem that strongly correlate with the divergences from normal tissue types. We know from our results that these commonalities do occur.
We need to go back to basics to explain these evolving concepts and resolve this apparent paradox. The phrase “karyotypic entropy” has been used in the past to define what is actually a divergence from the normal chromosome structure and it genomic organization. This denomination has also been employed by several authors, notably , but it has also been used in at least two other publications , . These works have in common the use of this term to refer to a “disorder”, fuelled by the undergraduate textbooks indoctrination of associating increase of entropy in natural spontaneous processes with the increase of “observed disorder” in the system. We propose that the use of a natural measure of divergence, the Jensen-Shannon divergence, could not only be a more formal, but also more appropriate modelling approach. As such, we propose to introduce the term ‘karyotypic divergence’ or ‘karyotypic Jensen-Shannon divergence’ to replace this concept and to avoid a subjective approach.
Why is it the case that we observe the Normalized Shannon Entropy of the transcriptional profile decreasing with cancer progression when intuitively our average expected surprisal (Shannon Entropy) should increase with progression?
Arieh Ben-Naim in his recent book “A farewell to Entropy: Statistical Thermodynamics based on Information”  comments:“It is interesting to note that Landsberg (1978) not only contended that disorder is an ill-defined concept, but actually made the assertion that ‘it is reasonable to expect ‘disorder’ to be an intensive variable’”. Ben-Naim also states: “In my view, it does not make any difference if you refer to information or to disorder, as subjective or objective. What matters is that order and disorder are not well-defined scientific concepts. On the other hand, information is a well-defined scientific quantity, as much as a point or a line are scientific in geometry, or mass or charge of a particle are scientific in physics.” However, in a manuscript entitled “Can Entropy and ‘order’ increase together ?” Landberg defines (in an attempt to decouple the notions of order and entropy), for a thermodynamical system that can be on N states the ‘disorder’ D(N) to be the Normalized Entropy (which is a function of N) divided by Boltzmann's constant . ‘Disorder’ then is an intensive magnitude bounded by 0 and 1, and ‘order’ is defined as 1-D(N).
While Landberg's decoupling argument between order and entropy  may still be controversial in Physics, the question is pertinent for our apparent paradox (the question that motivates this subsection). Borrowing from the title of his paper we could now state the central question as “Can Shannon Entropy increase while the Normalized Shannon Entropy decrease?” The solution of this apparent paradox is a trick of escapologism, perhaps also paralleled by what a cancer cell may be experiencing (or “reacting” in response to increased sources of stresses), and it is worth discussing in this context. Let H[X] be Shannon Entropy for an ensamble X with N different values. We will now assume, and here is the trick, that N is not a constant, but a function of time N(t). Let D(X(N(t))) be the Normalized Shannon Entropy. By definition D(X(N(t))) = H(X(N(t)))/. Then, just by taking the time derivatives it can be shown that the time variation of D(X(N(t))) can be negative, although the time rate of H[X] can be positive.where k is a constant. The escape to our paradox is “achieved” via making explicit the time variability of N(t). Landberg explicitly mentions that biological systems are examples where growth processes increase N(t), and perhaps the increased diversity in the transcriptome of a cancer cell during progression is one of such examples.
This discussion somehow resolves the apparent disassociations due to language barriers that may exist between the different disciplines (physics, information theory, molecular biology and oncology). A biologist may regard a cancer cell as an entity that, during progression, may “spread” its transcriptomic profile, including the generation of a large number of novel molecular species (due to adquired characteristics during its “devolution” from the normal type). In our informational perspective, this would be analogous to a situation in which the sender of a message, after some time, decides to increase the size of the alphabet of transmitted symbols. Clearly, it is intuitive to think that the receiver would be in a situation of increased Shannon Entropy. However, if the receiver is not aware of the new symbols (or is not able to detect them) and some of the symbols of the previous alphabet are no longer used, the receiver would now perceive a reduction of Normalized Shannon Entropy, observing an increasing order.
We now borrow an illustrative example from Landberg , but we add a twist to this argument for the purpose of illustrating this discussion. Suppose we have a sender transmitting only two possible symbols (N = 2), and we will assume that we have the same probability, let's denote this as (1/2, 1/2). Then the average expected surprisal (Shannon Entropy), is H(X) = 1, and the Normalized Shannon Entropy is also equal to one. Assume now that now our sender starts to transmit using another symbol, so that we now have theoretical probabilities of (0.5, 0.25, 0.25). Then N = 3, and the average expected surprisal increases to H(X′) = 1.5 the Normalized Shannon Entropy is now 1.5/ = 0.946… (a reduction). This ‘third symbol’ could actually represent a new “molecular species” or a protein isoform that would not be normally expressed in that tissue type , or even something entirely new, product of a mutational/deletional event. If our hypothetical high-throughput technology can only be detecting the first two symbols, and following the conventions we established in the ‘Materials and Methods’ section, we would be “observing” frequencies of (2/3,1/3) since the other events would not be detected with our equipment. As a consequence, the both the = 1, Shannon Entropy and the Normalized Shannon Entropy are both reduced to 0.918293. Obviously, we can not count what we can not observe. As a consequence, a degenerating transcriptional profile that produces novel molecular species, and at the same time reduces those which we can not measure with a particular technology, would look increasingly more ordered.
Exporting entropy, Maxwell Demons and Aquaporins
We envision that physicists may find here a fertile ground to explore new ideas and attempt novel mathematical formalisms for cancer progression from the realm of finite-state thermodynamics  and in particular endorevesible processes  and endoreversible thermodynamics . Some molecular alterations would then be part of the set of revesible processes that could occur in a cancer cell, while other processes like aneuploidy or gene fusions could be truly “irreversible genetic switches” associated with cancer progression . If we assume that the process is slow (i.e. the times required for significant variations of the transcriptome's profile is large in comparison with the cell's processes time scales), and follwing the results of Spirkl and Reis , it may be possible that we have a constant entropy production rate exists during cancer progression leading to Hauptmann's “entropic devolution” . Hauptmann sees a malignant tumour as “a dissipative structure arising within the thermodynamical open system of the human body” that starts when “a localized surplus of energy exists and there is no possibility to export entropy. An energetic overload in most malignant cells is indicated by their abnormally high phosphorylation state.” His perspective, preceeded in part by Dimitrov , Klimek ,  and Marinescu and Viculetz  might then fit well an endoreversible thermodynamic formalism. Hauptmann says in  “I believe that cancer is a special kind of adaptation to energetic overload, characterized by multiplication and mutation of genomic DNA (generation of new biomolecules which enhance the probability of survival under harmful conditions), and by chiral alterations (reduction of entropy by entrapping energy) leading to abnormal configurated biomolecules. In this regard the genetic alterations are probably secondary changes. Cancer serves to dissipate energy in a type of developmental process but one in which the results are harmful to the whole organism: an entropic devolution.”
This thermodynamical perspective is now worth exploring and we will discuss it in this context. Assuming that a cancer cell is in a state of “energy overload”, without “the possibility of exporting entropy”, could it lead to some type of “genetic alterations”? Which key mechanisms might be impaired? What consequences is this “system” delivering? Could this be another hallmark for oncosystems indentification?
In 1871, in this book called “Theory of Heat”, Maxwell speculated the idea of “a being, who can see the individual molecules” and who has enough reactive intelligence to open and close a unique small hole existing between two communicating vessels (called ‘A’ and ‘B’). An ideal gas filled both vessels, so that starting at uniform temperature the intelligent being could observe the molecules and close and open the hole accordingly to a mission: “to allow only the swifter molecules to pass from A to B, and only the slower ones pass from B to A.” The being, “without expenditure of work raise the temperature of B and lower that of A in contradiction to the second law of thermodynamics.” The ability of the “being” to use observable information about the system to lower the thermodynamical entropy has motivated many articles in physics and fuelled the imagination of many since it was originally introduced by Mawell, and named as “demon” by Thomson three years later . An excellent collection of articles until 1990 , , , , , , , , , ,  was edited by Leff and Rex . The Maxwell “demon”, far from being “exorcised” from Physics, still inspires interesting new perspectives , , , , , , , , , , , , .
In a letter to Peter Guthrie Tait, Maxwell writes about the “demons”: “Is the production of an inequality of temperature their only occupation? No, for less intelligent demons can produce a difference in pressure as well as temperature by merely allowing all particles going in one direction while stopping all those going the other way. This reduces the demon to a valve. As such value him. Call him no more a demon but a valve like that of the hydraulic ram, suppose.” (from , p. 6). Maxwell gives again here a sign of his brilliant mind, “degrading” the demon to a valve, but also offering an inspiring perspective to oncosystems research. Which types of mechanisms exist in biological systems, and particularly in individual cells, to control these differential values in key parameters? Could changes of key physical parameters for metabolic processes of the cytoplasm and cell's organelles like temperature, volume, pH or electrochemical potentials be also implicated in cancer progression?
The influence of temperature may be giving an interesting working hypothesis for further research. What are the consequences if cancer cells are a different type of open system which also operates at a different temperature than a normal cell? Butler et al. have studied p53 and they argue that at temperatures above 37 degrees centigrades wild-type p53 spontaneously loses DNA binding activity. While folding kinetics do not show important changes in a range from 5 to 35 degrees C, the unfolding rates accelerate 10,000-fold. This leads to a somewhat unexpected mechanism of p53 inactivation. It could be the case that a fraction of p53 molecules become trapped in misfolded conformations with each folding-unfolding cycle due to the increased frequency of cycling. The occurrence of misfolded p53 proteins can lead to aggregation and subsequent ubiquitination in the cell, leading to p53 inactivation , . If a key “guardian of the genome integrity” ,  and its remarkable conformational flexibility  is challenged by an increase of temperature , its role in genotoxic damage and adaptive response (like that of the skin to UVB damage ) may be impaired. The same may occur for other members of the DNA damage response. An increment in temperature has already been linked to skin carcinogenesis. Boukamp et al. report in that  “exposure of immortal human HaCaT skin keratinocytes (possessing UV-type p53 mutations) to 40 degrees C reproducibly resulted in tumorigenic conversion and tumorigenicity was stably maintained after recultivation of the tumors.”
On the other hand, natural gradients on physical biochemical properties can also be challenged in a cancer cell. This in turn derives in metabolic processes running under abnormal parametric circumstances. It is well-known that compartimentalization, in biological systems, naturally require the existence of mechanisms that would keep some key state variables relatively constant, or within bounds, for normal operation of the metabolic processes. One example is very illustrative and a case in point. Instead of demons, holes, or valves, the cell requires pores in its membranes to allow osmotic regulatory processes, yet it should preclude the conduction of protons. This is a nanotechnological design problem not faced by Maxwell, but certainly solved by biological systems without the need of an “intelligent being” as Mawell cleverly pointed to Tait in his letter.
This discussion brings us to one of the gene families we have already discussed in this paper, the aquaporins , , , , , , . They are considered the primary water channels of cell membranes , , , . The specific functions of each member of this family are now being slowly mapped by several research labs around the world . Their clinical role in cancer , , , , , , , , ,obesity , malaria ,  and other diseases is emerging , , , , , , , , , , , . In , our group observed the dowregulation of AQP3 in all melanoma cell lines studied of the NCI-60 dataset of Ross et al.; this dowregulation was also observed for the CNS and Renal cell lines. AQP3 was relatively upregulated for Leukaemia and Colon cell-lines (we refer the reader to the Supplementary Material of  for details). Inhibition of AQP3 in prostate cancer cells was already proposed as a mechanism that increases the sensitivity to cryotherapy treatment .
The aquaporins are not “an intelligent being” in any real sense, yet they are so formidable selective that they could easily parallel Maxwell demon's efficiency in creating the right conditions for the cell. Wu et al. give us some clues on the role of point mutations in the AQP1 and how their effective electrostatic proton barrier can be impaired . The elicitation of the detailed mechanistic explanation of this extraordinary selectivity is under intense investigation with a number of techniques, including sophisticated molecular dyanamics simulations, for an overview of this field see , , , , , , , , , , , , , , , , . One less known feature of aquaporins is that they may not only channel water, but also carbon dioxide and ammonia , , , glycerol  and urea and other small solutes  and, very relevant for cancer research, hydrogen peroxide . At least two of members of this family have been observed in the inner mitochondrial membrane in different tissues. This in turn may indicate mitochondrial roles for aquapotins in osmotic swelling induced by apoptotic stimuli .
Could it be possible that we can track cancer progression by looking at some of these “Maxwell demons”? We have seen in Figure 10, that AQP3 has a reduced expression with increased progression in our melanoma dataset. Cao et al., reported that ultraviolet radiation induced AQP3 down-regulation in human karatinocytes; thus AQP3 has become a strong and plausible link between UV radiation, skin dehydration ,  and photoaging . This may indicate an impared function on skin hydration , , , , , . The expression of AQP3, as well as AQP1, AQP5, and AQP9 seem to be correlated with melanoma progression, indicating a common pattern of downregulation from the higher values in normal skin and benign nevi (see Figure 17).
Primary melanaoma samples (annotated in green) and benign nevi (in yellow) show higher expression values. Primar melanoma (in orange) show a mixed behaviour and metastaic melanoma samples (in red) show in comparision that their expression is remarkably lower. We highlight the similarity of this finding with Figure 8, in which we have shown the same behaviour for a group of genes functionally annotated as being involved in cell adhesion, cell-cell communication, tight junction mechanisms and epithelial cell polarity. Metastatic melanoma samples, in comparison, show remarkably reduced values of the joint expression of these four probes, indicating the possibility of an impaired function of these highly selective mechanisms.
Does a similar pattern of aquaporin downregulation exist in prostate cancer? Wang et al. have looked at the expression and localization of AQP3 in human prostate using cell lines as well as patient samples. They have observed AQP3 mRNA “in both normal and cancerous epithelia of human prostate tissues, but not in the mesenchyme. In the normal epithelia of the prostate, localization was limited to cell membranes, particularly the basolateral membranes. However, the expression of AQP3 protein in the cancer epithelia was not observed on the cell membranes.” This finding seems to implicate the subcellular localization of AQP3 as a possible indicator of a transition to a more malignant phenotype. Lapointe's dataset allows us to see the downregulation of AQP3 and AQP1. A large subgroup of primary prostate tumors has reduced levels of AQP3 and AQP1 as most of the lymph node metastasis samples [Figure 18].
Most of the control samples have a positive joint expression value (in green). A reduction is observed in primary prostate tumor samples (in yellow), with more than one half of the samples now having negative values. On the rightmost part of the figure, most of the lymph node metastasis samples (in red) have a strong negative total joint expression of these two biomarkers.
Retrodictions, Postdictions, Predictions, Telomeres, non-coding RNAs and paraspeckles
One critique that we are aware we could receive is that the current manuscript presents a novel methodology and an underlying unifying theory based on retrodictions or postdictions. Indeed we have shown that the use of the Normalized Shannon Entropy and the Information Theory quantifiers (the M-complexities and the Jensen-Shannon divergence) allow to monitor cancer progression and to identify the best biomarkers that correlate with the transcriptomic changes. Our approach works in a retrodiction way in that it looks at data already obtained by other studies, but gives a unifying framework to track cancer progression. For instance, on True et al's dataset, our unifying hallmark of cancer gives not only MAOA, which was already identified in the original publication, but also AMACR, CD40, CDK4, etc. are very important biomarkers for prostate cancer. Analogously, the identification of KLK3/PSA in Lapointe's dataset is another important retrodiction which shows the power of the method.
In some sense our approach also works in a postdiction way, as it helps to evaluate the speculation that cancer cells have “an entropic devolution”. Our results show that the variations of Normalized Shannon Entropy and Jensen-Shannon divergences indeed give measurable changes, and that these changes are related to important biomarkers in the two types of cancer studied in this work.
In addition, we remark that we are literally making hundreds, or even thousands of predictions. The results in the ‘Supplementary Material’ provide this information for the detailed scrutiny of our peers. We believe that other probes with gene expression patterns in high correlation with the probes discussed in this paper, and perhaps less studied by immunohistochemistry and other methods in the two cancer types studied here, are worth exploring as a group of biomarkers. These predictions can be tested with further studies on staging and patient stratification.
A very recent study by Ballal et al. have linked BRCA1 to telomere length and maintenance and its loss from the telomere in response to DNA damage  (see also ). We have previously mentioned that BRCA1 is a conspiquous biomarker arising from the analysis of True et al.'s dataset using our methods. We found this to correlate with a preivous study that showed that BRCA1 has a reduced expression in immortalized prostate epithelial cells before and after their conversion to tumorigenicity . We also mentioned that the knockdown of BRCA1 leads to anaccumulation of multinucleated cells , preserving chromosomal stability . Ballal et al. telomeric ChIP assays to detect BRCA1 at the telomere and reported time-dependent loss of BRCA1 from the telomere following DNA damage. Due to the role of telomeres in maintaining chromosomal stability  and the inverse correlation of telomere length and divergent karyotypes in prostate cancer cell lines ,  (as well as the recognized role of telomere dysfunction in the induction of apoptosis or senescence in vivo , , , , , , increase of mutation rates , DNA fragmentation , and their relation with DNA damage signalling ), we checked for other probes of genes involved in telomeric function.
From those which we were able to identify in True et al's dataset, we have found a strong high correlation of the expression of BRCA1 with TERF2/TRF2 (telomeric repeat binding factor 2)  and a negative correlation with the expression pattern of TERF2IP (telomeric repeat binding factor 2, interacting protein) [Figure 19].
The first group of samples (1 to 9 in green) correspond to Gleason 3 pattern, indicating that most of the samples in this group have no significantly reduced expression of this pair of genes. The second group of columns (10 to 21 in yellow) correspond to Gleason 4 patterns and the last 8 columns (22 to 29 in red) correspond to Gleason 5 samples. A very recent study by Ballal et al. have linked BRCA1, to telomere length and maintenance and its loss from the telomere in response to DNA damage  (see also ). There is an increasing trend of dowregulation, so it would be interesting to evaluate if indeed this pair of proteins could be an early marker of dowregulation useful to evaluate samples with Gleason pattern 2, or if may constitute a biomarker useful to distinguish a prostate cancer subtype.
Finally, one particular type of probes has also caught our attention, and we would like to refer to them before concluding this section.
With the denomination of ‘non-coding RNA’ we identify those RNA molecules which are functional but that are not translated into proteins. Many microarray chips contain probes that are annotated as ‘non-protein coding’, indicating that there might be some valuable expression data that we can also mine for information. We note that our method, although employing transcriptomic data, does not limit its application to protein-coding information, and that the combined use of protein-coding and non-coding protein probe expression would allow a more comprehensive view of the transcriptional state of the cell.
Among non-protein coding, microRNAs  are gaining acceptance as key players in several cancers , ,  (including prostate cancer , ), but the so-called “long non-coding RNAs”  are also gaining a place in the scenario of cancer biomarkers (see , and , , ). We thus turned our attention to these probes that have been annotated as “non-protein coding” and we highlight some of them that have very high correlation values with the Normalized Shannon Entropy in True et al's prostate cancer dataset. In particular, the probes for MALAT1/MALAT-1 , , , , , , , , , , , , ,  have a very conspiquous position (See Figure 20). They located very closely to other protein coding biomarkers that have also lost expression and have been discussed in this work like SFPQ, CD40, BRCA1, and TP53 (see Figure 16). MALAT1 has been recently pointed as a biomarker in primary human lobular breast cancer as a result of an analysis of over 132,000 Roche 454 high-confidence deep sequencing reads . An international team, searching on thousands of novel non-coding transcripts of the breast cancer transcriptome, has been able to identify more than three hundred reads corresponding to MALAT1 . This is a non-coding RNA which was identified in 2003 in non-small cell lung cancer, was shown to be highly expressed (relative to GAPDH) in lung, pancreas and prostate, but not in other tissues including muscle, skin, stomach, bone marrow, saliva, thyroid and adrenal glands, uterus and fetal liver . MALAT-1, also known as NEAT2, is considered to be “extraordinarily conseved for a noncoding RNA, more so than even XIST” . Our results indicate that the reduction of expression of some non-coding RNAs, in particular of MALAT-1, and SNORA60 with respect to their normal expression in prostate, as well as the upregulation of SNHG8 and SNHG1 should be monitored as useful biomarkers to track disease progression.
We present again a scatter plot of Spearman versus Pearson correlation values of the probe expression of 13,188 probes in True et al's prostate cancer dataset with the Normalized Shannon Entropy values of the samples. All blue dots correspond to one of the probes, but the only difference with Figure 16 is that we have now highlighted the position of s ome probes which have been annotated as corresponding to “non-coding RNAs”. In particular, we highlight those of MALAT1 (Metastasis associated lung adenocarcinoma transcript 1, (non-protein coding)), SNORA60 (small nucleolar RNA, H/ACA box 60); both increasingly downregulated, SNHG1 (small nucleolar RNA host gene 1 (non-protein coding)) and SNHG8 (small nucleolar RNA host gene 8 (non-protein coding)). The probes for MALAT1/MALAT-1 , , , , , , , , , , , , ,  have a very conspiquous position, which we could judge a priori to be equivalent in relevance to those of the previously discussed roles of SFPQ, CD40, BRCA1, and TP53 (see Figure 16). MALAT1 has been recently pointed as a biomarker in primary human lobular breast cancer as a result of an analysis of over 132,000 Roche 454 high-confidence deep sequencing reads. Within the thousands of novel non-coding transcripts of the breast cancer transcriptome, Guffanti al., identified more than three hundred reads corresponding to MALAT1 . This non-coding RNA, first identified in 2003 in non-small cell lung cancer, was shown to be highly expressed (relative to GAPDH) in lung, pancreas and prostate, but not in other tissues including muscle, skin, stomach, bone marrow, saliva, thyroid and adrenal glands, uterus and fetal liver (see figure four of Ref. ). Our results indicate that the reduction of expression of some non-coding RNAs, in particular of MALAT-1, and SNORA60 with respect to their normal expression in prostate, as well as the upregulation of SNHG8 and SNHG1 should be monitored as useful biomarkers to track disease staging and progression to a more malignant phenotype. Interestingly enough, a study published in 2006 by Nadminty et al. has shown that KLK3/PSA modulates several genes, reporting a 16.5 fold downregulation of MALAT1 . While these results have been obtained using the human osteosarcoma cell line SaOS-2, our results indicate that MALAT1 expression in the normal prostate and in cancer cells could also be considered as a relevant biomarkers to be tested in the future.
We will now address another non-coding RNA called NEAT1 which, like NEAT2, is also conserved in the mammalian lingeage. Before we move onto NEAT1, we will first recall a previous result. We have noted before the conspiquous position of SFPQ/PSF (Polypyrimidine tract-binding protein-associated splicing factor) in Figure 16. The expression of a probe for SPQF has the highest correlation with the values of the Normalized Shannon Entropy. We highlighted before that SFPQ/PSF is a putative regulator of growth factor-stimulated gene expression . The loss of SFPQ expression during the progression of prostate cancer may be an important key to understand this disease or one of its subtypes. We have also mentioned that the AR/PSF complex interacts with the PSA gene (perhaps the most well-established prostate cancer biomarker) and that SFPQ/PSF inhibits AR transcriptional activity . Kuwahara et al. showed that SFPQ together with NONO (Non-POU-domain-containing, octamer binding protein) and PSPC1 (Paraspeckle protein 1 alpha isoform, formerly known as PSP1) are expressed in mouse Sertoli cells of the testis and form complexes that function as coregulators of androgen receptor-mediated transcription . While new research results  link SFPQ and NONO/P54NRB with the RAD51 family of proteins (largely regarded as another key protector of chromosome integrity as being involved in homologous recombination DNA repair), it is perhaps SFPQ and NONO's co-localization in paraspeckles that make this group also remarkable .
Paraspeckles , , , , , , , , , , , , , , , , ,  are a novel nuclear compartment, of approximately 0.2–1 µm in size, discovered in 2002, by Fox et al. in Dundee Scotland, following the identification of the protein PSPC1 (AF448795) in the nucleolar proteomics project at Lamond's lab which is described well by Fox et al. . Three years later, Fox, Bond and Lamond showed that NONO and PSPC1 form a heterodimer that localizes to paraspeckles in an RNA-dependent manner . Paraspeckles are dynamic structures, observed in numbers that vary between 10 and 20, that seem to control gene expression via retention of RNA in the nucleus . A long noncoding RNA called NEAT1/MEN epsilon/beta , , , , , that colocalizes with paraspeckles, seems to be integral to their structure. Depletion of NEAT1 erradicates paraspeckles and a biochemical analysis by Clemson et al indicates that the NEAT1 binds with paraspeckle proteins SFPQ/PSF, P54NRB/NONO and PSPC1. NEAT1 is also known as TncRNA (trophoblast-derived noncoding RNA) , , , , , , , ,  and probes for TncRNA exist on this dataset, We have observed in True et al.'s dataset that there exists a high correlation between the Normalized Shannon Entropy with the expression of SFPQ/PSF, P54NRB/NONO, and TncRNA. Overall, this implies that the disruption of the function of the paraspeckles is correlated with the increasing signs of deterioration of normal transcriptomic state of the cells. While a causal relationship still needs to be proved, we admire the mathematical elegance of the Normalized Shannon Entropy of the samples, a global measure of the average expected surprisal of the transcriptome, which in turn has lead us to consider the dysfunction of the smallest nuclear body as a putative biomarker of disease progression. The role of SFPQ/PSF in the control of tumorigenesis is under investigation  and the information coming from these studies would need to be integrated with their role, together with P54NRB/NONO and TncRNA, in paraspeckles if we want to achieve a better understanding of these mechanisms.
In this contribution we have shown that for the melanoma and prostate cancer datasets studied, the quantitative changes of Information Theory measures, Normalized Shannon Entropy, Jensen–Shannon divergence and the novel Statistical Complexity quantifiers defined here are in high correlation with gene expression changes of well-established biomarkers associated to cancer progression. In addition, variations of the basic technique (i.e. a modified form of statistical complexity) which allows us to better understand the phenotypic changes observed in these samples which are associated with the progression and the transitions of the gene expression profiles. For instance, in a properly defined Statistical Complexity vs. Entropy plane, on a melanoma dataset first studied in Ref. , samples appear in well differentiated “clusters”. These clusters correlate well with the phonotypic characteristics of normal skin, nevi, primary and metastatic melanoma. In this “Complexity vs. Entropy” plane, primary melanomas samples appear “bridging” benign nevi and metastatic melanoma samples. Our results may also suggest that the evolution of metastatic melanoma leads to at least two different subtypes.
The Normalized Shannon Entropy of a transcriptional sample profile is calculated associating the measured expression values of a gene with the relatively probability of being expressed. We have observed that, in general, the transcriptomes of tumour progressing cells tend to have lower values of Normalized Shannon Entropy than normal ones. Given a population of normal cells of a given tissue type it is then possible to compute useful measure of divergence of cancer cell profiles from the normal expression average profile, in terms of Information Theory quantifiers, the Shannon Eveness normalized entropy and generalized statistical complexity , , .
In addition, our observation of the correlation of the statistical complexity of tumours with its natural progression allows an unprecedented way of finding biomarkers that links with the gradual deterioration of the genome integrity. The proposed methodology uncovered, for the first time, evidence of the putative role of impared centrosome cohesion in melanoma progression.
Statistical complexity has then been able to pinpoint otherwise unrecognized biomarkers in concert with existing ones, reinforcing the view that “chromosomal chaos” and “cancer as a chromosomal disease” can be a useful guiding principle to understand the molecular biology of cancer and uncover the timeline of its progression. This is a powerful method to uncover “oncosystems” instead of “oncogenes”. “Oncosystems” are a highly differentially disregulated set of genes that, if linked with the molecular “hallmarks of cancer” described in the introduction, and existing databases with putative common functional genomic annotations, can help to understand the biological progression pathways that drive the disease.
On one of the prostate cancer dataset studied (obtained from a previous published study, ), we observe a gradual pattern of reduction of Normalized Shannon Entropy from three well characterized tissue types: normal prostate, primary prostate tumours and lymph node metastases. On a different dataset on prostate cancer (from Ref ), we observe that a group of samples having Gleason patterns 4 and 5 (two patterns which are typically associated to an aggressive phenotype) have lower Normalized Shannon Entropy values than a subset of Gleason pattern 3 (a pattern which is normally associated to a less aggressive phenotype but which nevertheless is still of clinical concern). However, a group of samples having Gleason patterns 3, 4, and 5 is revealed; this mixed cluster has a mid-range entropy. This is an interesting fact which correlates with the limitations observed in Ref. . We note the authors' comment: “We were unable to identify a cohort of genes that could distinguish between pattern 4 and 5 cancers with sufficiently high accuracy to be useful, suggesting a high degree of similarity between these cancer histologies or substantial molecular heterogeneity in one or both of these groups.” Our results provide a conciliatory middle ground that explains the perceived clinical usefulness of Gleason pattern classification, widely used around the world, while at the same time reveals the reason for the difficulties of obtaining a good transcriptional signature for the other two patterns .
We have seen, through a detailed discussion of several biomarkers in three different datasets, that the variation of the gene expression distributional profile can be characterized via Information Theory quantifiers. Our study also showed that current established biomarkers of the two diseases studied seem to correlate with those that best co-variate with these quantifiers. For instance, AMACR, in our second prostate cancer dataset studied, naturally appears as one of the most correlated genes (in both the Pearson and the Spearman sense) with the pattern of variation of Entropy of the samples. Together with MAOA, which is the highlighted gene in True et al.'s  original publication, AMACR is now being recognized as one of the best biomarkers in primary prostate cancer with approximately 180 publications dedicated to it in the past five years. We have also shown that many gene probes that best correlate with the divergence of the normal tissue profile have been identified as useful biomarkers (via other accepted validation methods). This said, the use of other sources of information, like pathway or gene ontology databases has lead as to the identification of other cell processes that may be altered.
We have presented a unifying hallmark of cancer, the cancer cell's transcriptome changes its Normalized Shannon Entropy (as measured by high-througput technologies), while it increments its physical Entropy (via creation of states we might not measure with our devices). This hallmark allows, via the use of the Jensen-Shannon divergence, to identify the arrow of time of the process, and helps to map the phenotypical and molecular hallmarks of cancer as major converging trends of the transcriptome. The methodology has produced remarkable postdictions and retrodictions that show that it can predictively guide biomarker discovery.
Materials and Methods
We refer the reader to the original publications for details of methods for data collection, but we highlight here some aspects that are important to understand the data generation process for the purpose of our analysis.
Lapointe et al.'s dataset (File S1)
Samples were obtrained from radical prostatectomy surgical procedures. Samples are labelled as “tumors” if they contain at least 90% of cancerous epithelial cells, and they were considered as “non-tumor” if they contain no tumor epithelium and are from the noncancerous region of the prostate. The later samples were labelled “normals” although the authors alert that some may contain dysplasia. In this dataset, Lapointe et al. have performed a gene expression profiling by using cDNA microarrays containing 26,260 different human genes (UniGene clusters). Using 50 µg of total RNA from prostate samples Cy5-labeled cDNA was prepared and Cy3-labeled cDNA used 1.5 µg of mRNA common reference, pooled from 11 human cell lines (see Ref. ). The fluorescence ratios were subsequently normalized by mean centering genes for each array, a relatively standard procedure. In addition, to minimize potential print run specific bias, Lapointe et al. report that ratios were then mean centered for each gene across all arrays according to Ref. . We have only used the genes that the authors report in their first figure, 5,153 genes that have been well measured and have significan variation in some of the samples. For the other details of their matrials and methods we refer the readers to the Supporting Notes and the Materials and Methods section of their original publication .
Haqq et al.'s dataset (File S2)
Samples were obtained from nevus volunteers and melanoma patients and only those samples that have more than 90% of tumor cells were profiled. The 20,862 cDNAs used (Research Genetics, Huntsville, AL) represent 19,740 independent loci. (Unigene build 166).median of ratio values from the experiment were subjected to linear normalization in nomad (which can be accessed at http://derisilab.ucsf.edu), log-transformed (base 2), and filtered for genes where data were present in 80% of experiments, and where the absolute value of at least one measurement was >1.
True et al's dataset (File S3)
In this dataset, samples have information of 15,488 spots per array, with a total of 7,700 unique cDNAs represented. The samples were obtained from frozen tissue blocks from 29 radical prostatectomies accessioned and selected to represent Gleason grades 3, 4, and 5. The samples are “treatment naïve”, meaning that they were also selected such that their gene expression profile is also and the absence of any bias that the treatment before prostatectomy. The frozen sections (8 µm) were cut from optimal cutting temperature medium blocks and immediately fixed in cold 95% ethanol. Around 5,000 epithelial cells from both histologically benign glands and cancer glands were separately laser-capture microdissected (LCM). The authors of the study have also been very careful to include only one Gleason pattern in each laser-captured cancer sample, following a process in which the patterns were assessed independently by two investigators.The matched benign epithelium was captured for each cancer sample for a total of 121 samples.
An important characteristic of this dataset is the normalization procedure. For each spot and in each channel (Cy3 and Cy5), True et al. substracted the median background intensity from the median foreground intensity, and subsequently the log ratios of cancer expression to benign expression were computed. These ratios were obtained by first dividing the background-subtracted intensities (Prostate Cancer/Benign) and then taking the logarithm base 2. In the case that the median background intensity was greater than the median foreground intensity, the spot was considered missing. We refer to the original publication for the other aspects of imputation, spot quality and filtering, but, like in Lapointe et al's study, they also filter to keep informative (expression ratios of benign versus cancer should at least be 1.5-fold or greater in at least half of one of the Gleason groups as one of the selection criteria).
Normalized Shannon Entropy, Jensen-Shannon Divergence and Statistical Complexity
In many circumstances, experimental measurements are associated with the accumulation of individual results which, ultimately, qualitatively and quantitatively characterized our experimental observations. The presence (or absence) of a particular result of an individual experimental measure is called an event. An event which can take one of several possible values is called a random variable. Analogously, a random event is an event that can either fail to happen, or happens, as a result of an experiment. An event is certain if it can not fail to happen and it is said to be impossible if it can never happen.
Following Andreyev , we will define the probability p(x) of an event x, as the theoretical frequency of the event x about which the actual frequency occurrence of the event shows a tendency to fluctuate as the experiment is repeated many times. The Shannon information content of an event x (or the surprisal of an event x, ), is defined asFollowing McKay , an ensamble X is a triple , where x is the value of a random variable, which takes on one of a set of possible values, , having probabilities , with , and .
The Shannon Entropy of an ensemble X (also known as the uncertainty of X), denoted as H[X], is defined to be the average Shannon information content. It is the average expected surprisal for an infinitely long series of experiments. We use the theoretical frequencies to compute this average, and then we haveSuppose that we have a fair dice, the theoretical frequency of an event ‘the dice shows a three’ is 1/6, (if the dice is assumed fair, the theoretical frequency is the same for any number from 1 to 6). In that case a hypothetical experimentalist guessing will have an average expected surprise of H[X] = . We note the two natural bounds that the entropy can have. The Shannon Entropy of an ensemble X is always greater or equal to zero. It can only be zero if for only one of the N elements of . On the other hand, the Shannon Entropy is maximized in the case that . This is the so-called “equiprobable distribution”, a uniform probability distribution over the finite set.
Transcriptional Shannon Entropy.
Let the expression value of probe i (i = 1,…, N) on sample j (j = 1, …, M). For each sample j we first normalize the expression values. We interpret them as the theoretical frequency of a single hybridization event. We then define a probability distribution function (PDF) over a finite set as:The uniform (equiprobably) distribution is defined asand the average probability distribution over all M samples asLet , then in this paper we always use the Normalized Shannon Entropy, defined as:
The Jensen-Shannon divergence and the Statistical complexity measures
Given a probability distribution function over a discrete finite set, is then straightforward to calculate its Normalized Shannon Entropy if we have the theoretical frequencies. Several measures of “complexity” of a probability distribution function have been proposed. In this work we have used Statistical Complexity measures.
All the complexity measures used in this work are the product of a Normalized Shannon Entropy of the probability distribution function, and a divergence measure to a reference probability distribution function. We follow earlier proposals by López-Ruiz, Mancini and Calbet who first introduced a statistical complexity measure based on such a product in . The LMC-Statistical Complexity is the product of the Normalized Shannon Entropy, H[P], times the disequilibrium, Q[P]; the latter given by the Euclidean distance from P to Pe, the uniform probability distribution over the ensemble. In this paper we used a later modification which we refer as the MPR-Statistical Complexity  which replaces the Euclidean distance between P to Pe by the Jensen-Shannon divergence , . The Jensen-Shannon divergence is linked in physics to the thermodynamic length , , , .
We define the MPR-Statistical complexity  as:where , Q0 is a normalization factor, and is the Jensen-Shannon's divergence between two probability density functions P(1) and P(2), which in turn is defined asIn this work, in many cases we compute the Jensen-Shannon divergences of a probability with a probability of reference which is not the uniform probability distribution over the ensemble. In general, it is the average over a subset of probability distribution functions which are consider to be either the “initial” of “final” states of interest. Let be such an average, then the M-Statistical Complexity of a probability distribution function , given a of reference, is given by
An illutrative example.
In order to discuss a relatively simple example that can intuitively provide a grasp of the basic mathematical principles of Information Theory we present a hypothetical “gene expression” dataset involving four samples each with the expression of five unique probes corresponding to five genes (not necessarily different) as follows in Table 3.
One of the quantifiers that we use in this contribution describes a measure of order for a sample: the Normalized Shannon Entropy also known as Shannon Evenness Index . This section focuses on this quantifiers use and importance (refer to the ‘Materials and Methods’ section to see how this measure is calculated). In Sample 4 all probes have the same expression therefore it has the highest achievable value of Normalized Shannon Entropy (H = 1). The Normalized Shannon Entropy values for samples 1 and 2 are the same (H = 0.82). Sample 3, which tends to be less peaked and has the two most significantly expressed genes with the same value, has a higher value of Normalized Shannon Entropy (H = 0.92) (see Figure 21).
Sample 4 has the largest attainable value since the expression of all probes is the same. Samples 1 and 2, which have the same set of expression values, although in different probes, have the same value of Normalized Shannon Entropy. As a consequence, there is a need for another quantifier of gene expression to address the permutational indistinguishability of these two expression profiles. The Jensen-Shannon divergence provides a natural alternative (see Table 4).
This simple example shows that the Normalized Shannon Entropy variations of the gene expression profile convey information about global transcriptomic changes; however, this measure alone is not enough to characterize the deviations from normal tissue profiles. For example, assume that Sample 1 is the normal profile of a particular tissue type. Assume that Sample 3 is the profile of a cancer cell that originated from that tissue type, the variation of Normalized Shannon Entropy can be related to this malignant change. However, as Sample 2 illustrates, Normalized Shannon Entropy is not enough to let us to measure the variation from a profile and at least another Information Theory quantifier is needed. We resort to Statistical Complexity quantifiers, which in turn use the Jensen-Shannon divergence  to provide this complementary dimension  (refer to the ‘Materials and Methods’ section for a mathematical definition of the Jensen-Shannon divergence).
Figure 21 shows how the Jensen-Shannon divergence helps us to evaluate the variation between profiles. Samples 1 and 2, as perhaps intuitively expected, have the largest divergence between them, their Jensen Shannon divergence is 0.286636 (JS(1,2) = JS(2,1) = 0.286636). The two “closest” pair of profiles correspond to Samples 3, and 4, (JS(3,4) = JS(4,3) = 0.035851). See Table 4.
Let be the Normalized Shannon Entropy of a transcriptional sample profile, then the MPR-Statistical Complexity is defined as being proportional to the product of the Normalized Shannon Entropy times the Jensen-Shannon divergence of the profile with the equiprobable distribution (in the example above the equiprobable distribution is that of Sample 4). Then we haveWhere is a normalization factor. Once again, we refer to the ‘Materials and Methods’ sections for the accompanying formal mathematical presentation. As a consequence, we can plot the MPR-Statistical Complexity of the samples of our example as a function of the Normalized Shannon Entropy as can be seen in Figure 22.
The MPR-Statistical complexity is proportional to the Normalized Shannon Entropy (labelled ‘MPR’, y-axis) of a sample and the Jensen-Shannon divergence of the sample and a hypothetical sample with an equiprobability distribution of gene expression.
Haqq Data Set Supporting File
(3.92 MB XLS)
Lapointe Data Supporting file
(1.51 MB XLS)
List of references for research into NFKappa-B as a target for the intervention in prostate cancer
(0.10 MB DOC)
The authors would like to thank three Research Associates of our Centre (Osvaldo Rosso, Carlos Riveros and John Marsden) for discussions on this topic. We thank the first two, in particular to Osvaldo, for their collaboration on data cleaning and the computation of the Entropy and Jensen-Shannon divergences. We thank Marsden for his advice on editing the final draft. PM would also like to thank two stimulating discussions (decades apart) with Dr. Carlos Reigosa and Prof. Elizabeth Blackburn, as well as the invisible hand of Prof. Yaser Abu-Mostafa.
Conceived and designed the experiments: RB PM. Performed the experiments: RB PM. Analyzed the data: RB PM. Wrote the paper: RB PM. Produced all graphical material: RB.
- 1. Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100: 57–70.
- 2. Wong DJ, Segal E, Chang HY (2008) Stemness, cancer and cancer stem cells. Cell Cycle 7:
- 3. Glinsky GV (2008) “Stemness” genomics law governs clinical behavior of human cancer: implications for decision making in disease management. J Clin Oncol 26: 2846–2853.
- 4. Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, et al. (2008) An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat Genet 40: 499–507.
- 5. Maniccia AW, Lewis C, Begum N, Xu J, Cui J, et al. (2009) Mitochondrial localization, ELK-1 transcriptional regulation and growth inhibitory functions of BRCA1, BRCA1a, and BRCA1b proteins. J Cell Physiol 219: 634–641.
- 6. Rustin P, Kroemer G (2007) Mitochondria and cancer. Ernst Schering Found Symp Proc 1–21.
- 7. Ortega AD, Sanchez-Arago M, Giner-Sanchez D, Sanchez-Cenizo L, Willers I, et al. (2009) Glucose avidity of carcinomas. Cancer Lett 276: 125–135.
- 8. Lee HC, Wei YH (2009) Mitochondrial DNA instability and metabolic shift in human cancers. Int J Mol Sci 10: 674–701.
- 9. Yeung SJ, Pan J, Lee MH (2008) Roles of p53, MYC and HIF-1 in regulating glycolysis - the seventh hallmark of cancer. Cell Mol Life Sci 65: 3981–3999.
- 10. Isidoro A, Martinez M, Fernandez PL, Ortega AD, Santamaria G, et al. (2004) Alteration of the bioenergetic phenotype of mitochondria is a hallmark of breast, gastric, lung and oesophageal cancer. Biochem J 378: 17–20.
- 11. Bartkova J, Rajpert-De Meyts E, Skakkebaek NE, Lukas J, Bartek J (2003) Deregulation of the G1/S-phase control in human testicular germ cell tumours. APMIS 111: 252–265; discussion 265–256.
- 12. Esteller M (2005) Aberrant DNA methylation as a cancer-inducing mechanism. Annu Rev Pharmacol Toxicol 45: 629–656.
- 13. Licchesi JD, Westra WH, Hooker CM, Herman JG (2008) Promoter hypermethylation of hallmark cancer genes in atypical adenomatous hyperplasia of the lung. Clin Cancer Res 14: 2570–2578.
- 14. Fraga MF, Ballestar E, Villar-Garea A, Boix-Chornet M, Espada J, et al. (2005) Loss of acetylation at Lys16 and trimethylation at Lys20 of histone H4 is a common hallmark of human cancer. Nat Genet 37: 391–400.
- 15. Tennant DA, Duran RV, Boulahbel H, Gottlieb E (2009) Metabolic transformation in cancer. Carcinogenesis 30: 1269–1280.
- 16. Sheng H, Niu B, Sun H (2009) Metabolic targeting of cancers: from molecular mechanisms to therapeutic strategies. Curr Med Chem 16: 1561–1587.
- 17. Kroemer G, Pouyssegur J (2008) Tumor cell metabolism: cancer's Achilles' heel. Cancer Cell 13: 472–482.
- 18. Ruan K, Song G, Ouyang G (2009) Role of hypoxia in the hallmarks of human cancer. J Cell Biochem 107: 1053–1062.
- 19. Fang JS, Gillies RD, Gatenby RA (2008) Adaptation to hypoxia and acidosis in carcinogenesis and tumor progression. Semin Cancer Biol 18: 330–337.
- 20. Ruan K, Fang X, Ouyang G (2009) MicroRNAs: Novel regulators in the hallmarks of human cancer. Cancer Lett.
- 21. Dalmay T, Edwards DR (2006) MicroRNAs and the hallmarks of cancer. Oncogene 25: 6170–6175.
- 22. Tycko B (2003) Genetic and epigenetic mosaicism in cancer precursor tissues. Ann N Y Acad Sci 983: 43–54.
- 23. Panigrahi AK, Pati D (2009) Road to the crossroads of life and death: Linking sister chromatid cohesion and separation to aneuploidy, apoptosis and cancer. Crit Rev Oncol Hematol.
- 24. Albertson DG, Collins C, McCormick F, Gray JW (2003) Chromosome aberrations in solid tumors. Nat Genet 34: 369–376.
- 25. Fabarius A, Hehlmann R, Duesberg PH (2003) Instability of chromosome structure in cancer cells increases exponentially with degrees of aneuploidy. Cancer Genet Cytogenet 143: 59–72.
- 26. Duesberg P, Rausch C, Rasnick D, Hehlmann R (1998) Genetic instability of cancer cells is proportional to their degree of aneuploidy. Proc Natl Acad Sci U S A 95: 13692–13697.
- 27. Weinstein RS, Pauli BU (1987) Cell junctions and the biological behaviour of cancer. Ciba Found Symp 125: 240–260.
- 28. Galluzzi L, Morselli E, Kepp O, Vitale I, Rigoni A, et al. (2009) Mitochondrial gateways to cancer. Mol Aspects Med.
- 29. Colotta F, Allavena P, Sica A, Garlanda C, Mantovani A (2009) Cancer-related inflammation, the seventh hallmark of cancer: links to genetic instability. Carcinogenesis 30: 1073–1081.
- 30. Allavena P, Garlanda C, Borrello MG, Sica A, Mantovani A (2008) Pathways connecting inflammation and cancer. Curr Opin Genet Dev 18: 3–10.
- 31. Caino MC, Meshki J, Kazanietz MG (2009) Hallmarks for senescence in carcinogenesis: novel signaling players. Apoptosis 14: 392–408.
- 32. Naugler WE, Karin M (2008) NF-kappaB and cancer-identifying targets and mechanisms. Curr Opin Genet Dev 18: 19–26.
- 33. Doratiotto S, Marongiu F, Faedda S, Pani P, Laconi E (2009) Altered growth pattern, not altered growth per se, is the hallmark of early lesions preceding cancer development. Histol Histopathol 24: 101–106.
- 34. Fabbri M (2008) MicroRNAs and cancer epigenetics. Curr Opin Investig Drugs 9: 583–590.
- 35. Daley GQ (2008) Common themes of dedifferentiation in somatic cell reprogramming and cancer. Cold Spring Harb Symp Quant Biol 73: 171–174.
- 36. Tenen DG (2003) Disruption of differentiation in human cancer: AML shows the way. Nat Rev Cancer 3: 89–101.
- 37. Blagosklonny MV (2005) Molecular theory of cancer. Cancer Biol Ther 4: 621–627.
- 38. Luo J, Solimini NL, Elledge SJ (2009) Principles of cancer therapy: oncogene and non-oncogene addiction. Cell 136: 823–837.
- 39. Wodarz D (2008) Stem cell regulation and the development of blast crisis in chronic myeloid leukemia: Implications for the outcome of Imatinib treatment and discontinuation. Med Hypotheses 70: 128–136.
- 40. Gleason DF (1966) Classification of prostatic carcinomas. Cancer Chemother Rep 50: 125–128.
- 41. Ghani KR, Grigor K, Tulloch DN, Bollina PR, McNeill SA (2005) Trends in reporting Gleason score 1991 to 2001: changes in the pathologist's practice. Eur Urol 47: 196–201.
- 42. Oyama T, Allsbrook WC Jr, Kurokawa K, Matsuda H, Segawa A, et al. (2005) A comparison of interobserver reproducibility of Gleason grading of prostatic carcinoma in Japan and the United States. Arch Pathol Lab Med 129: 1004–1010.
- 43. Tarhini AA, Agarwala SS (2006) Cutaneous melanoma: available therapy for metastatic disease. Dermatol Ther 19: 19–25.
- 44. Lapointe J, Li C, Higgins JP, van de Rijn M, Bair E, et al. (2004) Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc Natl Acad Sci U S A 101: 811–816.
- 45. Camargo JA (2008) Revisiting the relation between species diversity and information theory. Acta Biotheor 56: 275–283.
- 46. Forget A, Ayrault O, den Besten W, Kuo ML, Sherr CJ, et al. (2008) Differential post-transcriptional regulation of two Ink4 proteins, p18(Ink4c) and p19(Ink4d). Cell Cycle 7:
- 47. Canepa ET, Scassa ME, Ceruti JM, Marazita MC, Carcagno AL, et al. (2007) INK4 proteins, a family of mammalian CDK inhibitors with novel biological functions. IUBMB Life 59: 419–426.
- 48. Wonsey DR, Follettie MT (2005) Loss of the forkhead transcription factor FoxM1 causes centrosome amplification and mitotic catastrophe. Cancer Res 65: 5181–5189.
- 49. Laoukili J, Kooistra MR, Bras A, Kauw J, Kerkhoven RM, et al. (2005) FoxM1 is required for execution of the mitotic programme and chromosome stability. Nat Cell Biol 7: 126–136.
- 50. Ostrander EA, Udler MS (2008) The role of the BRCA2 gene in susceptibility to prostate cancer revisited. Cancer Epidemiol Biomarkers Prev 17: 1843–1848.
- 51. Mitra A, Fisher C, Foster CS, Jameson C, Barbachanno Y, et al. (2008) Prostate cancer in male BRCA1 and BRCA2 mutation carriers has a more aggressive phenotype. Br J Cancer 98: 502–507.
- 52. Moro L, Arbini AA, Marra E, Greco M (2005) Down-regulation of BRCA2 expression by collagen type I promotes prostate cancer cell proliferation. J Biol Chem 280: 22482–22491.
- 53. Gronberg H, Ahman AK, Emanuelsson M, Bergh A, Damber JE, et al. (2001) BRCA2 mutation in a family with hereditary prostate cancer. Genes Chromosomes Cancer 30: 299–301.
- 54. Moro L, Arbini AA, Yao JL, di Sant'Agnese PA, Marra E, et al. (2008) Loss of BRCA2 promotes prostate cancer cell invasion through up-regulation of matrix metalloproteinase-9. Cancer Sci 99: 553–563.
- 55. Narod SA, Neuhausen S, Vichodez G, Armel S, Lynch HT, et al. (2008) Rapid progression of prostate cancer in men with a BRCA2 mutation. Br J Cancer 99: 371–374.
- 56. Agalliu I, Karlins E, Kwon EM, Iwasaki LM, Diamond A, et al. (2007) Rare germline mutations in the BRCA2 gene are associated with early-onset prostate cancer. Br J Cancer 97: 826–831.
- 57. van Golen KL, Ying C, Sequeira L, Dubyk CW, Reisenberger T, et al. (2008) CCL2 induces prostate cancer transendothelial cell migration via activation of the small GTPase Rac. J Cell Biochem 104: 1587–1597.
- 58. Mizutani K, Sud S, Pienta KJ (2009) Prostate cancer promotes CD11b positive cells to differentiate into osteoclasts. J Cell Biochem.
- 59. Li X, Loberg R, Liao J, Ying C, Snyder LA, et al. (2009) A destructive cascade mediated by CCL2 facilitates prostate cancer growth in bone. Cancer Res 69: 1685–1692.
- 60. Loberg RD, Day LL, Harwood J, Ying C, St John LN, et al. (2006) CCL2 is a potent regulator of prostate cancer cell migration and proliferation. Neoplasia 8: 578–586.
- 61. Roca H, Varsos ZS, Mizutani K, Pienta KJ (2008) CCL2, survivin and autophagy: new links with implications in human cancer. Autophagy 4: 969–971.
- 62. Roca H, Varsos Z, Pienta KJ (2008) CCL2 protects prostate cancer PC3 cells from autophagic death via phosphatidylinositol 3-kinase/AKT-dependent survivin up-regulation. J Biol Chem 283: 25057–25073.
- 63. Craig M, Ying C, Loberg RD (2008) Co-inoculation of prostate cancer cells with U937 enhances tumor growth and angiogenesis in vivo. J Cell Biochem 103: 1–8.
- 64. Lu Y, Xiao G, Galson DL, Nishio Y, Mizokami A, et al. (2007) PTHrP-induced MCP-1 production by human bone marrow endothelial cells and osteoblasts promotes osteoclast differentiation and prostate cancer cell proliferation and invasion in vitro. Int J Cancer 121: 724–733.
- 65. Lu Y, Cai Z, Xiao G, Keller ET, Mizokami A, et al. (2007) Monocyte chemotactic protein-1 mediates prostate cancer-induced bone resorption. Cancer Res 67: 3646–3653.
- 66. Loberg RD, Ying C, Craig M, Yan L, Snyder LA, et al. (2007) CCL2 as an important mediator of prostate cancer growth in vivo through the regulation of macrophage infiltration. Neoplasia 9: 556–562.
- 67. Loberg RD, Ying C, Craig M, Day LL, Sargent E, et al. (2007) Targeting CCL2 with systemic delivery of neutralizing antibodies induces prostate cancer tumor regression in vivo. Cancer Res 67: 9417–9424.
- 68. Loberg RD, Tantivejkul K, Craig M, Neeley CK, Pienta KJ (2007) PAR1-mediated RhoA activation facilitates CCL2-induced chemotaxis in PC-3 cells. J Cell Biochem 101: 1292–1300.
- 69. Chetcuti A, Margan S, Mann S, Russell P, Handelsman D, et al. (2001) Identification of differentially expressed genes in organ-confined prostate cancer by gene expression array. Prostate 47: 132–140.
- 70. Edwards J, Krishna NS, Mukherjee R, Bartlett JM (2004) The role of c-Jun and c-Fos expression in androgen-independent prostate cancer. J Pathol 204: 153–158.
- 71. Schlomm T, Hellwinkel OJ, Buness A, Ruschhaupt M, Lubke AM, et al. (2008) Molecular Cancer Phenotype in Normal Prostate Tissue. Eur Urol.
- 72. Ouyang X, Jessen WJ, Al-Ahmadie H, Serio AM, Lin Y, et al. (2008) Activator protein-1 transcription factors are associated with progression and recurrence of prostate cancer. Cancer Res 68: 2132–2144.
- 73. Mukherjee R, Bartlett JM, Krishna NS, Underwood MA, Edwards J (2005) Raf-1 expression may influence progression to androgen insensitive prostate cancer. Prostate 64: 101–107.
- 74. Ivorra C, Kubicek M, Gonzalez JM, Sanz-Gonzalez SM, Alvarez-Barrientos A, et al. (2006) A mechanism of AP-1 suppression through interaction of c-Fos with lamin A/C. Genes Dev 20: 307–320.
- 75. Gonzalez JM, Navarro-Puche A, Casar B, Crespo P, Andres V (2008) Fast regulation of AP-1 activity through interaction of lamin A/C, ERK1/2, and c-Fos at the nuclear envelope. J Cell Biol 183: 653–666.
- 76. Thomsen MK, Francis JC, Swain A (2008) The role of Sox9 in prostate development. Differentiation 76: 728–735.
- 77. Thomsen MK, Butler CM, Shen MM, Swain A (2008) Sox9 is required for prostate development. Dev Biol 316: 302–311.
- 78. Ostrer H (2000) Sexual differentiation. Semin Reprod Med 18: 41–49.
- 79. Schaeffer EM, Marchionni L, Huang Z, Simons B, Blackman A, et al. (2008) Androgen-induced programs for prostate epithelial growth and invasion arise in embryogenesis and are reactivated in cancer. Oncogene 27: 7180–7191.
- 80. Wang H, McKnight NC, Zhang T, Lu ML, Balk SP, et al. (2007) SOX9 is expressed in normal prostate basal cells and regulates androgen receptor expression in prostate cancer cells. Cancer Res 67: 528–536.
- 81. Acevedo VD, Gangula RD, Freeman KW, Li R, Zhang Y, et al. (2007) Inducible FGFR-1 activation leads to irreversible prostate adenocarcinoma and an epithelial-to-mesenchymal transition. Cancer Cell 12: 559–571.
- 82. Wang H, Leav I, Ibaragi S, Wegner M, Hu GF, et al. (2008) SOX9 is expressed in human fetal prostate epithelium and enhances prostate cancer invasion. Cancer Res 68: 1625–1630.
- 83. Dudley AC, Khan ZA, Shih SC, Kang SY, Zwaans BM, et al. (2008) Calcification of multipotent prostate tumor endothelium. Cancer Cell 14: 201–211.
- 84. Fujita A, Gomes LR, Sato JR, Yamaguchi R, Thomaz CE, et al. (2008) Multivariate gene expression analysis reveals functional connectivity changes between normal/tumoral prostates. BMC Syst Biol 2: 106.
- 85. Denham JW, Lamb DS, Joseph D, Matthews J, Atkinson C, et al. (2008) PSA response signatures - a powerful new prognostic indicator after radiation for prostate cancer? Radiother Oncol.
- 86. Winter A, Uphoff J, Henke RP, Wawroschek F (2009) [First results of PET/CT-guided secondary lymph node surgery on patients with a PSA relapse after radical prostatectomy]. Aktuelle Urol 40: 294–299.
- 87. Fujita K, Nakayama M, Nakai Y, Takayama H, Nishimura K, et al. (2009) Vascular endothelial growth factor receptor 1 expression in pelvic lymph nodes predicts the risk of cancer progression after radical prostatectomy. Cancer Sci 100: 1047–1050.
- 88. Roche JB, Malavaud B, Soulie M, Cournot M, Game X, et al. (2008) [Pathological stage T3 prostate cancer after radical prostatectomy: a retrospective study of 246 cases]. Prog Urol 18: 586–594.
- 89. Roder MA, Reinhardt S, Brasso K, Iversen P (2008) [Prostate cancer patients with lymph node metastasis. Outcome in a consecutive group of 59 patients]. Ugeskr Laeger 170: 2554–2558.
- 90. Bastide C, Brenot-Rossi I, Garcia S, Rossi D (2009) Radioisotope guided sentinel lymph node dissection in patients with localized prostate cancer: results of the first 100 cases. Eur J Surg Oncol 35: 751–756.
- 91. Pettus JA, Masterson TA, Abel EJ, Middleton RG, Stephenson RA (2008) Risk stratification for positive lymph nodes in prostate cancer. J Endourol 22: 1021–1025.
- 92. Karam JA, Svatek RS, Karakiewicz PI, Gallina A, Roehrborn CG, et al. (2008) Use of preoperative plasma endoglin for prediction of lymph node metastasis in patients with clinically localized prostate cancer. Clin Cancer Res 14: 1418–1422.
- 93. Karakiewicz P (2006) The rate of lymph node invasion (LNI) in men with PSA values less than 10 ng/ml. Eur Urol 50: 277–278.
- 94. Wang D, Lawton C (2008) Pelvic lymph node irradiation for prostate cancer: who, why, and when? Semin Radiat Oncol 18: 35–40.
- 95. Sheridan T, Herawi M, Epstein JI, Illei PB (2007) The role of P501S and PSA in the diagnosis of metastatic adenocarcinoma of the prostate. Am J Surg Pathol 31: 1351–1355.
- 96. Schoder H, Bochner B (2007) Detection and management of isolated lymph node recurrence in patients with PSA relapse. Eur Urol 52: 310–312.
- 97. Miyake H, Kurahashi T, Hara I, Takenaka A, Fujisawa M (2007) Significance of micrometastases in pelvic lymph nodes detected by real-time reverse transcriptase polymerase chain reaction in patients with clinically localized prostate cancer undergoing radical prostatectomy after neoadjuvant hormonal therapy. BJU Int 99: 315–320.
- 98. Cimitan M, Bortolus R, Morassut S, Canzonieri V, Garbeglio A, et al. (2006) [18F]fluorocholine PET/CT imaging for the detection of recurrent prostate cancer at PSA relapse: experience in 100 consecutive patients. Eur J Nucl Med Mol Imaging 33: 1387–1398.
- 99. Weckermann D, Goppelt M, Dorn R, Wawroschek F, Harzmann R (2006) Incidence of positive pelvic lymph nodes in patients with prostate cancer, a prostate-specific antigen (PSA) level of < or = 10 ng/mL and biopsy Gleason score of < or = 6, and their influence on PSA progression-free survival after radical prostatectomy. BJU Int 97: 1173–1178.
- 100. Radosavljevic R, Hadzi-Djokic J, Acimovic M, Tulic C, Dzamic Z, et al. (2005) Histopathological evaluation of radical prostatectomy in the treatment of localized prostate cancer. Acta Chir Iugosl 52: 109–112.
- 101. Kroepfl D, Loewen H, Roggenbuck U, Musch M, Klevecka V (2006) Disease progression and survival in patients with prostate carcinoma and positive lymph nodes after radical retropubic prostatectomy. BJU Int 97: 985–991.
- 102. Schumacher MC, Burkhard FC, Thalmann GN, Fleischmann A, Studer UE (2006) Is pelvic lymph node dissection necessary in patients with a serum PSA<10ng/ml undergoing radical prostatectomy for prostate cancer? Eur Urol 50: 272–279.
- 103. Sakai I, Harada K, Kurahashi T, Muramaki M, Yamanaka K, et al. (2006) Usefulness of the nadir value of serum prostate-specific antigen measured by an ultrasensitive assay as a predictor of biochemical recurrence after radical prostatectomy for clinically localized prostate cancer. Urol Int 76: 227–231.
- 104. Cheng L, Jones TD, Lin H, Eble JN, Zeng G, et al. (2005) Lymphovascular invasion is an independent prognostic factor in prostatic adenocarcinoma. J Urol 174: 2181–2185.
- 105. Malmstrom PU (2005) Lymph node staging in prostatic carcinoma revisited. Acta Oncol 44: 593–598.
- 106. Palapattu GS, Allaf ME, Trock BJ, Epstein JI, Walsh PC (2004) Prostate specific antigen progression in men with lymph node metastases following radical prostatectomy: results of long-term followup. J Urol 172: 1860–1864.
- 107. Rogers CG, Khan MA, Craig Miller M, Veltri RW, Partin AW (2004) Natural history of disease progression in patients who fail to achieve an undetectable prostate-specific antigen level after undergoing radical prostatectomy. Cancer 101: 2549–2556.
- 108. Shoskes DA, Trachtenberg J (1993) The value of prostate-specific antigen levels in pelvic lymph nodes for diagnosing metastatic spread of prostate cancer. Can J Surg 36: 33–36.
- 109. Bishoff JT, Reyes A, Thompson IM, Harris MJ, St Clair SR, et al. (1995) Pelvic lymphadenectomy can be omitted in selected patients with carcinoma of the prostate: development of a system of patient selection. Urology 45: 270–274.
- 110. Haqq C, Nosrati M, Sudilovsky D, Crothers J, Khodabakhsh D, et al. (2005) The gene expression signatures of melanoma progression. Proc Natl Acad Sci U S A 102: 6092–6097.
- 111. Chen X (1999) The p53 family: same response, different signals? Mol Med Today 5: 387–392.
- 112. Johnson J, Lagowski J, Sundberg A, Kulesz-Martin M (2005) P53 family activities in development and cancer: relationship to melanocyte and keratinocyte carcinogenesis. J Invest Dermatol 125: 857–864.
- 113. Tomkova K, Tomka M, Zajac V (2008) Contribution of p53, p63, and p73 to the developmental diseases and cancer. Neoplasma 55: 177–181.
- 114. Blandino G, Dobbelstein M (2004) p73 and p63: why do we still need them? Cell Cycle 3: 886–894.
- 115. Petitjean A, Hainaut P, Caron de Fromentel C (2006) TP63 gene in stress response and carcinogenesis: a broader role than expected. Bull Cancer 93: E126–135.
- 116. Petitjean A, Ruptier C, Tribollet V, Hautefeuille A, Chardon F, et al. (2008) Properties of the six isoforms of p63: p53-like regulation in response to genotoxic stress and cross talk with DeltaNp73. Carcinogenesis 29: 273–281.
- 117. Lena AM, Shalom-Feuerstein R, di Val Cervo PR, Aberdam D, Knight RA, et al. (2008) miR-203 represses ‘stemness’ by repressing DeltaNp63. Cell Death Differ 15: 1187–1195.
- 118. Kulesz-Martin M, Lagowski J, Fei S, Pelz C, Sears R, et al. (2005) Melanocyte and keratinocyte carcinogenesis: p53 family protein activities and intersecting mRNA expression profiles. J Investig Dermatol Symp Proc 10: 142–152.
- 119. Brinck U, Ruschenburg I, Di Como CJ, Buschmann N, Betke H, et al. (2002) Comparative study of p63 and p53 expression in tissue microarrays of malignant melanomas. Int J Mol Med 10: 707–711.
- 120. Sbisa E, Mastropasqua G, Lefkimmiatis K, Caratozzolo MF, D'Erchia AM, et al. (2006) Connecting p63 to cellular proliferation: the example of the adenosine deaminase target gene. Cell Cycle 5: 205–212.
- 121. Desrosiers MD, Cembrola KM, Fakir MJ, Stephens LA, Jama FM, et al. (2007) Adenosine deamination sustains dendritic cell activation in inflammation. J Immunol 179: 1884–1892.
- 122. Finlan LE, Nenutil R, Ibbotson SH, Vojtesek B, Hupp TR (2006) CK2-site phosphorylation of p53 is induced in DeltaNp63 expressing basal stem cells in UVB irradiated human skin. Cell Cycle 5: 2489–2494.
- 123. Aghaei M, Karami-Tehrani F, Salami S, Atri M (2005) Adenosine deaminase activity in the serum and malignant tumors of breast cancer: the assessment of isoenzyme ADA1 and ADA2 activities. Clin Biochem 38: 887–891.
- 124. Kalcioglu MT, Kizilay A, Yilmaz HR, Uz E, Gulec M, et al. (2004) Adenosine deaminase, xanthine oxidase, superoxide dismutase, glutathione peroxidase activities and malondialdehyde levels in the sera of patients with head and neck carcinoma. Kulak Burun Bogaz Ihtis Derg 12: 16–22.
- 125. Eroglu A, Canbolat O, Demirci S, Kocaoglu H, Eryavuz Y, et al. (2000) Activities of adenosine deaminase and 5′-nucleotidase in cancerous and noncancerous human colorectal tissues. Med Oncol 17: 319–324.
- 126. Hussein NG, el-Belbessy SF (1998) Serum adenosine deaminase and arylsulphatase A as an index of early infiltration of central nervous system in acute lymphoblastic leukemia. J Egypt Public Health Assoc 73: 97–109.
- 127. Canbolat O, Akyol O, Kavutcu M, Isik AU, Durak I (1994) Serum adenosine deaminase and total superoxide dismutase activities before and after surgical removal of cancerous laryngeal tissue. J Laryngol Otol 108: 849–851.
- 128. Gilmore BF, Lynas JF, Scott CJ, McGoohan C, Martin L, et al. (2006) Dipeptide proline diphenyl phosphonates are potent, irreversible inhibitors of seprase (FAPalpha). Biochem Biophys Res Commun 346: 436–446.
- 129. Wesley UV, McGroarty M, Homoyouni A (2005) Dipeptidyl peptidase inhibits malignant phenotype of prostate cancer cells by blocking basic fibroblast growth factor signaling pathway. Cancer Res 65: 1325–1334.
- 130. Roesch A, Wittschier S, Becker B, Landthaler M, Vogt T (2006) Loss of dipeptidyl peptidase IV immunostaining discriminates malignant melanomas from deep penetrating nevi. Mod Pathol 19: 1378–1385.
- 131. Morrison ME, Vijayasaradhi S, Engelstein D, Albino AP, Houghton AN (1993) A marker for neoplastic progression of human melanocytes is a cell surface ectopeptidase. J Exp Med 177: 1135–1143.
- 132. Van den Oord JJ (1998) Expression of CD26/dipeptidyl-peptidase IV in benign and malignant pigment-cell lesions of the skin. Br J Dermatol 138: 615–621.
- 133. Linden J (2006) Adenosine metabolism and cancer. Focus on “Adenosine downregulates DPPIV on HT-29 colon cancer cells by stimulating protein tyrosine phosphatases and reducing ERK1/2 activity via a novel pathway”. Am J Physiol Cell Physiol 291: C405–406.
- 134. Tan EY, Mujoomdar M, Blay J (2004) Adenosine down-regulates the surface expression of dipeptidyl peptidase IV on HT-29 human colorectal carcinoma cells: implications for cancer cell behavior. Am J Pathol 165: 319–330.
- 135. Rai R, Phadnis A, Haralkar S, Badwe RA, Dai H, et al. (2008) Differential regulation of centrosome integrity by DNA damage response proteins. Cell Cycle 7: 2225–2233.
- 136. Glinsky GV (2006) Genomic models of metastatic cancer: functional analysis of death-from-cancer signature genes reveals aneuploid, anoikis-resistant, metastasis-enabling phenotype with altered cell cycle control and activated Polycomb Group (PcG) protein chromatin silencing pathway. Cell Cycle 5: 1208–1216.
- 137. Weichert W, Ullrich A, Schmidt M, Gekeler V, Noske A, et al. (2006) Expression patterns of polo-like kinase 1 in human gastric cancer. Cancer Sci 97: 271–276.
- 138. Yamamoto Y, Matsuyama H, Kawauchi S, Matsumoto H, Nagao K, et al. (2006) Overexpression of polo-like kinase 1 (PLK1) and chromosomal instability in bladder cancer. Oncology 70: 231–237.
- 139. Weichert W, Kristiansen G, Schmidt M, Gekeler V, Noske A, et al. (2005) Polo-like kinase 1 expression is a prognostic factor in human colon cancer. World J Gastroenterol 11: 5644–5650.
- 140. Takahashi T, Sano B, Nagata T, Kato H, Sugiyama Y, et al. (2003) Polo-like kinase 1 (PLK1) is overexpressed in primary colorectal cancers. Cancer Sci 94: 148–152.
- 141. Wang XQ, Zhu YQ, Lui KS, Cai Q, Lu P, et al. (2008) Aberrant Polo-like kinase 1-Cdc25A pathway in metastatic hepatocellular carcinoma. Clin Cancer Res 14: 6813–6820.
- 142. Ito Y, Nakamura Y, Yoshida H, Tomoda C, Uruno T, et al. (2005) Polo-like kinase 1 expression in medullary carcinoma of the thyroid: its relationship with clinicopathological features. Pathobiology 72: 186–190.
- 143. Bu Y, Yang Z, Li Q, Song F (2008) Silencing of polo-like kinase (Plk) 1 via siRNA causes inhibition of growth and induction of apoptosis in human esophageal cancer cells. Oncology 74: 198–206.
- 144. Gray PJ Jr, Bearss DJ, Han H, Nagle R, Tsao MS, et al. (2004) Identification of human polo-like kinase 1 as a potential therapeutic target in pancreatic cancer. Mol Cancer Ther 3: 641–646.
- 145. Bhanot G, Alexe G, Levine AJ, Stolovitzky G (2005) Robust diagnosis of non-Hodgkin lymphoma phenotypes validated on gene expression data from different laboratories. Genome Inform 16: 233–244.
- 146. Weichert W, Kristiansen G, Winzer KJ, Schmidt M, Gekeler V, et al. (2005) Polo-like kinase isoforms in breast cancer: expression patterns and prognostic implications. Virchows Arch 446: 442–450.
- 147. Kneisel L, Strebhardt K, Bernd A, Wolter M, Binder A, et al. (2002) Expression of polo-like kinase (PLK1) in thin melanomas: a novel marker of metastatic disease. J Cutan Pathol 29: 354–358.
- 148. Huang X, Zhuang L, Cao Y, Gao Q, Han Z, et al. (2008) Biodistribution and kinetics of the novel selective oncolytic adenovirus M1 after systemic administration. Mol Cancer Ther 7: 1624–1632.
- 149. Judge AD, Robbins M, Tavakoli I, Levi J, Hu L, et al. (2009) Confirming the RNAi-mediated mechanism of action of siRNA-based cancer therapeutics in mice. J Clin Invest 119: 661–673.
- 150. Liu X, Lei M, Erikson RL (2006) Normal cells, but not cancer cells, survive severe Plk1 depletion. Mol Cell Biol 26: 2093–2108.
- 151. Liu X, Erikson RL (2003) Polo-like kinase (Plk)1 depletion induces apoptosis in cancer cells. Proc Natl Acad Sci U S A 100: 5789–5794.
- 152. Zhou Q, Su Y, Bai M (2008) Effect of antisense RNA targeting Polo-like kinase 1 on cell growth in A549 lung cancer cells. J Huazhong Univ Sci Technolog Med Sci 28: 22–26.
- 153. Sur S, Pagliarini R, Bunz F, Rago C, Diaz LA Jr, et al. (2009) A panel of isogenic human cancer cells suggests a therapeutic approach for cancers with inactivated p53. Proc Natl Acad Sci U S A 106: 3964–3969.
- 154. Lei M, Erikson RL (2008) Plk1 depletion in nontransformed diploid cells activates the DNA-damage checkpoint. Oncogene 27: 3935–3943.
- 155. Chang JT, Nevins JR (2006) GATHER: a systems approach to interpreting genomic signatures. Bioinformatics 22: 2926–2933.
- 156. Reimand J, Kull M, Peterson H, Hansen J, Vilo J (2007) g:Profiler–a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res 35: W193–200.
- 157. Steijlen PM, van Steensel MA, Jansen BJ, Blokx W, van de Kerkhof PC, et al. (2004) Cryptic splicing at a non-consensus splice-donor in a patient with a novel mutation in the plakophilin-1 gene. J Invest Dermatol 122: 1321–1324.
- 158. Lai-Cheong JE, Arita K, McGrath JA (2007) Genetic diseases of junctions. J Invest Dermatol 127: 2713–2725.
- 159. Ersoy-Evans S, Erkin G, Fassihi H, Chan I, Paller AS, et al. (2006) Ectodermal dysplasia-skin fragility syndrome resulting from a new homozygous mutation, 888delC, in the desmosomal protein plakophilin 1. J Am Acad Dermatol 55: 157–161.
- 160. Wessagowit V, McGrath JA (2005) Clinical and molecular significance of splice site mutations in the plakophilin 1 gene in patients with ectodermal dysplasia-skin fragility syndrome. Acta Derm Venereol 85: 386–388.
- 161. McGrath JA (2005) Inherited disorders of desmosomes. Australas J Dermatol 46: 221–229.
- 162. Boralevi F, Haftek M, Vabres P, Lepreux S, Goizet C, et al. (2005) Hereditary mucoepithelial dysplasia: clinical, ultrastructural and genetic study of eight patients and literature review. Br J Dermatol 153: 310–318.
- 163. Sprecher E, Molho-Pessach V, Ingber A, Sagi E, Indelman M, et al. (2004) Homozygous splice site mutations in PKP1 result in loss of epidermal plakophilin 1 expression and underlie ectodermal dysplasia/skin fragility syndrome in two consanguineous families. J Invest Dermatol 122: 647–651.
- 164. Cheng X, Mihindukulasuriya K, Den Z, Kowalczyk AP, Calkins CC, et al. (2004) Assessment of splice variant-specific functions of desmocollin 1 in the skin. Mol Cell Biol 24: 154–163.
- 165. McMillan JR, Shimizu H (2001) Desmosomes: structure and function in normal and diseased epidermis. J Dermatol 28: 291–298.
- 166. Whittock NV, Haftek M, Angoulvant N, Wolf F, Perrot H, et al. (2000) Genomic amplification of the human plakophilin 1 gene and detection of a new mutation in ectodermal dysplasia/skin fragility syndrome. J Invest Dermatol 115: 368–374.
- 167. Whittock NV, Eady RA, McGrath JA (2000) Genomic organization and amplification of the human plakoglobin gene (JUP). Exp Dermatol 9: 323–326.
- 168. McGrath JA, Hoeger PH, Christiano AM, McMillan JR, Mellerio JE, et al. (1999) Skin fragility and hypohidrotic ectodermal dysplasia resulting from ablation of plakophilin 1. Br J Dermatol 140: 297–307.
- 169. McGrath JA (1999) Hereditary diseases of desmosomes. J Dermatol Sci 20: 85–91.
- 170. McGrath JA (1999) A novel genodermatosis caused by mutations in plakophilin 1, a structural component of desmosomes. J Dermatol 26: 764–769.
- 171. Sobolik-Delmaire T, Katafiasz D, Keim SA, Mahoney MG, Wahl JK 3rd (2007) Decreased plakophilin-1 expression promotes increased motility in head and neck squamous cell carcinoma cells. Cell Commun Adhes 14: 99–109.
- 172. Moll I, Kurzen H, Langbein L, Franke WW (1997) The distribution of the desmosomal protein, plakophilin 1, in human skin and skin tumors. J Invest Dermatol 108: 139–146.
- 173. Acehan D, Petzold C, Gumper I, Sabatini DD, Muller EJ, et al. (2008) Plakoglobin is required for effective intermediate filament anchorage to desmosomes. J Invest Dermatol 128: 2665–2675.
- 174. Schmidt A, Jager S (2005) Plakophilins–hard work in the desmosome, recreation in the nucleus? Eur J Cell Biol 84: 189–204.
- 175. Hatzfeld M, Haffner C, Schulze K, Vinzens U (2000) The function of plakophilin 1 in desmosome assembly and actin filament organization. J Cell Biol 149: 209–222.
- 176. Hatzfeld M (2007) Plakophilins: Multifunctional proteins or just regulators of desmosomal adhesion? Biochim Biophys Acta 1773: 69–77.
- 177. Setzer SV, Calkins CC, Garner J, Summers S, Green KJ, et al. (2004) Comparative analysis of armadillo family proteins in the regulation of a431 epithelial cell junction assembly, adhesion and migration. J Invest Dermatol 123: 426–433.
- 178. Godsel LM, Hsieh SN, Amargo EV, Bass AE, Pascoe-McGillicuddy LT, et al. (2005) Desmoplakin assembly dynamics in four dimensions: multiple phases differentially regulated by intermediate filaments and actin. J Cell Biol 171: 1045–1059.
- 179. Wahl JK 3rd (2005) A role for plakophilin-1 in the initiation of desmosome assembly. J Cell Biochem 96: 390–403.
- 180. South AP, Wan H, Stone MG, Dopping-Hepenstal PJ, Purkis PE, et al. (2003) Lack of plakophilin 1 increases keratinocyte migration and reduces desmosome stability. J Cell Sci 116: 3303–3314.
- 181. Kami K, Chidgey M, Dafforn T, Overduin M (2009) The desmoglein-specific cytoplasmic region is intrinsically disordered in solution and interacts with multiple desmosomal protein partners. J Mol Biol 386: 531–543.
- 182. Weiske J, Schoneberg T, Schroder W, Hatzfeld M, Tauber R, et al. (2001) The fate of desmosomal proteins in apoptotic cells. J Biol Chem 276: 41175–41181.
- 183. Ishibashi K, Hara S, Kondo S (2008) Aquaporin water channels in mammals. Clin Exp Nephrol.
- 184. Boury-Jamot M, Daraspe J, Bonte F, Perrier E, Schnebert S, et al. (2009) Skin aquaporins: function in hydration, wound healing, and skin epidermis homeostasis. Handb Exp Pharmacol 205–217.
- 185. Kida H, Miyoshi T, Manabe K, Takahashi N, Konno T, et al. (2005) Roles of aquaporin-3 water channels in volume-regulatory water flow in a human epithelial cell line. J Membr Biol 208: 55–64.
- 186. Sugiyama Y, Ota Y, Hara M, Inoue S (2001) Osmotic stress up-regulates aquaporin-3 gene expression in cultured human keratinocytes. Biochim Biophys Acta 1522: 82–88.
- 187. Olsson M, Broberg A, Jernas M, Carlsson L, Rudemo M, et al. (2006) Increased expression of aquaporin 3 in atopic eczema. Allergy 61: 1132–1137.
- 188. Bienert GP, Moller AL, Kristiansen KA, Schulz A, Moller IM, et al. (2007) Specific aquaporins facilitate the diffusion of hydrogen peroxide across membranes. J Biol Chem 282: 1183–1192.
- 189. Cao C, Wan S, Jiang Q, Amaral A, Lu S, et al. (2008) All-trans retinoic acid attenuates ultraviolet radiation-induced down-regulation of aquaporin-3 and water permeability in human keratinocytes. J Cell Physiol 215: 506–516.
- 190. Fruehauf JP, Meyskens FL Jr (2007) Reactive oxygen species: a breath of life or death? Clin Cancer Res 13: 789–794.
- 191. Swisshelm K, Macek R, Kubbies M (2005) Role of claudins in tumorigenesis. Adv Drug Deliv Rev 57: 919–928.
- 192. Arabzadeh A, Troy TC, Turksen K (2007) Changes in the distribution pattern of Claudin tight junction proteins during the progression of mouse skin tumorigenesis. BMC Cancer 7: 196.
- 193. Chao YC, Pan SH, Yang SC, Yu SL, Che TF, et al. (2009) Claudin-1 is a metastasis suppressor and correlates with clinical outcome in lung adenocarcinoma. Am J Respir Crit Care Med 179: 123–133.
- 194. Hoevel T, Macek R, Swisshelm K, Kubbies M (2004) Reexpression of the TJ protein CLDN1 induces apoptosis in breast tumor spheroids. Int J Cancer 108: 374–383.
- 195. Tokes AM, Kulka J, Paku S, Szik A, Paska C, et al. (2005) Claudin-1, -3 and -4 proteins and mRNA expression in benign and malignant breast lesions: a research study. Breast Cancer Res 7: R296–305.
- 196. Morohashi S, Kusumi T, Sato F, Odagiri H, Chiba H, et al. (2007) Decreased expression of claudin-1 correlates with recurrence status in breast cancer. Int J Mol Med 20: 139–143.
- 197. Murakami T, Toda S, Fujimoto M, Ohtsuki M, Byers HR, et al. (2001) Constitutive activation of Wnt/beta-catenin signaling pathway in migration-active melanoma cells: role of LEF-1 in melanoma with increased metastatic potential. Biochem Biophys Res Commun 288: 8–15.
- 198. Brown LF, Papadopoulos-Sergiou A, Berse B, Manseau EJ, Tognazzi K, et al. (1994) Osteopontin expression and distribution in human carcinomas. Am J Pathol 145: 610–623.
- 199. Katagiri Y, Mori K, Hara T, Tanaka K, Murakami M, et al. (1995) Functional analysis of the osteopontin molecule. Ann N Y Acad Sci 760: 371–374.
- 200. Chellaiah M, Fitzgerald C, Filardo EJ, Cheresh DA, Hruska KA (1996) Osteopontin activation of c-src in human melanoma cells requires the cytoplasmic domain of the integrin alpha v-subunit. Endocrinology 137: 2432–2440.
- 201. Smith LL, Cheung HK, Ling LE, Chen J, Sheppard D, et al. (1996) Osteopontin N-terminal domain contains a cryptic adhesive sequence recognized by alpha9beta1 integrin. J Biol Chem 271: 28485–28491.
- 202. Jang A, Hill RP (1997) An examination of the effects of hypoxia, acidosis, and glucose starvation on the expression of metastasis-associated genes in murine tumor cells. Clin Exp Metastasis 15: 469–483.
- 203. Smith LL, Giachelli CM (1998) Structural requirements for alpha 9 beta 1-mediated adhesion and migration to thrombin-cleaved osteopontin. Exp Cell Res 242: 351–360.
- 204. Gilbert M, Shaw WJ, Long JR, Nelson K, Drobny GP, et al. (2000) Chimeric peptides of statherin and osteopontin that bind hydroxyapatite and mediate cell adhesion. J Biol Chem 275: 16213–16218.
- 205. Harant H, Spinner D, Reddy GS, Lindley IJ (2000) Natural metabolites of 1alpha, 25-dihydroxyvitamin D(3) retain biologic activity mediated through the vitamin D receptor. J Cell Biochem 78: 112–120.
- 206. Nemoto H, Rittling SR, Yoshitake H, Furuya K, Amagasa T, et al. (2001) Osteopontin deficiency reduces experimental tumor cell metastasis to bone and soft tissues. J Bone Miner Res 16: 652–659.
- 207. Philip S, Bulbule A, Kundu GC (2001) Osteopontin stimulates tumor growth and activation of promatrix metalloproteinase-2 through nuclear factor-kappa B-mediated induction of membrane type 1 matrix metalloproteinase in murine melanoma cells. J Biol Chem 276: 44926–44935.
- 208. Geissinger E, Weisser C, Fischer P, Schartl M, Wellbrock C (2002) Autocrine stimulation by osteopontin contributes to antiapoptotic signalling of melanocytes in dermal collagen. Cancer Res 62: 4820–4828.
- 209. Rashid MM (2002) [Cooperative role of osteopontin with type I collagen on the metastasis of murine melanoma cells]. Hokkaido Igaku Zasshi 77: 341–350.
- 210. Philip S, Kundu GC (2003) Osteopontin induces nuclear factor kappa B-mediated promatrix metalloproteinase-2 activation through I kappa B alpha/IKK signaling pathways, and curcumin (diferulolylmethane) down-regulates these pathways. J Biol Chem 278: 14487–14497.
- 211. Ohyama Y, Nemoto H, Rittling S, Tsuji K, Amagasa T, et al. (2004) Osteopontin-deficiency suppresses growth of B16 melanoma cells implanted in bone and osteoclastogenesis in co-cultures. J Bone Miner Res 19: 1706–1711.
- 212. Rangaswami H, Bulbule A, Kundu GC (2004) Nuclear factor-inducing kinase plays a crucial role in osteopontin-induced MAPK/IkappaBalpha kinase-dependent nuclear factor kappaB-mediated promatrix metalloproteinase-9 activation. J Biol Chem 279: 38921–38935.
- 213. Skamrov AV, Nechaenko MA, Goryunova LE, Feoktistova ES, Khaspekov GL, et al. (2004) Gene expression analysis to identify mRNA markers of cardiac myxoma. J Mol Cell Cardiol 37: 717–733.
- 214. Das R, Philip S, Mahabeleshwar GH, Bulbule A, Kundu GC (2005) Osteopontin: it's role in regulation of cell motility and nuclear factor kappa B-mediated urokinase type plasminogen activator expression. IUBMB Life 57: 441–447.
- 215. Denhardt D (2005) Osteopontin expression correlates with melanoma invasion. J Invest Dermatol 124: xvi–xviii.
- 216. Sturm RA (2005) Osteopontin in melanocytic lesions–a first step towards invasion? J Invest Dermatol 124: xiv–xv.
- 217. Zhou Y, Dai DL, Martinka M, Su M, Zhang Y, et al. (2005) Osteopontin expression correlates with melanoma invasion. J Invest Dermatol 124: 1044–1052.
- 218. Kadkol SS, Lin AY, Barak V, Kalickman I, Leach L, et al. (2006) Osteopontin expression and serum levels in metastatic uveal melanoma: a pilot study. Invest Ophthalmol Vis Sci 47: 802–806.
- 219. Packer L, Pavey S, Parker A, Stark M, Johansson P, et al. (2006) Osteopontin is a downstream effector of the PI3-kinase pathway in melanomas that is inversely correlated with functional PTEN. Carcinogenesis 27: 1778–1786.
- 220. Samanna V, Wei H, Ego-Osuala D, Chellaiah MA (2006) Alpha-V-dependent outside-in signaling is required for the regulation of CD44 surface expression, MMP-2 secretion, and cell migration by osteopontin in human melanoma cells. Exp Cell Res 312: 2214–2230.
- 221. Tscheudschilsuren G, Bosserhoff AK, Schlegel J, Vollmer D, Anton A, et al. (2006) Regulation of mesenchymal stem cell and chondrocyte differentiation by MIA. Exp Cell Res 312: 63–72.
- 222. Soikkeli J, Lukk M, Nummela P, Virolainen S, Jahkola T, et al. (2007) Systematic search for the best gene expression markers for melanoma micrometastasis detection. J Pathol 213: 180–189.
- 223. Rudzki Z, Jothy S (1997) CD44 and the adhesion of neoplastic cells. Mol Pathol 50: 57–71.
- 224. Katagiri YU, Murakami M, Mori K, Iizuka J, Hara T, et al. (1996) Non-RGD domains of osteopontin promote cell adhesion without involving alpha v integrins. J Cell Biochem 62: 123–131.
- 225. Craig AM, Bowden GT, Chambers AF, Spearman MA, Greenberg AH, et al. (1990) Secreted phosphoprotein mRNA is induced during multi-stage carcinogenesis in mouse skin and correlates with the metastatic potential of murine fibroblasts. Int J Cancer 46: 133–137.
- 226. Rangel J, Nosrati M, Torabian S, Shaikh L, Leong SP, et al. (2008) Osteopontin as a molecular prognostic marker for melanoma. Cancer 112: 144–150.
- 227. Steigemann P, Wurzenberger C, Schmitz MH, Held M, Guizetti J, et al. (2009) Aurora B-mediated abscission checkpoint protects against tetraploidization. Cell 136: 473–484.
- 228. Arbitrario JP, Belmont BJ, Evanchik MJ, Flanagan WM, Fucini RV, et al. (2009) SNS-314, a pan-Aurora kinase inhibitor, shows potent anti-tumor activity and dosing flexibility in vivo. Cancer Chemother Pharmacol.
- 229. Rosner MR (2007) MAP kinase meets mitosis: a role for Raf Kinase Inhibitory Protein in spindle checkpoint regulation. Cell Div 2: 1.
- 230. Lewis TB, Robison JE, Bastien R, Milash B, Boucher K, et al. (2005) Molecular classification of melanoma using real-time quantitative reverse transcriptase-polymerase chain reaction. Cancer 104: 1678–1686.
- 231. Riker AI, Enkemann SA, Fodstad O, Liu S, Ren S, et al. (2008) The gene expression profiles of primary and metastatic melanoma yields a transition point of tumor progression and metastasis. BMC Med Genomics 1: 13.
- 232. Heenen M, Laporte M (2003) [Molecular markers associated to prognosis of melanoma]. Ann Dermatol Venereol 130: 1025–1031.
- 233. Kaufmann WK, Nevis KR, Qu P, Ibrahim JG, Zhou T, et al. (2008) Defective cell cycle checkpoint functions in melanoma are associated with altered patterns of gene expression. J Invest Dermatol 128: 175–187.
- 234. Clarke LE, Fountaine TJ, Hennessy J, Bruggeman RD, Clarke JT, et al. (2009) Cdc7 expression in melanomas, Spitz tumors and melanocytic nevi. J Cutan Pathol 36: 433–438.
- 235. Estler M, Boskovic G, Denvir J, Miles S, Primerano DA, et al. (2008) Global analysis of gene expression changes during retinoic acid-induced growth arrest and differentiation of melanoma: comparison to differentially expressed genes in melanocytes vs melanoma. BMC Genomics 9: 478.
- 236. Mishima M, Pavicic V, Gruneberg U, Nigg EA, Glotzer M (2004) Cell cycle regulation of central spindle assembly. Nature 430: 908–913.
- 237. Gruneberg U, Neef R, Honda R, Nigg EA, Barr FA (2004) Relocation of Aurora B from centromeres to the central spindle at the metaphase to anaphase transition requires MKlp2. J Cell Biol 166: 167–172.
- 238. Venoux M, Basbous J, Berthenet C, Prigent C, Fernandez A, et al. (2008) ASAP is a novel substrate of the oncogenic mitotic kinase Aurora-A: phosphorylation on Ser625 is essential to spindle formation and mitosis. Hum Mol Genet 17: 215–224.
- 239. Saffin JM, Venoux M, Prigent C, Espeut J, Poulat F, et al. (2005) ASAP, a human microtubule-associated protein required for bipolar spindle assembly and cytokinesis. Proc Natl Acad Sci U S A 102: 11302–11307.
- 240. Ryu B, Kim DS, Deluca AM, Alani RM (2007) Comprehensive expression profiling of tumor cell lines identifies molecular signatures of melanoma progression. PLoS One 2: e594.
- 241. O'Regan L, Fry AM (2009) The Nek6 and Nek7 protein kinases are required for robust mitotic spindle formation and cytokinesis. Mol Cell Biol 29: 3975–3990.
- 242. Rapley J, Nicolas M, Groen A, Regue L, Bertran MT, et al. (2008) The NIMA-family kinase Nek6 phosphorylates the kinesin Eg5 at a novel site necessary for mitotic spindle formation. J Cell Sci 121: 3912–3921.
- 243. Takeno A, Takemasa I, Doki Y, Yamasaki M, Miyata H, et al. (2008) Integrative approach for differentially overexpressed genes in gastric cancer by combining large-scale gene expression profiling and network analysis. Br J Cancer 99: 1307–1315.
- 244. Lee MY, Kim HJ, Kim MA, Jee HJ, Kim AJ, et al. (2008) Nek6 is involved in G2/M phase cell cycle arrest through DNA damage-induced phosphorylation. Cell Cycle 7: 2705–2709.
- 245. Schmit TL, Zhong W, Nihal M, Ahmad N (2009) Polo-like kinase 1 (Plk1) in non-melanoma skin cancers. Cell Cycle 8: 2697–2702.
- 246. Schmit TL, Zhong W, Setaluri V, Spiegelman VS, Ahmad N (2009) Targeted Depletion of Polo-Like Kinase (Plk) 1 Through Lentiviral shRNA or a Small-Molecule Inhibitor Causes Mitotic Catastrophe and Induction of Apoptosis in Human Melanoma Cells. J Invest Dermatol.
- 247. Winnepenninckx V, Debiec-Rychter M, Belien JA, Fiten P, Michiels S, et al. (2006) Expression and possible role of hPTTG1/securin in cutaneous malignant melanoma. Mod Pathol 19: 1170–1180.
- 248. Kasuno K, Naqvi A, Dericco J, Yamamori T, Santhanam L, et al. (2007) Antagonism of p66shc by melanoma inhibitory activity. Cell Death Differ 14: 1414–1421.
- 249. Fagiani E, Giardina G, Luzi L, Cesaroni M, Quarto M, et al. (2007) RaLP, a new member of the Src homology and collagen family, regulates cell migration and tumor growth of metastatic melanomas. Cancer Res 67: 3064–3073.
- 250. Pasini L, Turco MY, Luzi L, Aladowicz E, Fagiani E, et al. (2009) Melanoma: targeting signaling pathways and RaLP. Expert Opin Ther Targets 13: 93–104.
- 251. Halaban R, Cheng E, Smicun Y, Germino J (2000) Deregulated E2F transcriptional activity in autonomously growing melanoma cells. J Exp Med 191: 1005–1016.
- 252. Kumagai K, Nimura Y, Mizota A, Miyahara N, Aoki M, et al. (2006) Arpc1b gene is a candidate prediction marker for choroidal malignant melanomas sensitive to radiotherapy. Invest Ophthalmol Vis Sci 47: 2300–2304.
- 253. Kashani-Sabet M, Rangel J, Torabian S, Nosrati M, Simko J, et al. (2009) A multi-marker assay to distinguish malignant melanomas from benign nevi. Proc Natl Acad Sci U S A 106: 6268–6272.
- 254. Meng X, Lu H, Shen Z (2004) BCCIP functions through p53 to regulate the expression of p21Waf1/Cip1. Cell Cycle 3: 1457–1462.
- 255. Walter-Yohrling J, Cao X, Callahan M, Weber W, Morgenbesser S, et al. (2003) Identification of genes expressed in malignant cells that promote invasion. Cancer Res 63: 8939–8947.
- 256. Harlin H, Meng Y, Peterson AC, Zha Y, Tretiakova M, et al. (2009) Chemokine expression in melanoma metastases associated with CD8+ T-cell recruitment. Cancer Res 69: 3077–3085.
- 257. Mellado M, de Ana AM, Moreno MC, Martinez C, Rodriguez-Frade JM (2001) A potential immune escape mechanism by melanoma cells through the activation of chemokine-induced T cell death. Curr Biol 11: 691–696.
- 258. Yurkovetsky ZR, Kirkwood JM, Edington HD, Marrangoni AM, Velikokhatnaya L, et al. (2007) Multiplex analysis of serum cytokines in melanoma patients treated with interferon-alpha2b. Clin Cancer Res 13: 2422–2428.
- 259. Diaz-Martinez LA, Gimenez-Abian JF, Clarke DJ (2007) Regulation of centromeric cohesion by sororin independently of the APC/C. Cell Cycle 6: 714–724.
- 260. Schmitz J, Watrin E, Lenart P, Mechtler K, Peters JM (2007) Sororin is required for stable binding of cohesin to chromatin and for sister chromatid cohesion in interphase. Curr Biol 17: 630–636.
- 261. Rankin S (2005) Sororin, the cell cycle and sister chromatid cohesion. Cell Cycle 4: 1039–1042.
- 262. Rankin S, Ayad NG, Kirschner MW (2005) Sororin, a substrate of the anaphase-promoting complex, is required for sister chromatid cohesion in vertebrates. Mol Cell 18: 185–200.
- 263. Lee JH, Jeong MW, Kim W, Choi YH, Kim KT (2008) Cooperative roles of c-Abl and Cdk5 in regulation of p53 in response to oxidative stress. J Biol Chem 283: 19826–19835.
- 264. Jaeger J, Koczan D, Thiesen HJ, Ibrahim SM, Gross G, et al. (2007) Gene expression signatures for tumor progression, tumor subtype, and tumor thickness in laser-microdissected melanoma tissues. Clin Cancer Res 13: 806–815.
- 265. Horuk R, Chitnis CE, Darbonne WC, Colby TJ, Rybicki A, et al. (1993) A receptor for the malarial parasite Plasmodium vivax: the erythrocyte chemokine receptor. Science 261: 1182–1184.
- 266. Shattuck RL, Wood LD, Jaffe GJ, Richmond A (1994) MGSA/GRO transcription is differentially regulated in normal retinal pigment epithelial and melanoma cells. Mol Cell Biol 14: 791–802.
- 267. Wang D, Richmond A (2001) Nuclear factor-kappa B activation by the CXC chemokine melanoma growth-stimulatory activity/growth-regulated protein involves the MEKK1/p38 mitogen-activated protein kinase pathway. J Biol Chem 276: 3650–3659.
- 268. Mangahas CR, dela Cruz GV, Friedman-Jimenez G, Jamal S (2005) Endothelin-1 induces CXCL1 and CXCL8 secretion in human melanoma cells. J Invest Dermatol 125: 307–311.
- 269. Gallagher PG, Bao Y, Prorock A, Zigrino P, Nischt R, et al. (2005) Gene expression profiling reveals cross-talk between melanoma and fibroblasts: implications for host-tumor interactions in metastasis. Cancer Res 65: 4134–4146.
- 270. Ghosh S, Spagnoli GC, Martin I, Ploegert S, Demougin P, et al. (2005) Three-dimensional culture of melanoma cells profoundly affects gene expression profile: a high density oligonucleotide array study. J Cell Physiol 204: 522–531.
- 271. Mockenhaupt M, Peters F, Schwenk-Davoine I, Herouy Y, Schraufstatter I, et al. (2003) Evidence of involvement of CXC-chemokines in proliferation of cultivated human melanocytes. Int J Mol Med 12: 597–601.
- 272. Dhawan P, Richmond A (2002) Role of CXCL1 in tumorigenesis of melanoma. J Leukoc Biol 72: 9–18.
- 273. Payne AS, Cornelius LA (2002) The role of chemokines in melanoma tumor growth and metastasis. J Invest Dermatol 118: 915–922.
- 274. Middleman BR, Friedman M, Lawson DH, DeRose PB, Cohen C (2002) Melanoma growth stimulatory activity in primary malignant melanoma: prognostic significance. Mod Pathol 15: 532–537.
- 275. Dhawan P, Richmond A (2002) A novel NF-kappa B-inducing kinase-MAPK signaling pathway up-regulates NF-kappa B activity in melanoma cells. J Biol Chem 277: 7920–7928.
- 276. Yang J, Luan J, Yu Y, Li C, DePinho RA, et al. (2001) Induction of melanoma in murine macrophage inflammatory protein 2 transgenic mice heterozygous for inhibitor of kinase/alternate reading frame. Cancer Res 61: 8150–8157.
- 277. Yang J, Richmond A (2001) Constitutive IkappaB kinase activity correlates with nuclear factor-kappaB activation in human melanoma cells. Cancer Res 61: 4901–4909.
- 278. Wang D, Yang W, Du J, Devalaraja MN, Liang P, et al. (2000) MGSA/GRO-mediated melanocyte transformation involves induction of Ras expression. Oncogene 19: 4647–4659.
- 279. Haghnegahdar H, Du J, Wang D, Strieter RM, Burdick MD, et al. (2000) The tumorigenic and angiogenic effects of MGSA/GRO proteins in melanoma. J Leukoc Biol 67: 53–62.
- 280. Shih IM, Herlyn M (1994) Autocrine and paracrine roles for growth factors in melanoma. In Vivo 8: 113–123.
- 281. Tettelbach W, Nanney L, Ellis D, King L, Richmond A (1993) Localization of MGSA/GRO protein in cutaneous lesions. J Cutan Pathol 20: 259–266.
- 282. Richmond A, Thomas HG (1988) Melanoma growth stimulatory activity: isolation from human melanoma tumors and characterization of tissue distribution. J Cell Biochem 36: 185–198.
- 283. Thomas HG, Richmond A (1988) Immunoaffinity purification of melanoma growth stimulatory activity. Arch Biochem Biophys 260: 719–724.
- 284. Bordoni R, Fine R, Murray D, Richmond A (1990) Characterization of the role of melanoma growth stimulatory activity (MGSA) in the growth of normal melanocytes, nevocytes, and malignant melanocytes. J Cell Biochem 44: 207–219.
- 285. Owen JD, Strieter R, Burdick M, Haghnegahdar H, Nanney L, et al. (1997) Enhanced tumor-forming capacity for immortalized melanocytes expressing melanoma growth stimulatory activity/growth-regulated cytokine beta and gamma proteins. Int J Cancer 73: 94–103.
- 286. Di Cesare S, Marshall JC, Logan P, Antecka E, Faingold D, et al. (2007) Expression and migratory analysis of 5 human uveal melanoma cell lines for CXCL12, CXCL8, CXCL1, and HGF. J Carcinog 6: 2.
- 287. Sini P, Samarzija I, Baffert F, Littlewood-Evans A, Schnell C, et al. (2008) Inhibition of multiple vascular endothelial growth factor receptors (VEGFR) blocks lymph node metastases but inhibition of VEGFR-2 is sufficient to sensitize tumor cells to platinum-based chemotherapeutics. Cancer Res 68: 1581–1592.
- 288. Sun T, Sun BC, Ni CS, Zhao XL, Wang XH, et al. (2008) Pilot study on the interaction between B16 melanoma cell-line and bone-marrow derived mesenchymal stem cells. Cancer Lett 263: 35–43.
- 289. Tas F, Duranyildiz D, Oguz H, Camlica H, Yasasever V, et al. (2006) Circulating serum levels of angiogenic factors and vascular endothelial growth factor receptors 1 and 2 in melanoma patients. Melanoma Res 16: 405–411.
- 290. Prickett TD, Agrawal NS, Wei X, Yates KE, Lin JC, et al. (2009) Analysis of the tyrosine kinome in melanoma reveals recurrent mutations in ERBB4. Nat Genet.
- 291. Brychtova S, Bezdekova M, Brychta T, Tichy M (2008) The role of vascular endothelial growth factors and their receptors in malignant melanomas. Neoplasma 55: 273–279.
- 292. Bolander A, Wagenius G, Larsson A, Brattstrom D, Ullenhag G, et al. (2007) The role of circulating angiogenic factors in patients operated on for localized malignant melanoma. Anticancer Res 27: 3211–3217.
- 293. Gille J, Heidenreich R, Pinter A, Schmitz J, Boehme B, et al. (2007) Simultaneous blockade of VEGFR-1 and VEGFR-2 activation is necessary to efficiently inhibit experimental melanoma growth and metastasis formation. Int J Cancer 120: 1899–1908.
- 294. Singh N, Jani PD, Suthar T, Amin S, Ambati BK (2006) Flt-1 intraceptor induces the unfolded protein response, apoptotic factors, and regression of murine injury-induced corneal neovascularization. Invest Ophthalmol Vis Sci 47: 4787–4793.
- 295. Lacal PM, Ruffini F, Pagani E, D'Atri S (2005) An autocrine loop directed by the vascular endothelial growth factor promotes invasiveness of human melanoma cells. Int J Oncol 27: 1625–1632.
- 296. Graells J, Vinyals A, Figueras A, Llorens A, Moreno A, et al. (2004) Overproduction of VEGF concomitantly expressed with its receptors promotes growth and survival of melanoma cells through MAPK and PI3K signaling. J Invest Dermatol 123: 1151–1161.
- 297. Hiratsuka S, Nakamura K, Iwai S, Murakami M, Itoh T, et al. (2002) MMP9 induction by vascular endothelial growth factor receptor-1 is involved in lung-specific metastasis. Cancer Cell 2: 289–300.
- 298. Hornig C, Barleon B, Ahmad S, Vuorela P, Ahmed A, et al. (2000) Release and complex formation of soluble VEGFR-1 from endothelial cells and biological fluids. Lab Invest 80: 443–454.
- 299. Lin EY, Piepkorn M, Garcia R, Byrd D, Tsou R, et al. (1999) Angiogenesis and vascular growth factor receptor expression in malignant melanoma. Plast Reconstr Surg 104: 1666–1674.
- 300. Gray CP, Arosio P, Hersey P (2003) Association of increased levels of heavy-chain ferritin with increased CD4+ CD25+ regulatory T-cell levels in patients with melanoma. Clin Cancer Res 9: 2551–2559.
- 301. Gray CP, Arosio P, Hersey P (2002) Heavy chain ferritin activates regulatory T cells by induction of changes in dendritic cells. Blood 99: 3326–3334.
- 302. Gray CP, Franco AV, Arosio P, Hersey P (2001) Immunosuppressive effects of melanoma-derived heavy-chain ferritin are dependent on stimulation of IL-10 production. Int J Cancer 92: 843–850.
- 303. Pham CG, Bubici C, Zazzeroni F, Papa S, Jones J, et al. (2004) Ferritin heavy chain upregulation by NF-kappaB inhibits TNFalpha-induced apoptosis by suppressing reactive oxygen species. Cell 119: 529–542.
- 304. Moser J, Kool H, Giakzidis I, Caldecott K, Mullenders LH, et al. (2007) Sealing of chromosomal DNA nicks during nucleotide excision repair requires XRCC1 and DNA ligase III alpha in a cell-cycle-specific manner. Mol Cell 27: 311–323.
- 305. Fachin AL, Mello SS, Sandrin-Garcia P, Junta CM, Ghilardi-Netto T, et al. (2009) Gene expression profiles in radiation workers occupationally exposed to ionizing radiation. J Radiat Res (Tokyo) 50: 61–71.
- 306. Lacal PM, Failla CM, Pagani E, Odorisio T, Schietroma C, et al. (2000) Human melanoma cells secrete and respond to placenta growth factor and vascular endothelial growth factor. J Invest Dermatol 115: 1000–1007.
- 307. Graeven U, Rodeck U, Karpinski S, Jost M, Andre N, et al. (2000) Expression patterns of placenta growth factor in human melanocytic cell lines. J Invest Dermatol 115: 118–123.
- 308. Bagri A, Tessier-Lavigne M, Watts RJ (2009) Neuropilins in tumor biology. Clin Cancer Res 15: 1860–1864.
- 309. Capparuccia L, Tamagnone L (2009) Semaphorin signaling in cancer cells and in cells of the tumor microenvironment–two sides of a coin. J Cell Sci 122: 1723–1736.
- 310. Prislei S, Mozzetti S, Filippetti F, De Donato M, Raspaglio G, et al. (2008) From plasma membrane to cytoskeleton: a novel function for semaphorin 6A. Mol Cancer Ther 7: 233–241.
- 311. Flannery E, Duman-Scheel M (2009) Semaphorins at the interface of development and cancer. Curr Drug Targets 10: 611–619.
- 312. Katoh M (2007) Comparative integromics on non-canonical WNT or planar cell polarity signaling molecules: transcriptional mechanism of PTK7 in colorectal cancer and that of SEMA6A in undifferentiated ES cells. Int J Mol Med 20: 405–409.