To better understand prostate function and disease, it is important to define and explore the molecular constituents that signify the prostate gland. The aim of this study was to define the prostate specific transcriptome and proteome, in comparison to 26 other human tissues. Deep sequencing of mRNA (RNA-seq) and immunohistochemistry-based protein profiling were combined to identify prostate specific gene expression patterns and to explore tissue biomarkers for potential clinical use in prostate cancer diagnostics. We identified 203 genes with elevated expression in the prostate, 22 of which showed more than five-fold higher expression levels compared to all other tissue types. In addition to previously well-known proteins we identified two poorly characterized proteins, TMEM79 and ACOXL, with potential to differentiate between benign and cancerous prostatic glands in tissue biopsies. In conclusion, we have applied a genome-wide analysis to identify the prostate specific proteome using transcriptomics and antibody-based protein profiling to identify genes with elevated expression in the prostate. Our data provides a starting point for further functional studies to explore the molecular repertoire of normal and diseased prostate including potential prostate cancer markers such as TMEM79 and ACOXL.
Citation: O'Hurley G, Busch C, Fagerberg L, Hallström BM, Stadler C, Tolf A, et al. (2015) Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer. PLoS ONE 10(8): e0133449. https://doi.org/10.1371/journal.pone.0133449
Editor: Natasha Kyprianou, University of Kentucky College of Medicine, UNITED STATES
Received: April 10, 2015; Accepted: June 25, 2015; Published: August 3, 2015
Copyright: © 2015 O'Hurley et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: Funding was provided by the Knut and Alice Wallenberg Foundation and the Swedish Cancer Foundation and by the Marie Curie Industry-Academia Partnerships and Pathways program, FAST-PATH (No. 285910). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. OncoMark Ltd provided support in the form of salaries for author GOH, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.
Competing interests: The authors have declared that no competing interests exist. OncoMark Ltd provided support in the form of salaries for author GOH. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.
Prostate specific antigen (PSA) has emerged as a useful tumor marker in oncology and PSA-based screening is widely used despite a relative lack of both specificity, leading to overdiagnosis and treatment of early stage prostate cancer, and sensitivity, leading to prostate cancer not being detected early enough [1–5]. Thus there is a need for better markers for early detection of prostate cancer.
PSA is a serine protease and one of three most abundant proteins secreted from the prostate gland . In the malignant prostate, tissue architecture is abnormal which facilitates PSA leakage to capillaries in the stromal compartment. Non-malignant prostate conditions, including prostatitis and benign prostatic hyperplasia (BPH), can lead to elevated serum PSA, limiting the specificity of PSA elevation for cancer detection . Thus, determining which patients require further examination with transrectal ultrasonography (TRUS)-guided biopsies remains a significant problem.
Several other markers have been implicated as potential biomarkers of prostate cancer, such as alpha-methylacyl coenzyme A racemase (AMACR) which has been shown to be significantly up-regulated in prostate cancer and detectable in both serum and cancer tissue. Other such diagnostic biomarkers include prostate carcinoma mucin-like antigen (PMA), GOLM1, fatty acid synthase (FASN), TMPRSS2-ERG fusion prostate cancer antigen 3 (PCA3), KLK3, KLK2, HOXB13, GRHL2 and FOXA1[8–12]. However, up-to-date, no individual marker has proven better than PSA.
Prostate cancer is diagnosed based on histopathological examination of multiple TRUS-guided prostatic core biopsies. The identification of cancer in the prostate is prone to subjectivity and error due to the reliance on human interpretation and that biopsies only provide a small amount of tissue, which often includes only a few malignant glands and histological benign mimics of cancer. The discovery of a specific marker of either prostate cancer or benign prostatic glands that also could be measured in serum would be beneficial to avoid unnecessary invasive diagnostic tests.
The interpretation of quantitative transcriptomics data based on mRNA sequencing of tissue samples is a challenge due to the heterogeneity of cell types that comprise various tissue types. Here we have analyzed genes expressed in normal human prostate and compared these data to the trancriptomes of 26 other normal human tissue types based on recently published RNA-seq data . The transcriptomics analysis was combined with immunohistochemistry-based protein profiling data available from the Human Protein Atlas (www.proteinatlas.org) [14, 15] to provide a map of gene expression on both the RNA and protein level in the prostate. The expression pattern of two proteins encoded from previously uncharacterized genes, TMEM79 and ACOXL, with elevated expression in the prostate gland were further analyzed using tissue microarrays (TMA), including normal prostate and prostate cancer, to explore their potential value as diagnostic biomarkers.
Materials and Methods
Fresh frozen human tissue representing 27 different normal human tissue types was included in the RNA-seq analysis as previously described , including 4 samples of prostatic tissue. Morphologically normal, non-cancerous prostate tissue was sampled from prostatectomy specimens derived from 4 male patients (age 62–68 y) with localized prostate cancer.
Formalin fixed, paraffin embedded (FFPE) human tissue samples were collected from the clinical Department of Pathology, Uppsala University Hospital, Uppsala, Sweden and assembled into TMAs. TMAs were created and used for protein profiling as previously described . The screening TMA contained 1 mm cores of 46 different normal tissues in triplicate, including three normal prostate samples, and 216 cancer tissues representing the 20 most common cancers, including 12 cases of prostate cancer . The four validation TMAs contained normal and cancerous prostate tissue; the details of each validation TMA is shown in Table 1 and have been described previously [18, 19]. Prostatic intraepithelial neoplasia was excluded from this study.
All human tissue samples used for RNA-seq and screening of protein expression were anonymized and used in accordance with approval and advisory report from the Uppsala Ethical Review Board (Reference # 2002–577, 2005–338 and 2007–159 (protein) and # 2011–473 (RNA)). The validation TMA cohorts were approved by The Central Ethical Review Board in Sweden (Dnr Ö25-2006, date 2006-06-29) and the Regional Ethical Review Board at Lund University, Sweden (approval number DN. 445–07). All patients provided written informed consent.
Transcript profiling (RNA-seq) and data analysis
Transcriptomic profiling has been described previously . Briefly, hematoxylin-eosin (HE) stained frozen sections (4 μm) were prepared from each sample using a cryostat and the CryoJane Tape-Transfer System (Instrumedics, St. Louis, MO, USA) and reviewed by a pathologist to ensure proper tissue morphology. Three 10 μm sections were cut from each frozen tissue block and homogenized prior to extraction of total RNA, using the RNeasy Mini Kit (Qiagen, Hilden, Germany) following manufacturer’s instructions. The extracted RNA samples were analyzed using either an Agilent 2100 Bioanalyzer system (Agilent Biotechnologies, Palo Alto, USA) with the RNA 6000 Nano Labchip Kit or an Experion automated electrophoresis system (Bio-Rad Laboratories, Hercules, CA, USA) with the standard-sensitivity RNA chip. Only samples of high-quality RNA (RNA Integrity Number ≥7.5) were used in the following mRNA sample preparation for sequencing. Illumina HiSeq2000 and 2500 machines (Illumina, San Diego, CA, USA) were used to perform mRNA sequencing using the standard Illumina RNA-seq protocol with a read length of 2x100 bases.
Raw reads obtained from the sequencing system were trimmed for low quality ends with the software Sickle. A phred quality threshold of 20 was used. Reads shorter than 54 bp after the trimming were discarded. The processed reads were mapped to the GRCh37 version of the human genome with Tophat v2.0.3 . Potential PCR duplicates were eliminated applying the MarkDuplicates module of Picard 1.77. To obtain quantification scores for all 20,050 human protein-coding genes, FPKM (fragments per kilobase of exon model per million mapped reads) values were calculated with Cufflinks v2.0.2 , which corrects for transcript length and the total number of mapped reads from the library to compensate for different read depths for different samples. The average percentage of successfully mapped reads was 77%. The gene models from Ensembl build 69  were used in Cufflinks. In addition to Cufflinks, HTSeq v0.5.1 was run to calculate read counts for each gene, which were used for analyses of differentially expressed genes utilizing the DESeq package . All data was analyzed with R Statistical Environment  and a network analysis was performed using Cytoscape 3.0 . For analyses performed in this study where a log2-scale of the data was used, pseudo-counts of +1 were added to the data set.
The average FPKM value in all samples for a particular tissue was used to estimate the total gene expression level. A cut-off value of 1 FPKM, roughly corresponding to an average of 1 mRNA molecule per cell, was defined as the detection limit . Each of the 20,050 genes was classified into one out of nine categories based on the expression pattern in prostate in relation to all other tissues (Table 2).
Antibody validation: siRNA transfection, immunofluorescence, imaging and statistical analysis
Extended antibody validation to that provided on the HPA database (www.proteinatlas.org) for all proteins, was carried out to further verify the specificity of the primary antibodies to TMEM79 (HPA055214) and ACOXL (HPA035392). This was performed using siRNA-based knock-down of gene expression. U-2 OS and MCF-7 cells were used for siRNA knock-down experiments of ACOXL and TMEM79. U-2 OS cells were cultivated in McCoy’s media supplemented with 10% fetal bovine serum (FBS) and MCF-7 cells in EMEM supplemented with 10% FBS, 1% non-essential amino acids (NEAA) and 1% L-glutamine (all from FisherScientific, Stockholm, Sweden).
On the day of transfection, 10,000 U-2 OS cells or 12,000 MCF-7 cells were seeded into 96-well glass bottom plates (VWR, Stockholm, Sweden) pre-coated with fibronectin. After cell attachment, medium was replaced with 100 μl Optim-MEM containing 0,5 μl Lipofectamine 2000 (cat. No 11668019, Life Technologies) and 2,5 pmol of ACOXL siRNA (Silencer Select Pre-designed siRNA product s30651, Life Technologies) or TMEM79 siRNA (Silencer Select Pre-designed siRNA product s228349, Life Technologies). A scrambled siRNA sequence (Silencer Select Negative control no. 1, cat. no 4390843, Life Technologies) was used as negative control and AllStar (SI04381048, Qiagen) was used as a positive control to ensure successful transfection.
After 72h of incubation, cells were fixed using 4% paraformaldehyde (PFA) and permeabilized using 0.1% Triton x-100 as previously described . Cells were stained with the antibodies targeting ACOXL or TMEM79, both at a concentration of 2 ng/uL. Cells were also stained with an antibody targeting the microtubules (ab7291, Abcam) at a concentration of 3 ng/μl, and with the nuclear probe 4’,6-diamidini-2-phenylindole (DAPI) at a concentration of 300 nM, to enable automated imaging and quantification of the reduced staining intensity.
The imaging and assay read-out of the siRNA experiments was done as previously described . The knock-down was measured as the relative fluorescence intensity (RFI) as compared to the negative control. The data was graphically presented in box-plots and a Mann-Whitney test was used to evaluate the significance of the median RFI of the silenced cell population compared to the RFI of the corresponding negative control.
Antibody-based tissue profiling
TMAs were cut in 4 micrometer thick sections and used for immunohistochemical staining, as previously described . The immunohistochemically stained and mounted slides were scanned using an Aperio ScanScope XT Slide Scanner (Aperio Technologies, Vista, CA) for generation of high-resolution digital whole slide images, followed by annotation by certified pathologists. In brief, the manual score of IHC-based protein expression for all proteins screened was determined as the fraction of positive cells defined in different tissues: 0 = 0–1%, 1 = 2–25%, 2 = 26–75%, 3>75% and intensity of immunoreactivity: 0 = negative, 1 = weak, 2 = moderate and 3 = strong staining. All annotation and immunohistochemical data for the screening TMA together with validation data for of all primary antibodies is publically available in the Human Protein Atlas (www.proteinatlas.org) . Primary antibodies used for immunostaining of validation TMAs included HPA055214 (dilution 1:250) for detection of Transmembrane protein 79 Tmem 79) and HPA035392 for detection of Acyl-CoA oxidase-like protein (ACOXL) (dilution 1:1000).
Scoring or TMEM79 and ACOXL protein expression
The immunoreactivity of TMEM79 was assessed in the membrane and cytoplasm, and of ACOXL in the cytoplasm of epithelial cells of the prostate. Antibodies corresponding to both proteins were immunohistochemically stained and the outcome was analyzed on all validation TMAs. Scoring was performed by two independent observers (GOH, CB).
For the purpose of statistical analysis, the immunohistochemical staining pattern of both antibodies were graded according to the following scale: 0, absence of reactivity, 1, faint but clearly detectable reactivity in > 30% of epithelial cells, 2, moderate reactivity in > 30% of epithelial cells and 3, strong reactivity in > 30% of epithelial cells.
For statistical analysis of TMEM79 and ACOXL expression versus histopathological features, the staining intensity of the epithelial cells was divided into two groups: low expression (immunohistochemical score of 0 or 1) including cases with negative or weak staining, and high expression (immunohistochemical score of 2 or 3) including cases with moderate or strong staining. Pearson Chi square tests and Spearman and Pearson Correlation tests were performed to test the association between protein expression and histopathological features on two-way contingency tables. Diagnostic performance criteria were tested by generating ROC curves. Kaplan–Meier survival analysis and multivariate Cox regression analysis were also performed on a subset of patients, where biochemical recurrence (BCR) data was available (N = 148), to analyze the association between BCR and protein expression, serum PSA value pre-prostatectomy and Gleason score. All calculations were performed with IBM SPSS 20 for Windows (SPSS, New York, NY, USA).
The transcriptomic analysis of prostate tissue
The transcriptomes of four prostate samples were quantified by RNA-seq and normalized mRNA levels, calculated as FPKM values , were determined for each sample. A total of 14,040 genes were detected in prostate, using a cutoff of mean expression value > 1 FPKM. Thus, approximately 70% of all putative protein coding genes (n = 20,050) were detected in the prostate. The distribution of FPKM values (mRNA expression levels) ranged from 0 FPKM up to 8,238, yielding a dynamic range of 104 between the highest and lowest expressed genes. The 30 genes with the highest levels of expression in the prostate are listed in S1 Table. A majority of these genes encode for proteins with “house-keeping” functions expressed in all analyzed tissues.
The biological variation between the four individual prostate samples was analyzed by comparing the expression levels of all protein coding genes in pairwise scatterplots. The correlation overall was high with Spearman coefficients ranging from 0.98 (Fig 1A) to 0.91, with an average correlation coefficient of 0.96 for all the four prostate samples. These results show low inter-individual variation across the genome-wide expression pattern and demonstrate high technical reproducibility between the prostate samples. As expected, a higher degree of variation was observed when similar comparisons were performed between prostate and other tissue types. The lowest correlation was noted between prostate and testis with a Spearman coefficient of 0.72 (Fig 1B), whereas the tissue with the highest similarity to prostate was endometrium with a Spearman coefficient of 0.92 (Fig 1C).
Gene expression scatterplots showing all FPKM values and the pairwise Spearman and Pearson correlation coefficients between: (A) the two prostate samples with highest correlation coefficient, (B) the lowest correlation to any other tissue type (prostate vs. testis samples), and (C) the highest correlation to any other tissue type (prostate vs. endometrium). (D) Piechart showing the distribution of the fraction of all human protein-coding genes in each of the categories, based on transcript expression levels in prostate compared to all other tissues.
Classification of the genes expressed in prostate
The transcriptomics data obtained from the 27 tissues enabled us to classify all of the 20,050 protein-coding genes into four major categories, firstly based on their expression levels in prostate (Fig 1D). These major categories included i) genes that were not detected in prostate (30%), ii) genes that showed a mixed expression pattern, being expressed in several but not in all tissue types (23%), iii) genes that were expressed in all tissues and thus characterized as “house-keeping” genes (46%) and iv) genes with an elevated level of expression in prostate as compared to other tissue types (1%). Then, the 203 genes within category iv, which showed an elevated expression pattern in prostate, were further divided into four other subcategories depending on degree of tissue specificity.
Six genes were defined as highly enriched in prostate (Table 3), characterized as the highest level of tissue-specificity with at least 50-fold higher FPKM level in prostate compared to any other tissue type. Sixteen genes were defined as moderately enriched in prostate (Table 3), with at least 5-fold higher FPKM level in prostate compared to all other tissues; 85 genes were defined as group enriched (S2 Table), with 5-fold higher average FPKM level in a group of 2–7 tissues including prostate compared to all other tissues; and 96 genes were prostate enhanced (S3 Table), defined as having a 5-fold higher FPKM level in prostate as compared to the average FPKM value of all the 27 tissues. Two well-studied genes in prostate, SLC45A3 and MSMB, were identified in this category.
A network plot of the group-enriched genes in prostate is presented in Fig 2, which shows the number of genes shared between a particular group of tissues (up to four different tissues) as well as the number of highly and moderately tissue enriched genes in prostate. Out of the 27 tissue types analyzed, 22 tissue types had common group enriched genes with the prostate. The tissue type with most group enriched genes in common with prostate is the esophagus (n = 15), followed by testis (n = 6), brain and heart (n = 5). The shared genes between esophagus and prostate are dominated by genes expressed in various muscle cells, included in the wall of esophagus and integrated in the prostate.
Groups of expressed genes are represented as blue circle nodes and linked to the respective enriched tissues represented as grey circles. The sizes of nodes are related to the number of enriched genes. The light blue circle shows the total number of highly and moderately tissue enriched genes. The network represents an overview of the grouped enriched genes with a maximum of 4 tissues combined.
Antibody based profiling of the prostate specific genes
The genes with elevated expression in prostate (n = 203) were evaluated using the online Human Protein Atlas database (HPA, www.proteinatlas.org) in order to compare the quantitative RNA-seq data with spatial expression data of corresponding protein levels. To further explore the protein expression patterns in prostate of this set of proteins, which include both previously well-studied as well as uncharacterized proteins, the immunohistochemical staining was visually evaluated with regard to benign prostate specificity and cellular distribution. Examples of expression patterns in normal prostate of proteins with elevated expression in prostate are shown in Fig 3.
Differentially expressed proteins in benign and malignant prostatic tissue
Transmembrane protein 79 (TMEM79).
Differential protein expression in benign prostate tissue versus prostate cancer tissue was next evaluated for uncharacterized genes to which validated antibodies directed towards the corresponding proteins were available. TMEM79, with evidence of existence only at the transcript level according to UniProt , was identified as a “group enriched” gene with a FPKM value of 39 in prostate. TMEM79 gene expression was also observed in esophagus (54 FPKM) and skin (65 FPKM) at higher levels than in prostate (see S2 Table). However, at the protein level TMEM79 showed a more distinct immunoreactivity in normal prostate glands compared with the expression pattern in squamous epithelia in skin, esophagus, oral mucosa, vagina and cervix. Strong membranous immunostaining was observed in 3/3 benign prostate tissue cases, whereas there was no evident membranous staining in tumor cells from 12/12 cases of prostate cancer (images of immunostained normal and cancerous prostate tissues are available at www.proteinatlas.org and in Fig 4).
(A) TMEM79 membranous expression in a benign gland with lack of TMEM79 expression in surrounding tumor. (B) Dual IHC staining of TMEM79 (red) and TP63 (Blue basal cell staining) in a benign gland. (C) A lack of TMEM79 protein expression in prostate cancer (Gleason grade 3). (D) A lack of TMEM79 protein expression in prostate metastatic tumor (Bone metastasis). Scale, 100μm.
To validate the initial protein screening results observed for TMEM79 in prostate, IHC was performed on four independent prostate cancer cohorts. Tissues from 333 cases were available for analysis (156 benign cases, 162 primary prostate cancer cases and 15 metastatic prostate cancer cases). Approximately 81% (127/156) of benign prostate tissue samples showed a positive membranous expression of TMEM79 and approximately 84% (148/156) of prostate cancer tissue samples did not express TMEM79 (Table 4). Thus, TMEM79 displayed a high sensitivity (81%) and specificity (84%) to distinguish benign prostate glands from prostate cancer. To statistically assess the hypothesis that positive TMEM79 expression is inversely associated to prostate cancer, a two-way contingency table was set up (Table 4), which classified the test variables into categories; benign tissue versus tumor (Gleason grade 2–5 and metastasis). To statistically assess the diagnostic performance criteria of TMEM79 a ROC curve was generated (S1 Fig) which produced an AUC value of 0.825 and a significant P value of 0.000 indicating that TMEM79 is a good diagnostic test for distinguishing between benign glands and tumor of the prostate. No association between BCR, serum PSA value pre-prostatectomy, Gleason score and TMEM79 expression was observed by either Kaplan–Meier survival analysis or multivariate Cox regression analysis on the subset of 148 patients analyzed.
Acyl-CoA oxidase-like protein (ACOXL).
Similar to TMEM79, ACOXL was identified as a novel tissue marker of benign prostate following screening of protein expression in the Human Protein Atlas. In contrast to the membranous expression of TMEM79, ACOXL showed a strong granular, cytoplasmic expression pattern in benign prostatic glands. ACOXL was identified as a “group enriched” gene also with a FPKM value of 3 in prostate tissue. Other tissues that were classified as “group enriched” for the ACOXL gene were lung (15 FPKM), urinary bladder (13 FPKM) and testis (4 FPKM). Immunostaining of ACOXL protein showed a strong granular, cytoplasmic staining also in bronchi and the fallopian tube as well as a weaker and diffuse staining pattern in lung, skin, gallbladder, salivary gland, thyroid, adrenal gland, vagina and brain.
3/3 benign prostate tissue cases showed a positive staining for ACOXL expression and no staining was observed in 8/12 prostate cancers (see Fig 5 and www.proteinatlas.org for images of immunostained normal and cancerous prostate).
(A) Granular, cytoplasmic expression of ACOXL in a benign prostatic hyperplasia. (B) A lack of ACOXL protein expression in prostate cancer (Gleason grade 3). (C) A lack of ACOXL protein expression in prostate metastatic tumor (lymph node metastasis). (D) Granular, cytoplamic expression of ACOXL in a benign gland with lack of TMEM79 expression in surrounding tumor. Scale, 100μm.
IHC was further performed on the four independent prostate cancer cohorts. Approximately 86% (132/154) benign prostate tissue samples had positive cytoplasmic ACOXL expression and approximately 72% (129/179) prostate cancer tissue samples did not express ACOXL, suggesting a high specificity and sensitivity also for ACOXL to identify benign prostate glands.
A similar contingency table and statistical analyses as was done for TMEM79 was performed on the ACOXL data (Table 5). Both Pearson and Spearman correlations also showed that there is a moderate inverse relationship between the membranous expression of TMEM79 and prostate cancer. Separating the tumor category into Gleason grades and metastatic tumors showed a trend for decreased ACOXL expression in more advanced and aggressive tumors. No association between BCR and ACOXL expression in prostate cancer tissue was observed. To statistically assess the diagnostic performance criteria of ACOXL a ROC curve was generated (S1 Fig) which produced an AUC value of 0.788 and a significant P value of 0.000 indicating that ACOXL is a good diagnostic test for distinguishing between benign glands and tumor of the prostate. No association between BCR, serum PSA value pre-prostatectomy, Gleason score and ACOXL expression was observed by either Kaplan–Meier survival analysis or multivariate Cox regression analysis on the subset of 148 patients analyzed.
Antibody validation for Rabbit Polyclonal anti-TMEM79 (HPA 055214) and Rabbit Polyclonal anti-ACOXL (HPA035392)
To evaluate the specificity of the primary antibodies targeting ACOXL (HPA035392) and TMEM79 (HPA055214), the target proteins were knocked down using siRNA in U-2 OS and MCF-7 cells, and analyzed using immunofluorescence (S2 Fig). The immunofluorescence staining of ACOXL showed a mainly cytoplasmic staining pattern that was significantly decreased after silencing of the corresponding transcript in both U-2 OS and MCF-7 cells, indicating a specific binding of the ACOXL antibody to the intended target protein with a RFI of 58 and 70% respectively (S2 Fig).
For TMEM79, a significant decrease in staining intensity was observed in both cell lines (RFI of 74 and 70% for U-2 OS and MCF-7), also indicating a specific binding of the antibody to the TMEM79 protein (S2 Fig). In addition to staining in the cell membrane, as seen with IHC, immunofluorescence staining was found in the nucleoli. Additional image analysis of the nuclear staining intensity, showed a slight decrease in the silenced cells compared to the controls (data not shown). However, this decrease was not significant.
Current clinical biomarkers for prostate cancer lack specificity and sensitivity to reliably distinguish aggressive versus non-aggressive prostate cancer [2–5]. Thus, determining which patients require treatment and stratification of patients that benefit from aggressive treatment strategies remains a diagnostic and clinical dilemma.
While major advancements in proteomic and metabolomic research have been seen in the past decade producing potential biomarkers for cancer, most have failed to replace existing markers due to lack of added value . A concrete approach to discover and identify specific biomarkers is to search for proteins that are specifically expressed in the tissue of interest prior to searching for such discriminating proteins in the blood or urine .
Unlike other previously published tissue specific expression studies, we combined a state of the art RNA-seq data set describing the prostate specific transcriptome with corresponding in situ protein expression in benign prostate. Our RNA-seq analysis identified 6 highly enriched genes in prostate, 16 moderately enriched genes, 85 group enriched genes and 96 prostate enhanced genes. Several genes with elevated expression in prostate (Table 3 and S2 and S3 Tables) are previously well-known genes in prostate cancer, and include KLK3 (Prostate Specific Antigen), which had the highest FPKM value (4701) of all genes with elevated expression in prostate, and ACPP (Human Prostatic Acid Phosphatase) which had a FPKM value of 1942, [32–34]. Other genes identified as highly enriched in prostate cancer were TGM4, KLK2 and KLK4, the latter kalekrein-related proteins belonging to the same family as PSA (KLK3). All three genes/proteins have been well characterized in prostate tissue . The identification of these genes in this category validates the RNA-seq and antibody profiling approach taken in our study to identifying prostate tissue specific markers.
The most interesting finding in this study was the identification of two novel, uncharacterized genes in prostate from the “group enriched” category, TMEM79 and ACOXL. Both genes showed excellent protein and RNA correlation and differential protein expression in benign and prostate tissue and, thus, were chosen to be further evaluated on a larger cohort of prostate cancer cases to investigate if they would be good markers of benign prostate in tissue.
TMEM79, which encodes transmembrane protein 79, is a member of the transmembrane protein (TMP) family which plays a crucial role in cells acting primarily as transporters and receptors . TMPs and misassembly of these proteins are related to several serious diseases. For example, the I655V mutation in the transmembrane α-helix of ERBB2 has been shown to increase risk of breast cancer and cystic fibrosis has been attributed to endoplasmic reticulum defects caused by misassembly of CFTR (the cystic fibrosis transmembrane conductance regulator) , making TMPs biological drug targets . In our study, we report strong membranous protein expression of TMEM79 in approximately 82% of benign prostate glands and a lack of membranous TMEM79 expression in 84% of prostate tumors, corresponding to approximately 82% sensitivity and 84% specificity at identifying benign prostate glands in tissue. In some cases of tumor where membranous expression was lost weak cytoplasmic expression was observed. This may indicate a translocation of the protein from the membrane internally to the cell cytoplasm in the conversion of normal epithelium to tumor epithelium. The underlying mechanisms and potential role of loss of TMEM79 expression in prostate cancer cells are unknown as the function of TMEM79 has yet to be elucidated. It could be speculated that a possible mutation or defect such as a deletion in the gene may play a role in its loss of expression as like many members of the TMP family. However, future functional studies of TMEM79 and further sequencing of prostate tumor tissue will be important to increase our knowledge regarding this promising prostate cancer marker.
Acyl-Coenzyme A oxidase-like (ACOXL) is proposed to participate in fatty acid β-oxidation, fatty acid metabolic process and oxidation reduction according to NCBI”Aceview” (39). ACOXL displayed strong granular cytoplasmic expression in mitochondrial regions in approximately 86% of benign glands and this expression was lost in 72% of prostate tumors, corresponding to approximately 86% sensitivity and 72% specificity in identifying benign prostate glands in tissue. An increasing trend of ACOXL protein expression loss in more poorly differentiated and aggressive tumors was also noted. Recently, an enrichment of the ACOXL gene has been implicated in prostate cancer serum using metabolic quantitative trait loci analysis in the serum of 402 Swedish men . This could suggest that the transformation in phenotype of normal epithelium to tumor epithelium may result in loss of ACOXL protein expression in the epithelium potentially due to leakage of ACOXL into the serum. However, further analysis of protein levels in both tissue and matched serum is warranted to further evaluate this hypothesis.
Our study revealed the identification of two potential, novel biomarkers of benign prostate, TMEM79 and ACOXL. We observed high sensitivity and specificity of these markers at detecting benign prostate which may suggest that these markers could be beneficial at detecting benign tissue on biopsy to assist pathological diagnosis of benign glands in combination with other basal cell markers such as p63. Although these markers do not show as high sensitivity and specificity as other diagnostic IHC basal cell markers such as TP63 or antibodies detecting high molecular weight cytokeratins (34βE12) at identifying benign glands, we have shown that these markers are specific for benign prostate epithelium and speculate that based on previously reported serum analysis in combination with our findings that ACOXL could be leaked into serum during tumor growth, yielding a potential marker for prostate cancer screening. However, further analysis of TMEM79 and ACOXL at a number of levels including functional analysis, analysis of ACOXL expression levels in matched tissue and serum are warranted in prostate cancer tissue in order to determine if they have clinical impact in disease screening.
In conclusion, our model of RNA-seq analysis and immunohistochemistry-based protein profiling in human normal tissues provides an advantageous strategy to identify tissue specific markers of diseases such as prostate cancer. Using this strategy TMEM79 and ACOXL were identified as two novel candidate biomarkers for prostate cancer.
S1 Fig. Receiver Operating Characteristic (ROC) curve analysis (A) ROC curve testing the diagnostic performance criteria of TMEM79 where an AUC of 0.825 is observed (B) ROC curve testing the diagnostic performance criteria of TMEM79 where an AUC of 0.788 is observed.
S2 Fig. Antibody validation of TMEM79 (HPA035392) and ACOXL (HPA055214) by siRNA gene knock down and immunofluorescence.
(A) ACOXL immunofluorescence staining of U-2 OS cells with ACOXL siRNA gene knock down and U-2 OS control cells. (B) ACOXL immunofluorescence staining of MCF-7 cells with ACOXL siRNA gene knock down and MCF-7 control cells. (C) TMEM79 immunofluorescence staining of U-2 OS cells with TMEM79 siRNA gene knock down and U-2 OS control cells. (D) TMEM79 immunofluorescence staining of MCF-7 cells with TMEM79 siRNA gene knock down and MCF-7 control cells.
S1 Table. The top 30 genes with the highest levels of expression in the prostate.
S2 Table. Group enriched genes which were defined as having at least 5-fold higher FPKM level in a group of 2–7 tissues including prostate compared to all other tissues.
We acknowledge the entire staff of the Human Protein Atlas program and the Science for Life Laboratory for valuable contributions. We thank the Department of Pathology at the Uppsala Akademiska hospital, Uppsala, Sweden and Uppsala Biobank for kindly providing clinical diagnostics and specimens used in this study. Funding was provided by the Knut and Alice Wallenberg Foundation and the Swedish Cancer Foundation and by the Marie Curie Industry-Academia Partnerships and Pathways program, FAST-PATH (No. 285910).
Conceived and designed the experiments: FP GOH MU. Performed the experiments: GOH CS LF BH. Analyzed the data: GOH CB EL CS JS. Contributed reagents/materials/analysis tools: AT KJ AB. Wrote the paper: GOH FP WG.
- 1. Brawer MK, Meyer GE, Letran JL, Bankson DD, Morris DL, Yeung KK, et al. Measurement of complexed PSA improves specificity for early detection of prostate cancer. Urology. 1998;52(3):372–8. Epub 1998/09/08. doi: S0090-4295(98)00241-6 [pii]. pmid:9730446.
- 2. Etzioni R, Penson DF, Legler JM, di Tommaso D, Boer R, Gann PH, et al. Overdiagnosis due to prostate-specific antigen screening: lessons from U.S. prostate cancer incidence trends. J Natl Cancer Inst. 2002;94(13):981–90. Epub 2002/07/04. pmid:12096083.
- 3. Carter HB, Metter EJ, Wright J, Landis P, Platz E, Walsh PC. Prostate-specific antigen and all-cause mortality: results from the Baltimore Longitudinal Study On Aging. J Natl Cancer Inst. 2004;96(7):557–8. Epub 2004/04/08. pmid:15069122.
- 4. Carter HB. Management of low (favourable)-risk prostate cancer. BJU Int. 2011;108(11):1684–95. Epub 2011/11/15. pmid:22077546; PubMed Central PMCID: PMC4086468.
- 5. Bangma CH, Roobol MJ. Defining and predicting indolent and low risk prostate cancer. Crit Rev Oncol Hematol. 2012;83(2):235–41. Epub 2011/10/29. pmid:22033113.
- 6. Lilja H. A kallikrein-like serine protease in prostatic fluid cleaves the predominant seminal vesicle protein. J Clin Invest. 1985;76(5):1899–903. Epub 1985/11/01. pmid:3902893; PubMed Central PMCID: PMC424236.
- 7. Makarov DV, Loeb S, Getzenberg RH, Partin AW. Biomarkers for prostate cancer. Annu Rev Med. 2009;60:139–51. Epub 2008/10/25. pmid:18947298.
- 8. Kristiansen G. Diagnostic and prognostic molecular biomarkers for prostate cancer. Histopathology. 2012;60(1):125–41. Epub 2012/01/04. pmid:22212082.
- 9. Fiorentino M, Capizzi E, Loda M. Blood and tissue biomarkers in prostate cancer: state of the art. Urol Clin North Am. 2010;37(1):131–41, Table of Contents. Epub 2010/02/16. pmid:20152526; PubMed Central PMCID: PMC3784983.
- 10. Ploussard G, de la Taille A. Urine biomarkers in prostate cancer. Nat Rev Urol. 2010;7(2):101–9. Epub 2010/01/13. pmid:20065953.
- 11. Schaefer A, Jung M, Kristiansen G, Lein M, Schrader M, Miller K, et al. MicroRNAs and cancer: current state and future perspectives in urologic oncology. Urol Oncol. 2010;28(1):4–13. Epub 2009/01/02. pmid:19117772.
- 12. Duijvesz D, Luider T, Bangma CH, Jenster G. Exosomes as biomarker treasure chests for prostate cancer. Eur Urol. 2011;59(5):823–31. Epub 2011/01/05. pmid:21196075.
- 13. Fagerberg L, Hallstrom BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. 2014;13(2):397–406. Epub 2013/12/07. pmid:24309898; PubMed Central PMCID: PMC3916642.
- 14. Ponten F, Schwenk JM, Asplund A, Edqvist PH. The Human Protein Atlas as a proteomic resource for biomarker discovery. J Intern Med. 2011;270(5):428–46. Epub 2011/07/15. pmid:21752111.
- 15. Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, Forsberg M, et al. Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 2010;28(12):1248–50. Epub 2010/12/09. pmid:21139605.
- 16. Kampf C, Olsson I, Ryberg U, Sjostedt E, Ponten F. Production of tissue microarrays, immunohistochemistry staining and digitalization within the human protein atlas. J Vis Exp. 2012;(63). Epub 2012/06/13. doi: 3620 [pii] pmid:22688270; PubMed Central PMCID: PMC3468196.
- 17. Ponten F, Jirstrom K, Uhlen M. The Human Protein Atlas—a tool for pathology. J Pathol. 2008;216(4):387–93. Epub 2008/10/15. pmid:18853439.
- 18. Danielsson F, Skogs M, Huss M, Rexhepaj E, O'Hurley G, Klevebring D, et al. Majority of differentially expressed genes are down-regulated during malignant transformation in a four-stage model. Proc Natl Acad Sci U S A. 2013;110(17):6853–8. Epub 2013/04/10. pmid:23569271; PubMed Central PMCID: PMC3637701.
- 19. Tassidis H, Brokken LJ, Jirstrom K, Bjartell A, Ulmert D, Harkonen P, et al. Low expression of SHP-2 is associated with less favorable prostate cancer outcomes. Tumour Biol. 2013;34(2):637–42. Epub 2012/11/30. pmid:23192641.
- 20. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5. Epub 2010/05/04. doi: nbt.1621 [pii] pmid:20436464; PubMed Central PMCID: PMC3146043.
- 21. Flicek P, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, et al. Ensembl 2012. Nucleic Acids Res. 2012;40(Database issue):D84–90. Epub 2011/11/17. doi: gkr991 [pii] pmid:22086963; PubMed Central PMCID: PMC3245178.
- 22. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106. Epub 2010/10/29. doi: gb-2010-11-10-r106 [pii] pmid:20979621; PubMed Central PMCID: PMC3218662.
- 23. RCoreTeam. R: A language and environment for statistical computing. Vienna, Austria2013.
- 24. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504. Epub 2003/11/05. 13/11/2498 [pii]. pmid:14597658.
- 25. Hebenstreit D, Fang M, Gu M, Charoensawan V, van Oudenaarden A, Teichmann SA. RNA sequencing reveals two major classes of gene expression levels in metazoan cells. Mol Syst Biol. 2011;7:497. Epub 2011/06/10. pmid:21654674; PubMed Central PMCID: PMC3159973.
- 26. Stadler C, Skogs M, Brismar H, Uhlen M, Lundberg E. A single fixation protocol for proteome-wide immunofluorescence localization studies. J Proteomics. 2010;73(6):1067–78. Epub 2009/11/10. pmid:19896565.
- 27. Stadler C, Hjelmare M, Neumann B, Jonasson K, Pepperkok R, Uhlen M, et al. Systematic validation of antibody binding and protein subcellular localization using siRNA and confocal microscopy. J Proteomics. 2012;75(7):2236–51. Epub 2012/03/01. pmid:22361696.
- 28. Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, Forsberg M, et al. Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 2010;28(12):1248–50. Epub 2010/12/09. doi: nbt1210-1248 [pii] pmid:21139605.
- 29. Uniprot. Available from: http://www.uniprot.org/uniprot/Q9BSE2.
- 30. Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer. 2010;127(12):2893–917. Epub 2011/02/26. pmid:21351269.
- 31. Issaq HJ, Waybright TJ, Veenstra TD. Cancer biomarker discovery: Opportunities and pitfalls in analytical methods. Electrophoresis. 2010;32(9):967–75. Epub 2011/03/31. pmid:21449066.
- 32. Kong HY, Byun J. Emerging Roles of Human Prostatic Acid Phosphatase. Biomol Ther (Seoul). 2013;21(1):10–20. Epub 2013/09/07. pmid:24009853; PubMed Central PMCID: PMC3762301.
- 33. Gunia S, Koch S, May M, Dietel M, Erbersdobler A. Expression of prostatic acid phosphatase (PSAP) in transurethral resection specimens of the prostate is predictive of histopathologic tumor stage in subsequent radical prostatectomies. Virchows Arch. 2009;454(5):573–9. Epub 2009/03/21. pmid:19301031.
- 34. Henttu P, Vihko P. Prostate-specific antigen and human glandular kallikrein: two kallikreins of the human prostate. Ann Med. 1994;26(3):157–64. Epub 1994/06/01. pmid:7521173.
- 35. Borgono CA, Michael IP, Diamandis EP. Human tissue kallikreins: physiologic roles and applications in cancer. Mol Cancer Res. 2004;2(5):257–80. Epub 2004/06/12. pmid:15192120.
- 36. Wang H, He Z, Zhang C, Zhang L, Xu D. Transmembrane protein alignment and fold recognition based on predicted topology. PLoS One. 2013;8(7):e69744. Epub 2013/07/31. pmid:23894534; PubMed Central PMCID: PMC3716705.
- 37. Ng DP, Poulsen BE, Deber CM. Membrane protein misassembly in disease. Biochim Biophys Acta. 2012;1818(4):1115–22. Epub 2011/08/16. pmid:21840297.
- 38. Klabunde T, Hessler G. Drug design strategies for targeting G-protein-coupled receptors. Chembiochem. 2002;3(10):928–44. Epub 2002/10/04. pmid:12362358.
- 39. NCBI Aceview. Available from: http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?db=human&c=Gene&l=ACOXL.
- 40. Hong MG, Karlsson R, Magnusson PK, Lewis MR, Isaacs W, Zheng LS, et al. A genome-wide assessment of variability in human serum metabolism. Hum Mutat. 2013;34(3):515–24. Epub 2013/01/03. pmid:23281178.