PTRF/Cavin-1 and MIF Proteins Are Identified as Non-Small Cell Lung Cancer Biomarkers by Label-Free Proteomics

With the completion of the human genome sequence, biomedical sciences have entered in the “omics” era, mainly due to high-throughput genomics techniques and the recent application of mass spectrometry to proteomics analyses. However, there is still a time lag between these technological advances and their application in the clinical setting. Our work is designed to build bridges between high-performance proteomics and clinical routine. Protein extracts were obtained from fresh frozen normal lung and non-small cell lung cancer samples. We applied a phosphopeptide enrichment followed by LC-MS/MS. Subsequent label-free quantification and bioinformatics analyses were performed. We assessed protein patterns on these samples, showing dozens of differential markers between normal and tumor tissue. Gene ontology and interactome analyses identified signaling pathways altered on tumor tissue. We have identified two proteins, PTRF/cavin-1 and MIF, which are differentially expressed between normal lung and non-small cell lung cancer. These potential biomarkers were validated using western blot and immunohistochemistry. The application of discovery-based proteomics analyses in clinical samples allowed us to identify new potential biomarkers and therapeutic targets in non-small cell lung cancer.


Introduction
Lung cancer is the leading cause of cancer death in the world. The overall survival rate at 5 years is 15% and has not been improved for decades. Two thirds of patients are diagnosed with advanced disease where therapeutic options are palliative, and up to 55% of patients with limited disease eventually relapse after radical surgery [1].
Gene expression profiling has led to the identification of groups of patients with different outcome, thus reflecting the heterogeneity of this disease [2]. However, gene-level analyses do not detect subtle changes caused by post-translational modifications of proteins [3]. A deep understanding of the processes of carcinogenesis, tumor progression and metastasis requires the analysis of both the genome and the proteome [4]. Proteomic technologies based on mass spectrometry (MS) have emerged as preferred components of a strategy to discover diagnostic, prognostic and therapeutic protein biomarkers [5]. Continuing advances in this field give this strategy an enormous potential for such investigations [6,7].
Recent clinical trials demonstrating good response to new drugs in specific subgroups of patients underline the need for molecular tests that complement classical histopathological procedures [8]. In this context, proteomic profiling can provide valuable biomarker tools for efficient patient stratification and therapy selection.
Although it is possible to analyze proteins from tissues using mass spectrometry [3,9], the complexity of the clinical sample and the amount of available protein are limiting factors. Therefore, sample enrichment in biologically relevant analytes is required [5]. Most eukaryotic cellular processes are regulated by protein phosphorylation, and deregulation of this key post-translational modification is common in cancer and other diseases. This explains why protein kinases have emerged as the main class of new drug targets in oncology and other fields [10]. In this work we have applied phosphopeptide enrichment coupled with label-free MS techniques to identify already known and new potential biomarkers in non-small cell lung cancer clinical tissues and validate them using western blot and immunohistochemistry.

Ethics statement
Institutional approval from our ethical committee was obtained for the conduct of the study (Comité É tico de Investigación Clínica, Hospital Universitario La Paz). Data were analyzed anonymously. Patients provided written consent so that their samples and clinical data could be used for investigational purposes.

Sample selection
Frozen samples from patients diagnosed with lung cancer were retrieved from the Department of Pathology of Hospital Universitario La Paz (Madrid, Spain): 5 lung adenocarcinoma (AC), 5 lung squamous cell carcinoma (SC) and 5 normal lung (NL) samples. The histopathological features of each sample were reviewed by an experienced lung pathologist to confirm diagnosis and tumor content. At least 50% of a sample had to be made up of tumor cells for it to be eligible. Samples from patients were kindly provided by the IdiPAZ Biobank (RD09/0076/00073) integrated in the Spanish Hospital Biobanks Network (RetBioH; www. redbiobancos.es). Samples were registered and processed following current procedures and fixed/frozen immediately after their reception.

Total protein extraction, solubilization and digestion
Samples were cut in a Leica CM3050S cryostat, obtaining 10 sections of 10 microns thickness each. Tissue was processed with TRIzol reagent (Invitrogen) following the manufacturer's instructions. For MS analyses, protein pellets were resuspended in guanidine hydrochloride 6 M and heated 10 minutes at 95uC with agitation. Subsequently, 950 ml of 50 mM ammonium bicarbonate (pH 7-9) per sample were added. Protein sample concentration was measured by MicroBCA Protein Assay Kit (Pierce-Thermo Scientific). Trypsin MS Grade Gold (Promega) was added to each sample to a 1:50 relation. Digestion was carried out overnight at 37uC. The digested sample was divided into two aliquots.

Parallel IMAC (PIMAC)
Phosphopeptide enrichment was carried out as described previously [11]. Briefly, Fe(III)-based IMAC was performed in one aliquot of digested protein using the PHOS-Select Iron Affinity Gel (Sigma-Aldrich) following the manufacturer's instructions. Ga(III)-based IMAC was performed in another aliquot of digested protein using the Phosphopeptide Isolation Kit (Pierce-Thermo Scientific) following the manufacturer's instructions. Eluates were mixed, vacuum-dried and stored at 220uC for later MS analysis.

LC-MS/MS analyses
Peptide mixtures were subjected to nano-liquid chromatography coupled with MS for protein identification. Peptides were injected into a C-18 reversed phase (RP) nano-column (100 mm I.D. and 12 cm, Mediterranea sea, Teknokroma) and analyzed in a continuous acetonitrile gradient consisting of 0-40% B in 90 min, 50-90% B in 20 min (B = 95% acetonitrile, 0.5% acetic acid). At the end of the gradient, the column was washed with 90% B and equilibrated with 5% B for 20 min. A flow rate of 300 nl/ min was used to elute peptides from the RP nano-column to an emitter nanospray needle for real time ionization and peptide fragmentation on an LTQ-Orbitrap XL mass spectrometer (Thermo-Fisher). An enhanced FT-resolution spectrum (resolution = 60000) followed by the MS/MS spectra from the five most intense parent ions were analyzed along the chromatographic run (130 min). Dynamic exclusion was set at 1 min. For protein identification fragmentation spectra were searched against the MSDB database (version 091509) using the Mascot 2.1 program (Matrixscience). Two missed cleavages were allowed, and an error of 10 ppm or 0.8 Da was set for full MS or MS/MS spectra searches, respectively. All identifications were performed by Proteome Discoverer 1.0 software (Thermo-Fisher). Decoy database search for false discovery rate analysis was set at 0.05 by applying corresponding filters. Raw data files were processed and compared with SIEVE version 1.2 (Thermo-Fisher). Protein identifications were validated using the BLAST tool from the blastp suite (http://blast.ncbi.nlm.nih.gov). For detailed peptide mass fingerprint and protein identification settings, see Table S4.

Inmunoblotting assays
For inmunoblotting assays, protein pellets were resuspended in 2% SDS and heated 10 minutes at 95uC with agitation. Protein sample concentration was measured by MicroBCA Protein Assay Kit (Pierce-Thermo Scientific).Western blots were performed using WesternDot system (Invitrogen) in a SNAP i.d. device (Millipore). MIF antibody 1: 250 dilution (R&D Systems) and PTRF antibody 1:125 dilution (BD Biosciences) were used. Densitometry analyses were performed using ImageJ 1.38e software (http://rsb.info.nih.gov/ij/) to measure the intensity of bands. For western blot normalization, total protein loading was measured using the Novex Reversible Membrane Stain Kit (Invitrogen).

Immunohistochemistry
Formalin-fixed, paraffin-embedded tissue blocks, representative of normal lung and non-small cell lung cancer diagnosis, were retrieved following routine histopathological assessment. Sections were processed using a Dako Autostainer universal staining system (Dako). For this study, 3.5-mm sections were immunostained with anti-MIF 1:2000 (R&D Systems) or anti-PTRF 1:100 (BD Biosciences). Images were obtained in a Leyca microscope with magnification 640. The percentage of stained tissue and the stain intensity (0, +, ++ or +++) was obtained for each sample and marker evaluated. IHC staining was considered positive when at least 50% of the tissue (normal or tumoral) was stained with at least ++.

Statistical Analyses
Expression values between sample groups were compared using a Kruskal-Wallis test (Gaussian Approximation). To assess differences between pairs of groups Dunn's Multiple Comparison Test was used. A p-value ,0.05 was considered significant. SIEVE and densitometry values were compared using Pearson's correlation coefficient.

Bioinformatics
Protein lists were processed using The Database for Annotation, Visualization and Integrated Discovery (DAVID) version 2.0 (http://david.abcc.ncifcrf.gov/home.jsp) [12,13]. To identify under-and over-represented functional categories we used Protein ANalysis THrough Evolutionary Relationships (PANTHER) database v 6.1 (www.pantherdb.org) [14]. Tumor protein list were compared to the normal lung list using the binomial test [15] for each molecular function, biological process or pathway term in PANTHER. Protein-protein interactions were obtained from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database v9.0 containing known and predicted physical and functional protein-protein interactions [16]. STRING in protein mode was used, and only interactions based in experimental protein-protein interaction and curated databaseswith confidence levels over 0.5-were kept.

Results
In this study, we assessed differences at the protein level between non-small cell lung cancer (NSCLC) and lung normal tissue using a phosphopeptide enrichment strategy and a label-free approach. Samples were analyzed on a LTQ-Orbitrap XL after being subjected to liquid chromatography. Since it is known that different techniques isolate distinct and overlapping segments of the phosphoproteome [17], including Fe(III) and Ga(III) IMAC [18], we mixed the Fe(III) and Ga(III) IMAC fractions from each sample and analyzed them together.
We evaluated the number of unique peptides and their corresponding proteins, as well as phosphopeptides and their corresponding phosphoproteins, identified in lung adenocarcinoma (AC), lung squamous cell carcinoma (SC) and normal lung (NL) samples applying a decoy database search at false discovery rate,0.05. The extensive analysis performed in NSCLC and NL samples using LC-MS/MS allowed us to identify a mean of 381 unique peptides per sample, of which a mean of 56 were phosphopeptides. These peptides corresponded to a mean of 138 unique proteins identified per sample, of which a mean of 39 were phosphorylated. The fraction of phosphopeptides identified (number of phosphorylated peptides*100/number of identified peptides) was 19.9%.
Gene ontology analyses were performed using all identified proteins. The tumor protein list was compared to the normal lung protein list for each molecular function ( Figure S1), biological process ( Figure S2), or pathway ( Figure 1) terms using PANTHER. This approach showed significant differences between normal lung and tumor samples (complete analyses are provided in Table S1). Differences in molecular functions are mainly related to the interaction with nucleic acids and the regulation of protein synthesis and activity. Processes controlling exocytosis, immune response, response to stimulus, response to stress and transport were significantly under-represented in tumors, whereas categories related to cell-matrix adhesion or response to toxin were overrepresented. On the other hand, homeostasis categories were under-represented, whereas categories related with energy production and cell proliferation were over-represented in tumors. Remarkable differences in pathway analysis appeared in categories related with signal transduction control. While cytoskeletal regulation by Rho GTPase, inflammation mediated by chemokine and cytokine signaling pathway, integrin signaling pathway and Wnt signaling pathway were under-represented in tumor samples, EGF receptor signaling pathway, Glycolysis, p53 pathway and PI3 kinase pathway were over-represented.
Differential expression analysis between NSCLC vs. normal lung was performed using SIEVE 1.2 software. A total of 296 differentially expressed m/z peaks were found, 115 of which had available MS2 spectra, leading to the identification of proteins differentially expressed between normal lung and NSCLC samples ( Table 1). All data obtained from SIEVE analyses, including relative expression values, are provided in Table S2.
PTRF/cavin-1 and MIF outstand among the differentially expressed biomarkers between NSCLC and normal lung samples in label-free analyses as the most down-regulated and up-regulated respectively ( Figure 2). PTRF/cavin-1 showed loss of expression in both adenocarcinoma and squamous cell carcinoma samples. On the other hand, MIF showed an increased expression in these   (Figures 3 and 4). On the other hand, tumor samples confirmed loss of PTRF/cavin-1 expression when compared with normal lung (Figures 3 and 4) using both techniques. Pearson's correlation between Sieve label-free expression values and western blot quantification was r = 0.723 and r = 0.754 (p,0.005) for MIF and PTRF/cavin-1 respectively. We searched in the STRING database for interactomic connections of MIF and PTRF. In order to minimize the rate of false positives, we eliminated partners using stringent criteria, and only experimental protein-protein interactions and pathways from curated databases were taken into account. PTRF is included in the RNA transcription pathway, and physically interacts with TTF1. Other PTRF interactions comprise proteins involved in transcription regulation and EGFR ( Figure 5). MIF is related to the phenylalanine metabolism pathway, and interacts with p53 and proteins of the COP9 signalosome complex, a complex involved in various cellular and developmental processes, including p53 phosphorylation-mediated degradation [19]. Other interactions comprise proteins related with cell death regulation and inflammatory process ( Figure 6).

Discussion
Proteomics in general and phosphoproteomics in particular are becoming the preferred methods of protein discovery-based analyses. The use of label-free, discovery-based approaches may help discover unexpected biological connections due to the absence of previous knowledge bias. Bioinformatics tools, such as gene ontology and interactome analyses, applied on clinical samples have great potential to identify pathways and molecules with implication at the therapeutic level and may offer clues to the genesis of diseases and their underlying molecular alterations. However, both the technology itself and data analysis tools should be further refined before their entry into the clinic.
Phosphopeptide enrichment of samples prior to MS analysis using PIMAC worked reasonably well, as 20% of measured peptides were phosphorylated. Previous studies have shown an enrichment of phosphopeptides of approximately 50% using an IMAC protocol similar to ours on tryptic digest of a mixture of several reference proteins [20]. Considering that our samples were very complex and that we did not use any fractionation step, phosphopeptide enrichment was successful and comparable with that obtained in previous works [21,22]. However, most of the spectra showing a phosphate loss presented a poor fragmentation, and no peptide identification was generated. The use of new fragmentation techniques, as higher energy collisional dissociation, improve the quality of fragmentation spectra [23], allowing to perform large-scale phosphoproteome analysis.  Gene ontology analyses of biological process and pathways showed an increase in categories related to energy production in cells, such as glycolysis and generation of precursor metabolites and energy. These differences in energy metabolism between normal and tumor cells are known as Warburg effect [24]. From the signaling pathways under-represented in NSCLC tissues, chemokine-and cytokine-mediated inflammation has been previously shown to be under-represented in NSCLC [25]. It is remarkable the over-representation of proteins belonging to EGFR signaling pathway, in a context where the clinical use of EGFR inhibitors has become the paradigm of personalized therapy for NSCLC [26,27,28]. More than 10% of detected peptides showed a differential expression between normal and tumor samples. The percentage of differential peptides was less than 2% when comparing adenocarcinoma and lung squamous cell carcinoma samples, but still there were substantial differences between these two NSCLC histological subtypes, as we have previously demonstrated [11].
We were able to validate NSCLC potential biomarkers identified in shotgun proteomics analyses using IHC and western blot approaches. MIF (macrophage migration inhibitory factor) discriminated between normal lung and NSCLC samples. This well known factor is a proinflammatory cytokine capable of acting as soluble growth factor, expressed and secreted in response to mitogens and integrin-mediated signals. MIF protein is involved in many malignancies, as it promotes cellular transformation, inhibits cytolytic immune response against tumor cells and promotes neovascularization [29]. Interactome analyses revealed a close relation between cell death regulation and MIF, and it is not surprising that MIF over-expression was described in many types  of cancer, including colorectal, breast, prostate, skin and lung cancer [30,31], having a major role in the development of tumors in the central nervous system [32]. Tumors co-expressing MIF and its membrane receptor (CD74 protein) have increased vascularization [33]. Although there are several molecules that inhibit enzymatic activity of MIF, its high IC50 has limited its clinical use so far [34], but new molecules are under current development [35]. Our results show an increased expression of MIF in NSCLC samples by label-free proteomics, confirmed by both western blot and immunohistochemistry.
PTRF (Polimerase I and Transcript Release Factor), also known as cavin-1, is a protein essential for RNA transcription [36] and caveolae formation [37]. These invaginations of the cell surface are associated with processes of vesicular transport, cholesterol homeostasis, signal transduction [38] and lipolysis control [39]. Therefore, it is not surprising that PTRF/cavin-1 mutations are associated with congenital generalized lipodystrophy, type 4 in humans [40]. PTRF/cavin-1 colocalizes with caveolin 1 (CAV1) within caveolae [41], and positively modulates its expression [42]. Interactome analyses suggest that PTRF harbors unknown functions beyond some recently described [43,44,45]. Loss of PTRF/cavin-1 expression in prostate cancer has been related with progression [46], and it has been demonstrated that its expression decreases the migration of PTRF/cavin-1-deficient prostate cancer cells [47]. The loss of PTRF/cavin-1 expression in tumorigenic HBE cells as compared with normal human bronchial epithelial cells has been proved recently [48]. Bai and colleagues have reported recently that PTRF protein was down-regulated in breast cancer cell lines and breast tumor tissue, and that down-regulation of PTRF in breast cancer cells was associated with the promoter methylation [49]. PTRF/cavin-1 phosphorylated species have been described in cells that over-express EGFR, which suggests a function in this signaling pathway [50]. Our label-free proteomics results indicate that PTRF expression is lost in NSCLC samples. These results were confirmed using both western blot and immunohistochemical staining. This is the first study showing PTRF/cavin-1 loss of expression in NSCLC tumor tissue at the protein level. This loss of expression, along with PTRF-EGFR interaction and EGFR pathway deregulation in NSCLC samples, suggests a role of PTRF in NSCLC development.
Our work demonstrates that it is possible to identify potential biomarkers using a label-free differential proteomics strategy on real clinical samples. We identified several differential markers, two of which were validated by alternative classical proteomic methods. Moreover, we show that gene ontology and interaction analyses can identify pathways and processes altered on tumor tissue, which may provide clues to the genesis of the disease and its underlying molecular alterations, and could be susceptible to therapeutic intervention. In this sense, this work indicates that PTRF role in NSCLC and its relationship with EGFR pathway deserves further exploration.

Supporting Information
Figure S1 Analysis of differences in GO Molecular Function between NSCLC and normal lung. Comparison of number of proteins assigned to each GO pathway category. Normal tissue sample categories are represented as fold-change in relation to this category. Statistical significance is tested using the binomial test. Only significant categories (p,0.05) are shown. (TIF) Figure S2 Analysis of differences in GO Biological Process between NSCLC and normal lung. Comparison of number of proteins assigned to each GO pathway category. Normal tissue sample categories are represented as fold-change in relation to this category. Statistical significance is tested using the binomial test. Only significant categories (p,0.05) are shown. (TIF) Figure S3 Fragmentation spectra from PTRF SLKE-SEALPEK tryptic peptide. Diagram shows fragment ions corresponding to main fragmentation series (b-amino and ycarboxy). * indicates water loss; 2+, doubly charged fragment. Parental ion is marked with an arrow. (TIF) Figure   S4 Fragmentation spectra from MIF PMFIVNTNVPR tryptic peptide. Diagram shows fragment ions corresponding to main fragmentation series (b-amino and ycarboxy). * indicates water loss. Parental ion is marked with an arrow. (TIF)