Proteomic Analysis of Urine to Identify Breast Cancer Biomarker Candidates Using a Label-Free LC-MS/MS Approach

Introduction Breast cancer is a complex heterogeneous disease and is a leading cause of death in women. Early diagnosis and monitoring progression of breast cancer are important for improving prognosis. The aim of this study was to identify protein biomarkers in urine for early screening detection and monitoring invasive breast cancer progression. Method We performed a comparative proteomic analysis using ion count relative quantification label free LC-MS/MS analysis of urine from breast cancer patients (n = 20) and healthy control women (n = 20). Results Unbiased label free LC-MS/MS-based proteomics was used to provide a profile of abundant proteins in the biological system of breast cancer patients. Data analysis revealed 59 urinary proteins that were significantly different in breast cancer patients compared to the normal control subjects (p<0.05, fold change >3). Thirty-six urinary proteins were exclusively found in specific breast cancer stages, with 24 increasing and 12 decreasing in their abundance. Amongst the 59 significant urinary proteins identified, a list of 13 novel up-regulated proteins were revealed that may be used to detect breast cancer. These include stage specific markers associated with pre-invasive breast cancer in the ductal carcinoma in-situ (DCIS) samples (Leucine LRC36, MAST4 and Uncharacterized protein CI131), early invasive breast cancer (DYH8, HBA, PEPA, uncharacterized protein C4orf14 (CD014), filaggrin and MMRN2) and metastatic breast cancer (AGRIN, NEGR1, FIBA and Keratin KIC10). Preliminary validation of 3 potential markers (ECM1, MAST4 and filaggrin) identified was performed in breast cancer cell lines by Western blotting. One potential marker MAST4 was further validated in human breast cancer tissues as well as individual human breast cancer urine samples with immunohistochemistry and Western blotting, respectively. Conclusions Our results indicate that urine is a useful non-invasive source of biomarkers and the profile patterns (biomarkers) identified, have potential for clinical use in the detection of BC. Validation with a larger independent cohort of patients is required in the following study.


Introduction
Breast cancer (BC) is a major public health problem worldwide. Despite the widespread use of mammographic screening, which has contributed to reduced mortality, BC is still the most common form of cancer among women. It can only be detected using mammography if there is a visible, detectible abnormality with architectural distortion or calcification, which correlates with the presence of several hundred thousand tumor cells. Once BC has been biopsied and the diagnosis has been confirmed pathologically, the tumor is surgically excised. The complexity and heterogeneity of individual tumors play an important role in therapeutic decision making. Pathological examination is still the gold standard for diagnosis and assessment of prognostic indicators in BC which include tumor size, grade (degree of tumor cell differentiation), presence or absence of positive lymph nodes (metastases), immunohistochemical expression of key proteins such as estrogen receptor (ER), progesterone receptor (PR) and HER2 [1].
Although advances in BC diagnosis have been made in the last decade, there are still many BC patients who cannot be diagnosed in the early stages of disease or monitored adequately for tumor recurrence using current techniques. To reduce morbidity and mortality from BC, novel approaches must be considered for screening, early detection and prevention, as well as for monitoring cancer progression or recurrence. The early detection of ductal carcinoma in-situ (DCIS) or invasive breast cancer (IBC) may prevent the development of life threatening metastatic disease. Additionally, monitoring metastatic progression could identify early BC recurrence and help guide therapeutic decision making.
Human urine is one of the most interesting and useful bio-fluids for clinical proteomics studies. Advances in proteomics, especially in mass spectrometry (MS) [2,3] have rapidly changed our knowledge of urine proteins which have simultaneously led to the identification and quantification of thousands of unique proteins and peptides in a complex biological fluid [4,5]. Proteomic studies of urine are highly informative, and have been successfully used to discover novel markers for cancer diagnosis and surveillance [6][7][8][9] as well as for monitoring cancer progression [10,11]. Technological development combined with the addition of urine screening would increase the knowledge about patient status and further assist assessment and treatment in clinical practice. Proteomic analysis of urine holds the potential to apply a noninvasive method to identify novel biomarkers of BC. However, investigation of urinary proteins from different stages of BC patients using a liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomic approach has not been reported to date.
In this study, we used a label free LC-MS/MS technique to test the feasibility of urine as a source for BC biomarkers and identify the urinary proteins for BC diagnosis and monitoring progression. One potential marker (extracellular matrix protein 1 (ECM1) previously identified and associated with BC), and two novel potential protein markers (MAST4-microtubule associated serine/threonine kinase family member 4 and filaggrin) identified from BC urine were validated in BC cell lines and MAST4 was validated in a small number of primary BC tissues and in the individual human BC urine samples, demonstrating the link of these proteins with BC. However, a larger cohort of BC patients' samples is needed for the validation of the identified potential markers in the following studies. The proteins identified showed significant differences in abundance between the different BC disease stages which provide a useful reservoir of biomarkers for the detection of early and advanced BC.

Study design and ethics
In this pilot study, all the female BC subjects had received detailed diagnostic procedures, i.e. a physical breast examination, mammography, ultrasound and biopsy or excision with a detailed pathological report on the cancer. Ethics approval for the collection of human urine and tissue samples was approved by the South Eastern Sydney Area Health Service Ethics Committee (SEA HRCE) (#07/71Li). The study was designed and conducted in accordance with the ethical principles and all participants signed informed consent forms. None of the subjects had received any prior treatment, either endocrine or chemotherapy. The healthy disease free control group (n = 20) were age matched with the BC patients (range 35-70 years, mean age, 51 ± 10.5 years). Urine samples were collected prior to surgery while BC tissues and normal part of breast tissues were collected after surgery (St George Private Hospital, Sydney, Australia). The collected samples were evaluated and grouped in the analysis according to histopathology report, after diagnosis. The breast carcinoma typing and grading were performed by a pathologist according to the World Health Organization criteria [1]. The samples were grouped into 3 different BC stages: DCIS (n = 6), early IBC (with or without axillary lymph node involvement, but no distant metastases, n = 8), and metastatic breast cancer (MBC) (distant metastases to viscera or bone, n = 6) along with a group of samples with benign breast disease (BBD) (n = 6). The histopathology characteristics and clinical features are summarized in Table 1.

Urine sample collection and processing
Clean catch (no skin contamination), midstream 30-50 mL urine samples were collected in a sterile tube and immediately transported on ice. The urine was centrifuged at 2000 x g (4000 rpm), at 4°C for 10 min to remove insoluble materials and cellular debris. The supernatants were aliquoted and frozen at -20°C and then transferred to -80°C for long term storage. All samples were handled by the same standard operating procedures and processed for storage within one hour of collection. All urine samples had protein concentration and urine creatinine levels measured, and abnormal samples were excluded from the study. The appropriate volume of urine samples was then pooled within the appropriate group to ensure the same total concentration of proteins for proteomics analysis. The pooled urine supernatants from each group were subjected to total protein precipitation by 1:8 sample-solvent ratio of ice-cold (-20°C) acetone, mixed and stored for 1 hour at −20°C, and then high speed centrifuged with high speed centrifugation (HSC), 11,000 x g at 4°C for 30 min. The supernatants were removed and the pellets were further air-dried.
To further precipitate and concentrate the proteins, the pellets were resuspended in 2 mL of fresh TCA solution (concentrated: 10 g TCA in 10 mL Milli-Q H 2 O) in a 4:1 sample-to-solvent ratio, vortexed, incubated at 4°C for 1 hour and then centrifuged with HSC at 4°C for 30 min. After carefully discarding the supernatants, protein pellets were washed twice with ice-cold acetone for 15 min, along with HSC at 4°C for 15 min. All pellets were air-dried as our published method [12].

Urine sample protein clean-up and digestion
The peptide fractions were enzymatically digested with trypsin. Lyophilized protein samples were reconstituted with 25 μL of 50 mM Ammonium bi-carbonate (AMBIC) (pH 8). Trypsin (12.5 ng/μL trypsin proteomic grade, Sigma-Aldrich, St. Louis, MO, USA) was added to a final enzyme-to-protein ratio of 1:100 (w/w) and was incubated at 37°C overnight. The reaction was stopped by acidifying the preparation to~pH 3 using neat formic acid (FA). Samples were dried in a vacuum centrifuge to concentrate the samples which were stored at -20°C. Following trypsin digestion, the peptide samples were purified using Strong Cation exchange (SCX) and C18 StageTips (Thermo Scientific, USA) following the manufacturer's instructions.

LC-MS/MS analysis of urine sample
Label-free LC-MS/MS quantification was performed using an Orbitrap Velos (LTQ-Orbitrap, Thermo Scientific, USA). All urine samples were run in triplicate. Peptides were reconstituted in 10 μL of 0.1% FA and separated by nano-LC using an Ultimate 3000 HPLC and auto sampler (Dionex, Amsterdam, Netherlands). The samples (0.6 μL, 2 μg total load) were loaded onto a micro C18 pre-column (500 μm × 2 mm, Michrom Bio-resources, Auburn, CA, USA) with Buffer A at 10 μL/min (2% ACN and 0.01% Heptafluorobutyric Acid (HFBA) in water). After a 4-min wash, the pre-column was switched (Valco 10 port valve, Dionex) into line with a fritless nano column (75 μm diameter × 12 cm) containing reverse phase C18 media (3 μm, 200Å Magic, Michrom Bio-resources). Peptides were eluted using a linear gradient of Buffer A to Buffer B (98% ACN, 0.01% HFBA in water) at 250 nL/min over 60 min. High voltage (2000 V) was applied to a low volume tee (Upchurch Scientific, Oak Harbor, WA, USA) and the column tip positioned~0.5 cm from the heated capillary (T = 280°C) of an Orbitrap Velos (Thermo Electron, Bremen, Germany) mass spectrometer. Positive ions were generated by electrospray and the Orbitrap was operated in data-dependent acquisition mode. A survey scan MS was acquired in the Orbitrap in the 350-1750 m/z range with the resolution set to a value of 30 000 at m/z = 400 (with an accumulation target value of 1 000 000 ions), with lockmass enabled. The 10 most intense ions (>5000 counts) with charge states +2 to +4 were sequentially isolated and fragmented within the linear ion trap using CID with an activation q = 0.25 and activation time of 30 milliseconds (ms) at a target value of 30 000 ions. The m/z ratios selected for MS/MS were dynamically excluded for 30 seconds to prevent repetitive selection of the same peptide.
Label-free LC-MS quantitative profiling MS peak intensities were analyzed using Progensis QI, LC-MS data analysis software (version 4.1, Nonlinear Dynamics, Newcastle upon Tyne, UK). Ion intensity maps from each run were aligned to a reference map and ion feature matching was achieved by aligning consistent ion m/z and retention times. The peptide intensities were normalized against total intensity (sample specific log-scale abundance ratio scaling factor) and compared between groups by oneway analysis of variance (ANOVA, p 0.05 for statistical significance). Type I errors were controlled by False Discovery Rate (FDR) with q value set at 0.02 [13,14].
MS/MS spectra were searched and identified against the human protein database Uni-Prot database (downloaded January 2013) using the database search program MASCOT (Matrix Science, London, UK, www.matrixscience.com). Parent and fragment ions were searched with tolerances of ± 6 ppm and ± 0.6 Da, respectively. Searched peptide charge states were limited to +2 to +4. Deamination (M), Oxidation. Phosphorylation was chosen as variable modifications. Only peptides with an ion score >25 were considered for protein identification. Proteins were considered to be significantly different at p < 0.05; fold change >3.

Generation of the heat map
The area under curve (AUC) of all MS1 peaks generated from comparisons among different stages of BC in urine (Progensis data) was normalised to the mean of all AUC using TIBCo spotfire (Boston, MA, USA). The clustering method used is UPGMA and distance measure was Cosine correlation in logarithmic scale for rows. Columns were clustered using a Ward's method with distance measured using Half Square Euclidean.

Immunohistochemistry
Standard immunoperoxidase procedures were used to visualize MAST4 expression using our published method [16]. Briefly, paraffin sections including BC tissues and normal breast tissues were deparaffinised in xylene, followed by a graded series of alcohols (100%, 95%, and 75%) and re-hydrated in water followed by Tris-buffered Saline (TBS) (pH 7.5). Slides were subsequently immersed in boiling 0.1 M citrate buffer (pH 6.0) for 30 min to enhance antigen retrieval, treated with 3% hydrogen peroxide and then incubated with primary rabbit anti-MAST4 PAb (1:100 dilution) o/n at 4°C. After washing with TBS, slides were incubated with goat anti-rabbit IgG (Dako, North Sydney NSW, Australia) secondary antibody (1:100 dilution) for 45 min at room temperature. Sections were finally developed with 3,3' diaminobenzidine (DAB) substrate solution (Sigma-Aldrich, Pty Ltd, Castle Hills, NSW, Australia) as a chromogen, then counterstained with hematoxylin and blued with Scotts Bluing solution. Control slides were treated in an identical manner, and stained with an isotype matched non-specific immuno-globulin as a negative control. MDA-MB-231 cell line was used as positive control.

Assessment of immunostaining
Staining intensity (0-3) was assessed using light microscopy (Leica microscope, Germany) at a x 40 objective as-(negative), + (weak), ++ (moderate), and +++ (strong) using our previously published method [16]. Evaluation of tissue staining was done, independently, by two experienced observers (JB and YL). All specimens were scored blind and an average of grades was taken. If discordant results were obtained, differences were resolved by joint review and consultation with a third observer, experienced in immunohistochemical pathology.

Results and Discussion
Proteomic discovery of circulating urinary markers in human BC Label-free LC-MS/MS quantification was used to characterize the differential expression of urinary proteins in various human BC stages. The urine samples were analysed from patients with DCIS, IBC, MBC, BBD and normal healthy control subjects (ANOVA p< 0.05; q<0.02). A reverse database was also searched to determine protein level FDR. Using Progenesis software to compare protein expression between all the samples, we identified a total of 166 proteins with 1% FDR (for protein identification, determined by searching a reverse database).
Using the raw urine data and Tibco-spotfire software Inc. 2014, a biological heat map of clusters from different stages of BC patients and normal health control subjects was produced. The representative data are shown in Fig 1. This analysis demonstrates datasets as clustered patterns which show an overview of the distribution of urine proteins represented according to their expression.
The data obtained with Progenesis LC-MS analysis was then applied to calculate the fold change (FC) as a normalised ratio for disease compared to healthy control subjects. This statistical analysis revealed 59 significant urinary proteins in BC with >3-fold change relative to the normal healthy control subjects. These protein profiles are all recorded in Table 2 and S1-S3 Tables. A review of the literature demonstrates that the 59 significant proteins changing in abundance have not been detected in human urine in either BC or BBD. Several of these proteins identified were previously reported in blood, tissue and human BC cell lines (associated references are shown in Table 2 and S1-S3 Tables), supporting that these proteins detected are associated with BC. Therefore, in this study we decided to focus on the unreported proteins changing in abundance, and their biological significance in BC. In addition, several proteins typically associated with plasma (CO3, KV101, ALBU, A1AG1, FETUA, LAC2, TTHY, A1BG, CO6A1, FIBA, CERU and HAPT) which are known to be excreted in urine [17] were also detected. These proteins are shown in the Table 2 and S2 and S3 Tables with protein spots marked Ã . Several of plasma associated proteins (except LAC2 and FIBA) were also found to be associated with BC, further substantiating our findings.

Classification of identified urine proteins in BC
The 59 significant urinary proteins (p<0.05, >3-fold) identified (see Table 2 and S1-S3 Tables) were classified according to their subcellular locations based on the Uni-Prot entry information available. Protein locations shown in Fig 2 demonstrate that 52% (32) of the proteins are secreted, 18% (11) cytoplasmic, 24% (15) membrane-associated, and 6% (4) grouped as others consisted of nuclear, mitochondrial, cell organelle or unknown sub-cellular origin. The majority of significant BC related proteins detected are secreted and membrane associated in nature, either tumor or host in origin but associated with the presence of disease.

Urine protein distribution in BC patients
To investigate novel urine biomarkers in BC, we wanted to identify proteins which are both increasing and decreasing in abundance, and associated with BC prognosis. Our proteomics screening data provided a list of signature proteins for BC ( Table 2, and S1 Table) and benign disease (S3 Table). Firstly, this signature list highlights 37 unique circulating proteins which were found to be expressed only in specific stages of BC and not across all the urine samples. These BC profiles included 24 up-regulated proteins ( Table 2) and 12 down regulated proteins (S1 Table). Additionally, 23 proteins were identified which appeared across the different BC urine samples, some of which displayed similar patterns of protein expression in DCIS and IBC (S2 Table).
Our findings in the current study indicate that several significant urinary proteins in the BC samples had a strong relationship with BC stage and are potentially promising BC markers. Our profile lists contained several makers which have already been investigated and are known to be associated with BC (see S2 Table). Therefore, the aim of this study was to highlight the novel additional proteins which are not yet reported. We found 3 abundant proteins were associated with pre-invasive BC (i.e. DCIS patients) including Leucine-rich repeat-containing protein 36 (LRC36), Microtubule-associated serine/threonine-protein kinase 4 (MAST4) and a novel uncharacterised protein C9orf131 (CI131) ( Table 2). Also in the DCIS samples, Secretogranin-1 was found to be decreased (S1 Table).
Invasive BC is a cancer stage that invades outside the basement membrane of the lobule or duct into the breast tissue, and can then spread to lymph nodes and distant organs. Finding markers to detect the cancer before it spreads would prevent life threatening metastatic disease. Several of the proteins which we detected have been reported as markers of BC and include ANXA1, Vitronectin, Lacto transferrin, ITIH4 and NGAL (see Table 2). In our findings, the six unreported potential markers of early IBC are Dynein heavy chain 8 (DYH8), Haemoglobin subunit alpha (HBA) and Pepsin A (PEPA >10 FC) ( Table 2), along with uncharacterized protein C4orf14 (CD014, >200-fold), filaggrin (>30-fold) and Multimerin-2 (down >40-fold in DCIS, S2 Table), which are markedly elevated in the IBC samples. Protein DYH8 was previously detected in serum of normal health subjects [65]. Although HBA was reported as a potential serum biomarker in ovarian [39] and colon cancer [38], only hemoglobin subunit β was reported to be elevated in the BC patients [19]. Notably, Desmoglein-1, Kallikrein-1, Keratin, type II cytoskeleton 2 epidermal (K22E) and Poliovirus receptor (PVR) were all proteins significantly down regulated in IBC (S1 Table). Desmoglein-1 was reported as a prognostic marker in anal carcinoma [66]. However, only Desmoglein-3 was found to be associated with BC cells [67]. Kallikrein-1 has not been assigned for a specific biological function, even though Kallikrein-2 and 3 are serum and tissue markers for diagnosis and monitoring of prostate cancer and BC [68,69]. Poliovirus receptor (also called CD155/PVR) which was down regulated is known to have a key role in motility during cancer cell invasion, migration [70] and cell adhesion in BC cell lines [71].
MBC is a stage of the cancer which has spread to distant sites within the body. A detection pattern for MBC could provide the opportunity for early therapeutic intervention. Our analysis results provide a list of proteins with numerous proteins already linked to BC including A1AG1, FETUA, APOA4, CAH1, SULF2 and SPRL1 (see Table 2). In MBC, the novel proteins detected include AGRIN, and Neural growth regulator 1 (NEGR1) ( Table 2), as well as Fibrinogen alpha chain (FIBA) and Keratin type 1 cytoskeleton 10 (KIC10) (S2 Table) which were exclusively elevated in the MBC samples. Only nerve growth factor 1 (NEGF1) was previously reported in relation to the survival and proliferation of BC cells [72,73]. Additionally, in this group of patients, Vasorin and Vitelline membrane outer layer protein 1 homolog (VMO1) were both down regulated (S1 Table).
Analysis of the differential expression patterns of urine samples across different stages of BC highlights potential protein markers that can identify some similarities or difference between DCIS and IBC. DCIS is a non-invasive process which can progress to IBC. If a biomarker or a panel of biomarkers could be identified in DCIS stage, an early action taken may prevent IBC occurrence. Therefore, the identified potential BC markers have clinical significance.
The 23 proteins listed in S2 Table include 3 potential progression markers of BC: immune response proteins immunoglobulin (Ig) kappa chain V-I region WEA (KV118), lambda -2 chain C region (LAC2) and ECM1. Protein ECM1 was elevated in BC (S2 Table). ECM1 is a secreted glycoprotein, previously reported to be associated with BC metastatic bone homing [74] and plays an important role in BC progression [75]. High levels are detected in aggressive tumorigenic cancer cell lines MDA435 [76], in ductal breast carcinomas [77] and its expression is also correlated with poor prognosis [78] and metastatic potential in cancer [79]. In the current study, this protein was detected in IBC urine samples and validated in primary and metastatic BC cell lines, further confirmed its link with BC patients. This urine marker ECM1 demonstrates the potential for BC diagnosis and monitoring.

Urine protein distribution in benign disease patients
Although some benign breast diseases are associated with increased risk of subsequent BC, benign breast condition is generally considered as a noncancerous disorder. In such cases, markers at this stage are important for early detection in high risk individuals and understanding of disease progression. In this study, we also identified a list of unreported proteins of interest in association with BBD (S2 and S3 Tables). The 6 up-regulated proteins uniquely found only in BBD include FAM184A, transcription factor E2F8 and Cadherin-1 along with Igresponse proteins kappa chain V-III region VG (KV309, >1000-FC), lambda like polypeptide 5 (IGLL5) (S2 Table), and Ig kappa chain V-I region BAN (KV122) (S3 Table). Several down regulated proteins were also identified in the benign breast patients (S3 Table). The most significant benign breast protein identified was Nucleobindin-1, which has a 200-fold change (S2 Table). Nucleobindin-1 is a major intracellular calcium-binding protein previously detected in colorectal cancer cells after treatment with anti-tumor compounds [80].

Ingenuity analysis of interaction networks of human urine proteins in BC patients
To identify the major biologic pathways of BC urine involved, Ingenuity Pathway Analysis (IPA) Software (IPA; Ingenuity 1 Systems, www.ingenuity.com Release date: 05-02-2013, Ingenuity Systems, Redwood City, CA, USA) was used for canonical pathway enrichment analysis and the derivation of mechanistic networks. The lists of proteins were uploaded directly into IPA for analysis, and functional pathways or networks with the highest confidence scores were then determined. Cell growth and proliferation analysis are shown in Fig 3. The proteins detected from Table 2 and S1 and S2 Tables are found to be associated with tumor growth and progression (Fig 3A), suggesting that these proteins are involved in the inhibition or proliferation of various cells in BC patients (Fig 3B). The enriched pathways associated with BC urine are shown in S1 Fig while the best scored networks were selected are shown in Fig 4. Highly interconnected networks are likely to represent significant biological functions associated with BC progression.
Proteomics results from this study demonstrate that the proteins detected in BC urine are involved in the LXR/RXR activation and acute-phase response pathways, which are active during inflammation and/or as a contribution of the immune response to cancer. Other pathways also enriched include production of nitric oxide and reactive oxygen species (ROS) in macrophages and IL12 signaling and production in macrophages (Fig 4). IPA highlighted cholesterol metabolism as significant in our BC samples. Cholesterol is an essential structural component of the cell membrane and proliferating cells. Cancer cells are believed to have increased requirements for cholesterol. Cancer cells can increase lipid biosynthesis and uptake cholesterol from the bloodstream [81]. It seems that LDL-cholesterol enriched systemic environment promotes BC progression by activating key signaling pathways and modulating cell behaviour. LDL-cholesterol signaling was shown to induce BC proliferation and invasion [82].
These canonical pathway results show that multiple pathways and networks are involved in the systemic response to BC and that intrinsic and endocytosis signaling pathways play a role, along with communication between the innate and the adaptive immune system. Increasing evidence indicates that the immune response plays an important role in BC disease [83,84]. Therefore, these immune response proteins in urine have potential to be used as BC biomarkers for diagnosis and monitoring. The interaction between these identified pathways in patients' urine and BC disease is complex and biologically significant. More efforts will be put to study the roles and functions of these signaling pathways in BC in the future.

Validation of the identified potential urine markers in BC cell lines
To find an association of identified potential urine protein with human BC, one existing marker-ECM1 and another two selected novel protein markers MAST4, and filaggrin were evaluated in human primary BC cell line (BT474) and metastatic BC cell lines (MDA-MB-231, MCF-7 and SK-BR-3) by Western blotting. As shown in Fig 5A, ECM1 and MAST4 were positive in all 4 BC cell lines and filaggrin was positive in the 3 metastatic BC cell lines, suggesting the identified potential urine markers from BC patients, are closely associated with human BC.

Preliminary validation of identified potential marker MAST4 in human primary BC tissues
To further investigate the clinical significance of our findings, we conducted immuno-histochemistry (IHC) for preliminary validation of one novel protein marker MAST4 using a small number of representative human primary BC tissue samples, including DCIS and IBC patients and normal breast tissues. Our results indicate that MAST4 was positive in 80% (4/5) of IBC and in 60% (3/5) of DCIS, respectively and no positive staining was seen in normal breast tissues (5/5). The typical staining results are shown in Fig 5B. These findings further strengthen the link of the identified potential urine marker with BC disease. Due to the limited number of patients' tissue samples, a lager sample size is required in the following study.

Preliminary validation of potential mark MAST4 in individual human BC urine samples
In order to confirm MAST4 overexpression in the individual DCIS patients, the remaining BC urine samples available were re-examined using Western blotting. Our results clearly indicate that high levels of MAST4 expression were found in the individual DCIS urine samples and low levels in IBC and MBC urine samples, while no expression was seen in the samples from BBD patients and normal health control subjects (see Fig 6), further confirming that a strong link exists between MAST4 protein and the DCIS urine samples identified with the LC-MS/ MS.
MAST4 is a protein-coding gene. The roles and functions of this protein in cancers have not been reported so far. Differential expression of MAST4 was observed between the four different cell lines. High level of MAST4 expression was found in BT474 cell line which is an aggressive luminal B subtype, suggesting this protein could be used as a therapeutic target of interest for future studies of endocrine resistance. Though proteomics screening detected MAST4 as a significant increased marker in DCIS, our results from BC cell lines and human BC tissues suggest that MAST4 may be also involved in BC progression, which is also in line with the positive expression in the individual urine samples from IBC and MBC patients. A preliminary analysis of publically available mRNA expression profiling data (Kaplan-Meier Plotter, an online survival analysis tool [85] http://kmplot.com/analysis/index.php?p = service&cancer = breast), demonstrated that MAST4 expression is highly significantly correlated with survival in a cohort of over 3,000 BC patients. Therefore, we plan to further investigate the MAST4 expression in a large independent cohort of BC patients and study its role in BC metastasis.

Conclusions
Identification of new biomarkers in early and advanced BC is important in the prevention and monitoring of disease progression and is a recently developed research area. MS is a very powerful technique for comprehensive analysis of proteins. Collection of urine from patients is relatively easy and non-invasive, therefore making it an ideal candidate to sample for clinical management of patients and to search for biomarkers.
Within BC publications, most studies have used MS technologies to analyse urine metabolomic biomarkers. In this study, LC-MS/MS analysis was performed on urine samples from BC patients, benign patients and healthy control subjects. Therefore the results of this comparative discovery study provide a panel of novel significantly altered urinary proteins that are abundant in pre-invasive and invasive BC, but have not yet been previously detected in urine or other biological samples. These BC proteins identified provide further insight into the complex signaling pathway interactions occurring during the progression of BC. Our study indicates that the majority of the abundant BC urinary proteins detected are secreted proteins. The list of novel up-regulated proteins detected (Table 2), provides information which could be used to create a panel of targets which could form part of urine screening "dipstick" test for the detection of non-invasive and invasive BC.
Breast cancer cell lines are preclinical models that represent different breast tumor subtypes. To link the potential of urine markers identified with BC disease, we validated one existing marker and two novel biomarkers in human BC cell lines by Western blot analysis. We demonstrated significantly elevated expression of three interesting markers-ECM1, MAST4 and filaggrin in a panel of human BC cell lines and one marker MAST4 in a small group of clinical BC tissue ( Fig 5) and urine (Fig 6) samples, indicating their promise for further investigation.
Urinary proteins can potentially provide a preliminary indication of the presence of BC during screening and could assist with direct examination and pathology testing for final diagnosis. The development of a non-invasive test of BC risk has been a major goal for more than 20 years. In the current study, we present potential protein biomarkers that are related to BC stage that can be used for early diagnosis and monitoring cancer progression in urine. These novel protein markers in urine require to be further evaluated in BC tissues and an independent group of BC urine samples to test their specificity and sensitivity for BC early diagnosis and lead to potential applications in cancer surveillance and prevention.  Table. A list of differentially expressed urinary proteins in breast cancer and benign breast disease. (DOCX) S3 Table. A list of urine proteins up-and-down regulated in benign breast disease. (DOCX)