Discovery of Prognostic Biomarker Candidates of Lacunar Infarction by Quantitative Proteomics of Microvesicles Enriched Plasma

Background Lacunar infarction (LACI) is a subtype of acute ischemic stroke affecting around 25% of all ischemic stroke cases. Despite having an excellent recovery during acute phase, certain LACI patients have poor mid- to long-term prognosis due to the recurrence of vascular events or a decline in cognitive functions. Hence, blood-based biomarkers could be complementary prognostic and research tools. Methods and Finding Plasma was collected from forty five patients following a non-disabling LACI along with seventeen matched control subjects. The LACI patients were monitored prospectively for up to five years for the occurrence of adverse outcomes and grouped accordingly (i.e., LACI-no adverse outcome, LACI-recurrent vascular event, and LACI-cognitive decline without any recurrence of vascular events). Microvesicles-enriched fractions isolated from the pooled plasma of four groups were profiled by an iTRAQ-guided discovery approach to quantify the differential proteome. The data have been deposited to the ProteomeXchange with identifier PXD000748. Bioinformatics analysis and data mining revealed up-regulation of brain-specific proteins including myelin basic protein, proteins of coagulation cascade (e.g., fibrinogen alpha chain, fibrinogen beta chain) and focal adhesion (e.g., integrin alpha-IIb, talin-1, and filamin-A) while albumin was down-regulated in both groups of patients with adverse outcome. Conclusion This data set may offer important insight into the mechanisms of poor prognosis and provide candidate prognostic biomarkers for validation on larger cohort of individual LACI patients.


Introduction
Lacunar infarction (LACI) is a subtype of ischemic stroke that accounts for approximately a quarter of all ischemic stroke cases with a higher prevalence in south Asian population [1,2]. Current stroke guidelines do not differentiate between lacunar and nonlacunar strokes (e.g. large vessel stroke or cardioembolic) with respect to treatment or risk factor modification [1,3]. Similarly, many of the major secondary stroke prevention trials have not distinguished between different types of ischemic stroke, which may be important in determining the differential protective influence of various therapeutic approaches (e.g., antiplatelet drugs or thrombolysis) [1,4]. However, mounting evidence suggests differences in LACI pathology in comparison with nonlacunar strokes [1]. Nevertheless, LACI remains a poorly understood area in terms of its etiology, pathophysiology, and more importantly prognosis [2,5].
Unlike non-lacunar subtypes of ischemic stroke, the short-term prognosis of ischemic small-vessel disease (SVD), including LACI is more favorable with an almost negligible early mortality, an absence of neuropsychological impairment and an excellent neurological recovery. However, LACI causes an increase in the mid-or long-term risk of recurrent vascular events and cognitive impairment or neuropsychological abnormalities. It has been shown recently that the proportion of dementia caused by SVD ranges from 36 to 67% [6]. Therefore, identifying the patient cohorts that are at mid-or long-term risk for recurrent vascular events or secondary complications such as vascular cognitive impairment may allow for improved treatment and prevention paradigms.
Blood-based biomarkers can serve as an alternative tool to complement and improve the prognostic ability of clinical features and neuroimaging. Biomarker for prognosis of ischemic stroke is a relatively new concept compared to biomarkers for diagnosis. No single or panel of blood-based biomarkers has been validated by clinical trials for stroke or related secondary complications. Blood, CSF [7] or brain extracellular fluid [8] has been used as starting materials for biomarker discovery in stroke. Although several studies had been performed to validate protein biomarkers from blood [9,10,11], only a few of them were directed specifically to SVD [12,13,14,15]. In addition, most have tried to validate one or a few candidates and although suggested, a proteomics resultguided discovery approach has never been utilized to discover a panel of potential stroke biomarkers [9]. This unbiased systematic approach could be complementary to the traditional hypothesisdriven approach of targeted selection and validation of a single or few proteins.
Plasma microvesicle is a good source of disease biomarkers that entered the circulatory system following their release by cells from various tissues. It has been found that central nervous system (CNS)-specific cell types secrete microvesicles to mediate cell-tocell communication under physiological and pathological conditions [16,17,18,19]. Here, we hypothesize that the brain cells of LACI patients with poor prognosis under the influence of ischemic stress may release microvesicles into circulation through the compromised blood brain barrier (BBB) during its evolution. Detecting these plasma microvesicles with good sensitivity by downstream proteomics profiling could provide potential biomarkers for LACI prognosis. Isobaric labeling based quantitative proteomics is a popular profiling approach that has found wide application in various areas of science and medicine [20,21]. Recently, we have successfully combined an iTRAQ-2D-LC-MS/ MS-based proteomics strategy as a relative quantitation tool along with various types of biological samples (such as neuroblastoma cell-line, rodent and human brain tissue) to obtain important pathological insights in the area of ischemic stroke [22,23,24] and vascular dementia [25]. Here, we apply a similar methodology for comparative profiling of plasma microvesicles from three groups of LACI patients and a group of demographically-matched control to discover potential prognostic biomarkers of LACI. Plasma samples of forty five LACI patients from the European Australasian Stroke Prevention in Reversible Ischemia Trial (ESPRIT) were used for this study [26]. The patients were monitored for up to 5 years after index stroke for adverse outcomes (i.e. recurrent vascular events or decline in cognitive functions). A microvesicle-enriched fraction was obtained by differential centrifugation and ultracentrifugation from the pooled plasma of each group for the iTRAQ experiment.
Analysis of the significantly regulated proteins from the iTRAQ data set revealed an up-regulation of brain-specific myelin basic protein (MBP) apart from proteins related to the integrin signaling [e.g. Integrin alpha-IIb (ITGA2B), Talin-1 (TLN1), and Filamin-A (FLNA)] and coagulation cascade [Fibrinogen alpha chain (FGA), Fibrinogen beta chain (FGB)] that is associated with an unfavorable outcome. Given that blood collection is a simple and cheap procedure; these candidates once validated in larger cohort of LACI patients may have additive value over the existing imaging, clinical or neurobehavioral modalities used in the clinic.

Reagents
Unless indicated, all reagents were purchased from Sigma-Aldrich (St. Louis, MO, USA).

Ethics Statement
The study protocol was approved by Singapore General Hospital's and Nanyang Technological University's Institutional Review Board and Ethics Committee. Written informed consent was obtained from all patients or legal guardians. The European Australasian Stroke Prevention in Reversible Ischemia Trial (ESPRIT) was registered under http://clinicaltrials.gov with the identifier NCT00161070.

Sample Collection
The plasma samples were obtained from patients with a nondisabling ischemic stroke who were recruited at the Singapore General Hospital between 1999 and 2005 for the cognitive substudy of the ESPRIT (ESPRIT-cog). Detailed methodology of ESPRIT and ESPRIT-cog including the exclusion criteria have been reported previously [26,27]. Briefly, for ESPRIT, patients were eligible if they were within 6 months of a transient ischemic attack (including transient monocular blindness) or a nondisabling ischemic stroke (grade#3 on the modified Rankin scale [mRS]) of presumed arterial origin [28]. All patients were randomized to either aspirin (100 mg/day) or aspirin combined with dipyridamole (75-450 mg/day). The control plasma was collected from nonstroke subjects at the same site during 2004-2006. EDTA was used as the anti-coagulant during the processing of blood samples. The exclusion criteria were: a possible cardiac source of embolism, high-grade carotid stenosis for which carotid endarterectomy or endovascular treatment was planned, moderate to severe leukoaraiosis on brain imaging (for randomization into anticoagulation), any blood coagulation disorder, any contraindication for aspirin or dipyridamole, and a limited life expectancy [27].

Neuropsychological Test Battery -Determination of Cognitive Impairment
The cognitive status of the patients was determined by trained research psychologists using standard neuropsychological test battery that has been validated for use in Singapore. Details of the procedure have been described previously [27,29]. Briefly, the battery assessed 6 domains; 2 memory domains (i.e. Verbal Memory and Visual memory) and 4 non-memory domains (Attention, Language, Visuomotor speed and Visuoconstruction). Failure in at least half of the tests in a domain constituted failure in that domain. Diagnoses of dementia were made according to the DSM-IV criteria [30]. Any patients who were demented at the baseline were excluded from this study. The patients who did not qualify to be demented, included individuals with diagnoses of cognitive impairment no dementia (CIND) -mild (impairment of 1-2 domains), CIND -moderate (impairment of 3-6 domains) and no cognitive impairment (NCI). Diagnoses of dementia and CIND were made after each patient's baseline and follow up visits.

Baseline Risk Factors
Risk factor information was collected at baseline. Stroke subtype was classified according to the Oxfordshire Community Stroke Project as total anterior circulation infarct, partial anterior circulation infarct, posterior circulation infarct, or LACI [31]. Vascular risk factor data, such as age, diabetes mellitus status, hypertension, hyperlipidemia, smoking status, ischemic heart disease, peripheral artery disease, as well as past history of stroke, angina and myocardial infarction were obtained verbally from the patient and confirmed with hospital records.

Experimental Design Guided by Outcome Measures
The experimental design is depicted in Figure 1. The LACI patients were followed up annually for up to 5 years (median follow up 3 years; interquartile range, 2 years) to monitor for the occurrence of any vascular event or for change in the cognitive status. Strokes, peripheral artery disease, intracranial bleeds, and any cardiac ischemia (stable and unstable angina, myocardial infarctions) or deaths from any of the above were considered to be a recurrent vascular event. Any LACI patient having a recurrence of vascular event during the follow-up period was included in the group called ''recurrent vascular event'' [27,29]. The patients whose cognitive status declined from the respective baseline status during the course of the prospective study had been assigned to the ''cognitive decline'' group. Patients who did not suffer a recurrent vascular event or cognitive decline during this period were grouped as ''LACI, no adverse outcome''. Accordingly, plasma samples of 45 LACI patients were divided into three groups based on the outcome variables (LACI-no adverse outcome, n = 19; LACI-recurrent vascular events, n = 11; LACI -cognitive decline but no recurrent vascular events, n = 15). The age-matched control group had 17 subjects who never had a stroke or cancer and were cognitively normal at the baseline. The plasma samples were pooled group-wise before processing. A microvesicleenriched fraction was isolated by sequential centrifugation combined with ultracentrifugation and labeled with isobaric tags that was followed by 2D-LC-MS/MS analysis to improve the depth of identification and quantification. The iTRAQ samples were injected thrice in the LC-MS/MS analysis (technical replicate = 3).

Sample Preparation
Separation of microvesicle-enriched fraction by sequential centrifugation. Frozen individual plasma samples were thawed on ice and pooled in a group-wise manner to obtain four tubes containing around 5 ml of plasma specimens from each group. The samples were subjected to sequential centrifugation to enrich the microvesicles using a modified protocol as described previously [32,33]. Briefly, sonicated plasma (561 min) was centrifuged at 4,000 g twice for 30 min and then at 12,000 g for 30 min to collect and remove the pellets. The resulting supernatant was subsequently diluted approx. five times with ice-cold 1X PBS before doing ultra-centrifugation at 30,000 g for 2 h to collect the pellet of plasma membrane derived vesicles or microparticles for separate study. The supernatant was ultra-centrifuged again at 200,000 g for 2 h 15 min to collect the microvesicle pellet ( Figure 2). The microvesicle pellets were washed at least twice with 1X PBS and were lyophilized. The lyophilisate was dissolved using 50-100 ml of ice-cold dissolution buffer [6% sodium dodecyl sulfate; 20 mM dithiothreitol, 100 mM tris-HCl with Complete Protease Inhibitor Cocktail (COMPLETE, (Roche; Mannheim, Germany)), pH 7.75] by brief vortexing. Protein quantization was performed using 2-D Quant kit (Amersham Biosciences, Piscataway, NJ, USA).

Proteomics
In-Gel tryptic digestion and isobaric labeling. The samples (500 mg/condition) were subjected to denaturing PAGE using a 4%-6%-25% gel following an identical procedure as described previously [23,24]. Briefly, the diced gel bands were extensively washed with 25 mM TEAB in 50% ACN to completely remove Tris HCl and detergent before reduction and

Electrostatic Repulsion and Hydrophilic Interaction Chromatography (ERLIC)
The combined iTRAQ sample was desalted by Sep-Pak C18 SPE cartridges (Waters, Milford, MA, USA). A modified ERLIC with volatile salt-containing buffers was adopted [34]. The dried iTRAQ-labeled peptides were reconstituted in 200 ml of Buffer A (10 mM NH 4 HCO 2 , 85% ACN, 0.1% formic acid (FA)) and fractionated using a PolyWAX LP column (200 6 4.6 mm; 5 mm; 300 Å ) (PolyLC, Columbia, MD, USA) on a Prominence HPLC system (Shimadzu, Kyoto, Japan) in a 65 min gradient with Buffer B (30% ACN, 0.1% FA). The HPLC gradient was composed of 100% buffer A for 10 min; 0-25% buffer B for 35 min; then 25-100% buffer B for 10 min; followed by 100% buffer B for 10 min. The chromatogram was recorded at 280 nm. Eluted fractions were collected in every 1 min, and then pooled into 34 fractions depending on the peak intensities, before drying them in a vacuum centrifuge. They were stored at 220uC till MS analysis.

Reverse Phase LC-MS/MS Analysis using QSTAR
The iTRAQ-labeled peptides were reconstituted with 0.1% FA, 3% ACN and analyzed using a HPLC system (Shimadzu) coupled with QSTAR Elite Hybrid MS (Applied Biosystems/MDS-SCIEX) as described previously with minor modifications. Briefly, most of the LC parameters for a 90 min gradient including column configuration, gradient and flow rate were kept constant except the mobile phase A composition (0.1% FA in 3% ACN) and sample injection volume (15 ml/injection). Regarding MS parameters, the precursors with a mass range of 300-1600 m/z and calculated charge of +2 to +5 were selected for the fragmentation. The selected precursor ion was dynamically excluded for 20 s with a 50 mDa mass tolerance. The maximum accumulation time was set at 1.0 s. All other MS parameters were kept identical as reported previously [23].

Mass Spectrometric Raw Data Analysis
The Analyst QS 2.0 software (Applied Biosystems) was used for the spectral data acquisition. ProteinPilot Software 3.0, Revision Number: 114 732 (Applied Biosystems) was used for the peak list generation, protein identification and quantification against the concatenated target-decoy Uniprot human database (191242 sequences). The false discovery rate (FDR) of peptide identification was set to be less than 1% (FDR = 2.06decoy_hits/total_hits). Details of the analysis strategy have been described previously [23].

Bioinformatics Analysis
The bioinformatics analysis was performed using different attributes such as gene ontology (GO), pathway, protein interaction, tissue specificity, keywords or protein domains of DAVID to extract out hidden trends and enrichment of certain groups of proteins. DAVID uses modular enrichment analysis where the term-term/gene-gene relationships are considered for enrichment p-value calculation. It calculates the probability of the number of genes in the list that hit a given biology class as compared to pure random chance with the aid of Fisher's exact test [35]. Opensource GenePattern software (version 3.3.3) was used for clustering the final list of regulated proteins by hierarchical clustering algorithm [36].

Statistical Analyses
All statistical analyses were performed using SPSS 13.0 for Windows software (SPSS Inc.). One-way ANOVA followed by post hoc Tukey test was used for scale variables such as age. Nonparametric Kruskal-Wallis H Test was used for comparing ordinal variables such as demographic characteristics and baseline rick factors. Statistical significance was accepted at p,0.05.

Patient Characteristics
The demographic characteristics, baseline risk factors and cognitive classifications of the study population stratified by  Table 1. The average age of the recruited subjects was 61610 years; 55% were males and 92% were Chinese. Notably, no significant difference was observed between three groups of LACI patients in terms of most of the baseline risk factors except 'smoking' (H(2) = 7.276, p = 0.026). This reiterates the importance of having a complementary prognostic tool as traditionally used risk factors (e.g. hypertension, diabetes mellitus) fail to predict adverse outcome in LACI patients.

Proteomics
Quality control and filtering of iTRAQ data set. The proteins and peptides that are identified and quantified by iTRAQ experiment were exported from ProteinPilot and listed in the Table S1 (Protein Summary) and Table S2 (Peptide Summary). There were 183 proteins with a FDR of 1.1% when a strict cutoff of unused prot-score .3 (.99.9% confidence) was used as the qualification criteria to minimize the false-positive identification of proteins for subsequent data mining. Around 97% of the identified proteins had $2 unique peptides having a confidence of .95%. 288, 377 and 458 proteins were identified with unused score $2 (.99% confidence), .1.3 (.95% confidence) and .1.0 (.90% confidence) respectively. Here, our result is either comparable [37], or even better [33,38,39] than published reports on plasma proteome profiling. Notably, these studies had started with human plasma and used various approaches, such as, depletion of highabundant plasma proteins [37,38,39] or microvesicle enrichment [33] upstream of the proteomics experiment in order to improve the depth of identification.
Next, a cut-off of p-value ,0.05 was used for filtering the proteins with significant ratios from each condition. There were 17, 33 and 28 proteins for the three ratios (i.e. 115/114, 116/114 and 117/114) respectively with an acceptable p-value after excluding the keratins from the list. Of note, this p-value is not related to the biological variation as a pooling strategy was adopted during the proteomics sample preparation. The groups with adverse outcome (either recurrent vascular event or cognitive decline) following LACI had higher percentage of perturbed proteins in the plasma microvesicles (33/183 and 28/183) in comparison with the LACI patients with a good recovery profile (17/183). Overall, 43 proteins having at least one ratio with an acceptable level of confidence were shortlisted for the bioinformatics analysis to retrieve useful biological trends (Table 2).
Bioinformatics analysis of perturbed proteome of microvesicle-enriched plasma. Uniprot accession numbers of the shortlisted 43 proteins was uploaded in DAVID to compare them with the human proteome which was used as the background. To check the enrichment, p-value#0.01 and FDR ,1% was used as a cut-off. The GO analysis in the 'biological process' category shortlisted 'response to wounding', 'acute inflammatory response' and 'lipid transport' as some of the perturbed processes whereas 'enzyme (peptidase and endopeptidase) inhibitor activity' was the key 'molecular function' that was enriched in the perturbed proteome. A complementary trend was observed for significantly enriched 'cellular component' as 'extracellular region or space' and 'platelet alpha granule' were shortlisted. Searching for enriched pathways using various modules (e.g. KEGG, Biocarta, Reactome) showed 'complement and coagulation cascades', 'intrinsic prothrombin activation pathway' or 'integrin cell surface interactions' as significantly over-represented (Table 2).  The hierarchical clustering analysis classified the proteins into two major clusters ( Figure 3A, I and II) separating up-and downregulated proteins in adverse outcome groups. A few trends were apparent. First, the pattern of regulation of the significantly perturbed proteins in both groups with adverse outcome (recurrent vascular events and cognitive decline) was similar in most cases (except Fibrinogen gamma chain and Complement component 4 binding protein, alpha) amid differences in magnitudes only. However, the extent of deregulation is generally more for the 'recurrent vascular event' group compared to the 'cognitive decline' group (Table 2, Figure 4). This could indicate the involvement of vascular abnormality in both groups which may remain at a subclinical stage in the patients with cognitive decline. Proteins related to 'enzyme inhibitor activity' (e.g. Complement Figure 3. A) Hierarchical clustering of the filtered list of proteins from microvesicle-enriched plasma. Log 2 -transformed ratios (e.g. ln (115/114)) of each protein (row) were presented for all conditions (column). Pearson correlation was applied for the measurement of row and column distance. Globally normalized view was presented here. The color scale of the heat map ranges from saturated blue (value, 22.45) to saturated red (value, 2.19) in the natural logarithmic scale. The proteins were mainly clustered into two parts as shown by I (up-regulated in adverse outcome groups) and II (down-regulated in the adverse outcome groups). The pattern of regulation was similar between recurrent vascular event and cognitive decline group amid subtle differences in magnitudes. MBP and ALB were the two most regulated proteins. The protein names and accession numbers were taken from the uniprot protein database. The gene symbols are provided within brackets along with the protein name, wherever available. B) Technical validation of iTRAQ result by WB analysis of ALB on pooled lysates. ALB showed down-regulation in both LACI groups with adverse outcome, which is consistent with the iTRAQ result. doi:10.1371/journal.pone.0094663.g003 C5, Complement C3, Vitamin K-dependent protein S and Interalpha (Globulin) inhibitor H4 (ITIH4)) were generally upregulated in LACI group with better outcome and down-regulated in LACI groups with adverse outcome. A recent study reported the reduction of serum ITIH4 during the first 2-4 days after stroke onset compared to control serum that returned to baseline levels subsequently with the improvement of patients' condition [40]. Interestingly, proteins related to 'integrin cell surface interactions' (e.g. ITGA2B, TLN1, FGB and FGA) were down-regulated in LACI patients with no adverse outcome while up-regulated in both groups with adverse outcome. In contrast, the lipoproteins did not show a differential regulation between groups. Most of the lipoproteins (Apolipoprotein E, A-I, A-II and L1) were downregulated except Lipoprotein, Lp(A) which was significantly upregulated across the LACI groups in comparison with the control. Overall, ALB and MBP were the two most deregulated candidates. The abundance of ALB was validated using pooled samples with WB analysis to check the technical reliability of the iTRAQ result. The WB result showed consistent trends with the iTRAQ result ( Figure 3B).

Discussion
Here we report the significantly altered plasma proteome of microvesicle-rich fraction by comparative profiling of three groups of prospectively followed-up LACI patients and a demographically matched control group that could predict adverse outcome in the surviving LACI patients.

Quantitative Proteomics of Microvesicle-enriched Fraction -An Alternative Approach of Biomarker Discovery for LACI Prognosis
The proteomics approach for biomarker discovery from crude plasma is technically limited by its complexity and extreme dynamic range (.10 10 ), thereby resulting in poor sensitivity for detecting low abundant plasma proteins [41]. To overcome this challenge, multiple approaches have been described that includes biophysical fractionation, enrichment of target sub-proteome and immunodepletion of the abundant interfering proteins. However, none of them were able to significantly outclass the other techniques [38,42]. Here we adopted an alternative approach of targeting the plasma microvesicles in order to enrich the low abundant pathogenic proteins for quantitative profiling. Microvesicles, including exosomes are membrane-bound particles and are increasingly being recognized as reservoirs of potential biomarkers [43]. They are reported to be involved in the pathogenesis of various diseases such as ischemic stroke, thrombosis, diabetes, inflammation, atherosclerosis and vascular cell proliferation [18,44,45]. Microvesicles can be secreted from endothelial, circulatory (e.g. platelets, leukocytes, erythrocytes), and even central nervous system (CNS)-specific cell types (e.g. microglia and oligodendrocytes) [16,17,18].
The discovery of circulatory biomarkers for neurological disorders represents an additional challenge as brain parenchyma remains selectively accessible by the systemic circulation due to the presence of BBB under the physiological condition. This makes blood an indirect reflector to sense any events happening inside the brain tissue. However, different cell types of the brain (e.g. microglia and oligodendrocytes) are reported to release microvesicles for delivering signals to the neighboring cells and external environment [18]. A fraction of these microvesicles may drain into the cerebrospinal fluid (CSF) or eventually in the blood. In addition, ischemic SVD is well-known to cause an endothelial dysfunction and a diffuse increase in the BBB permeability that may facilitate the leakage of microvesicles in the general circulation [46]. Hence, profiling of circulatory microvesicleenriched fractions by quantitative proteomics during the poststroke recovery phase constituted a technically and conceptually preferred strategy to investigate the on-going neuro-pathological processes and to discover useful prognostic markers. Accordingly, the detection of several commonly known exosome markers (e.g. Galectin-3-binding protein, ITGA2B, Peptidyl-prolyl cis-trans isomerase) in the shortlist of perturbed candidates as well as in the complete list (e.g. CD9, CD81, Gelsolin, Glyceraldehyde-3phosphate dehydrogenase, Pyruvate kinase, Tubulin alpha-1B and beta-1) of confidently identified proteins indicated the successful enrichment of plasma microvesicles, including exosomes for this study ( Table 2, Table S1) [43].
Blood-based biomarker studies in the area of ischemic stroke mostly correlated acute levels of biomarkers (within first week after stroke onset) to short term outcome (e.g. death, disability or infarct volume) without focusing on certain subtypes of ischemic stroke. Most of the investigational biomarkers are proteins of extra-cranial source that are related to inflammation, cardiovascular system and hemostasis apart from few proteins of brain origin [47,48]. Our study, using the microvesicle profiling approach, has identified differentially regulated peripheral as well as brain-specific candidates (e.g. MBP and glial fibrillary acidic protein(GFAP)) targeting only lacunar stroke while relating them to the long-term outcome measures such as cognitive decline and recurrent vascular events ( Table 2, Table S1). In addition, the plasma was collected during the convalescent phase following the index event thus effectively evading the acute systemic response.

Up-regulation of Integrin Signaling -Probable Failure of Aspirin Therapy
The down-regulation of candidates from the coagulation cascade (e.g. FGB) and integrin signaling pathway (e.g. ITGA2B, FLNA and TLN1) and up-regulation of plasminogen (PLG) in the LACI patients was associated with no adverse outcome. PLG is secreted as a zymogen and activated by proteolysis through tissue plasminogen activator to generate plasmin, which dissolves fibrin in blood clots and helps to restore circulation. ITGA2B or CD41 is a platelet membrane glycoprotein and receptor for diverse ligands including fibronectin, fibrinogen, plasminogen, prothrombin, and thrombospondin. Activated ITGA2B mediates platelet spreading and aggregation on vascular surfaces during hemostasis and thrombosis. It has been shown that TLN1 can independently activate b integrin by binding on its cytoplasmic tail [49] (Figure 4). FLNA on the other hand can compete with TLN1 for binding to integrins, thereby regulating its activation under certain circumstances [50]. In a recent study, involvement of TLN1-dependent activation of ITGA2B or Rac1 in platelets has been demonstrated for late phase stability of thrombus on undisrupted endothelial cells [51]. Thus, the overall suppression of integrin signaling in patients with no adverse outcome is complementary to the downregulation of its ligands (i.e. FGB) and the up-regulation of PLG. Aspirin has been reported to partially inhibit the inside-out ITGA2B signaling apart from its anti-platelet action [52]. As all LACI patients were on aspirin therapy, down-regulation of the pro-aggregatory platelet proteins and suppression of proteins from integrin signaling pathway in platelet should be related to the desired anti-thrombotic effect (Figure 4). This speculation was further validated as all the above-mentioned proteins (i.e. ITGA2B, TLN1, FLNA, FGA, FGB, and PLG) showed opposite or no regulation in both groups with adverse outcome (Figure 4A). High plasma fibrinogen is well-studied to be an independent risk factor for stroke and is associated with an increased risk of recurrent cardiovascular events, when stroke sub-types were not specified [53]. In another study dealing with SVD in particular, a positive correlation was obtained between fibrinogen level and the amount of leukoaraiosis [54]. Fibrinogen is one of the main determinants of plasma viscosity. Thus, higher levels of fibrinogen in surviving LACI patients may aggravate the cerebrovascular dysfunction through hemorheologic impairment or by inducing a state of hypercoagulability [54].

Up-regulation of Brain-specific MBP -Predictor of Poor Outcome
Brain-specific MBP has been detected in the systemic circulation in nanogram concentration during the acute phase (e.g. hrs to few days) of ischemic stroke and are correlated with acute (24 hrs) or subacute (3 months) outcomes using targeted assays [55,56]. Here, we confidently identified MBP with five unique peptides (unused score = 10.7) by a proteomics profiling approach justifying the utility of this methodology for sensitive detection of low abundant plasma proteins. Our result indicates that significantly higher MBP concentration during convalescent stage is associated with adverse outcome which is consistent with the previous reports. BBB abnormality is generally more diffuse in small vessel stroke compared to non-lacunar stroke subtypes that may cause gradual and sustained leakage of brain-specific MBP into general circulation [46]. Chronic hypoperfusion of the white matter leading to progressive and selective death of oligodendrocytes by apoptosis and subsequent degeneration of myelinated fibres have been demonstrated in animal models of SVD [57]. Hence, the release of MBP, which is a structural component of CNS myelin and a marker of oligodendrocyte, either signifies an increased glial injury or an increased permeability of the BBB. Both of these may be responsible for the recurrent vascular event or cognitive decline in the groups with adverse outcome. Leaked MBP along with other CNS specific proteins may also act as antigenic signals to activate systemic immune response that could exacerbate the ischemic injury through inflammatory pathways [58].

Down-regulation of ALB -Indicative of Poor Outcome
Our result showed that significant down-regulation of plasma ALB is associated with adverse outcome among the surviving LACI patients (Table 2, Figure 3B). Several clinical studies have reported higher concentration of circulatory ALB at admission, which is predictive of a better functional outcome and lower mortality in ischemic stroke patients [59,60]. However, unlike others, we have seen a complementary trend by focusing on patients with non-disabling lacunar stroke only. ALB is known to be neuroprotective in preclinical animal models of stroke and was under clinical trial as a potential neuroprotective agent [61,62]. It might play a beneficial role through its antioxidant, prothrombolytic action or by promoting and sustaining perfusion in the cerebral microcirculation [63]. Hence, a procoagulatory condition as discussed previously is complementary with the down-regulation of ALB in the groups with poor outcome.

Limitations
The samples had been stored for more than 5 yrs (up to maximum 12 yrs) at 280uC, which should be taken into account before comparing the data with similar studies during a metaanalysis. However, it should be kept in mind that being a study to discover potential prognostic biomarkers while targeting long-term outcome variables, major part of the waiting time is included in the study duration. Further, similar storage time is a common occurrence in biomarker studies and shown to adequately preserve the quality of frozen samples when compared with freshly collected specimens for various circulatory proteins such as insulin-like growth factor-I and transforming growth factor b [55,64].
The microvesicles obtained by sequential centrifugation and ultracentrifugation are often contaminated by co-sedimenting vesicles and protein aggregates. We have acknowledged this by describing the fraction as 'microvesicle-enriched' instead of a pure microvesicle preparation. However, there is no optimized and universally-accepted protocol for isolating pure microvesicles. The source of the plasma microvesicles is also unknown in our study, which may either be derived from circulatory or even some cells of CNS origin as seen by the detection of MBP and GFAP as some of the identified candidates. The traditionally used flow cytometric approaches may also fail to detect microvesicles of specific source due to their ultra-low abundance [18]. Hence, the results from this microvesicle-enriched preparation having an undefined anatomical origin should be interpreted with caution.
The small amount of the starting plasma (, 5 ml after pooling) and low yield of the resulting microvesicles did not allow extensive validation of the candidate makers on pooled or individual samples. Hence, no conclusion could have been drawn about the biological variation of candidate biomarkers. Validation using multiple reaction monitoring based mass spectrometric approach or higher volume of starting samples could alleviate this problem.

Concluding Remarks
Our study is the first of its kind where a discovery proteomics approach was used to identify prognostic circulatory biomarkers of ischemic stroke. The therapeutic importance of plasma microvesicle and the technical advantage of the iTRAQ quantitative proteomics are addressed in a unique experimental design. We also proposed several hypotheses for further testing in bigger population of individual patients. The up-regulation of many platelet related proteins including the proteins of integrin pathway and coagulation cascade probably due to the failure of an antiplatelet therapy is associated with the adverse outcome in the LACI patients. Reverse regulation of MBP and ALB could indicate underlying pathological changes ongoing during the convalescent stage of LACI.
As the plasma levels of many of these candidates are modifiable via drugs or changes in lifestyle, the perturbed candidates alone or 'as a panel' can be tested to stratify the high-risk group of patients on priority for hospital admission, treatment or rehabilitation or to monitor the effect of therapy on the long-term functional outcome of the disease. Conversely, these proteins can also be used as surrogate markers in LACI related clinical trials to monitor the consequences of therapeutic interventions [65]. Our study will facilitate a better understanding of the underlying pathology and stimulate research interest on individual candidates because it is difficult to acquire comparable data set from LACI affected brain samples. In conclusion, this study will foster the emerging area of quantitative clinical proteomics as a viable tool for the discovery of novel biomarkers for ischemic stroke.

Raw Data Availability
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral. proteomexchange.org) via the PRIDE partner repository [66] with the dataset identifier PXD000748.

Supporting Information
Table S1 Complete information of the full list of the qualified proteins obtained from the bias and background corrected iTRAQ data set. (XLSX)