A Novel Multivariate Index for Pancreatic Cancer Detection Based On the Plasma Free Amino Acid Profile

Background The incidence of pancreatic cancer (PC) continues to increase in the world, while most patients are diagnosed with advanced stages and survive <12 months. This poor prognosis is attributable to difficulty of early detection. Here we developed and evaluated a multivariate index composed of plasma free amino acids (PFAAs) for early detection of PC. Methods We conducted a cross-sectional study in multi-institutions in Japan. Fasting plasma samples from PC patients (n = 360), chronic pancreatitis (CP) patients (n = 28), and healthy control (HC) subjects (n = 8372) without apparent cancers who were undergoing comprehensive medical examinations were collected. Concentrations of 19 PFAAs were measured by liquid chromatography–mass spectrometry. We generated an index consisting of the following six PFAAs: serine, asparagine, isoleucine, alanine, histidine, and tryptophan as variables for discrimination in a training set (120 PC and matching 600 HC) and evaluation in a validation set (240 PC, 28 CP, and 7772 HC). Results Several amino acid concentrations in plasma were significantly altered in PC. Plasma tryptophan and histidine concentrations in PC were particularly low, while serine was particularly higher than that of HC. The area under curve (AUC) based on receiver operating characteristic (ROC) curve analysis of the resulting index to discriminate PC from HC were 0.89 [95% confidence interval (CI), 0.86–0.93] in the training set. In the validation set, AUCs based on ROC curve analysis of the PFAA index were 0.86 (95% CI, 0.84–0.89) for all PC patients versus HC subjects, 0.81 (95% CI, 0.75–0.86) for PC patients from stage IIA to IIB versus HC subjects, and 0.87 (95% CI, 0.80–0.93) for all PC patients versus CP patients. Conclusions These findings suggest that the PFAA profile of PC was significantly different from that of HC. The PFAA index is a promising biomarker for screening and diagnosis of PC.


Introduction
Pancreatic cancer (PC) is currently the eighth leading cause of cancer-related mortality, with an estimated 266,000 deaths worldwide in 2008 [1], and remains one of the most challenging malignancies to treat. The only potentially curative therapy is surgical resection; however, approximately 70% of cases initially present with advanced disease (stage III-IV), which cannot be cured by surgery. Advanced PC has a very poor prognosis, which is attributable to absence of early symptoms and useful screening methods, with a median survival period of 7.7 months for stage III and 2.5 months for stage IV disease [2]. Moreover, the 5-year survival rates are reportedly only 21.3% for local stage, 8.9% for regional stage, and 1.8% for distant stage [3].
Several tumor-associated antigens have been evaluated as potential prognostic factors for PC, including carcinoembryonic antigen (CEA) and carbohydrate antigen (CA) 19-9. CA19-9 is the most clinically useful diagnostic marker, with sensitivity of 79%-81% and specificity of 82%-90% in symptomatic patients, but its low positive predictive value makes it a poor marker for screening [4]. In addition, enhanced computed tomography (CT) and endoscopic ultrasound (EUS) are useful for the diagnosis of PC; however, these modalities are costly and potentially hazardous. Therefore, it is necessary to establish more effective screening methods for PC, particularly in the early stages of the disease. Amino acids are either ingested or endogenously synthesized and play essential physiological roles both as basic metabolites and metabolic regulators. Plasma free amino acids (PFAAs) present favorable targets of biomarkers because PFAA profiles are known to be influenced by metabolic variations in specific organ systems induced by specific diseases [5][6] [7][8] [9]. Previous comprehensive metabolomic studies have often focused on changes in PFAA profiles [10] [11][12] [13]. Measurement of PFAA concentrations as possible marker of disease is also a more advantageous strategy for accurate and highthroughput analysis using mass spectroscopy than comprehensive metabolomics. Changes in PFAAs profiles are characteristic of several cancers; thus, the development of a multivariate index composed of these PFAAs could be used to better discriminate individual cancer types from healthy controls [14] [15]. In the present study, we investigated the patterns of PFAA profiles, and then developed and validated a multivariate index for detection of PC.
Kameda Medical Center, and the Kanagawa Health Service Association. All subjects gave written informed consent before participation in this study. All clinical information was anonymized before data analysis.

Subjects
PC patients (n = 360) included in this study were recruited from the Osaka Medical Center of Cancer and Cardiovascular Diseases, the Kanagawa Cancer Center, the National Cancer Center Hospital, Tokai University Hospital, and the Gunma Prefectural Cancer Center between 2007 and 2014. Patients with chronic pancreatitis (CP; n = 28) were recruited from JCHO Osaka Hospital between 2013 and 2014, while healthy control (HC) subjects (n = 8372) who underwent comprehensive health examination were recruited from Kanagawa Health Service Association, Kameda Medical Center (Makuhari Clinic), and Mitsui Memorial Hospital between 2008 and 2010. Over 95% of consecutive PC and CP cases and HCs agreed to provide consent during the study period. PC patients with the following characteristics were excluded: (1) simultaneously diagnosed with cancer in another organ, (2) hepatitis C, and (3) under treatment with anti-cancer agents. CP patients with the following characteristics were excluded: (1) diagnosed with cancer and (2) hepatitis C. The inclusion criteria of HC subjects were as follows: (1) no history of any cancer, and (2) no history of hepatitis C. PC stage was determined according to the Sixth Edition of the International Union Against Cancer (UICC) Tumor-Node-Metastasis (TNM) Classification of Malignant Tumors [16] Dataset preparation Among the 360 PC patients, 120 PC patients obtained early in blood collection order were used as a training dataset. To prepare the HC subjects in the training data set, 600 of 8372 HC subjects were selected using propensity score matching based on gender and age distribution. The remaining 240 PC and all 28 CP patients and 7772 HC subjects were used as a validation dataset.

Statistical analysis
Mean and standard deviation (SD). The mean amino acid concentrations ± SDs were calculated to determine summarized PFAA profiles for both patients and controls.
Mann-Whitney U-test. The Mann-Whitney U-test was used to assess significant differences of PFAA concentrations between patients and controls.
Receiver-operator characteristic (ROC) analysis. ROC analysis was performed to determine the capabilities of uni-and multivariate analyses to discriminate between patients and controls. The patient labels were fixed as positive class labels. Therefore, an area under the ROC curve (AUC of ROC) value of <0.5 indicated that the amino acid level was lower in patients than controls, whereas an AUC of ROC value of >0.5 indicated that it was higher. The 95% confidence intervals (95% CI) of the AUC of ROC for the discrimination of patients based on amino acid concentrations and ratios was also estimated using the methods described by Hanley and McNeil [20].
Logistic regression analysis. Multivariate logistic regression analysis was performed to estimate the model discriminating PC patient from control subjects.
Model selection of PFAA index. The PFAA index was defined as a multivariate model using PFAA concentrations as variables. Logistic regression analysis with variable selection was performed to distinguish PC patients from the HC subjects. The maximum number of explanatory variables was restricted to less than seven to avoid potential multicollinearity. For model selection, the AUC of ROC was obtained after leave one out cross validation (LOOCV). In brief, one matched set composed of one PC patient and corresponding control subjects was omitted from the training data set, and the logistic regression model was calculated using the remaining samples to estimate coefficients for each amino acid. The function values for the left-out matched set were calculated based on this model. This process was repeated until every sample in the study data set had been left out once.

Software
All statistical and multivariate analyses were performed using MATLAB (MathWorks, Natick, MA, USA) and Prism (GraphPad Software, Inc., San Diego, CA, USA) statistical software.  Table 1 summarizes the characteristics of PC and HC subjects included in this study. PC patients with stage 0-IIB disease, as a resectable stage subgroup, accounted for 35.8% of the training set and 35.0% of the validation set.

PFAA profiles of PC patients
We first measured the concentrations of 19 plasma amino acids in the training set by HPLC-ESI-MS and found significant increases in Ser concentrations and significant decreases in the concentrations of 14 amino acids (Thr, Asn, Pro, Ala, Cit, Val, Met, Leu, Tyr, Phe, His, Trp, Lys and Arg) in PC patients compared with HC subjects (p < 0.05) ( Table 2, Table). Plasma Ser concentrations were especially higher, while Trp and His concentrations were particularly lower in PC patients compared with HC subjects. The PFAA profiles of PC patients with stage 0-IIB disease, as a resectable stage subgroup, were almost similar to those of all other PC patients.

Multivariate PFAA index
For effective detection of PC patients, we calculated optimal PFAA indices by multiple logistic regression analysis. The AUC of ROC values obtained after LOOCV in the top 50 models were virtually the same (0.88-0.89). We evaluated a representative model composed of Ser, Asn, Ile, Ala, His, and Trp as the best model (Table 3). Then, additional logistic regression analyses adding BMI and/or smoking history into explanatory variables were performed to estimate the effects of potential confounding. No obvious elevation of significance was observed in each amino acid when those factors were added into the model, suggesting that the changes of the plasma level of those amino acids caused by PC were independent to BMI or smoking status of subjects (Table 3).
In terms of discriminating PC patients from control subjects, ROC curves for the PC vs. HC or CP subgroups between the training set and validation set were calculated (Fig 3A and 3B, respectively). In the training set, the AUCs of the PFAA indices for detection of PC patients vs. HC subjects was 0.89 (95% CI, 0.86-0.93) among all PC patients and 0.89 (95% CI, 0.83-0.95) among PC patients with stage 0-IIB disease. The sensitivities of the PFAA indices at 95% and 80% specificity were 60.0% and 82.5%, respectively, for all PC patients, and 53.5% and 83.7% for PC patients with stage 0-IIB disease, respectively (Table 4). In the validation set, the AUC of the PFAA index was 0.86 (95% CI, 0.84-0.89) for all PC patients and 0.81 (95% CI, 0.75-0.86) for PC patients with stages IIA and IIB disease (Fig 3B). The sensitivities of the PFAA indices at a specificity of 95% and 80% were 57.5% and 76.7% for all PC patients, and 48.8% and 64.3% for PC patients with stage IIA and IIB disease ( Table 4). The AUC of the PFAA index for detection of PC vs. CP was 0.87 (95% CI, 0.80-0.93) for all PC patients and the falsepositive rates at 95% and 80% specificity were 7.1% and 25.0%, respectively (Table 4) We confirmed that the variance inflating factor (VIF), the maximum of the diagonal element of the inverse matrix of correlation coefficient matrix, of all the top 50 models not to choose the inappropriate models showing multicolinearity. All the models passed the test, that is, VIFs were less than 10. Most of all, VIF of the representative model was 1.70, suggesting that no multicolinearity occured.

Discussion
Dysregulation of PFAA content in PC has been investigated in several recent studies using metabolomics or amino acid analysis [21][22] [23]. However, specific PFAA profiles in PC, particularly at resectable stages, remain unconfirmed because of the relatively small number of PC patients and control subjects used in these studies. Therefore, we measured fasting PFAA concentrations in a large-scale study of 360 PC patients and 8372 control subjects to identify specific PFAA profiles in PC patients as compared with a gender-and age-matched training set (PC120, HC600) (Fig 2). In addition, a similar PFAA profile was observed in patients with stage 0-IIB disease, which accounted for 35.8% of the PC patients included in this study. (Fig  2). As shown in Table 2, the plasma concentrations of several amino acids were significantly altered in PC patients, which were in accordance with the PFAA profiles of five types of cancer reported by Miyagi et al. [14], although plasma His and Trp concentrations were particularly decreased, while Ser concentrations were notably increased (Fig 2). Furthermore, we developed a PFAA index using a training set composed of six amino acids that were clearly characteristic of the amino acid profile in PC. We demonstrated that this index can be used to efficiently differentiate not only progressive PC but also operable PC, from stage IIA and IIB disease in an independent validation set (Table 4). Moreover, we also showed that the PFAA index hardly detects chronic pancreatitis (Table 4). PFAA profiles of PC patients have been reported in several previous studies, among which, several amino acid profiles were similar, although there were some obvious discrepancies [21] [22] [23] For example, we found a significant increase in plasma Ser concentrations in PC,  while this trend was not observed in other studies. In addition, there were discrepancies in Asn, Gln, Met, Ile, Phe, Leu, and Pro levels. In contrast, these previous studies commonly reported a significant decrease in Thr concentrations, while changes in Arg, Cit, and Trp concentrations were not determined in one study, and significant decreases in these amino acids in PC were observed in our study as well as two others. We considered several reasons for these discrepancies. First, these previous studies included relatively small numbers of subjects compared with the present study, which included the largest number of subjects to date. In this study, the PFAA index was robust and the AUC barely decreased even with the validation set because it was developed based on a training set with an adequate sample size. Second, differences between our results and those of other studies may have occurred because of variations in   [29]. Furthermore, leaving collected blood samples at room temperature is known to alter plasma amino acid concentrations [30]. To overcome this confounding factor, all participating facilities in this study used the same protocol, in which blood was drawn in the morning before breakfast after overnight fasting and the collected samples were quickly cooled to prevent alterations in amino acid concentrations because of enzymatic reactions. Therefore, the acquired samples were of high quality and the extracted data accurately reflects in vivo amino acid profiles during fasting. Furthermore, determination of amino acid concentrations using HPLC-ESI-MS in this study was calculated not as a semiquantified value using metabolomics as in previous studies but rather as directly quantified absolute concentrations by creating a calibration curve from the peak area of standard references of each amino acid [19]. These measurements were highly accurate and precise to guarantee validation, reproducibility, and limited daily error [19]. Thus, the findings of this study may more clearly demonstrate profile characteristics in comparison with those demonstrated by previous studies. The use of multivariate analysis of markers for PC has also been reported [23,31]. For example, Kobayashi et al. [23] constructed a multiple logistic regression model using a 43-case training set with the concentrations of four metabolites selected as variables from data comprehensively semiquantified from metabolite concentrations by GC-MS. Meanwhile, of the four selected metabolites, xylitol is a foodderived substance that is present at very low concentrations in healthy individuals [32] [33]. However, it was unclear whether these concentrations are physiologically maintained at certain levels in vivo. Furthermore, Leichtle et al. [31] constructed a combined metabolite panel to discriminate PC from CP and HC using aspartic acid (Asp) and CA19-9 as variables. However, the plasma Asp concentration tends to be comparatively low and an analytical variability of >25%, as reported elsewhere [34]. In the present study, the PFAA index was constructed with only amino acids with moderate to high plasma concentrations to secure measurement precision. Therefore, we believe that the PFAA index offers a high discriminatory ability without being influenced by measurement errors. Because amino acid analysis is widely used clinically, the PFAA index is likely to be quickly verified and we suspect its use will be widespread in the near future. Meanwhile, genetic, racial, and geographical elements may also be factors impacting these differences, which should be clarified in future research.
There are several possible mechanisms that may influence PFAA profiles in cancer patients. First, previous studies have demonstrated marked metabolic changes in local cancer, including varied amino acid profiles and different expression of amino acid transporter in cancer cells compared with healthy cells [10] [35]. For example, L-neutral amino acid transporter 1 (LAT1) is strongly expressed in PC cells [36]. With respect to Ser, de novo Ser biosynthesis is upregulated in cancer cells and Ser acts as an allosteric activator of pyruvate kinase isozyme M2 [37]. This characteristic may be related to factors that also increase plasma Ser concentrations. A second possible mechanism is the induction of remote organ metabolic changes caused by factors emitted from cancer cells. For example, Luo et al. [38] reported that HMGB-1 secreted by cancer cells caused the breakdown of remote muscle tissue proteins into amino acids, some of which leak out into the blood, thereby altering the PFAA profile. A third possible mechanism is involvement of the immune system. For example, plasma concentrations of Trp have been correlated with common metabolic changes, both in our study and a previous study that investigated Trp levels in five different types of cancer [14]. Expression of indoleamine 2,3-dioxygenase (IDO), which is involved in the kynurenine metabolic pathway, is induced in various types of cancer (cancer cells or immune cells) and known to play an important role in immunosuppression [39]. IDO is also known to be overexpressed in PC cells [40]. Thus, several points regarding the mechanisms behind changes in PFAA profiles in PC remain unclear; thus, further research is needed to clarify these issues.
Recently, Mayers et al. [41] reported that branched-chain amino acid (BCAA) serum levels are elevated 2-5 years before the onset of carcinogenesis in PC, suggesting that BCAA elevation is an independent risk factor for PC. However, BCAA levels return to normal levels within the 2 years before confirmation of cancer. In addition, the results of a mouse study indicated that the period of BCAA elevation was bell-shaped and only temporary. In our study, we presented the PFAA profiles of definitively diagnosed patients after cancer detection via diagnostic imaging. Among cases with resectable stage disease (up to stage IIB), there were no significant changes in BCAA concentrations compared with control subjects. In cases of advanced cancer, Leu and Val concentrations were decreased. Our study data identified characteristics of PC phases that are readily confirmed by currently available imaging modalities; thus, it is possible that these characteristics differed among the main stages of microcarcinoma or before the onset of carcinogenesis. Because we did not examine PFAA concentrations before carcinogenesis in this study, future studies are needed to accurately identify BCAA dynamics from before PC onset to carcinogenesis. However, the abovementioned studies found that metabolic changes alter systemic amino acid profiles together with changes in plasma BCAA concentrations in the precancerous phase or extremely early stages of PC [41]. Therefore, the observed changes in PFAA profiles of the patients with PC lesions, which could be diagnosed via imaging and considered for resection in our study, may also have been caused by systemic metabolic changes.
Currently, CA19-9 is the most widely used marker to predict PC treatment outcome and post-treatment prognosis [42]. However, CA19-9 is not synthesized by patients classified as Lewis blood group Le a-b-, which accounts for 10% of cases; therefore, this marker may not be elevated in some patients, even those with advanced stage PC [43]. CEA is widely used as a prognostic marker in gastrointestinal cancers; however, its sensitivity and specificity for PC are poor [44]. The pancreatic enzyme elastase-1, which is thought to increase with pancreatitis caused by pancreatic duct stenosis, has been demonstrated as an effective early diagnostic marker [45]. In the present study, we found no correlations between the PFAA index and CA19-9, CEA, or elastase-1 levels (Fig 5). Thus, when used concurrently, the PFAA index with CA19-9, CEA, or elastase-1 may complement each other in order to more accurately detect PC. However, CA19-9 and elastase-1 were not measured in the HC subjects in this study. Thus, to confirm the comparison of accurate discriminatory ability and synergetic effect with these markers, further studies are needed. The discriminatory ability of the PFAA index was shown to be high even for small pancreatic tumors of TS-1 according to subgroup analysis (ROC_AUC = 0.76) (Fig 4B). We also found that the PFAA index was not dependent on the location of the pancreatic tumor ( Fig 4C). Although general abdominal ultrasonography is used to diagnose PC in the initial phase, it is difficult to image small tumors or lesions in the pancreatic tail or uncinate process using this modality. Our results suggested that the proposed PFAA index developed in this study offers the same sensitivity without depending on tumor location. Therefore, combinatorial use of abdominal ultrasonography and the PFAA index may be a good marker to increase the detection rate of lesions of the pancreatic tail and uncinate process. In this study, the training set and validation set were divided chronologically. As a result, no early cases of stage I or less were included in the validation set that occurred chronologically later in time. The fact that the discriminatory ability of the PFAA index for early stage cases of stage I or less remains unknown is a limitation of this study. However, this study was cross-sectional; therefore, we cannot exclude the possibility of reverse causation or residual confounding from complications such as diabetes or indigestion. In our future work, we plan to demonstrate the clinical significance of our proposed PFAA index and confirm its ability to discriminate the early stages of PC and the association of the PFAA index with the complications of PC.

Conclusions
In this study, we successfully developed a novel PFAA index using fasting PFAA profiles to discriminate PC patients from control subjects, and validated the index in an independent large validation set, although the study was cross-sectional and the reversal causality, including symptoms associated with PC or complications, cannot be ruled out. Additional studies with larger patient cohort that include patients with early stage PC are also required. However, we believe the PFAA index will help to improve the early detection of PC in patients with asymptomatic and resectable stage disease.