Urinary Collagen Fragments Are Significantly Altered in Diabetes: A Link to Pathophysiology

Background The pathogenesis of diabetes mellitus (DM) is variable, comprising different inflammatory and immune responses. Proteome analysis holds the promise of delivering insight into the pathophysiological changes associated with diabetes. Recently, we identified and validated urinary proteomics biomarkers for diabetes. Based on these initial findings, we aimed to further validate urinary proteomics biomarkers specific for diabetes in general, and particularity associated with either type 1 (T1D) or type 2 diabetes (T2D). Methodology/Principal Findings Therefore, the low-molecular-weight urinary proteome of 902 subjects from 10 different centers, 315 controls and 587 patients with T1D (n = 299) or T2D (n = 288), was analyzed using capillary-electrophoresis mass-spectrometry. The 261 urinary biomarkers (100 were sequenced) previously discovered in 205 subjects were validated in an additional 697 subjects to distinguish DM subjects (n = 382) from control subjects (n = 315) with 94% (95% CI: 92–95) accuracy in this study. To identify biomarkers that differentiate T1D from T2D, a subset of normoalbuminuric patients with T1D (n = 68) and T2D (n = 42) was employed, enabling identification of 131 biomarker candidates (40 were sequenced) differentially regulated between T1D and T2D. These biomarkers distinguished T1D from T2D in an independent validation set of normoalbuminuric patients (n = 108) with 88% (95% CI: 81–94%) accuracy, and in patients with impaired renal function (n = 369) with 85% (95% CI: 81–88%) accuracy. Specific collagen fragments were associated with diabetes and type of diabetes indicating changes in collagen turnover and extracellular matrix as one hallmark of the molecular pathophysiology of diabetes. Additional biomarkers including inflammatory processes and pro-thrombotic alterations were observed. Conclusions/Significance These findings, based on the largest proteomic study performed to date on subjects with DM, validate the previously described biomarkers for DM, and pinpoint differences in the urinary proteome of T1D and T2D, indicating significant differences in extracellular matrix remodeling.


Introduction
Diabetes mellitus (DM) is a complex disease characterized by insufficient insulin production and resultant hyperglycemia with alterations in fat and protein metabolism. With time these alterations cause secondary cellular dysfunctions and vascular damage including diabetic nephropathy, retinopathy, neuropathy, and macrovascular disease or vascular alterations. The most common types of DM are type 1 diabetes (T1D) and type 2 diabetes (T2D). T1D is associated with destruction of insulinproducing b-cells in the islets of Langerhans in the pancreas, typically by an autoimmune mechanism, leading to insufficient insulin production. In contrast, T2D is caused by insulin resistance combined with insufficient insulin synthesis and is often associated with obesity.
Although all forms of DM are characterized by hyperglycemia and b-cell dysfunction, the pathogenesis of DM is variable, comprising different degrees of b-cell dysfunction, apoptosis, inflammation and immune responses. Proteome analysis holds the promise of delivering substantial insight into the pathophysiological changes associated with different types of DM. Urine represents an excellent specimen for proteome analysis, as it can be obtained in high quantities without the need for special collection procedures [1], shows higher stability than blood [2,3], and enables the identification of valid biomarkers for renal, as well as systemic diseases [4,5]. Recently, we identified and validated urinary proteomics biomarkers for DM, and DM associated micro-and macrovascular complications [3,[6][7][8][9][10][11]. These biomarkers also gave indications of relevant pathophysiological changes: the interference with homeostasis of extracellular matrix (ECM) turnover [9].
Based on these initial findings, we aimed to further validate urinary proteomics biomarkers for DM in general, and examine specific association of urinary proteins and peptides with either T1D or T2D. The identification of these differences in the urinary proteome should provide a deeper understanding of the pathophysiological changes associated with DM, especially DM associated micro-and macrovascular disease, and may result in advancements in therapeutic strategies.

A. Urinary biomarkers for DM
Recently, we identified a panel of 261 urinary biomarkers that exhibit significant differences between patients with DM and non-DM individuals [11]. In the study by Snell-Bergeon et al. [11], an SVM-derived classifier based on the DM specific panel (''diabetes 7'') was tested in a small one-center cohort of patients with T1D. In this first part of the study (A), to thoroughly validate these marker candidates in an independent multicenter validation set, we collected 697 urine samples from patients with either T1D or T2D and healthy controls in 9 additional centers. Urine samples of 382 DM and 315 non-DM were analyzed using CE-MS urinary proteome analysis, as graphically outlined in Figure 1A. The distribution of the 261 biomarkers in the 697 validation samples is given in Table S1. The established diabetes 7 model enabled classification of this independent validation cohort with an AUC in ROC analysis of 94% (95% CI: 92-95%) ( Figure 2A). The comparison of classification scores for the non-DM control samples showed statistically highly significant differences (P,0.0001) compared to T1D and as well as to T2D patients ( Figure 2B and Table S2). In order to further validate the individual DM biomarker candidates, we applied Mann-Whitney U-testing to identify out of the 261 peptides those which are significantly associated with DM also in the independent multicenter cohort of 697 patients. Of the 261 peptides, 148 displayed P#0.05 in the validation cohort, indicating significant association with DM in this independent patient cohort.
In summary, the previously developed DM specific panel is able to identify patients with DM independent of the diabetes type. However, the AUC value of only T1D patients compared to controls is higher (0.946) than the AUC value of T2D patients (0.932). Therefore, we also compared the median scoring of T1D and T2D patients and, interestingly, it was significantly (P = 0.034) different.

B. Urinary biomarkers distinguishing type 1 and type 2 DM
After successful validation of DM specific biomarkers, and initiated by the observed difference between T1D and T2D, we subsequently aimed to investigate these differences in more detail in this second part of the study (B). For this purpose we employed the urinary proteome data from the 382 DM subjects described above and additional urinary proteome data from 205 diabetic subjects previously used for DM biomarker discovery [11], a total of 587 datasets from DM subjects (299 T1D and 288 T2D, Figure 1B).
To avoid any interference of peptides deriving from diabetic nephropathy, we only include DM patients without any evidence for renal disease. Therefore, we used urine samples of normoalbuminuric T1D and T2D patients to identify DM type specific biomarkers. Of the 587 diabetic subjects, 369 were excluded due to evidence of chronic renal disease, and 218 had normal renal function (136 with T1D and 82 with T2D). These 218 subjects were randomly divided into a discovery set (n = 110, 68 T1D and 42 T2D) and an independent validation set (n = 108, 68 T1D and 40 T2D, see Figure 1B, flow sheet, and Table 1). Characteristics of patients in the discovery set (n = 110), validation set (n = 108), and the remaining patients with DM who had chronic renal impairment (n = 369) are given in Table 2, stratified by DM type. The statistical comparison of the single urinary peptides and proteins in the discovery data set resulted in the tentative identification of 222 potential marker candidates (see Table S3/set I).
The differences of biomarkers in T1D and T2D patient urine samples may be caused by different pathophysiology of the DM, but also by differences of other clinical parameters in both cohorts.
For all data sets, T1D subjects were younger, had longer diabetes duration, lower systolic blood pressure and BMI, and were less likely to be treated for hypertension (HTN) or dyslipidemia than patients with T2D. All T1D patients were treated with insulin and none were treated with oral hypoglycemics, in contrast to T2D patients. We analysed whether the different variables contributed to the prediction of diabetes type. Logistic regression can be used for prediction of the probability of occurrence of an event and makes use of several predictor variables that may be either continuous or categorical. Therefore, logistic regression was utilized to assess if demographic or clinical data, or medication use differed by DM type. For this analysis the discovery set was used. The results revealed that the prediction of DM type was not significantly dependent on gender, urinary albumin, ACR, GFR, systolic and diastolic blood pressure, BMI, smoking status, TC, HDL, LDL, TG and medication status. Of all included parameters, only age and duration of DM were significantly independently associated with DM type.
To correct the 222 marker candidates for age and duration of DM related proteomic changes, we performed a non-parametric analysis of the variances (Kruskall-Wallis test). The analysis identified 91 peptides significantly correlated with age (see Table  S3/set I), and one peptide correlated with duration of DM, which was also among the 91 peptides correlated with age. These 91 peptides were excluded from the list of potentially diabetes typeassociated biomarkers.
The remaining 131 age and DM duration independent candidate biomarkers (Table S3/set II, Figure 3B) were employed in SVM-based classifier, which was trained as potentially 'diabetes type specific polypeptide panel' (DTspP) in the discovery set.
Subsequently, DTspP was evaluated in the validation set (n = 108) consisting of 68 normoalbuminuric T1D and 40 T2D Figure 2. Results for validation of the urinary proteome pattern specific for diabetes. (A) ROC curve for the independent validation set (n = 697). ROC analysis for diagnosis of DM irrespective of diabetes type using a 261 marker panel [11]. An AUC value of 94% was calculated for the discrimination of case and control groups of the multicenter patient cohort (P,0.0001). patients samples. As shown in Figure 3C, the corresponding ROC analysis resulted in an AUC value of 88% (95%-CI of 81-94%). Urine samples of both cohorts (discovery and validation set) were derived from patients without any measurable renal function loss. To verify if DN could interfere in the discrimination between T1D and T2D patients, the DTspP was applied to a further cohort of DM subjects with impaired renal function (n = 369). This classification resulted in an AUC in ROC analysis ( Figure 3D) of 85% (95% CI of 81-88%; classification factors are listed in Table  S2).
To identify those peptides significantly differentiating T1D and T2D patients in the validation cohort without (n = 108) and with (n = 369) renal impairment, we applied Mann-Whitney U-testing in these cohorts. This held true in the validation set for 70 markers in the normoalbuminuric patients group and 86 peptides in the kidney disease cohort. 57 candidates were significant in both groups (P,0.05) (Table S3/set II).

Sequencing of DM specific and DM type specific biomarkers
We applied tandem mass spectrometry to obtain peptide sequences. We successfully obtained sequences for 100 of the 261 DM biomarkers and 40 of 131 DM type specific peptides (Table S1 and S3/set II). Of the validated 148 DM markers and the 57 DM type specific biomarkers we were able to identify 56 and 20 peptides, respectively. Table S4 displays sequence and information on the regulation of the identified and validated biomarkers for DM. The regulation of these markers in urine of DM patients and healthy controls is also shown in Figure 4. The validated and sequenced DM type-specific markers are listed in Table 3, and their regulation between T1D and T2D patients are shown in Figure 5.
The majority of the identified biomarkers were fragments of collagen alpha-1 (I) and (III). In general, collagen fragment levels were decreased in urine of patients with DM compared to non-DM subjects ( Figure 4A and B), with even further decreased levels in urine of T2D compared to T1D patients ( Figure 5A and B). Most of these collagen fragments are C-terminal. In contrast, fragments of fibrinogen alpha and beta were increased in the urine of patients with DM compared to non-DM subjects ( Figure 4C). Furthermore, fragments of alpha-1-antitrypsin, membrane-associated progesterone receptor component 1 and uromodulin (for regulation see Table S4, 4, Table S1 or S3/set II and Figure 4D, 5C) were among the biomarkers.

Discussion
This study represent the largest proteomic study (with respect to cohort size) reported to date. Furthermore, this is the first study to our knowledge which is dealing with the investigation of differences between the urinary proteome of T1D and T2D patients. In this work we successfully validated urinary peptides that are specific for DM in general (part A), and further identified urinary peptides significantly associated with T1D or T2D (part B). The defined biomarkers indicate (patho)physiological differences in the extracellular remodeling of T1D and T2D. Due to different etiopathologies of T1D and T2D, T1D subjects in our study were significantly younger, and had significantly longer duration of DM. In addition, all T1D subjects received insulin treatment. All these potential confounding factors were considered in the statistical analysis, and peptides which were significantly associated with these factors were excluded from further examinations.
The most prominent DM associated urinary proteome changes were a significant reduction of specific collagen alpha-1 (I) and (III) fragments, and in direct comparison among patients without evidence of chronic kidney disease, these changes were significantly more pronounced in T2D than in T1D, despite the lower ACR and higher estimated GFR in T2D patients. This corresponds to the morphological observation that along with increased b-cell apoptosis, pancreatic islets from T2D patients contain amyloid deposits and resulting fibrosis [12,13]. In this context it is worth mentioning that extracellular matrix (ECM) homeostasis is maintained by the balance between tissue inhibitor of metalloproteinases (TIMP) and matrix metalloproteases (MMP). Decreased activity of certain MMPs (e.g. MMP-2, -3, and -9), as described in diabetes [14][15][16], would account for our finding of decreased urinary excretion of collagen fragments, since less collagen filaments would in this case be cleaved from the ECM. Our data support the hypothesis that physiological degradation of ECM components, especially collagen fibers, may be disturbed as a result of DM and this phenomenon would subsequently result in morphologically observed increased ECM deposits [15][16][17]. These data indicate a demand for further research to investigate the detailed relationship between MMPs/TIMPs/ECM in DM-associated complications in a systems approach, as recently suggested [18,19].
In addition, the data on the differences in urinary collagen fragments may indicate that the mechanism of attenuation of collagen degradation is different in T1D and T2D. In addition to MMPs, advanced glycemic end products (AGEs) are prominent candidates possibly responsible for a disturbance in collagen breakdown and chemical modification of collagen [20,21]. While we could not find reports indicating significant differences in protease activity or AGE status between T1D and T2D, both phenomena have been observed when comparing patients with diabetes to normal controls [9,22]. Based on the data reported here, we hypothesize that the underlying molecular changes that result in vascular damage and fibrosis in diabetes may be different between T1D and T2D, as indicated by significant differences in urinary collagen fragments.
Alpha-1-antitrypsin (AAT) is a member of the serpin family, a major acute phase protein, and a physiological inhibitor of serin proteases like neutrophil elastase, resulting in a plethora of various anti-inflammatory and anti-apoptotic effects [23]. Plasma levels and activity of AAT are reported to be significantly decreased in DM patients [24][25][26], while we and others found urinary fragments of AAT to be significantly increased [27], suggesting increased degradation and subsequent renal clearance of AATderived peptides in DM. Increased degradation, resulting in decreased AAT serum levels, would facilitate conversion of fibrinogen to fibrin by thrombin and release of fibrinogen-alpha and -beta. This assumption is supported by the observed increase of urinary fibrinogen-alpha and -beta-chain fragments in diabetics compared to controls, and consistent with the the significant prothrombotic risk in DM observed by others [28].
Progesterone receptor membrane component 1 (PGRMC1) is a member of the so-called membrane-associated progesterone receptors (MAPR) [29]. As an adaptor protein, PGRMC1 was proposed to be involved in regulating protein interactions, intracellular signal transduction and/or membrane trafficking [29]. Interestingly, in the rat, PGRMC1 activation by progesterone is discussed as an inhibitor of cell respiration and suppressor of glucose transport in late rodent pregnancy [30]. This effect could contribute to pregnancy associated changes in glucose homeostasis in gestational diabetes. While uromodulin has previously been reported to be decreased in patients with DM [31][32][33], we observe the up-regulation of a uromodulin fragment without a C-terminal arginine residue ( Table 3). This may be a result of increased proteolytic activity in DM, resulting in decrease of the parental protein, but increase in degradation products. However, this hypothesis requires further investigation.
Several approaches aiming at the analysis of differently regulated proteins in body fluids from patients with T1D and T2D have been performed [31,34]. One early proteomic approach using fractionated human serum samples in the context of T2D and insulin resistance was performed by Zhang et al. [34] to mine low abundant proteins. When comparing serum from patients with T2D or insulin resistance to controls' serum, haptoglobin was elevated. Also, several other proteins involved in the inflammatory response, like a-2 macroglobulin, fibrinogen, complement C3 and C1 inhibitor were altered. Many of the detected proteins have been connected to DM, such as the acute phase protein haptoglobin, which has been associated with cardiovascular and renal complications in T1D [35,36]. However, we are not aware of any investigation using urine for the analysis of differences in the proteome of T1D versus T2D.
Our study has some potential limitations. Health-care provider definitions of diabetes type were used, and although standard clinical methodology was used by experienced diabetologists, tests such as T1D-specific auto-antibodies were not performed. However, any possible misclassification of subjects by diabetes type would bias our findings toward the null. Additionally, as expected, the T1D subjects had different clinical and demographic characteristics compared to the T2D subjects. Therefore, we adjusted for these differences in the statistical analyses to avoid introduction of bias. Although we used state-of-the-art tandem mass spectrometry to identify discovered biomarker candidates by peptide sequence, we were unable to sequence all biomarker candidates. Most likely, we have reached the technical limits of currently available sequencing technology of naturally occurring peptides [37]. In general, native peptide sequencing is limited by post-translational modifications, complicating not only peptide ion fragmentation, but also subsequent database searches [37,38]. Additionally, the proteomics CE-MS technology is able to detect polypeptides with a high analytical sensitivity [39,40], whereas tandem mass spectrometry used for sequencing has higher detection limits [41,42]. In conclusion, this work gives clear and valid evidence, based on a multicenter cohort, of differences in the urinary proteome of T1D versus T2D patients with normal renal function, validated also in those with chronic kidney disease. Future studies should enable identification of not yet sequenced differentially expressed peptides and determine how these differences can be exploited for disease monitoring and therapeutic issues. However, the vast amount of data reported here and available today clearly suggest that alterations in the remodeling of extracellular matrix, and likely in endogenous proteolytic activity, are among the hallmarks of DM. These pathophysiological changes likely represent promising targets for pharmacological intervention, aiming specifically at prevention of diabetes-associated vascular complications. Further, the alterations in urinary ECM degradation products show significant differences between T1D and T2D.

Patient characteristics and study design
Urine samples were collected as described previously [43], in agreement with the protocol established by HUPO (www.hupo. org/research/urine) and EuroKUP (www.eurokup.org). In short, urine samples were collected using standard operation procedures and frozen immediately without the addition of any preservatives. 587 patients with either T1D (n = 299) or T2D (n = 288) were recruited at 10 different hospital centers in the US, Europe, and Australia (see Table 1 for details). The diagnosis of T1D and T2D was based on commonly accepted diagnostic criteria [44]. The pre-existing diagnosis of T1D and T2D as assigned in each center was considered as reference-standard for the purpose of comparison with the generated DM-specific urinary polypeptide panels. 205 of the 587 diabetes patients were previously used [11] for development of DM-specific panel. These remaining 382 DM samples and additional 315 samples from healthy non-DM controls were used in this study as an initial step to validate the DM (yes/no) panel (53% male, mean age6SD, 40610 years) [45,46] (Figure 1).
Chronic renal impairment was assessed using albumin/ creatinine ratio .30 mg/g, or with a glomerular filtration rate (GFR) ,60 unit, and scoring negative in a previously published classification model for chronic kidney disease [47]. The Cockcroft-Gault method was used to estimate GFR.   Sample preparation A 0.7 mL aliquot of urine was thawed immediately before use and diluted with 0.7 mL 2 M urea, 10 mM NH 4 OH containing 0.02% SDS. In order to remove high molecular weight compounds of urine, samples were filtered using Centrisart ultracentrifugation filter devices (20 kDa molecular weight cutoff; Sartorius, Goettingen, Germany) at 3,000 g until 1.1 mL of filtrate was obtained. Subsequently, filtrate was desalted using a PD-10 column (GE Healthcare, Sweden) equilibrated in 0.01% NH 4 OH in HPLC-grade water. Finally, samples were lyophilized and stored at 4uC. Shortly before CE-MS analysis, lyophilisates were resuspended in HPLC-grade water to a final protein concentration of 0.8 mg/mL as verified by BCA assay (Interchim, Montlucon, France).

Urinary proteome analysis
CE-MS analysis was performed as described previously [2,40]. By this procedure the average recovery rate in the preparation procedure was ,85% and the limit of detection was ,1 fmol [40]. Mass resolution was controlled to be above 8,000 enabling resolution of monoisotopic mass signals for z#6. After charge deconvolution, mass deviation was ,25 ppm for monoisotopic resolution and ,100 ppm for unresolved peaks (z.6). The analytical precision of the set-up was assessed by reproducibility achieved for repeated measurements of the same replicate and by the reproducibility achieved for repeated preparations and measurements of the same urine sample [40]. To ensure high data consistency, a minimum of 950 peptides/proteins had to be detected with a minimal MS resolution of 8,000 in a minimal migration time interval of 10 minutes. By following this set-up, CE-MS enabled reproducible and robust high-resolution urinary proteome analysis.

Data processing
Mass spectral ion peaks representing identical molecules at different charge states were deconvoluted into single masses using MosaiquesVisu software [48]. For noise filtering, signals with z.1 observed in a minimum of 3 consecutive spectra with a signal-tonoise ratio of at least 4 were considered. MosaiquesVisu employs a probabilistic clustering algorithm and uses both isotopic distribution (for z#6) as well as conjugated masses for charge-state determination of peptides/proteins. The resulting peak list characterizes each polypeptide by its mass and its migration time. TOF-MS data were calibrated utilizing 80 reference masses exactly determined by Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS). For calibration, linear regression is performed. Both capillary electrophoresis (CE)-migration time and ion signal intensity (amplitude) show variability, mostly due to different amounts of salt and peptides in the sample and were consequently normalized. Reference signals of more then 1700 urinary polypeptides were used for CE-time calibration by local regression [49]. For normalization of analytical and urine dilution variances, MS signal intensities were normalized relative to 29 ''housekeeping'' peptides with small relative standard deviation. For calibration, local regression is performed [50]. The obtained peak lists characterized each polypeptide by its molecular mass [Da], normalized CE migration time [min] and normalized signal intensity. To avoid artifacts (specific individual peptides) only detected peptides with frequency .20% were deposited, matched, and annotated in a Microsoft SQL database allowing further statistical analysis. For clustering, peptides in different samples were considered identical, if mass deviation was #50 ppm for small or #75 ppm for larger peptides. Due to analyte diffusion effects, CE peak widths increase with CE migration time. In the data clustering process this effect was considered by linearly increasing cluster widths over the entire electropherogram (19 min to 45 min) from 2-5%. After calibration, mean deviation of migration time was controlled to be below 0.35 minutes. All annotated data are available in Table S5C.
Biomarker discovery. Peptides' P-values were calculated using the base 10 logarithm transformed intensities and the Gaussian approximation to the t-distribution. For multiple testing corrections, P-values were corrected using the false discovery rate (FDR) procedure introduced by Benjamini and Hochberg [51]. The FDR is the fraction of false positives among all tests declared significant. FDR was controlled to be #0.05, which means that on average less than 5% of peptides declared significant are actually false positives. On the other hand, the other 95% of the biomarkers were indeed true positives. The approach is reported to have high statistical power for biomarker discovery in the situation of differential expression between two samples, when subjected to two different treatments, such as disease/no disease [51]. Only proteins that were detected in a diagnostic group of patients in at least 50% of samples were considered for testing. The test was implemented as macros in SAS (www.sas.com) and is part of the multitest R-package (www.bioconductor.org). Nonparametric Kruskal-Wallis one-way analysis of variances [52] (MedCalc version 8.1.1.0, MedCalc Software, Belgium, www. medcalc.be) was used to assess of candidates' dependency on ageand DM duration.
Descriptive statistics. Sensitivity, specificity, and 95% confidence intervals (95% CI) were calculated using receiver operating characteristic (ROC) plots [53] (MedCalc version 8.1.1.0, MedCalc Software, Belgium, www.medcalc.be). The receiver operating characteristic curve (ROC) was obtained by plotting all sensitivity values (true positive fraction) on the y-axis against their equivalent (1-specificity) values (false positive fraction) for all available thresholds on the x-axis. The area under the ROC curve (AUC) provides the single best measure of overall accuracy independent of any threshold.

Classification
Disease specific protein/peptide patterns were generated using support vector machine-based (SVM) MosaCluster software [4]. SVM view a data point (proband's urine sample) as a pdimensional vector (p numbers of protein used in the pattern), and they attempt to separate them with a (p 2 1) dimensional hyperplane. The hyperplane with the maximal distance from the hyperplane to the nearest data point is selected. Classification is performed by determining the Euclidian distance (defined as the SVM score) of the polypeptides to the (n-1) dimensional maximal margin hyperplane and the direction of the vector.

Sequencing of peptides
In order to identify the defined biomarkers, we applied MS/MS peptide sequencing using CE-or liquid chromatography (LC)-MS/MS analysis including either collision-induced dissociation (CID) [8,54] or electron transfer dissociation (ETD) [38,55,56]. Obtained MS/MS data were submitted to MASCOT (www. matrixscience.com) for search against human entries in the MDSB Protein Database. Accepted parent ion mass deviation was 0.5 Da; accepted fragment ion mass deviation was 0.7 Da. Hits were accepted with MASCOT peptide scores of $20. Additionally, ion coverage was controlled to be related to main spectral fragment features (b/y or c/z ion series). If necessary, manual de novo sequencing was performed to confirm the identifications. The number of basic and neutral polar amino acids of the peptide sequences was utilized to correlate peptide sequencing data to CE-MS data, as described [54].

Supporting Information
Table S1 261 DM-specific peptides included in diabetes7 model. Shown are the protein/peptide identification number in the dataset (Protein ID), mass (in Da) and normalized migration time (in min), the p-values [unadjusted using Mann-Withney U-test], frequency, mean amplitude and standard deviation in the two groups of the cohort, and the regulation factor by diabetes compared to healthy controls. In addition, sequences (modified amino acids: p = hydroxyproline; k = hydroxylysine; m = oxidized methionine), protein names, start and stop amino acid, Swiss-Prot entries and accession numbers are given.   Table S3 ''Set II'': 131 peptides included in DTspP. Shown are the protein/peptide identification number in the dataset (Protein ID), mass (in Da) and normalized migration time (in min), the adjusted P-values using Benjamini-Hochberg (BH) for training data and unadjusted P-values using Mann-Withney U-test for validation and CKD cohorts, the frequency, mean amplitude, standard deviation in the two groups of diabetes in the training set and in the group of 315 healthy controls, and the regulation factor for type 1 compared to type 2 and type 1 and 2 diabetes compared to healthy controls. In addition, sequences (modified amino acids: p = hydroxyproline; k = hydroxylysine; m = oxidized methionine), protein names, start and stop aino acid, Swiss-Prot entries and accession numbers are given.