Standardization of Diagnostic Biomarker Concentrations in Urine: The Hematuria Caveat

Sensitive and specific urinary biomarkers can improve patient outcomes in many diseases through informing early diagnosis. Unfortunately, to date, the accuracy and translation of diagnostic urinary biomarkers into clinical practice has been disappointing. We believe this may be due to inappropriate standardization of diagnostic urinary biomarkers. Our objective was therefore to characterize the effects of standardizing urinary levels of IL-6, IL-8, and VEGF using the commonly applied standards namely urinary creatinine, osmolarity and protein. First, we report results based on the biomarker levels measured in 120 hematuric patients, 80 with pathologically confirmed bladder cancer, 27 with confounding pathologies and 13 in whom no underlying cause for their hematuria was identified, designated “no diagnosis”. Protein levels were related to final diagnostic categories (p = 0.022, ANOVA). Osmolarity (mean = 529 mOsm; median = 528 mOsm) was normally distributed, while creatinine (mean = 10163 µmol/l, median = 9350 µmol/l) and protein (0.3297, 0.1155 mg/ml) distributions were not. When we compared AUROCs for IL-6, IL-8 and VEGF levels, we found that protein standardized levels consistently resulted in the lowest AUROCs. The latter suggests that protein standardization attenuates the “true” differences in biomarker levels across controls and bladder cancer samples. Second, in 72 hematuric patients; 48 bladder cancer and 24 controls, in whom urine samples had been collected on recruitment and at follow-up (median = 11 (1 to 20 months)), we demonstrate that protein levels were approximately 24% lower at follow-up (Bland Altman plots). There was an association between differences in individual biomarkers and differences in protein levels over time, particularly in control patients. Collectively, our findings identify caveats intrinsic to the common practice of protein standardization in biomarker discovery studies conducted on urine, particularly in patients with hematuria.


Introduction
Advances in proteomics have enhanced our understanding of the urinary proteome [1][2][3][4] and subsequently encouraged biomarker discovery screens in a range of complex diseases [2,3], including bladder cancer [5,6]. Urine has the advantage of ease of access and is relatively stable thermodynamically [3]. Despite these encouraging developments, no biomarker or biomarker combination to date, has achieved widespread clinical application as a diagnostic assay. Perhaps this is partly attributable to the range of methodologies used to standardise urinary biomarker levels which introduces a lack of consistency in reported levels and inhibits cross study comparisons.
When we reviewed publications on biomarkers for urological pathologies to ascertain the 'correct' methodology to employ for urine normalization, we found inconsistency. As there is no standard methodology, the normalization method employed for any given study is still very much at the discretion of the project investigator, the accessibility of equipment and the available technical expertise. Further, insufficient research into the effects of different standardization approaches means that researchers are employing methods which may introduce bias. Thus there is the potential both for biased data and masking detection of valuable biomarkers secreted into urine at low levels [7].
Some researchers have reported biomarker levels in units per unit volume of urine [5,8,9]; others have standardized biomarker levels using urinary creatinine [10][11][12][13]. Most, however, have opted to use protein as their denominator [5,[14][15][16]. Creatinine, the breakdown product of creatine phosphate during muscle metabolism, is filtered out of the blood into the urine by the kidney. Creatinine production is usually at a fairly constant rate when renal function, metabolism and muscle mass are stable, but can be dependent on age, sex, race and size [17]. Serum creatinine and the albumin:creatinine ratio in urine are in clinical use as biomarkers of kidney disease [18]. Osmolarity is a measure of the osmoles of solute per litre of solution and therefore reflects the concentrating ability of the kidneys. Protein is often used to normalize potential bladder cancer biomarkers [5,[14][15][16]. Proteinuria is, however, synonymous with diabetes and renal diseases [7,[18][19][20].
Potential biomarkers must proceed through rigorous validation before they progress through the phases that span discovery to clinical application [21]. However, in the absence of evidencebased guidelines for the standardization of urinary biomarkers, it is possible that potential biomarkers secreted at low levels into urine have not been identified. Urine standardization guidelines would complement those already established for Standards for Reporting of Diagnostic Accuracy (STARD) [22,23] and would ensure that promising biomarkers could be cross-referenced thus facilitating their more expeditious development. It is, however, conceivable that individual guidelines tailored to the specifics of different confounding factors may be required.
The aim of this study was to increase our understanding about the consequences and effects of different methods employed to standardize biomarker levels detected in urine collected from hematuric patients. We assessed the effects on three biomarkers previously reported to be associated with bladder cancer i.e., interleukin-6 (IL-6) [24], IL-8 [25] and vascular endothelial growth factor (VEGF) [26]. Using data collected during a case control study [27], we characterized urinary creatinine, osmolarity and protein levels across patient groups with the following final diagnoses: no diagnosis (n = 13), confounding pathologies (n = 27) and bladder cancer (n = 80). We determined areas under the receiver operator characteristic (AUROC) for IL-6, IL-8 and VEGF both for uncorrected data and for data standardized using urinary creatinine, osmolarity or protein. In 72 hematuric patients, we compared the intra-patient variability of levels measured at recruitment and at follow-up. We assessed whether there was any association between the differences in levels of biomarkers on recruitment and those measured at follow-up and the differences similarly detected in levels of the standards in the same samples. We present findings that indicate urine volume standardization is preferable to the use of protein standardization because of the high incidence of proteinuria in the hematuric patient population.

Patient Samples
A case control study approved by the Research Ethics Committee, Faculty of Medicine, Queen's University Belfast (80/04) and the Office for Research Ethics Committees Northern Ireland (ORECNI 80/04); and reviewed by the Belfast City Hospital Trust review board and the Ulster Community and Hospital Trust Research Committee was conducted according to STARD guidelines [22,23]. Written consent was obtained from 181 patients with hematuria (103 patients with confirmed transitional cell carcinoma; and 78 controls); recruited between November 2006 and October 2008 [27]. All patients were white Caucasians except for one of black African origin. Dipstick analysis is a simple and fast analyses of urine undertaken by medical personnel to determine the levels of constituents in urine, including blood, protein, and white blood cells. Dipstick analyses were undertaken on urine samples collected from each of the patients using Aution Sticks 10EA, which were interpreted using PocketChem (Arkray factory, Inc. Japan). The dipstick results for protein were recorded. Approximately 250 mg/l (0.25 mg/ml) is the lower limit of sensitivity for urine dipstick testing [20]. Urine samples from each patient were then stored at -80uC for a maximum of 12 months prior to triplicate analyses of urinary creatinine, osmolarity, protein, IL-6, IL-8 and VEGF.
First, we analysed data from 120/181 hematuric patients (96 males:24 females) with a mean age = 66 years. Eighty of these patients had pathologically confirmed transitional cell carcinoma of the bladder (TCCB) and 40 were controls. Of the controls, 27 had confounding pathologies, such as stones, inflammation or benign prostate enlargement. In 13 patients, even after detailed investigations, including cystoscopy and radiological imaging of the upper urinary tract, no underlying cause for their hematuria was identified. The diagnosis for these patients is referred to as ''no diagnosis''.
In our second set of analyses, we compared standards and biomarker levels across time in 72/181 patients (60 males: 12 females) with a mean age = 69 years. Urine samples were collected from these 72 patients both at the time of recruitment and at a second visit (median interval = 11 months (range 1 to 20 months)). It was not possible to collect longitudinal samples from all 181 patients recruited to the study because many patients had significant distances to travel to hospital. The characteristics of these 72 patients were representative of the 120 patients previously analysed. Forty-eight of these patients had TCCB. Sixteen of the 24 controls had confounding pathologies and 8 had a final diagnosis of no diagnosis.

SDS PAGE Analyses
Urine samples (2.5 ml/lane) from each patient were investigated for protein using SDS PAGE (16%) analysis. The gels were stained with Coomassie Blue for 1 h and then de-stained in methanol/ acetic acid/water (2:1:7) until clear. Protein bands, observed for each patient, were quantified using Quantiscan ß software.

Statistical Analyses
Using data from the 120 hematuric patients, we assessed the distribution of the three standards by visual comparison of histograms and boxplots and interpretation of means, medians, skewness and kurtosis. We explored correlations and then used linear regression to determine the extent to which one standard could predict another. To determine the relationship between protein levels measured in urine and protein categories defined using dipstick analyses, we examined a scatter plot of protein levels (mg/ml) against dipstick protein categories to ascertain the range of protein levels within each dipstick category i.e. ''+'', ''++'', ''+++'' and ''++++''. We used one-way ANOVA to determine whether protein levels were related to final diagnoses categories. To ascertain their diagnostic potential, we compared the area under the receiver operating characteristic (AUROC) determined for the standardized and uncorrected biomarker levels. We divided the average measurement for each of the three biomarkers by the average osmolarity, creatinine or protein level measured in the same patient's urine sample and then log 10 transformed the data.
Urine samples were obtained from 72 patients on two visits; one on recruitment and a second at follow-up (median = 11 (1 to 20 months)). To assess the agreement between the levels of the standards on recruitment and at follow-up, we constructed Bland Altman plots and undertook paired t-test analyses. In addition, we were interested to ascertain whether there were significant associations between differences in individual biomarkers levels over time and differences in standard levels over time. For each biomarker we divided the mean biomarker level measured at follow-up by the mean biomarker level measured on recruitment, and then computed the log 10 of this value. Similarly, for each standard we divided the mean level measured at follow-up by the mean level measured at recruitment and then computed the log 10 of this value. To compare these ratios we undertook regression analyses inserting log differences of each biomarker into the dependent box and log differences of creatinine, osmolarity and protein sequentially into the independent box.
Statistical analyses were completed using SPSS v17.

Results
Osmolarity (mean = 529 mOsm; median = 528 mOsm) was normally distributed while creatinine (mean = 10163 mmol/l, median = 9350 mmol/l) and protein (mean = 0.3297 mg/ml; median = 0.1155 mg/ml) distributions were not. This is substantiated by skewness and kurtosis values for osmolarity (0.1; 20.5), creatinine (2.2; 8.8), and protein (3.1; 10.6) ( Figure 1). Measurements for osmolarity, creatinine, and protein ranged from 103 to 1047 mOsm; 1329 to 44542 mmol/l (1.3 to 44.5 mmol); and zero to 3.36 mg/ml, respectively. Two patients had extreme creatinine levels ( Figure 1B). These levels, 44542 and 39077 mmol/l respectively, were measured in a 40 year-old male with stone disease and a 58 year-old male with non-muscle invasive TCCB. All other measurements were ,24000 mmol/l. Extreme creatinine levels have been reported previously [12]. There was a modest relationship between osmolarity and creatinine in that 51.9% of the variation in creatinine was accounted for by osmolarity (linear regression; R Square = 0.519) ( Figure 2). In this study we report 49% false positives and ,1% false negatives in dipstick analyses based on our findings that 25/51 patients deemed dipstick positive had protein levels ,0.25 mg/ml; and that 4/62 patients with measured protein levels .0.25 mg/ml were deemed dipstick negative ( Figure 3).
Urinary protein levels were related to final diagnoses categories (ANOVA; p = 0.022). Protein levels in urine from bladder cancer patients were higher than in those with no diagnosis (p = 0.073)( Table 1). In contrast, osmolarity and creatinine levels were not significantly related to final diagnoses (ANOVA p = 0.851 and 0.630, respectively).
Median protein levels were lower at follow-up (0.08 mg/ml) when compared to levels on recruitment levels (0.10 mg/ml). Osmolarity and creatinine were constant. Mean osmolarity = 519, 521 mOsm; mean creatinine = 9835, 9941 mmol/l, respectively on recruitment and at follow-up. Median osmolarity = 527, 515 mOsm; median creatinine = 9086, 8832 mmol/l, respectively on recruitment and at follow-up. Bland Altman plots illustrated that protein levels decreased by approximately 24% between recruitment and follow-up (mean log e difference = 20.24 (95% Confidence Interval (CI) 2.18 to 22.66). In contrast, osmolarity and creatinine were stable with little variation across the scale in the Bland Altman plots ( Figure 5). Protein levels decreased between recruitment and follow-up (Paired T-test; p,0.10) ( Table 2).
When we studied longitudinal ratios there were significant associations between the differences in logarithms (base 10) between all three biomarkers and protein. These associations between the biomarkers and protein ratios were stronger in the control sub-population (n = 24) than in the bladder cancer subpopulation (n = 48). In the control sub-population, IL-6 = 20. There were no significant associations when recruitment levels were subtracted from followup levels of biomarkers and the differences similarly determined in either osmolarity or creatinine in the same samples ( Figure 6).
We analysed the urine from each patient using PAGE. The levels of protein in the urine that we observed on the gel, following equal loading (2.5 ml, i.e. no standardization or normalization), did not reflect the level of the biomarker in the same urine sample. Therefore high levels of protein observed on the PAGE gel did not correlate with high levels of the biomarkers. For example, IL-8 levels did not significantly correlate with the band density frequently observed at approximately 64-66 kDa (Figure 7).

Discussion
We have presented evidence that the high prevalence of proteinuria in hematuric patients introduces a caveat with respect to using protein as the standardiser of urinary biomarker levels. The origin of proteins shed into the urine of patients with proteinuria is dependent on the specific disorder that the patient has [7]. Further, drugs which are often prescribed for hematuric patients, including nonsteroidal anti-inflammatory and occasionally angiotensin-converting enzyme (ACE) drugs, can cause increases or decreases in proteinuria [32]. In certain renal diseases large proteins such as albumin leak into urinary space and the amount of secreted protein very much depends on the specific disease [7]. Dipstick protein analyses detects predominantly albumin. Proteinuria is classified as selective when albumin is the major protein constituent [7]. Albumin is detected as a dense band at approximately 64 to 66 kDa observed on the SDS PAGE gel indicating that the corresponding patients have selective proteinuria. In contrast, the patient with a dense band around 13 kDa may have non-selective proteinuria. There was a significant correlation r = 0.802 (Pearson correlation; p,001) between the density of the albumin band quantified using Quantiscan ß software and log 10 average protein levels, but this, on its own, would not justify the classification of patients with proteinuria as having albuminuria.
This study has therefore demonstrated in three ways the caveats of protein normalization in patients with hematuria. These being that protein levels are not homogeneous across diagnostic groupings in hematuric patients; that there is intra-patient variability in protein levels in urine over time; and that protein standardization reduced AUROCs in biomarkers previously demonstrated to be elevated in bladder cancer patients [24][25][26]. First, we have demonstrated that urinary protein levels were higher in patients with bladder cancer compared to those with no final diagnosis and that protein per se is associated with final diagnosis. Second, we found that standardization using protein resulted in the lowest AUROCs for each of the three bladder cancer diagnostic biomarkers. The latter indicates that biomarker differences between controls and bladder cancer patients can be attenuated following protein standardization. Third, we observed that protein levels were generally lower on follow-up, perhaps indicative of successful treatment. However, there were significant associations between the differences determined in each of in the biomarkers when recruitment levels were subtracted from followup levels and the differences similarly determined in protein in the  Standardization of Diagnostic Urinary Biomarkers PLOS ONE | www.plosone.org same samples. This was most evident in controls. These findings suggest that after treatment and/or recovery, protein levels decreased in the control sub-population to a greater extent than in the cancer patients. This finding would only arise if controls were not healthy and controls in some case control studies would be healthy and therefore protein concentration would not be expected to be lower at the end of the study. The latter associations would support the use of protein normalization, particularly in controls. However, in light of other findings, particularly considering that the lowest AUROCs were determined following protein normalization and the high prevalence Figure 3. Comparison between measured protein levels and protein dipstick analyses. Total protein levels (mg/ml) in urine were determined by Bradford assay A 595 nm (Hitachi U2800 spectrophotometer) using Bovine Serum Albumin as standard. Dipstick analyses were undertaken using Aution Sticks 10EA. Analyses were interpreted using PocketChem (Arkray factory, Inc. Japan). Protein levels were plotted against dipstick results with the Y -axis reference line indicating the usual lower limit of sensitivity for urine dipstick testing (0.25 mg/ml). doi:10.1371/journal.pone.0053354.g003 Urinary protein levels measured in 120 patients with hematuria were related to final diagnostic categories in (ANOVA; p = 0.022). Subsequently, we carried out a one way ANOVA with post-hoc Dunnett T3 analyses using log 10 transformed protein data. Higher protein levels were measured in urine from patients diagnosed with bladder cancer in comparison to those with no diagnosis (p = 0.073). There were no significant differences between the protein levels measured in patients with confounding pathologies and levels measured in the urines from bladder cancer patients (p = 0.621) or between patients with no diagnosis and patients with confounding pathologies (p = 0.316). doi:10.1371/journal.pone.0053354.t001 of proteinuria in this patient population, this approach could bias true biomarker levels. These observations demonstrate that it is not appropriate to use protein standardization of urine samples in hematuric patient populations where proteinuria is often a co-morbidity. However, our findings cannot be extrapolated to patients who have proteinuria who are nonhematuric. Proteinuria has widespread causes including ureteric calculi, minimal change glomerulonephritis, diabetes, malaria and congestive heart failure [7,33]. Many of these pathologies present with hematuria [27].
Interestingly, differences in protein levels between recruitment and follow-up accounted for a significant amount of the differences in biomarker levels at the same time-points. Further this relationship was strongest in the control samples. This reflects a close relationship between a disease status indicator, i.e. proteinuria, and IL-6, IL-8 and VEGF levels in Patients with hematuria.
The persistent trend for researchers to normalize biomarker levels using protein [5,[14][15][16] perhaps stems from the concept of equal loading in Western blot experiments, which has been carried through, to biomarker studies and then more recently, to proteomic screens. It is interesting that Chen et al (2010) achieved higher AUROCs for novel potential bladder cancer-biomarkers using urine volumes rather than protein normalized samples [5]. Our data suggest that in hematuric populations in which there is a high incidence of proteinuria, urine volume is likely to be more accurate, and indeed a simpler approach to standardization, than applying protein as a denominator. The consequences of normalizing using protein in this study were that biomarker levels in patients with proteinuria were proportionately reduced. This approach therefore introduced bias. It might be prudent to consider proteinuria as a contraindication to protein based standardization of urine in proteomic studies conducted in patients  with hematuria. In other biomarker applications different confounding pathologies may play a role and our findings might not apply [7]. This is the first time that the effects of biomarker standardization have been compared across four methodologies simultaneously, i.e. uncorrected levels, creatinine, osmolarity and protein.
Standardization of the urinary biomarker levels using protein, attenuated the data reducing both sensitivity and specificity of the biomarkers IL-6, IL-8 and VEGF. In this study, urinary creatinine and osmolarity levels in patients were constant in patients over time. Since creatinine and osmolarity did not differ significantly across disease pathologies frequently diagnosed in hematuric patients, our data suggest that creatinine or osmolarity could be used to normalize for urinary protein biomarkers. Further, osmolarity levels measured in this study predicted creatinine levels supporting the notion that osmolarity and creatinine levels in urine are interchangeable. However differences in IL-6, IL-8 and VEGF measured on recruitment and at follow-up were not significantly associated with differences in either of these standards. In this study we did not evaluate the efficacy of standardization based on 24 hour urine collections which might be more accurate than the state measurements used in this study. This study provides no justification for normalization using either creatinine or osmolarity when they are determined as state measurements. Uncorrected IL-6, IL-8 and VEGF AUROC analyses were very similar to those normalized using osmolarity and creatinine. Therefore, it makes more sense to use uncorrected biomarker levels for biomarker studies in hematuric patients.
Our study provides evidence that urinary diagnostic biomarkers should be standardized by urine volume in hematuric patients where there is a high incidence of proteinuria. Since proteinuria is a common condition in patients with hypertension, ureteric calculi, minimal change glomerulo-nephritis, diabetes, malaria and congestive heart failure, our findings may have implications for a wide range of biomarker discovery, biomarker validation and quantitative proteomic studies investigating complex diseases.  Urine samples were obtained on two visits; one on recruitment and a second at follow-up (median = 11 (1 to 20 months)) from 72 patients who had presented with hematuria. The mean difference between log10 protein levels decreased over time (p = 0.097). doi:10.1371/journal.pone.0053354.t002