Augmented Backward Elimination: A Pragmatic and Purposeful Way to Develop Statistical Models

Statistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only partial subject-matter knowledge is available. Therefore, selecting the most suitable variables for a model in an objective and practical manner is usually a non-trivial task. We briefly revisit the purposeful variable selection procedure suggested by Hosmer and Lemeshow which combines significance and change-in-estimate criteria for variable selection and critically discuss the change-in-estimate criterion. We show that using a significance-based threshold for the change-in-estimate criterion reduces to a simple significance-based selection of variables, as if the change-in-estimate criterion is not considered at all. Various extensions to the purposeful variable selection procedure are suggested. We propose to use backward elimination augmented with a standardized change-in-estimate criterion on the quantity of interest usually reported and interpreted in a model for variable selection. Augmented backward elimination has been implemented in a SAS macro for linear, logistic and Cox proportional hazards regression. The algorithm and its implementation were evaluated by means of a simulation study. Augmented backward elimination tends to select larger models than backward elimination and approximates the unselected model up to negligible differences in point estimates of the regression coefficients. On average, regression coefficients obtained after applying augmented backward elimination were less biased relative to the coefficients of correctly specified models than after backward elimination. In summary, we propose augmented backward elimination as a reproducible variable selection algorithm that gives the analyst more flexibility in adopting model selection to a specific statistical modeling situation.

selection algorithm that gives the analyst more flexibility in adopting model selection to a specific statistical modeling situation.

Introduction
Statistical modeling is concerned with finding a simple general rule to describe the dependency of an outcome on several explanatory variables. Such rules may be simple linear combinations, or more complex formulas involving product and non-linear terms. Generally, statistical models should fulfill two requirements. First, they should be valid, i.e., provide predictions with acceptable accuracy. Second, they should be practically useful, i.e., a model should allow to derive conclusions such as 'how large is the expected change in the outcome if one of the explanatory variables changes by one unit'. In a typical modeling situation the analyst is often confronted with a large number of potential explanatory variables, and selecting the most suitable ones for a model is usually a non-trivial task.
Statistical models are used in predictive as well as in etiologic research [1]. In the former, one is interested in a simple and well-interpretable rule in order to accurately predict an outcome of interest, while in the latter, the strength of an assumed relationship of a variable of interest, i.e., the exposure variable, with an outcome is investigated. Control of confounding by multivariable adjustment (or other techniques such as propensity scores) is crucial if such relationships are to be estimated from observational rather than from randomized intervention studies [2,3]. Thus, in both types of research valid and useful statistical models are needed.
Backward, forward, and stepwise variable selection algorithms are implemented in most regression software packages, and together with univariate screening they are the algorithms that are used most often to select variables in practice (see e.g. [4,5] Chapter 2). All these algorithms rely only on significance as a sufficient condition to include variables into a model. For example, univariate screening includes variables based on the significance of their associations with the outcome in univariate models, or backward elimination removes insignificant variables one-by-one from a model. An excellent, critical summary of standard variable selection methods can be found in Royston and Sauerbrei ([5], Chapter 2).
Hosmer and Lemeshow proposed the 'purposeful selection algorithm' [6,7] which combines significance and change-in-estimate criteria [8][9][10][11][12] for selecting explanatory variables for a final model and is particularly attractive as it can be realized with standard software. Here, we will readopt the idea of combining significance and change-in-estimate criteria, and we will suggest a simple approximation to quantify the change-in-estimate from which a hypothesis test on the change-in-estimate can be directly derived.
The remainder of the manuscript is organized as follows: the Methods section will first discuss the change-in-estimate criterion and selection by significance.
Later, we will present a new proposal for an efficient algorithm, denoted as augmented backward elimination (ABE), combining both criteria. A SAS macro incorporating the ABE algorithm will be introduced [13,14]. The subsequent section summarizes results of a simulation study to evaluate the algorithm. Aspects of application of ABE are discussed by means of a study of progression of chronic kidney disease, including the use of resampling methods for confidence interval estimation and for assessing model stability.

The Change-In-Estimate Criterion Revisited
We denote by d {a p the change-in-estimate, i.e., the change in a regression coefficient b p by removal of a variable X a from an arbitrary linear statistical model with k explanatory variables X 1 , . . . ,X k ; p,a[f1, . . . ,kg; p=a. (The indices p and a refer to the roles of X p and X a in d {a p as the 'passive' and 'active' variables, respectively.) Instead of refitting the model with X a omitted, we propose to approximate the change-in-estimate, using the estimatesb p andb a , their covarianceŝ pa , and the variance ofb a ,ŝ 2 a , aŝ This approximation is motivated by consideringb p andb a as random variables with variancesŝ 2 p andŝ 2 a and covarianceŝ pa . The slope of a regression ofb p onb a , which denotes the expected change inb p ifb a is augmented by 1, is then given bŷ s pa =ŝ 2 a . Since we would like to approximate what happens ifb a~0 , i.e., ifb a is subtracted fromb a , we multiply the slope by {b a . The approximation does not only speed up the evaluation of the change-in-estimate considerably, but it also allows to directly assess the 'significance' of the change-in-estimate, i.e., to test for collapsibility of the models including and excluding X a . irrespective ofb a , and elimination of X a will not cause a change inb p . This case can only be expected in analyses of experiments with factorial designs by linear models, a situation where variable selection is not considered.) Under the null hypothesis of b a~0 , equivalent to d {a p~0 , variable selection based on significance-testing will control the probability of falsely selecting X a approximately at the nominal type I error rate. However, the change-in-estimate criterion is usually evaluated using a pre-specified minimum value of d {a p or d {a p =b p as a threshold for leaving X a in a model [10,12,15], and thus the probability of a false selection of X a is not controlled. This probability is rather associated with the standard error of d {a p , which is higher in smaller samples compared to larger ones.
Despite this unfavorable property, the change-in-estimate criterion may still be useful to obtain a model which approximates the unselected model up to negligible differences in point estimates of the regression coefficients, but contains fewer variables. Another justification for incorporating the change-in-estimate criterion in variable selection is to avoid the tendency of purely significance-based selection to select only one out of several correlated variables.
Some authors used a relative criterion jd {a p =b p j with, e.g., 0:1 as the threshold value [6,10,12,15]. This definition may not be suitable ifb p is close to zero. We propose the following criteria, which do not share this property, are suitably standardized and focus on the quantities of interest in a regression analysis: 1. In linear regression, regression coefficients depend on the scaling of the explanatory variable and on that of the outcome variable. Scale-independence is attained by evaluating where SD(X p ) and SD(Y) are the standard deviations of the passive explanatory variable X p and the outcome Y, respectively. 2.
1. In logistic or Cox regression, interest lies in odds and hazard ratios, respectively. This leads us to the standardized criterion The threshold value t could be set to, say, 0.05 but can be adopted to the specific modeling situation.
Usually, the individual explanatory variables play different roles (e.g., exposure variable of interest, important adjustment variable, less important adjustment variable) and this should be reflected in the selection process. We have identified three specific roles of explanatory variables, which may require different handling when evaluating the change-in-estimate criterion:

Variable Selection Based on Significance
Most variable selection procedures that are used in practice only rely on significance, e.g., univariate screening, forward selection, stepwise selection, and backward elimination (BE). Literature suggests that BE procedures with a mild significance level criterion, e.g., a~0:2, are superior to other approaches with regard to bias and root mean squared error of regression coefficients [8,9,16,17]. BE has a tendency to underselect important confounders [18], because it ignores variables with a strong association with the exposure, but a relatively weak association with the outcome conditional on the exposure. Royston and Sauerbrei also distinguish between 'BE only' and BE with additional forward steps, in which variables that have already been excluded at earlier iterations are reconsidered for inclusion [5]. They conclude that re-inclusion after exclusion rarely occurs. Therefore, we consider BE-only with a significance criterion of a~0:2 as the consensus method for significance-based variable selection. There is no statistical justification for univariate screening ( [17,19] Chapter 4.4). Forward selection may sometimes be preferred over BE for practical reasons, e.g., in very highdimensional variable selection problems. Stepwise selection, e.g., as implemented in SAS procedures [13], is essentially a forward selection with additional backward steps.

The Initial Working Set of Variables
For estimating etiologic models a priori information should be used to define the initial working set of variables to consider during statistical modeling. This a priori information can often be represented by a directed acyclic graph (DAG) which reflects the conditional dependencies of variables [20,21]. DAGs prompt the analyst to carefully question the causal relationship between all explanatory variables in a model, and they allow to identify the role of each variable: either as a confounder, a mediator, a variable unrelated to the causal relationship of interest [22], or incorporating the possibility of unmeasured quantities, a collider [23]. Finally, only variables assumed to be confounders, i.e., variables which are possibly associated with the outcome and with the exposure variable of interest, but which are not on the causal pathway from the exposure to the outcome, should be included for multivariable adjustment. Application of such causal diagrams requires that the analyst knows how each explanatory variable is causally related to each other [24]. However, in many areas of research such knowledge is hardly available or at least very uncertain. For prognostic modeling situations the initial set of variables will be selected based on other reasons, like future availability, the costs of collecting these variables, the reliability of measurements, or the possibility of measurement errors.

Variable Selection Based on Significance and Change-In-Estimate
In summary, we propose to use BE augmented with a standardized change-inestimate criterion on the quantity of interest for variable selection. We will denote this algorithm as 'augmented backward elimination' (ABE). The algorithm is briefly outlined in Figure 1. The ABE algorithm has been implemented in a SAS macro [13], which is described in more detail in a Technical Report [14]. The SAS macro can handle continuous, binary and time-to-event outcomes by implicitly applying linear regression using PROC REG, logistic regression using PROC LOGISTIC, or Cox proportional hazards regression using PROC PHREG, respectively. Basically, the ABE macro only needs the following specifications: N Type of model (linear, logistic or Cox) N Name of the outcome variable N Names of the explanatory variables from the initial working set N Roles of explanatory variables from the initial working set ('passive or active', 'only passive', 'only active') N Significance threshold a (default: 0:2) N Change-in-estimate threshold t (default: 0:05) Setting t~? (i.e., to a very large number) turns off the change-in-estimate criterion, and the macro will only perform BE. On the other hand, the specification of a~0 will include variables only because of the change-in-estimate criterion, as then variables are not safe from exclusion because of their p-values. Specifying a~1 will always include all variables.
We agree with Hosmer and Lemeshow's position that any automated algorithm only suggests a preliminary final model. Such a model should be critically evaluated for possible extensions such as non-linear and non-additive (interaction) effects ( [6], Chapter 5.2). Alternatively to the post-hoc inclusion of some transformations of continuous variables to allow for the estimation of non-linear effects, one could first apply an algorithm like 'multivariable fractional polynomials' (MFP) which simultaneously selects variables and determines their functional form by appropriate transformations [5]. Then ABE could be applied by including the possibly transformed continuous variables and all other selected variables as 'only passive' variables, and any further variables which were not selected by MFP could be entered as 'passive or active' variables. It should be mentioned that specifying a significance criterion of a~0:2 does not mean that the model itself or all its regression coefficients are significant at level 0:2. Simulations have shown that the actual significance levels of models derived by any variable selection procedure are usually much higher than the reported levels [25,26]. Likewise, one should be aware that the actual confidence levels of the reported confidence intervals in the final model are often less than the nominal ones. Additionally, performance measures of the model such as R 2 or area under the receiver operating characteristic curve are likely to be overestimated, i.e., too optimistic, if directly computed from the final estimates [27]. These phenomena are usually not dramatic if the sample size is large enough compared to the number of variables considered, e.g., if the effective sample size is at least 20 to 50 times the number of variables considered in the initial set. However, it can lead to wrong conclusions in other cases if not appropriately corrected [28].
Since the algorithm is available in a macro, it can easily be applied to bootstrap resamples or subsets of the data at hand, which allows to derive bootstrap confidence intervals for the regression coefficients (usually wider than their model-based counterparts), or to perform cross-validation to obtain optimismcorrected performance measures. In such analyses, the algorithm is applied to the resamples or subsets without any changes in the parameter settings. It may then result in different final models than obtained in the original analysis, and the final models may even differ between resamples or subsets. Thus, such analyses account for the variation in estimated regression coefficients that is produced by the uncertainty of variable selection in a data set, and they validate the model development strategy rather than the model itself. Later, we will demonstrate the difference between model-based and bootstrap standard errors by means of a reallife example.

Simulation Study
We evaluated the proposed ABE procedure and compared it to BE, no selection and variable selection based on background knowledge in the setting of an etiologic study. Analyses comprised continuous, binary and time-to-event outcomes and were carried out using our SAS macro ABE.
We simulated seven normally distributed potential explanatory variables X 1 , . . . ,X 7 among which X 1 was the exposure variable of main interest. A latent outcome variable was defined as Y Ã~b 1 X 1 zX 2 zX 4 zX 7 . The covariance structure of X 1 , . . . ,X 7 was defined such that omission of X 2 ,X 4 or X 7 , or false inclusion of X 3 could induce bias into the estimate of b 1 , and that a pre-specified variance inflation factor (VIF) of X 1 given X 2 , . . . ,X 7 was attained. From Y Ã we generated continuous, binary and time-to-event variables to simulate linear, logistic and Cox regression, respectively. Further simulation parameters were set such that we obtained approximately equal sampling distributions ofb 1 in these three types of regression analyses.
Specifically, X 2 ,X 3 ,X 4, and X 5 were drawn from a multivariate normal distribution with a mean vector of 0, standard deviations of 1 and bivariate correlation coefficients of 0:5 . X 6 and X 7 were independently drawn from a standard normal distribution. X 1 depended on X 2 ,X 3 and X 6 and was simulated from the equations X 1~0 :266 X 2 zX 3 zX 6 ð Þ z0:710E for scenarios with VIF~2 and X 1~0 :337 X 2 zX 3 zX 6 ð Þ z0:449E for VIF~4, where E was a random number drawn from a standard normal distribution. The latent outcome variable was defined as Y Ã~b 1 X 1 zX 2 zX 4 zX 7 . Subsequently, we generated continuous, binary and time-to-event outcome variables Y C ,Y B and Y T from Y Ã to simulate linear, logistic and Cox regression, respectively. In particular, Y C was drawn from a normal distribution with mean Y Ã and standard deviation 0:36. Y B was drawn from a Bernoulli distribution with event probability 1=(1zexp½{Y Ã ). The overall expected event probability was approximately 0:5. Weibull distributed survival times T were drawn from where U was a standard uniform random variable [29]. To obtain approximately 55% censoring (averaged over all scenarios), follow-up times F were drawn from a uniform U½0,3:35 distribution, and the observable survival time and status indicators were defined as Y T~m in(T,U) and S T~I (TwU), respectively. For Cox regression, all covariates were divided by 2. These definitions guaranteed that the sampling standard deviations of estimates of b 1 from linear, logistic and Cox regression in the scenarios with VIF~2 and b 1~1 were approximately equal when the models were specified correctly.
In a factorial design we simulated 1000 samples of 120 observations for each combination of true b 1 (either 0 or 1), VIF (2,4) and type of regression (linear, logistic or Cox). If b 1~1 and VIF~2, this sample size gave a power of 50% to reject the null hypothesis b 1~0 at a two-sided significance level of 5% in all three types of regression, when the model was specified correctly. (In other words, in such models the expected p-value for this hypothesis was 0:05.) Each sample was analyzed by a regression on all explanatory variables without selection, applying ABE with t~0:05 or 0:1 and a~0:05 or 0:2, and applying BE with a~0:05 or 0:2. Unselected, ABE and BE analyses were then repeated by applying the disjunctive cause criterion [30] assuming that causal relationships between the variables X 1 , . . . ,X 7 and their likely effects on the outcome were known, which means that X 5 was eliminated from the scope of explanatory variables to consider.
For these evaluations, we used correctly specified models as 'benchmark', i.e., those containing X 1 ,X 2 ,X 4 and X 7 without further selection. We computed the bias and root mean squared error (RMSE) of unselected models, BE and ABE relative to the meanb 1 from such correctly specified models.

Simulation Results
While the full results of our simulation study are contained in a Technical Report [14], the relative behavior of modeling by ABE, BE or by applying no variable selection can already be understood from the results selected for Table 1. In general, we found that no selection and ABE selection lead to less biased estimates of the exposure effect than BE. The bias of ABE is small in absolute terms (usually around 1{2% and only in logistic regression 6%) and never exceeding the bias of no selection. The bias of BE with a~0:2, although slightly larger, is still acceptable for most practical purposes. BE with a~0:2 has some advantages with respect to RMSE compared to ABE and no selection. The RMSE of ABE is slightly smaller than that of no selection in linear regression, and both procedures yield virtually identical RMSEs in logistic and Cox regression. These observations can be explained by comparing the proportion of selecting 'inflated' and 'biased' models, i.e., models in which noise variables were falsely included or important variables were falsely excluded, respectively. Unselected models always contain such noise variables. In 39{64% of the simulated data sets for linear regression, and in 2{20% of the simulated data sets for logistic and Cox regression, ABE manages to identify and exclude those noise variables but occasionally also eliminates some of the important variables (25{29% in linear regression, 1{7% in logistic and Cox regression). By contrast, BE excludes noise variables more often (54{73%), which likely explains its RMSE advantages. Note that despite BE's nominal significance level of a~0:2, the probability of false inclusion of at least one noise variable lies in the range of 27{46% in our setting. Important variables are frequently missed by BE (33{40% in linear regression, 7{17% in logistic regression, 19{28% in Cox regression), and this causes a slightly higher bias.
In additional simulations which are only reported in the Technical Report [14], we found that lowering the significance level in BE to a~0:05 further increases BE's bias since important adjustment variables are more frequently missed, and this also causes a modest increase in RMSE. Furthermore, increasing the changein-estimate threshold of ABE to t~0:1 makes ABE more similar to BE, i.e., bias is increased but RMSE slightly decreased. With smaller samples, bias and RMSE are generally more inflated with all methods. Finally, incorporating background knowledge into variable selection improves bias and RMSE for all investigated selection procedures. Thus, we conclude that in the scenarios studied, application of ABE with the proposed settings for a and t is at least as safe as application of BE with regard to bias, and is at least as good as, but often better than, including all available variables from the initial set for adjustment with regard to bias and RMSE. Here we want to elucidate the effect of different levels of U OSM , the exposure of interest, on the cause-specific hazard. Consequently, patients who died within follow-up but before initiating dialysis are considered as censored [32]. We used the logarithm to base 2 of U OSM for all modeling because of its skewed distribution. Based on a priori knowledge nine explanatory variables measured at baseline are considered as potential confounders: log 2 of creatinine clearance (ml/min), log 2 of proteinuria (g/L), presence of polycystic kidney disease, whether or not betablockers, diuretics, or angiotensin-converting enzyme inhibitors and angiotensin II type 1 receptor blockers (ACEI/ARBs) were used, age in decades, gender, and mean arterial pressure (mmHg) ( Table 2). We assume that all these variables  Augmented Backward Elimination fulfill the disjunctive cause criterion for selection of potential confounders, i.e., all variables are either a possible cause of the exposure or a possible cause of the outcome. The largest absolute correlation occurred between U OSM and creatinine clearance (0:59), followed by three correlation coefficients slightly above 0:30 (use of diuretics and age; creatinine clearance and age; proteinuria and ACEI/ARBs). The final model should be as simple as possible and should not include irrelevant variables. BE with a significance threshold a of 0:2 selects six of the ten variables from the initial set into the final model (Table 3). Figure 2 (first row, left column) shows the sensitivity of the absolute standardized regression coefficient of U OSM on the choices of a. Model stability was assessed by inclusion frequencies of each variable in 1000 bootstrap resamples, each analyzed with BE and a~0:2. All explanatory variables selected into the (original) final model by BE are selected in at least 60% of all bootstrap resamples. Figure 3 (first row) shows the number of selected variables in the 1000 models of the bootstrap resamples. In 60% of the bootstrap resamples six to seven variables were selected. The sensitivity of the bootstrap inclusion frequencies on the choice of significance threshold a is shown in Figure 3 (first row, right column).

Example
Applying ABE with a~0:2 and t~0:05 additionally selects ACEI/ARB use, since this causes a change in the standardized hazard ratio of proteinuria by more than 5%. ACEI/ARB use is included in almost 50% of all bootstrap resamples. Figure 3 (second row) also shows that ABE tends to select slightly more variables than BE. From a medical point of view the inclusion of ACEI/ARBs into the model can be explained, as ACEI/ARBs inhibit the activity of the renin- Augmented Backward Elimination angiotensin-aldosterone system (RAAS), which controls fluid and electrolyte balance through effects on the heart, blood vessels and the kidneys, and have been shown to be renoprotective and slow the progression of chronic nephropathies [33]. Angiotensin II, the main effector of the RAAS, exerts a vasoconstrictory effect on postglomerular arterioles, increasing glomerular hydraulic pressure and ultrafiltration of plasma proteins. Additionally, Angiotensin II has been linked to sustained cell growth, inflammation and fibrosis, which have also been associated with accelerated renal damage. Confidence limits and p-values given in Table 3 do not reflect model uncertainty and hence, are likely to underestimate the variability of regression coefficients. Table 4 shows bootstrap standard errors for U OSM which are clearly higher than their model-based counterparts. Robust standard errors correct some   but not all of the uncertainty induced by model selection and may be a good compromise if full resampling cannot be applied [34].
Up to now, all variables from the initial set were used as 'passive or active' variables when evaluating the change-in-estimate criterion. If required, we could define only U OSM as 'passive or active' and treat all other explanatory variables as 'only active'. Then only explanatory variables which reach the significance threshold a or change the standardized hazard ratio of U OSM by more than tt will be selected into the final model. Applying ABE with such redefined roles of variables and with a~0:2 and t~0:05 gives the same final model as selected by BE with a~0:2.

Discussion
In biomedical research we are often confronted with complex statistical modeling problems involving a large number of potential explanatory variables and only restricted prior knowledge about their relationships. Therefore, practical and reproducible approaches to statistical modeling are needed.
The first step in finding a practically useful statistical model should always be a careful pre-selection of explanatory variables based on subject-matter knowledge. Often this is the most important prerequisite for any analytical modeling steps to follow. If enough subject-matter knowledge is available causal diagrams may be of help. However, causal diagrams are always based on expert knowledge and opinions and their construction may sometimes not be universally reproducible. This may motivate the careful use of a reproducible data-driven variable selection procedure.
Based on our evaluation of unselected models, ABE and BE, we recommend ABE for development of statistical models when there is only little guidance on which variables to include. Compared to BE, ABE more often avoids bias due to the false exclusion of an important confounding variable. Compared to no variable selection, ABE frequently supplies smaller and thus practically more useful models but with no detrimental consequences on bias or RMSE. By construction, ABE models essentially show only negligible differences compared to unselected models including all candidate variables. In practice, this may be Model-based standard error, robust standard error and standard error based on 1000 bootstrap resamples for models selected with backward elimination (BE) and augmented backward elimination (ABE). Abbreviations and symbols: a, significance threshold; ABE, augmented backward elimination; BE, backward elimination; t, change-in-estimate threshold; SE, standard error; U osm , urine osmolarity (mosm/L). doi:10.1371/journal.pone.0113677.t004 important to demonstrate to reviewers and readers of a research report that all relevant confounders are accounted for. Our proposal for standardization of the change-in-estimate criterion employed by ABE focuses on the quantity of interest in a given type of regression analysis (regression coefficients, hazard ratios or odds ratios). It also considers the scaling of the variables, such that its results are invariant to linear transformations of variables. ABE can be adopted to the statistical modeling problem at hand, by defining the role and thus the importance of each candidate explanatory variable. Our approximation of the change-in-estimate shows that a 'significant' changein-estimate always results if the variable in question has a significant effect on the outcome. Thus, if 'false positive' selections are to be avoided, a simple significance-based selection such as BE is the method of choice. However, even though ABE and other data-driven variable selection methods may be useful statistical tools, they should not be a replacement for careful thinking of possible causal relationships.
Whenever data-dependent variable selection is conducted, reported standard errors and confidence limits understate the true uncertainty of regression coefficients and derived quantities (hazard or odds ratios). We have demonstrated how to use resampling-based methods to obtain more reliable interval estimates and to evaluate model stability.
We have written a SAS macro ABE, which implements augmented backward elimination for linear, logistic and Cox regression. By means of a simulation and an analysis of a biomedical study, we evaluated the ABE algorithm and its implementation in SAS. Depending on the settings of the parameters of ABE (significance threshold a, change-in-estimate threshold t and roles of candidate explanatory variables), the number of variables in the final model selected by ABE will be between the number of variables selected by BE and the total number of variables. Based on our simulations and practical experiences with ABE, we suggest to use a significance threshold of a~0:2 and a change-in-estimate threshold of t~0:05. The SAS macro ABE is freely available under a General Public License (GPL) at: http://cemsiis.meduniwien.ac.at/en/kb/science-research/ software/statistical-software/abe/.

Supporting Information
Materials S1. SAS code to reproduce the simulation study and the analysis of the urine osmolarity example. doi:10.1371/journal.pone.0113677.s001 (ZIP)