Figure 1.
Plate representation of the stochastic data generation process employed in the generation of the simulation study data.
Circles represent stochastic variables. Double circles represent functions of the stochastic variables. For each simulation, , we generate
samples
,
. For each sample,
, we simulate
correlated features,
, conforming to a covariance matrix
parameterized according to the correlation parameter,
. The response,
, is generated from a linear regression model with residual variance,
, set to 1. The indicator variables,
, control which features influence the response, according to the inclusion probability
(which controls the saturation of the model). The regression coefficients,
, are obtained by re-scaling the
values, in order to control the signal-to-noise ratio,
. See the Methods section for details, including the distributional assumptions associated with these quantities.
Figure 2.
Scatter plots comparing the MSE scores produced by ridge-regression, lasso, and elastic-net.
Panel a shows the comparison of ridge-regression vs lasso, panel b compares ridge-regression vs elastic-net, and panel c compares lasso vs elastic-net.
Figure 3.
Distributions of the absolute performance response for ridge-regression, across 10 equally spaced bins of the parameters ranges.
The x-axis show the parameter ranges comprised by each of the 10 bins. The y-axis shows the absolute performance response . The red horizontal line represents the median of the response distribution. The dotted line is set at zero.
Figure 4.
Interaction plots for ridge-regression.
The values of the interaction test statistics are shown on the top of the figures.
Table 1.
Ridge-regression.
Figure 5.
Distributions of the absolute performance response for lasso, across 10 equally spaced bins of the parameters ranges.
The x-axis show the parameter ranges comprised by each of the 10 bins. The y-axis shows the absolute performance response . The red horizontal line represents the median of the response distribution. The dotted line is set at zero.
Figure 6.
The values of the interaction test statistics are shown on the top of the figures.
Table 2.
Lasso.
Figure 7.
Distributions of the absolute performance response for elastic-net, across 10 equally spaced bins of the parameters ranges.
The x-axis show the parameter ranges comprised by each of the 10 bins. The y-axis shows the absolute performance response . The red horizontal line represents the median of the response distribution. The dotted line is set at zero.
Figure 8.
Interaction plots for elastic-net.
The values of the interaction test statistics are shown on the top of the figures.
Table 3.
Elastic-net.
Figure 9.
Distributions of the relative performance response in the ridge-regression vs lasso comparison, across 10 equally spaced bins of the parameters ranges.
The x-axis show the parameter ranges comprised by each of the 10 bins. The y-axis shows the relative performance response . The red horizontal line represents the median of the response distribution. The dotted line is set at zero.
Figure 10.
Interaction plots for the ridge-regression vs lasso comparison.
The values of the interaction test statistics are shown on the top of the figures.
Table 4.
Ridge-regression vs lasso.
Figure 11.
Distributions of the relative performance response in the ridge-regression vs elastic-net comparison, across 10 equally spaced bins of the parameters ranges.
The x-axis show the parameter ranges comprised by each of the 10 bins. The y-axis shows the relative performance response . The red horizontal line represents the median of the response distribution. The dotted line is set at zero.
Figure 12.
Interaction plots for the ridge-regression vs elastic-net comparison.
The values of the interaction test statistics are shown on the top of the figures.
Table 5.
Ridge-regression vs elastic-net.
Figure 13.
Distributions of the relative performance response in the elastic-net vs lasso comparison, across 10 equally spaced bins of the parameters ranges.
The x-axis show the parameter ranges comprised by each of the 10 bins. The y-axis shows the relative performance response . The red horizontal line represents the median of the response distribution. The dotted line is set at zero.
Table 6.
Lasso vs elastic-net.
Figure 14.
Interaction plots for the lasso vs elastic-net comparison.
The values of the interaction test statistics are shown on the top of the figures.