Evolutionary Wavelet Neural Network ensembles for breast cancer and Parkinson’s disease prediction

doi:10.1371/journal.pone.0192192

Fig 1.

Structure of a Wavelet Neural Network.

The network has n inputs in the input layer, m wavelons in the hidden layer and one neuron in the output layer. A bias θ is added to the WNN output response. Also, notice the n shortcut connections from the inputs to the output neuron.

More »

Expand

Fig 2.

Training of an EWNN on a two-spiral task.

(a) two-spiral classification task, each spiral consisting of 97 data points in the 2D Cartesian space. (b) Morelet wavelet activation function and (c) Optimal response on the task where a EWNN with Morelet wavelet activation function has separated the two classes successfully.

More »

Expand

Table 1.

Datasets’ characteristics used in this study.

More »

Expand

Fig 3.

Flowchart of the two phases of the approach.

Phase I is the process of generating optimized EWNNs. Phase II uses the optimized EWNNs to generate the ensemble of EWNNs.

More »

Expand

Table 2.

Main parameter settings of the evolutionary wavelet neural networks for the different datasets.

More »

Expand

Table 3.

Performance of the ensemble EWNN on the different case studies.

Notice the increase in accuracy of the classifiers when an ensemble approach is adopted (second column).

More »

Expand

Fig 4.

Identification of significant features.

The figure shows the average number of connections per feature within the EWNN (over 50 independent evolutionary runs) and its ensemble EWNN-e (over the active classifiers), for the datasets: (a) DDSM [42], (b) LPD [44], and (c) SPD [45]. For all three datasets, and for all features, the average is higher than zero indicating that no feature should be completely removed from analysis. For illustration purposes, consider the example of feature Age in (a). The correct way to interpret the values is that the feature is connected to 1 wavelon on average, considering the 50 runs of EWNN. Details on the features can be found in the referenced papers [42, 45, 44].

More »

Expand

Fig 5.

Should every wavelon be fully connected?

Summation of wavelons’ dimensionality for individual EWNNs (over 50 independent evolutionary runs); and for the ensemble EWNN-e (over the number of active classifiers), for the three datasets (a) DDSM, (b) LPD and (c) SPD. Note the overall increasing trend for DDSM, with a larger number of high-dimensionality wavelons (except for EWNN-e which shows a decrease in the number of 6-dimensional wavelons). For LPD we see a concentration between 2- to 4-dimensional wavelons for both individual and ensemble EWNNs. Finally, for SPD, we see a concentration at the higher dimensions. These results indicate that having the features connected to all wavelons is not necessarily the most appropriated choice.

More »

Expand

Table 4.

Average number of active EWNNs in the ensembles, using 10-fold cross-validation, and across the three datasets.

The average is calculated over 50 independent runs.

More »

Expand

Table 5.

Comparison between EWNN/EWNN-e and the different classifiers found in the literature for the DDSM, LPD and SPD datasets.

In the case of LPD, EWNN-e outperformed all methods reported in literature, reaching a test accuracy of 100%.

More »

Expand