Fig 1.
Process of the MST-kNN method.
Starting with the complete dataset, a distance matrix is computed which forms the basis for a complete graph. A Minimum Spanning Tree is computed within the complete graph. Then, all edges that are not k-Nearest Neighbors are removed resulting in clusters.
Fig 2.
Results of the clustering method with k = 3.
Seven clusters were found; Cluster 0 to Cluster 6. Clusters are of varying sizes with the largest cluster (Cluster 5) containing 556 respondents and the smallest cluster (Cluster 4) containing 45. Cluster 0 is shown in light yellow, Cluster 1 in green, Cluster 2 in light green, Cluster 3 in blue, Cluster 4 in light orange, Cluster 5 in orange and Cluster 6 in red.
Table 1.
Bottom features for each cluster (presented in ascending order of score).
Table 2.
Top features for each cluster (presented in descending order of score).
Fig 3.
The selected top and bottom features are shown in red and green respectively. As can be seen, these coloured features form a “shoulder” on either side of the ‘curve’ as they are characteristically higher or lower than the rest of the bars in this bar chart. The selected bottom and top “shoulders” are also presented in Tables 1 and 2.
Fig 4.
The selected top and bottom features are shown in red and green respectively. As can be seen, these coloured features form a “shoulder” on either side of the ‘curve’ as they are characteristically higher or lower than the rest of the bars in this bar chart. The selected bottom and top “shoulders” are also presented in Tables 1 and 2.
Fig 5.
The selected top and bottom features are shown in red and green respectively. As can be seen, these coloured features form a “shoulder” on either side of the ‘curve’ as they are characteristically higher or lower than the rest of the bars in this bar chart. The selected bottom and top “shoulders” are also presented in Tables 1 and 2.
Fig 6.
The selected top and bottom features are shown in red and green respectively. As can be seen, these coloured features form a “shoulder” on either side of the ‘curve’ as they are characteristically higher or lower than the rest of the bars in this bar chart. The selected bottom and top “shoulders” are also presented in Tables 1 and 2.
Fig 7.
The selected top and bottom features are shown in red and green respectively. As can be seen, these coloured features form a “shoulder” on either side of the ‘curve’ as they are characteristically higher or lower than the rest of the bars in this bar chart. The selected bottom and top “shoulders” are also presented in Tables 1 and 2.
Fig 8.
The selected top and bottom features are shown in red and green respectively. As can be seen, these coloured features form a “shoulder” on either side of the ‘curve’ as they are characteristically higher or lower than the rest of the bars in this bar chart. The selected bottom and top “shoulders” are also presented in Tables 1 and 2.
Fig 9.
The selected top and bottom features are shown in red and green respectively. As can be seen, these coloured features form a “shoulder” on either side of the ‘curve’ as they are characteristically higher or lower than the rest of the bars in this bar chart. The selected bottom and top “shoulders” are also presented in Tables 1 and 2.
Table 3.
Best ‘simple’ logistic models for assessing cluster partitioning.
For each model, the Fitness was guided by the Area Under the Curve value and is shown as well as the best model found by Eureqa.
Table 4.
Best ‘simple’ logistic models for Involvement Class.
For each cluster, the Fitness was guided by the Area Under the Curve value is shown as well as the best model found by Eureqa.