Adaptive Dimensionality Reduction with Semi-Supervision (AdDReSS): Classifying Multi-Attribute Biomedical Data | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

An example of how AdDReSS improves embedding by incorporating AL.
(a) The original embedding representation generated by SSDR. (b) A support vector machine classifier is used as an active learner. (c) samples within the low dimensional embedding found to be difficult to classify are selected as candidates for training. (d) SSDR trained on the labels queried by AL provide greater separation of object classes in the low dimensional embedding.

More »

Fig 2 — Fig 2.

Swiss Roll example.
(a) 3D Swiss Roll with all labels revealed. (b) 3D Swiss Roll with initial labels ℓ(S_tr) revealed. (c) Initial 2D embedding with labels. (d) Initial 2D embedding with initial labels ℓ(S_tr). (e) Ambiguous samples (in blue) are determined via active learning. (f) Region of the Swiss Roll at the class boundary (region is shown as a box in (e)). Note the selection of ambiguous samples (in blue) at the boundary between the two classes (in red and green). (g) Subsequent 2D embedding incorporating newly queried labels from the ambiguous samples. (h) Region near the class boundaries (shown as a box from (g)) revealing the increased separation between the two classes (in red and green) following application of the AdDReSS scheme.

More »

Table 1 — Table 1.

Datasets used for evaluation.

More »

Fig 3 — Fig 3.

Selection of mitotic and non-mitotic nuclei from the MITOS2012 dataset.
A nuclei candidate detection algorithm is used and patches centered at each candidate centroid are extracted.

More »

Table 2 — Table 2.

Strategies compared in this work.

More »

Table 3 — Table 3.

Summary of Evaluation Measures.

More »

Fig 4.

Evaluation of Classification Accuracy.
Number of instances for which labels were revealed versus mean ϕ^Acc for AdDReSS, SSAGE, GE, and the maximum empirically derived ϕ^Acc across all runs is shown for (a) , (b) , (c) and (d) . Standard deviation of ϕ^Acc shown as error bounds at each l.

More »

Fig 4.

Evaluation of Classification Accuracy.
Number of instances for which labels were revealed versus mean ϕ^Acc for AdDReSS, SSAGE, GE, and the maximum empirically derived ϕ^Acc across all runs is shown for (a) , (b) , (c) and (d) . Standard deviation of ϕ^Acc shown as error bounds at each l.

More »

Fig 5.

Evaluation of Silhouette Index.
Number of instances for which labels were revealed versus mean ϕ^SI for AdDReSS, SSAGE, GE, and the maximum empirically derived ϕ^SI across all runs is shown for (a) , (b) , (c) and (d) . Standard deviation in ϕ^SI shown as error bounds at each l.

More »

Fig 5.

Evaluation of Silhouette Index.
Number of instances for which labels were revealed versus mean ϕ^SI for AdDReSS, SSAGE, GE, and the maximum empirically derived ϕ^SI across all runs is shown for (a) , (b) , (c) and (d) . Standard deviation in ϕ^SI shown as error bounds at each l.

More »

Fig 6.

Evaluation of Variance for Classification Accuracy.
Variance of ϕ^Acc at selected numbers of instances for which labels were revealed for AdDReSS, SSAGE, GE are shown for (a) , (b) , (c) , and (d) .

More »

Fig 6.

Evaluation of Variance for Classification Accuracy.
Variance of ϕ^Acc at selected numbers of instances for which labels were revealed for AdDReSS, SSAGE, GE are shown for (a) , (b) , (c) , and (d) .

More »

Fig 7.

Evaluation of Variance for Silhouette Index.
Variance of ϕ^SI at selected numbers of instances for which labels were revealed for AdDReSS, SSAGE, GE are shown for (a) , (b) , (c) , and (d) . GE shows zero variance as labeled information does not affect the embedding for GE.

More »

Fig 7.

Evaluation of Variance for Silhouette Index.
Variance of ϕ^SI at selected numbers of instances for which labels were revealed for AdDReSS, SSAGE, GE are shown for (a) , (b) , (c) , and (d) . GE shows zero variance as labeled information does not affect the embedding for GE.

More »

Fig 8.

Evaluation of Raghavan Efficiency.
ϕ^Eff for k ∈ {2, 3} shows the comparative efficiency between AdDReSS and GE, SSAGE and GE, and AdDReSS and SSAGE for (a) , (b) , (c) , and (d) .

More »

Fig 8.

Evaluation of Raghavan Efficiency.
ϕ^Eff for k ∈ {2, 3} shows the comparative efficiency between AdDReSS and GE, SSAGE and GE, and AdDReSS and SSAGE for (a) , (b) , (c) , and (d) .

More »

Table 4 — Table 4.

Percent improvement in Raghavan efficiency via AdDReSS over SSAGE.

More »

Fig 9.

Evaluation of Maximum Information Gain.
ϕ^MIG shows areas of maximum information gain (shown as a dashed black line) in terms of the difference in ϕ^Acc between AdDReSS and SSAGE for (a) , (b) , (c) , and (d) .

More »

Fig 9.

Evaluation of Maximum Information Gain.
ϕ^MIG shows areas of maximum information gain (shown as a dashed black line) in terms of the difference in ϕ^Acc between AdDReSS and SSAGE for (a) , (b) , (c) , and (d) .

More »

Fig 10.

Evaluation of Maximum Query Efficiency.
ϕ^MQE describes the maximum efficiency in terms of queried labels given the same ϕ^Acc (shown as a dashed black line) between AdDReSS and SSAGE for (a) , (b) , (c) , and (d) .

More »

Fig 10.

Evaluation of Maximum Query Efficiency.
ϕ^MQE describes the maximum efficiency in terms of queried labels given the same ϕ^Acc (shown as a dashed black line) between AdDReSS and SSAGE for (a) , (b) , (c) , and (d) .

More »

Fig 11 — Fig 11.

Illustration describing Raghavan efficiency.
A refers to the area between the Active Learning curve and the empirically-derived maximum accuracy, and B refers to the area between the Random Sampling curve and the Active Learning curve.

More »