Fig 1.
Pipeline for computing image representations based on local features.
Local features are computed or defined within the detection step. Then local descriptor vectors are extracted from these patches and are encoded after pooling from predefined image regions. The image representation is finally used for training and classification.
Table 1.
Results reported for the Oxford Flower 17 and Oxford Flower 102 datasets using local features along with the methods detailed for each processing step.
Fig 2.
Earliest vs. early vs. late fusion of shape and color features.
In the earliest fusion strategy all local shape descriptors are computed from every color channel, followed by concatenation and encoding. In the early fusion strategy, all local features (shape and color) are extracted from the same patches and are locally concatenated before encoding whereas in the late fusion the image representations are computed separately for each feature and concatenated thereafter.
Fig 3.
Challenging examples from the Jena Flower 30 (JF30) dataset.
(a) and (b) Evolution of two flowers throughout the season and (c) Species with similar visual appearance: Lotus corniculatus vs. Hippocrepsis comosa, Scabiosa columbaria vs. Knautia arvensis, Inula hirta vs. Inula salicina.
Fig 4.
Images per species within the Jena Flower 30 (JF30) dataset.
Table 2.
Median amount of local features extracted per image for the OF17, the OF102, and the JF30 dataset.
Table 3.
Classification accuracy on the OF17, the OF102, and the JF30 dataset.
Computed using SIFT in combination with the Hessian and Harris-based detectors without and with (values in brackets) affine shape estimation.
Table 4.
Class averaged classification accuracy of the studied shape and color descriptors for the OF17, the OF102, and the JF30 datasets.
Fig 5.
Classification accuracies using (a) DoH-SIFT and (b) DoH-DCD features and different encoding methods for discrete codebook sizes and image representation lengths.
Table 5.
Class averaged classification accuracies for the fused shape and color descriptors on the OF17, the OF102, and the JF30 datasets all using the DoH detector.
Table 6.
Class averaged classification accuracies on the OF17, the OF102, and the JF30 datasets using DoH-OpponentSIFT and an increasing amount of pyramidal levels (one to three).