Multi-view classification with convolutional neural networks

doi:10.1371/journal.pone.0245230

Fig 1.

A collection of images is composed of multiple views depicting the same object instance from different perspectives.

More »

Expand

Table 1.

Overview of previous work utilizing multi-view classification.

More »

Expand

Fig 2.

Considered multi-view fusion strategies: (a) general architecture of a deep multi-view CNN; (b) investigated fusion strategies; and (c) fusion strategies mapped onto the ResNet-50 architecture.

Vertical lines mark the insertion of a view-fusion layer.

More »

Expand

Fig 3.

Example collections of the three multi-view datasets: (a) CompCars, (b) PlantCLEF, and (c) AntWeb.

Photographs of the ant specimen CASENT0281563 by Estella Ortega retrieved from www.AntWeb.org [32].

More »

Expand

Table 2.

Dataset demographics.

Top-1 accuracy refers to the best reported result in previous single-view studies using comparable evaluation protocols.

More »

Expand

Fig 4.

Distance matrices for the three datasets.

Matrix diagonal elements refer to intra-class distance, off-diagonal elements to inter-class distances. Elements are sorted from well-separable classes to less-separable classes as computed from the class-wise silhouette scores.

More »

Expand

Table 3.

Multi-view classification results across the three datasets.

More »

Expand

Fig 5.

Distribution of class-averaged top-1 classification accuracy for the single-view baseline and the multi-view classification strategies.

White dots indicate median accuracy whereas black bars display interquartile ranges. Thin black lines indicate lower and upper adjacent values at 1.5× the interquartile range.

More »

Expand

Table 4.

Top-5 accuracy for single-view and multi-view classifications.

More »

Expand

Table 5.

Dataset demographics for the Flora Incognita dataset.

More »

Expand

Table 6.

Multi-view classification results for the Flora Incognita dataset.

More »

Expand