A Hierarchical Probabilistic Model for Rapid Object Categorization in Natural Scenes

doi:10.1371/journal.pone.0020002

Figure 1.

Hierarchical probabilistic model of object categorization in natural scenes.

Object category is modeled as a composition of a set of geometrically related parts and each part is represented by a PD of a set of natural object structures. Natural context is modeled by a PD of natural context structures. Object categorization in natural scenes is performed as statistical inference. All the PDs were estimated from natural objects and context.

More »

Expand

Figure 2.

Coarse models of geometry of animals (medium-body animals) and cars in natural scenes.

(A), Any animal in natural scenes was modeled by two ellipses, one for the head and one for the body. Any car in natural scenes was modeled by one ellipse. (B), Size (left) and orientation (right) distributions of animal heads in natural scenes. (C), Size (left) and orientation (right) distributions of animal bodies in natural scenes. (D), Size (left) and orientation (right) distributions of cars in natural scenes.

More »

Expand

Figure 3.

Extracting natural object and scene structures.

Each structure is a structured patch compiled from images of natural objects and scenes. To obtain a set of natural object and context structures, we performed ICA on patches of natural scenes and classified the ICs into four orientations. We then sampled a large number of patches from natural scenes and classified each patch as being oriented in one of the four orientations according to the root total square amplitude of the ICs at that orientation. We applied this procedure to a collection of 3×3 small patches and its corresponding 1×1 big patch. A structure was thus a pair of 3×3 structured patches and its corresponding 1×1 structured patch. The structures shown here were the average of all patches that shared same dominant orientational structure at two spatial scales.

More »

Expand

Figure 4.

Examples of frequent object structures.

The upper panels in (A) and (B) are examples of the ICs of images of animals and cars at a finer scale respectively. Each frequent structure for the 5 animals and cars was the average of patches that shared the same dominant orientation structure at two spatial scales. The numbers indicate the locations of the structures in the animals and cars. The coarse structural description was for image patches of 48×48 pixels and the fine structural description was for the 3×3 blocks of the same patches (each block had 16×16 pixels). Most of the structures at the two spatial scales are similar except some fine details, e.g., the No. 4 and 6 structures of the zebra. The structures at the second scale (3×3 blocks) contain more details than the first scale (48×48 pixels).

More »

Expand

Figure 5.

Statistics of object structures.

(A), Relative occurring frequency of animal structures. (B), Examples of animal structures. The vertical axis indicates the percentage of the animals in the dataset by which the structures were shared. (C), The total numbers of structures that were shared by different percentage of the animals in the dataset. (D)–(F), Same format as (A)–(C) respectively for car structures.

More »

Expand

Figure 6.

PDs of selected object structures.

(A), Average posterior probability of being an animal scene based on each selected structure. The thin lines indicate the standard deviation. The insert shows the PD of the posterior probability of being an animal scene based on the structure. (B), The structures with odd indices shown in (A). (C), Relative occurring frequency and entropy of the 70 selected structures. (D)–(F), Same format as (A)–(C) respectively for car structures. (G), Ten examples of fitted generalized Gaussian PDs of the amplitudes of the ICs of joint probability based on the selected animal structures. (H), Examples (10) of fitted generalized Gaussian PDs of the amplitudes of the ICs of joint probability based on the selected car structures.

More »

Expand

Figure 7.

Object localization in natural scenes.

(A). Two input scenes. (B). Probability maps, i.e., the probability of a scene patch of 48×48 pixels being an animal or car at each pixel. (C). Object candidates (i.e., the ellipses) sampled from the PDs of coarse object geometry estimated from training scenes were overlaid on the probability maps so that the object candidates covered most pixels that had high probability. The dashed and sold ellipses in upper panel were for animal heads and bodies respectively.

More »

Expand

Figure 8.

Categorizing animals in natural scenes.

(A), Examples of four sets of animal scenes, i.e., head, close-body, medium-body, and far-body, and examples of distractors. (B), Performance of categorizing natural scenes with animals. (C), Performance of localizing animals in natural scenes and categorizing animals segmented from natural scenes for close-body and medium-body. (D), Performance of categorizing scenes having animals and with animals being replaced by random noise. (E), Animal categorization in scenes where animals were inserted into distractors. In (D) and (E), red bars show the results of the Serre et al model on medium-body scenes.

More »

Expand

Figure 9.

Categorizing cars in street scenes.

(A), Examples of street scenes having cars and examples of distractors. (B), Performance of localizing cars in street scenes and categorizing cars segmented from street scenes. (C), Performance of categorizing street scenes having cars and with cars being replaced by random noise. (D), Car categorization in scenes where car were inserted into distractors. In (C) and (D), red bars show the results of the Serre et al model.

More »

Expand