A Theory of Cheap Control in Embodied Systems

doi:10.1371/journal.pcbi.1004427

Fig 1.

Sensorimotor loop.

More »

Expand

Fig 2.

Causal structure of the reactive SML.

More »

Expand

Fig 3.

Ambiguity and redundancy of the sensor measurement.

In this example, an agent navigates the 20 × 20 maze shown in the left panel. The agent is endowed with two sensors (eyes), S_left and S_right. Each sensor measures a weighted average of the walls in the immediate vicinity, illustrated in the central panel, and outputs one of 8 possible numerical values, as shown in the right panel. There are 400 possible locations in the maze but only 8 × 8 = 64 joint sensor states. This implies that the sensor measurement is highly ambiguous about the world state. Furthermore, the outputs of both sensors are not independent; they always have the same value at the decimal place. Due to this redundancy, the factual number of joint sensor states is 15, instead of 64.

More »

Expand

Fig 4.

Locality of world-state transitions.

At subsequent time steps, the knee of a robot can only move by a small amount. Only very few world state transitions are possible within one time step (e.g., transitions to neighboring positions). This hexapod is used in the experimental evaluation of our theory in “Experiments with a Hexapod”.

More »

Expand

Fig 5.

Illustration of the exponential family Eq (11) of policies.

This figure shows an example with ∣𝒲∣ = 3 and ∣𝒜∣ = 2 and a policy-behavior map ψ with embodied behavior dimension d = 2. In this case, the polytope is the three-dimensional cube of 3 × 2 row stochastic matrices shown in the middle. The curved surface within is the exponential family , which is parametrized by two parameters. The exponential family is mapped by the policy behavior map ψ to the same set of behaviors (the hexagon illustrated in the right) as the set of all policies.

More »

Expand

Fig 6.

Illustration of a CRBM policy in the sensorimotor loop.

More »

Expand

Fig 7.

Hexapod set-up.

Left-hand side: The simulated hexapod with a display of the joint configurations. Right-hand side: Visualization of the target walking pattern. The plot shows which leg touched the ground at which point in time. Blue areas refer to a contact with a the ground, while orange areas refer to points in time during which the correspond leg did not touch the ground. The different legs are plotted over the y-axis, while each point on the x-axis refers to a single point in time.

More »

Expand

Fig 8.

Estimation of the support’s cardinality.

Estimation of the support set cardinality (before and after pruning).

More »

Expand

Fig 9.

Experimental results.

Performance of the best CRBM for different complexity parameters m in comparison to the performance of the target behavior (horizontal orange line). The vertical blue line indicates the m estimated from the data (see supporting information S2 Text).

More »

Expand