A computational model of stereoscopic prey capture in praying mantises

doi:10.1371/journal.pcbi.1009666

A computational model of stereoscopic prey capture in praying mantises

Fig 12

Responses of different model components to a horizontally-moving disk at a simulated distance of 2.5cm from the mantis.

Left, middle, right columns are for a target of size 11.2°, 16.9° and 25.5° as indicated. Sub-panels in the top two rows show left and right eyes. In all panels, axis coordinates are in degrees visual angle referred to the centre of the screen, i.e. x = 0, y = 0 corresponds to a location 10cm directly in front of the mantis. However, the interpretation of the axes differs in each row, as explained below. Top row (ABC): Snapshots of the filtered images J_L,R(x, y, t), shown as a function of retinal location (x, y) for one particular time t. The axes are therefore simply retinal location. Pseudocolor represents the images reaching the sensor’s receptive fields, following lowpass spatial filtering, highpass temporal filtering and squaring in the early visual system. The receptive field excitatory region is shown superimposed for comparison. Each pixel represents the value of the filtered image at a particular location in the retina. These snapshots are for one particular time t and thus for one particular target position x_tgt(t), y_tgt(t) as the target moves across the screen. In this figure, the target was moving horizontally, so y_tgt is in fact independent of time whereas x_tgt = x₀ + Vt. The yellow circle marks where the center of the target is in that eye at the time shown. The white cross marks the center of the sensor receptive field in that eye; the inner white square marks the boundary of the central excitatory region, while the outer white square marks the boundary of the outer excitatory region. The surrounding inhibitory region extends beyond the range shown in each panel. Thus, parts of the filtered image falling outside the white squares have an inhibitory effect on the sensor. Middle row (DEF): Inputs to the binocular disparity sensor from the two eyes, v_L,R. The input from each eye is the inner product of the monocular receptive field with the filtered image at that moment in time. It is here represented as a function of target position x_tgt(t), y_tgt. Since the target is moving horizontally across the screen from left to right, x_tgt is a function of time, whereas y_tgt is constant for a given trajectory. Each pixel-row in DEF therefore represents the time-course of the monocular input, v_L,R(t), as the target moves from left to right over the screen, at the vertical location y_tgt corresponding to the height of the pixel row. The axes therefore represent the current location of the target in the retina, and the panel as a whole does not represent an image, since different locations correspond to different times. The pink arrows mark the value of the monocular input in D for the filtered image shown in A. Bottom row (GHI): response of the disparity sensor, Eq 2. The axes are now the current visual direction of the moving binocular target, x_c(t) = 0.5(x_tgtL(t) + x_tgtR(t)); x_c is again a function of time. Arrows from D to G show the target locations shown in the top row, A, and thus the response when the target crosses the midline, x_c = 0. For comparison, dotted arrows show the response a little earlier when x_c was −6°. The target’s direction in the visual field is x_c = (x_tgtL + x_tgtR)/2 and y_c = (y_tgtL + y_tgtR)/2. Since the target is moving horizontally, x_c is a function of time, but y_c is constant for a given trajectory.

doi: https://doi.org/10.1371/journal.pcbi.1009666.g012