Feed-Forward Segmentation of Figure-Ground and Assignment of Border-Ownership

Figure-ground is the segmentation of visual information into objects and their surrounding backgrounds. Two main processes herein are boundary assignment and surface segregation, which rely on the integration of global scene information. Recurrent processing either by intrinsic horizontal connections that connect surrounding neurons or by feedback projections from higher visual areas provide such information, and are considered to be the neural substrate for figure-ground segmentation. On the contrary, a role of feedforward projections in figure-ground segmentation is unknown. To have a better understanding of a role of feedforward connections in figure-ground organization, we constructed a feedforward spiking model using a biologically plausible neuron model. By means of surround inhibition our simple 3-layered model performs figure-ground segmentation and one-sided border-ownership coding. We propose that the visual system uses feed forward suppression for figure-ground segmentation and border-ownership assignment.


Introduction
Figure-ground segmentation is achieved by assigning visual elements to either objects or background as a primary step in visual perception. Two main processes in organizing figure-ground segmentation are boundary assignment and surface segregation ( Fig. 1). Boundaries are detected based on local contrast of visual elements, and are assigned to the figural region and not to the surrounding background region. This assignment is called borderownership. For example in figure 1a, when the visual system assigns the contrast borders to the light grey area a vase is perceived on a black background. If the same contrast borders belong to the black regions two monkey faces are perceived and the light grey area becomes background. Surface segregation is based on the comparison of locally identified visual features across space. The surface is segregated from background by grouping operations according to Gestalt principles where similar elements are grouped into coherent objects. For example in figure 1b, the individual orientated line segments in the centre are grouped together because they have the same orientation and they are segregated from the elements in the surrounding region as they differ in orientation. Consequently a textured figure overlying a homogeneous background is perceived. So a key factor for figureground organization is the combination of local with global scene information. In the visual cortex contextual influences on neuronal activity have been interpreted as the neural substrate of figureground perception [1].
Intrinsic horizontal connections that connect surrounding neurons convey information from beyond the classical receptive field and can provide contextual information of the target stimulus.
However, it has been shown that contextual suppressive effects come from large regions (4-7mm), while the horizontal spread of axons is limited (up to 3.5-4.5 mm radius in V1 monkey). Together with the slow conductance velocities (typically 0.1-0.2 m/sec) of these fibers, these observations cast doubt on a role for horizontal connections in perceptual integration. Feedback projections from higher visual areas to lower areas are more suitable to provide the contextual information necessary for figureground segmentation. Feedback projections have high conductance velocity (,3-10 m/sec), have large spread in V1, and influence surround mediated responses in V1. Finally, theoretical and most, if not all, computer models explain figure-ground segmentation by recurrent processing through horizontal and/or feedback connections.
Yet several arguments are inconsistent with a leading role of feedback projections in producing contextual effects and figureground segmentation. For instance, V2 is the main contributor of feedback to the primary visual cortex, though inactivation of V2 has no effect on centre-surround interactions of neurons in the primary visual cortex [2]. Surround effects are primarily suppressive but blockade of intra-cortical inhibition does not reduce significantly surround suppression [3]. Surround suppression is fast and may arrive even earlier than the feedforward triggered excitatory classical receptive fields response [4,5]. This timing is inconsistent with contextual modulation by late feedback. Also surround suppression in the monkey LGN emerges too fast for an involvement from cortical feedback [6].
In contrasts, apart from carrying the sensory information, a role of feed-forward projections in producing surround effects related to figure-ground segmentation is unknown. Many findings, however, point out that contextual effects may modify the feedforward signal and extra-classical surround suppression is present at the first stages of sensory processing in the retina, LGN and V1. The aim of this study is therefore to have a better understanding of the role of feedforward connections in figureground organization, in particular in surface segmentation and border-ownership coding. For that reason we constructed a purely feedforward spiking model omitting horizontal and feedback connections ( fig. 2) and tested the model for figure-ground segregation of textures (figs. 3,4) previously used in primate [7,8] and computational [9,10] studies. By means of feedforward surround inhibition, our simple 3-layered model performs figureground segmentation and one-sided border-ownership coding.

Feature representation in layer 1
The figure-ground image is accurately represented ( fig. 5) because the input was mapped onto the first layer. So, only spiking neurons [11] at the figure location and the background regions, of the first and second feature map respectively were firing spikes. Initially these neurons had a higher firing rate (,100ms, 180 sp/s) and settle to a more constant firing rate (.100ms, 110 sp/s). Neurons that did not receive input from the figure-ground stimulus (black regions of the input patterns) showed a slight hyper-polarization before stabilizing the membrane potential around 264mV.

Figure-ground segmentation in the second layer
Whereas neurons in the first layer received continuous input from the figure-ground image, neurons in the second layer received spiking input from the first layer. Each neuron received retinotopic excitatory input and global inhibitory input from all spiking neurons in the first layer. For feature map 1 (the central figure) the spatial pattern of spiking activity in the second layer mirrored the excitatory input pattern (figs. 5,6). In contrast, for feature map 2 (background) the spatial activity pattern changed compared to the input pattern. Neurons that received excitatory input became quiescent and neurons that did not receive excitatory spiking input fired spikes. This result is explained by rebound spiking as a results of the relative strong global inhibitory input. So in the second feature map many layer 1 neurons were activated by the relatively large background region, which provoked a strong suppression of all layer 2 neurons. For the neurons located on the background this inhibition neutralized the retinotopic activation. For the neurons located at the centre (representing the figure location) this global inhibitory signal was the sole input resulting in a strong and rapid hyper-polarization of the membrane potential, which caused rebound spiking of these cells. Such a phenomenon of surround activation of otherwise unstimulated neurons has also been described in primate V1 [12]. Moreover, our observation agrees with the notion of cue invariant figure-ground segregation in the visual cortex [13,14]. Thus for both feature maps figure-ground segregation was achieved;  neurons located in the central figural region were active while background neurons were salient. The activation of layer 2 neurons by global inhibitory input was thus independent of direct retinotopic sensory input. Note, however, that rebound spiking is not essential for segregating figure from ground; important here is that background neurons are silent.

Assignment of border-ownership in the third layer
Besides figure-ground signals, many (18% in V1 and 59% in V2) neurons in early visual cortex show selective responses to contour borders [15]. In particular neurons in V2 preferably respond to the contour when it belongs to one side of a figural region and not to the other side of the figure; a phenomenon called one-sided border-ownership assignment. In order to explain onesided border ownership, we applied a basic aggregation of separate sub-regions of receptive fields [16,17] where neurons in the third layer receive both an excitatory and an inhibitory connection from two neighboring neurons located in the second layer of the same feature map. In this way, borders can be detected if the excitatory sub-region receives feedforward input and the inhibitory subregion does not. For example, layer 3 neurons respond when the excitatory sub-region falls on the figural part and the inhibitory one falls on the background ( fig. 7). In essence, the idea of opposite receptive field sub-regions is reminiscent of the opponent model for border-ownership coding as proposed by Zhou et al. [15]. Figure 8 shows the border-ownership coding for a single figure and for two partially overlapping figures. Here neurons respond only when the border of the figure is located at the left side of the receptive field. Activation of both excitatory and inhibitory subregions will not lead to a neural spiking response in the third layer. So, the surface of the figure is not detected. In the case of two overlapping figures, the local contrast between the two figures should be sufficient large to determine border-ownership assignment.
There are 3 types of edge detection cells described: edge contrast polarity, border-ownership and a combination of them. In principle our model can reproduce all the different types of neurons that signal contrast borders by applying different combinations of sub-regions from the second layer and/or from the first layer and from the two feature maps. A further product of such combinations of sub-regions is that neurons coding borderownership are orientation selective. This has also been described in the visual cortex where edge detection is mainly observed for neurons that have an orientation preference [15]. In the visual cortex, contextual interactions are complex and heterogeneous and are observed for stimuli far outside the classical receptive field. For textures stimuli, Zipser [13] reported figureground modulation in V1 for figures up to 10-12u. They further reported a dependency of modulation strength on the figure size. Size tuning of surround suppression has also been reported for drifting sinusoidal gratings up to 10 degrees in V1 and LGN [3]. Similarly, surround effects for uniform stimuli extend 20 degree up to 40 degrees beyond the classical receptive field [18]. In the case of border ownership, contextual effects are observed for stimuli 20u from the target stimulus and show only mildly size dependency [15]. To test the model behavior for stimulus size we applied different figure sizes. Figures as small as 161 pixels up to figures sizes of 46646 pixels are detected properly. Compared to small figures large figures (.32632 pixels) have a ,40% weaker response modulation than the smaller figures (180 sp/s vs. 145 sp/ s). The same is true for the border-ownership signal in the third  layer as it is based on the occurrence of figure-ground signal in the second layer. These results can be explained by the fact that inhibition increases by enlarging the figure size thereby lowering the total input to the neurons ( fig. 9). The inhibitory contribution to the input of layer 2 neurons as a function of figure size is shown in figure 9. Because the responses of layer 2 neurons to the change in inhibition do not follow the same rule, figure sizes (e.g. 46646 pixels) larger than background size can still be detected correctly.
We also tested the model for more than one figure. In figure 10 the results of figure-ground segregation are presented for four figures. Earlier psychophysical experiments demonstrate that these textures are not ambiguous, i.e. the multiple squares are perceived as figures and not as background [19]. For multiple figures, feature detection occurs in the first layer while figure-ground segregation is observed in the second layer and border-ownership in the final layer. We also tested the network performance for the outline of one or multiple figures. Outlines were 1 pixel wide containing concave and convex regions. The inner part of the outline was part of the second feature map, i.e. part of the background. The results show that for both feature maps, the outline is detected accurately for a single figure as well as for multiple figures (figs. 11,12). Here, compared to responses to homogeneous textures, responses to figure elements are enhanced and responses to ground elements, where a figure is presented outside the receptive field, are weakened. This push-pull effect becomes less by lowering the stimulus contrast [8]. Besides stronger response modulations, our data show that increasing contrast produces a shorter onset latency of the figure-ground signal (fig. 13c). For a high contrast figure, the onset latency is about 3 times shorter compared to a low contrast figure; a phenomenon also observed in the visual system ( [20]; see fig. 13c). Similar results hold true for border-owner assignment because in our model the occurrence of border-ownership assignment is directly related to the timing of figure-ground segregation.

Discussion
The goal of the present study was to examine the role of feedforward connections in figure-ground operations. We found that our three layered model of spiking neurons could perform figure-ground segregation and one-sided border-ownership assignment in a purely feedforward manner. The feedforward segregation of figure from ground was robust. A decrease of the input contrast by 80% still yielded figure-ground segregation.

Figure-ground segregation & one-sided border-ownership assignment
The first layer transformed the figure-ground texture input into a spike map, which was send to the layer 2 neurons. In the second layer, neurons received retinotopic excitatory input and global inhibitory connections from all the spiking neurons in the preceding layer of the corresponding feature map. In the first feature map only a minority of the total number of neurons, those at the figure location, contributed to the global inhibitory effect. Consequently each neuron in the second layer received relatively weak inhibition, which was not sufficiently strong to cancel out the  . 14). In the other feature map however, the numerous layer 1 neurons receiving background input produced together a strong global inhibitory input to each second layer neuron. This inhibitory input was strong enough to cancel out the excitatory activation, thereby silencing the background neurons if the second layer. For those layer 2 neurons that did not receive excitatory input, i.e. the ones at the central figure location, the strong global inhibition resulted in rebound spiking ( fig. 14). Thus, in the second feature map layer 2 neurons at the figure location fired and neurons at the background were silent. Hence, also here the model segregated figure from ground. Note that the important point for figure-ground segregation to take place is not the fact that the figure neurons of the second feature map spike but that the background neurons are silent (or fire few spikes). In the third layer, neurons received one excitatory and one inhibitory connection from two neighboring neurons in the second layer. Such a combination of spatially separated receptive field subregions reproduced one-sided border-ownership assignment (see fig. 8).
Can figure-ground segmentation occur without feedback?
Feedback projections from higher visual areas to lower areas are believed to provide the contextual information necessary for figure-ground segmentation. Yet several studies indicate that feedback projections may not be the sole component for producing contextual effects and figure-ground segmentation. For instance, V2 is the main contributor of feedback to the primary visual cortex, though inactivation of V2 has no effect on centre-surround interactions of neurons in the primary visual cortex [2]. Surround effects are primarily suppressive but blockade of intra-cortical inhibition does not reduce significantly surround suppression [3]. Surround suppression is fast and may arrive even earlier than the feedforward triggered excitatory classical receptive fields response [4,5]. This timing is inconsistent with contextual modulation by feedback. Also surround suppression in the monkey LGN emerges too fast for an involvement from cortical feedback [6].
Moreover, removing feedback (including V3, V4, MT, MST, but not V2) to V1 impairs figure-ground perception, but does not affect visual detection of textured figure-ground stimuli [22]. This finding implies that figure-ground segmentation occurs without  feedback from these extra-striate areas, and without producing visual awareness. This agrees with the belief that figure-ground organization is an automatic process [23]. For example, preserved figure-ground segregation is observed in neglect patients [24] and surface segregation signals evolve independent of attention [19]. Similarly, the assignment of border-ownership precedes object recognition and the deployment of attention [23,25]. Furthermore, the short onset latencies and sometimes incomplete cue invariance suggest that border-ownership assignment is not generated in higher level visual areas but within the lower visual areas [15]. In addition, figure-ground segmentation depends on the size of the figure region and drops with increasing figure sizes (.8u-12u). This size dependency argues against segregation by feedback since termination fields of feedback projections cover large regions of visual space in V1. Finally, an intriguing finding is that contextual neural interactions corresponding to perception are observed at sub-cortical levels in the LGN and even in the retina [26] and that competition for object awareness is fully resolved in monocular visual cortex [27]. So, there is considerable evidence against a unique role of feedback in figure-ground segregation and supports the idea for a feedforward component in figure-ground segmentation.
Possible correspondence of the model architecture to the visual system Visual information entering the retina produces graded potentials which are converted by ganglion cells into action potentials. Our model neurons in the first layer convert continuous texture input into spikes, and therefore the first layer can correspond to the ganglion cell layer of the retina. The second layer of the model may represent the LGN as the main recipient of ganglion connections. The retinotopic excitatory connections mimic the powerful synaptic excitatory contacts that each LGN neuron receives from one to three retinal ganglion cells [28]. The same retinal ganglion cells also provide inhibitory postsynaptic currents [29]. The influence of inhibition in the LGN however comes from a larger retinal region than that from excitation. Likely this is because retinal ganglion cell activate inter-neurons resulting in inhibition beyond those directly activated by ganglion cells [29]. This feedforward inhibition is fast; it takes place at the very beginning of an event related response [5,29]. For instance, some types of IPSC faithfully follow the EPSC with a latency of 1 ms and they are tightly locked to visual stimulation [29]. In our model we reproduced the fast surround inhibition seen in the retinogeniculate system by combining in time the retinotopic excitatory and the global inhibitory input. If our second layer indeed corresponds to the LGN, then figure-ground segregation, particularly for contrast-defined figures, does not start in the cortex but already in the thalamus. Although the existence of figureground signals in the LGN are not known, contextual responses matching perception, and attention signals are described in the LGN (e.g. [26]).
Alternatively, the second layer of our model may correspond to V1. In this case, LGN present just a relay of retinal information. In a previous version we successfully tested this by adding an extra layer representing the LGN. The thalamocortical connections are highly convergent maintaining the retinotopic mapping in V1. In V1, they synchronously activate layer 4 spiny cells, which in turn activate directly the upper layer neurons. Furthermore, thalamocortical synapse specifically and strongly excite the fast spiking network [30]. Fast spiking neurons form an inhibitory network connected through electric synapses and mediate thalamocortical feedforward inhibition [31]. In the visual cortex feedforward inhibition can suppress large regions [3,4,31,32] and is fast where it can arrive even earlier to the target neuron than excitatory signals [4]. Within the cortex, conductance of fast spiking interneuron onto spiny layer 4 neurons is ,10 fold greater than that of excitatory conductance [32] and fast spiking cells mediate strong and fast (,,6 ms) thalamocortical feedforward inhibition that can shunt thalamocortical excitation [31,32]. Intra-cortical surround inhibition, on the other hand is rather slow, tens of milliseconds [33]. In our model the combination in time of excitatory and strong inhibitory inputs mimic the synchronous activation and the strong and fast feedforward inhibition described in the visual system. Finally, inhibition from the surround has been shown to be orientation or direction selective [34][35][36][37][38]. In our model, surround inhibition is also feature, e.g. orientation, specific.
The last layer may represent V2, which receives its main feedforward input from V1. Neurons in V2 aggregate V1 receptive fields at similar but not identical topographical locations. In such a design V2 neurons show spatial in-homogeneity in the twodimensional receptive field structure. V2 receptive fields contain sub-regions that are tuned to similar or dissimilar orientations [16,17]. Accordingly, the response properties of V2 neurons are principally determined by the distribution of the aggregation of V1 receptive fields. A further complexity added to V2 receptive fields is that V2 neurons combine both excitatory and inhibitory parts of separated receptive fields of V1 neurons [17]. Many of the interactions between the sub-regions are inhibitory, which might be of V2 intra-cortical origin or inherited from V1 [17]. By applying two antagonistic sub-regions of V2 receptive fields we reproduced a simple form of sub-field aggregation of V1 receptive fields. Such a design explains border-ownership assignment. Alternatively input to V2 may come from LGN cells, in particular the non-standard cells, which project directly to V2. This idea is supported by the observation of V1-V2 correlograms that are centered on zero indicating coincidence of firing by common input. However, the LGN-V2 connection and its functions are yet poorly described.

Onset latencies of figure-ground signal and borderownership coding
A notable outcome of our model is that the figure-ground signal pops-out immediately after receiving the first spikes, both at the border and at the centre of the figure. Also border-ownership assignment occurred at similar time as figure-ground segregation. At first glance this may seem odd compared to the often reported late onset of figure-ground segregation in the visual cortex. General, non-specific surround suppression is one of the earliest contextual effects, which takes about 7 ms to develop after response onset [39]. The orientation specific modulation of responses to centre-surround stimuli occurs a bit later, around 15-20ms after the response onset [39]. Lamme showed onset  latencies for figure-ground modulation of 60-120 ms after stimulus onset, which equals to 30-60 ms after response onset [40]. In another study, early textured figure-ground segregation was seen to occur at 40-80 ms after stimulus onset [41] and was not different between V1 and V2 neurons. In this study, figureground segregation started 20-60 ms after the response onset. Border-ownership assignment for color and grey stimuli starts at ,70 ms after stimulus onset, both in V1 and V2 [15]. This is within 10 ms in V1 and within 25 ms in V2 after response onset. Thus, although frequently described as having a late onset, neural signatures of figure-ground segregation and assignment of borderownership can arise fast (as fast as 10-20 ms after response onset) both in V1 and V2. So at a closer look, our findings of 5 ms after stimulus onset agree with the fastest reported onset latencies of figure-ground signals. Moreover, in the visual system features, like orientation needs first to be computed before figure-ground segregation can take place. So the time that is needed to process features is included in the described onset latencies for the occurrence of figure-ground modulation. In our model however feature specificity was implicitly encoded and thus did not add extra time to the onset time of figure-ground segregation. So, when corrected for a latency of ,10 ms for orientation tuning to take place, our figure-ground latencies are close to the observed ones in the visual system.

Segregation of boundary and surface of the figure
Boundary detections and surface filling-in are other issues related to the onset of figure-ground segregation. Neurophysiological observations show that figure-ground modulation occurs first at the border of the figure followed by modulation for the center region of the textured figure [40][41][42]. These findings can be interpreted as a filling-in process or, alternatively, as two independent processes of border detection and a grouping operation where surface responses simple lag behind the responses to border. The finding that surface signals and not boundary signals are reduced by extra-striate lesions [40] argue for two distinct mechanisms. Also, the finding that the onset of the modulated responses across the whole surface is the same [40] argues against a gradual filling-in process of textured stimuli over time and favors independent mechanisms for boundary and surface detection. Our data shows that the whole figure poppedout instantaneously and no filling-in process of the figural region took place. Therefore, our model data fit the idea of two independent mechanisms for local border and surface detection. Local border detection, however, is absent in our model. The absence of border detection is explained by the fact that border detection is based on the comparison of local features, where discontinuities form a boundary. To detect local discontinuities interactions between features are needed. In our model such interactions were not implemented and thus boundary detection is not possible.
One model [58] may appear to be feed forward. However, their conductance based model does not use DEQs. Therefore, it can easily be re-defined and interpreted as a feedback model. More importantly is that the Sakai & Nishimura 2006 model is based on surround fields (iso-orientation suppression and cross orientation facilitation). These surround fields were not explicitly modeled but numerous (hundreds) different positions and sizes of surround fields were designed and tested. The neural origin of these surround effect are based on the information within V1 [35,36]. It has been demonstrated that in the visual system surround effects are mediated by long range horizontal connections [59]. In fact some models rely on lateral connections for figure ground organization [49]  Most of the models are conductance based models excluding the rich and complex response behavior of neurons. Some models rely on lateral connections for figure ground organization [49] and demonstrate that feedback is in principle not necessary. However, lateral latencies in the visual cortex are too long to explain contextual effects in figure-ground organization. Other studies add feedback projections to improve the performance of the model. The role of feedback is to suppress noise and to enhance figureground effects [48,54,55]. These results fit the idea that top down control has a push-pull effect where relevant signals are enhanced and irrelevant signals suppressed.

Limitations and predictions of our model
Our intention was to test feedforward segregation of textures that previously had been studied in primates and computer models, and which are believed to depend on recurrent processing. The model was not designed for complex or natural images, neither was the intention to obtain state of the art figure-ground segmentation. On the contrary, to understand the role of feedforward connections in figure-ground segregation we constructed a minimalistic feedforward architecture. Therefore, we deliberately omitted recurrent processing, thereby severely constraining the possible outcomes of the model. For example, feature interactions are not possible with the current network because the lack of horizontal connections. Nevertheless, our simple network advocates a feedforward organization of figure-ground. According to our model data, one-sided border-ownership coding does not depend on local feature contrast but is based on surface segregation of the figure. Thus, our model predicts that local border detection and border ownership coding employ different neural mechanisms. Furthermore, in our model we modeled global inhibition by adding a negative weight to the feedforward connections and not by introducing local inhibitory cells at layer 2. In this way the combination in time of excitatory and strong inhibitory inputs mimic the synchronous activation and the strong and global inhibition described in the early visual system. Further studies should reveal how figure-ground segregation occurs by including inhibitory cells. Finally, considering the simplicity of our model figure-ground segregation may occur already at the earliest stages of visual processing.

Conclusion
In the visual system it is not possible to separate axonal circuits and to analyze their function in isolation. Computational modeling of neural networks offers a complementary role to allow dissecting axonal circuits. Using biophysical realistic spiking neurons, we tested to what extent feed-forward connections contribute to the neural mechanisms underlying figure-ground organization. Our simple, 3 layered feed-forward spiking model performs figureground segmentation and one-sided border-ownership coding. It turns out that global inhibition and rebound spiking are important ingredients for figure-ground organization. We conclude that figure-ground organization includes besides feedback also a feedforward component.

Model architecture
The model is composed of three layers, each containing two arrays of N6N units or neurons of the Izhikevich type ( [11]; see fig. 2). For all layers, we used N = 64. Lower and higher values of N were also tested and did not affect model performance. The two separate arrays of each layer represent two neuronal cell populations with opposite preference for a single feature.

Connections
The feedforward connections between the layers are divided into excitatory and inhibitory connections ( fig. 2b). All excitatory connections are retinotopic (point-to-point connections) where neuron N ij in one layer solely connects to neuron N ij in the next layer. Thus the excitatory part of a neuron's receptive field has size one. The pattern of inhibitory connections differs between layers. Neurons in the first layer do not receive inhibitory signals from the texture input. In the second layer all neurons of a feature map (see below Inputs) receive inhibition from all neurons located in the same feature map of the first layer. In the third layer, a neuron N ij receives feature specific inhibition from a neighbor of neuron N ij located in the second layer. In principle, there are eight neighbors; for simplicity we chose only one (see fig. 2b). Inhibition is achieved by assigning negative weights to the connections. Neither intralaminar connections, i.e. horizontal connections between neurons within or across feature maps, nor feedback connections, i.e. connections from higher layers to lower layers, are included in the network architecture.

Inputs
The studied textured figures are arrays of N6N pixels, with N as in the model, containing one or four centered squares ( fig. 2c,3a). Input arrays are binary (0 or 1) and correspond to the preference of a single visual feature, like luminance, orientation, direction of motion, color etc. In other words, 1 stands for optimal tuning whereas 0 is the opposite. For every shape its binary complementary is also included ( fig. 2c,3a). The complementary input thus represents the reverse preference of the visual feature. These two arrays are referred to as feature map 1 and feature map 2. For instance, the first one corresponds to the orientation of line segments in the centre square of figure 1b and the second to the surrounding line segments, which have the opposite orientation (see fig. 2c). Together they form the figure-ground texture. The two feature maps are processed by separated neuronal pathways (channels). We also used the outlines of the figures as input ( fig.3b). For border-ownership coding single squares were placed to either the left or the right side for clarifying the side preference ( fig. 4a). Also two partially overlapping squares were used for borderownership ( fig. 4b). In this case, the two small squares (figures) belong to one feature map. The pixel values of the additional square were 0.3.

Neuronal cell type
Hodgkin-Huxley models are too slow for network operation and integrate-and-fire models are unrealistically simple and incapable of producing rich spiking and bursting dynamics exhibited by cortical neurons. We opted to use the spiking neurons of Izhikevich [11]. These neurons combine the biologically plausibility of Hodgkin-Huxley-type dynamics and the computational efficiency of integrate-and-fire neurons, and are capable of producing rich firing patterns exhibited by real biological neurons. We choose the neurons to be phasic bursting because feedforward connections rely on bursting neurons, which report the beginning of the stimulation by transmitting a burst. In the brain bursts are important to overcome the synaptic transmission failure and reduce neuronal noise. Also they can transmit saliency of the input and bursts can be used for selective communication between neurons.  [11]. In the evolution law for V (eq. 1), a capacitance factor C was omitted [11]. When dimensions are reintroduced, voltages are read in mV and time in ms. As initial conditions at t 0 = 0 we set v t0 ð Þ~c, u t0 ð Þ~bv t0 ð Þ for all the positions in our arrays (since we deal with twodimensional objects, equations (1) and (2) are actually meant for v?v ij , u?u ij , I?I ij , i,j = 1, …, N, and condition (3) is in fact applied to v ij ,u ij ,V i:j . We used the Euler method with Dt = 0.20 ms. The input current I in (1) is the result of summing different matrix contributions of the form

Model dynamics
where 'exc' stands for 'excitatory', 'inh' for 'inhibitory', and i,j are spatial indices. Further, for layers 1 and 2, F is either the two dimensional figure itself or the binary array defined by the presence of spikes, i.e., with ones where condition (2) is satisfied and zeros elsewhere. The 1 NxN symbol denotes an NxN matrix containing just ones. Since excitatory receptive fields have size one, excitatory signals are point-by-point (retinotopic) copies of F itself, multiplied by the corresponding weight. The inhibitory part, whose associate receptive field has the same size as F, produces a spatially constant term -hence the 1 NxN matrixwhich is proportional to the normalized sum of all the F coefficients times the inhibitory weight. Thus, all layer 2 units within a feature map received the same inhibitory input. In our design, the employed weights are w exc = 3, w inh = 0 for the texture input and w exc = 400, w inh = 2900 for the signals from layer 1 to layer 2. The weights values are a result of a heuristic process, and can be changed without critically affecting the model performance.
The path from layer 2 to layer 3, where border-ownership assignments take place, may be described in terms of two receptive sub fields, inhibitory and excitatory, both of size one and next to each other. Their working is more easily expressed by means of the convolution I 3 indicates the total input to layer 3, w 3 is the weight (w 3 = 200), F 2 means the spike map at layer 2, and the applied filter is given by the 261 matrix t BO~{ 1 1 .