University of Groningen Implications of behavioral architecture for the evolution of self-organized division of labor

Division of labor has been studied separately from a proximate self-organization and an ultimate evolutionary perspective. We aim to bring together these two perspectives. So far this has been done by choosing a behavioral mechanism a priori and considering the evolution of the properties of this mechanism. Here we use artificial neural networks to allow for a more open architecture. We study whether emergent division of labor can evolve in two different network architectures; a simple feedforward network, and a more complex network that includes the possibility of self-feedback from previous experiences. We focus on two aspects of division of labor; worker specialization and the ratio of work performed for each task. Colony fitness is maximized by both reducing idleness and achieving a predefined optimal work ratio. Our results indicate that architectural constraints play an important role for the outcome of evolution. With the simplest network, only genetically determined specialization is possible. This imposes several limitations on worker specialization. Moreover, in order to minimize idleness, networks evolve a biased work ratio, even when an unbiased work ratio would be optimal. By adding self-feedback to the network we increase the network’s flexibility and worker specialization evolves under a wider parameter range. Optimal work ratios are more easily achieved with the self-feedback network, but still provide a challenge when combined with worker specialization. Citation: Duarte A, Scholtens E, Weissing FJ (2012) Implications of Behavioral Architecture for the Evolution of Self-Organized Division of Labor. PLoS Comput Biol 8(3): e1002430. doi:10.1371/journal.pcbi.1002430 Editor: Olaf Sporns, Indiana University, United States of America Received August 30, 2011; Accepted February 1, 2012; Published March 22, 2012 Copyright: 2012 Duarte et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: A. Duarte was supported by a Ubbo Emmius PhD scholarship, attributed by the Centre for Ecological and Evolutionary Studies, University of Groningen. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: f.j.weissing@rug.nl ¤ Current address: Department of Zoology, University of Cambridge, Cambridge, United Kingdom


Introduction
Division of labor is ubiquitous in nature. The major evolutionary transitions, such as the separation of germ and soma and the transition from prokaryotes to eukaryotes, were accompanied by an increase in division of labor [1]. The transition from solitary to eusocial in insects encompasses the evolution of a reproductive caste and a sterile worker caste. Furthermore, division of labor among sterile workers also evolved, in which different groups of workers specialize in different functions, such as foraging and brood care [2]. Colony growth and survival is strongly dependent on the coordinated interaction of a large number of workers. This nonreproductive division of labor is therefore often considered a major determinant of the ecological success of eusocial insects and will be the focus of the work presented here.
Empirical evidence suggests that eusociality has evolved in associations of close kin [3,4]. Variation in behavioral tendencies can be found in forced associations of non-social individuals, leading to incipient forms of division of labor [5,6]. Undoubtedly, a source of variation is key to generating consistent inter-individual differences and task specialization [7]. The questions that arise are how and why such variation arises among close kin. Here we explore some of the mechanisms and conditions through which task specialization can evolve in groups of related individuals.
Recent work on division of labor in insect societies has focused on the self-organization properties of colony behavior. According to a variety of models [8][9][10][11][12] colony properties emerge from the behavior of individual workers whose reactions to the environment is governed by simple rules. The behavioral rules leading to emergent specialization are probably shaped by natural selection [10,13], yet only few studies have focused on the evolution of these rules [14,15]. Previous work focusing on the benefits of task specialization in other systems (e.g. enzyme-substrate specialization, coordination in co-viruses) generally disregard the mechanisms underlying it, viewing instead specialists and generalists as fixed behavioral strategies [16,17]. It is thus important to develop models that integrate the evolutionary and self-organization perspective, in order to create a better understanding of division of labor and its evolution [7].
In previous work, we took the response threshold model [1] as a starting point for an evolutionary model for division of labor (A. Duarte, I. Pen, L. Keller and F.J. Weissing, submitted). In the response threshold model, individuals compare an environmental stimulus for a task with their response thresholds; they perform the task if the stimulus is above their threshold, otherwise they remain idle. Using this predefined behavioral architecture, we allowed the evolution of threshold values and showed that division of labor can evolve from a homogeneous population via evolutionary branching, but only if there are clear fitness benefits of individual specialization. Our work also revealed that the response threshold model has the drawback that it imposes severe constraints on the distribution of workers over tasks.
Here we look at a more flexible behavioral architecture that is represented by a simple artificial neural network (ANN). ANNs simulate the processing of stimuli by individuals, from stimulus perception by receptor nodes to effector nodes determining the behavioral output [18,19]. ANNs have been used in evolutionary robotics to understand the evolution of communication and cooperation [20][21][22]. In a recent paper, Lichocki et al. showed that ANN's, in comparison to response threshold mechanisms, allow for more efficient worker allocation through task switching [23]. Here we examine the effect of the architecture of ANN's in worker specialization and worker allocation, in a context where task switching is detrimental.
In the response threshold model, the response to taskassociated stimuli is determined by task-associated thresholds.
The stimuli, which reflect the colony's need for work on the various tasks, change dynamically due to two factors: there is an inherent tendency for the stimuli to increase, and they are decreased whenever the corresponding task is performed. We keep most assumptions of the threshold model but allow the taskassociated stimuli to be processed by an ANN. In principle both the architecture of the network and the way information is processed could evolve [24,25], however, we for simplicity, we focus on predefined architectures (with a fixed number of receptor and effector nodes) and allow only for the evolution of connections between the nodes. The stimuli are processed by an ANN consisting of two receptor nodes and two effector nodes (Figure 1). In a second part of our study, we keep the same network structure but allow for the evolution of a feedback from the effector nodes to the processing of the stimuli ( Figure 1C). In other words, an effect of previous experience on current decisions can evolve. An effect of previous experience on task preference, leading to division of labor, has been observed in natural colonies [26], thus it would be interesting to observe under which circumstances it could evolve.
We investigate if these slightly more sophisticated mechanisms for processing input allow for the evolution of adaptive division of labor. More precisely, we study whether task specialization among workers can evolve and moreover, whether an appropriate distribution of workers over tasks can be achieved. Throughout, the main question is whether, and to what extent, the evolution of self-organized division of labor is determined by the underlying architecture of behavior.

Model
The general aspects of the model follow A. Duarte, I. Pen, L. Keller and F.J. Weissing (subm.). We consider a population of M colonies, each founded by a single-mated individual that produces N workers (typically M~100, N~100). Each colony goes through a work phase consisting of T time steps (T~100), where all individuals perceive stimuli associated with two tasks and decide whether to perform one of the tasks or remain idle. The amount of

Author Summary
In insect colonies, different individuals specialize in different tasks related to colony maintenance and growth. Unveiling why this division of labor evolved and how individuals decide which task to take on is crucial for our understanding of complex group behavior. Here we model the evolution of general behavioral rules for processing environmental signals of task need in social insect colonies, using artificial neural networks. We examine the patterns of individual specialization that arise in the course of evolution. Division of labor is likely to evolve if switching between tasks decreases worker productivity, but the pattern of division of labor across colonies is highly dependent on the architecture of the networks considered. In networks that allow for a feedback of previous experience on future task choice, division of labor can evolve across the whole population of colonies. In networks where this feedback is not allowed, the presence of division of labor is constrained by the specific genetic composition of colonies. Network architecture also affects how fine-tuned the worker allocation to different tasks can be when the tasks have different requirements. work performed and the distribution of workers over tasks determines the fitness of a colony, which corresponds to the number of reproductives produced. Selection occurs because the colonies of a given generation are founded by pairs of reproductives produced in the previous generation. Hence colonies where the workers perform their tasks in the most efficient and coordinated way spread the genes of their foundresses most effectively.
In line with [8], we assume that there are two tasks and two taskassociated stimuli. Stimuli increase each time step by a fixed amount d and decrease by an amount a whenever a worker performs the task, following [8] (a~0:03 and d~1 in our simulations). In the response threshold model, the association between stimuli and task was also expressed in the fact that individuals were more likely to perform a task for which the stimulus was high. However, in the present model, this is not necessarily the case. An association between task and stimulus is present because the performance of a given task decreases a given stimulus. Workers are assessed in random order and, once an individual works, the corresponding stimulus value is immediately decreased, such that the next worker to be assessed experiences a different stimulus value.

Artificial neural networks
The first network studied is a simple feedforward network [19] that consists of two stimulus input nodes and two behavioral output nodes, all four nodes being connected ( Figure 1B). Each input node perceives a task-associated stimulus with a certain error e (drawn from a normal distribution with mean 0 and standard deviation 1). The two signals are then processed and transmitted to the output neurons, via connections with weights w ij that are evolvable properties of the network. Output nodes receive a weighted sum of the stimuli, generally designated activation energy. The activation energy n i of an output node i is thus: Each output neuron is characterized by a threshold h i , which is another evolvable property. If the activation energy of an output neuron exceeds the threshold, the neuron is activated, meaning that an individual is willing to perform the respective task. If both output neurons are activated, one task is chosen at random. Note that the response threshold model implemented in previous work is in fact a special case of the feedforward neural network, where w 11~w22~1 and w 12~w21~0 ( Figure 1A). The main difference between our feedforward ANN model and the response threshold model is thus the evolution of the connection weights that determine how incoming information is processed and interpreted. The initial values of connection weights in our simulations are: w 11~w22~1 and w 12~w21~0 . Changes in the connection weights and thresholds take place when new individuals are produced, via mutation (see below). During the lifetime of an individual, the parameters of its network are fixed. Thus we do not consider the changing of connection weights with learning, for example. The second network architecture studied is a recurrent network [19]. It includes all previous nodes and connections, and in addition it has two self-feedback loops ( Figure 1C). The activation energy in a given time step will affect the activation energy in the next time step: n i (tz1)~P m j~1 w ij : S j (tz1)zf i : n i (t). The connection weight f i given to the previous activation energy (from here on called the self-feedback connection) is also an evolvable property that changes through mutation and natural selection during production of new individuals. During the lifetime of individuals, however, there is no change occurring in the parameters of the networks. Self-feedback connection weights were initialized at zero, which is equivalent to the feedforward network, without any influence of past experience in current decisions.

Fitness
After the work phase, the fitness of each colony is computed based on how much work the workers performed for each task. Fitness is assumed to be proportional to the weighted geometric mean of work done for both tasks: where A i is the total number of acts performed for task i (A. Duarte, I. Pen, L. Keller, F.J. Weissing, subm.). We take the geometric rather than the arithmetic mean in order to ensure that fitness can only be achieved if both tasks are being performed. The weighing factor b allows us to consider the (realistic) situation that not all tasks need to be performed equally often. For the fitness function (2), fitness is maximized if idleness is eliminated (i.e., if A 1 zA 2 is maximal) and if the workers distribute over tasks according to the ratio A 1 : A 2~b : (1{b). In other words, to maximize fitness the proportion p 1 of work allocated to task 1 by the colony should be equal to b: Each generation, 2M reproductive offspring are produced in total in the population. Colonies contribute to the population's pool of sexual individuals in proportion to their fitness. Population size is thus fixed. The reproductive individuals then form M pairs randomly. From each pair one individual will found a new colony with N workers, while the old colonies are eliminated.

Genetic details
We allowed for the evolution of all connection weights and thresholds of output nodes, giving us in a total 6 (resp. 8) evolving traits. These traits are encoded by 6 (resp. 8) gene loci. The alleles at these loci correspond to real numbers, with threshold alleles being larger or equal to zero, while connection weight alleles may also attain negative values. To keep the genetic assumptions as simple as possible, we assume that all individuals are haploid and that the network of each individual is fully determined by its genotype.
Genotypes of workers and sexuals are similarly inherited: Both types of individuals are offspring of the mated colony foundress, and possess alleles for thresholds and connection weights. Our model allows genetic linkage of the threshold loci or linkage of the connection weight loci, but both types of loci are considered to be sufficiently far apart in the genome to make them segregate independently. The degree of linkage is determined by a parameter r (0ƒrƒ 1 2 ) that corresponds to a recombination rate.
With probability 1{r, the threshold alleles (resp. the connection weight alleles) are inherited as a block from one of the two parents; with probability r, the parent whose allele is transmitted is chosen independently of what happens at the other loci.
Mutation occurs with probability m at each locus; when a mutation occurs, the genetic value at that locus is changed by adding a real number to it that is drawn from a normal distribution with mean 0 and standard deviation s m . In our simulations, we typically used m~0:1 and s m~0 :1.

Measuring worker specialization
We evaluate colony-level characteristics such as the proportion of work devoted to each task and the level of individual specialization. For each individual we calculate at the end of a simulation the fraction q of time steps that it stayed in the same task from that time step to the next. We average q over all workers and normalize this measure by dividing q q by the probability that individuals stay in the same task merely due to chance. The latter is given by p 2 1 zp 2 2 , where p i is the proportion of work devoted to task i. By subtracting 1 from the value thus obtained, we obtain a measure of worker specialization that ranges between 21 and 1 (A. Duarte, I. Pen, L. Keller and F. J. Weissing, subm.): When D is close to 1, there is a high degree of division of labor, and individuals stay in the same task much more often than expected by chance. If D is close to zero, workers switch between tasks at random. If D is lower than zero, individuals switch task more often than expected by chance.

Switching costs
Worker specialization can be adaptive if there is a cost to switching tasks (such as a time cost if tasks are confined to different locations, or a cognitive cost), or if specialized workers perform their task with higher efficiency [27]. Here we implemented a time cost scenario, by imposing c time steps of inactivity whenever an individual chooses to switch from one task to the other.

Results
Simulations of the neural network model, with different network architectures were ran for b~1 2 , b~3 4 and switching costs c ranging from 0 to 5 time steps. We also tested the influence of recombination between the loci coding the neural network in the evolution of specialization. There were 10 replicates per parameter combination. The evolutionary patterns of the components of the neural networks were examined (thresholds of output neuron and connection weights) at the population level. Overall, connection weights were far more important than the thresholds in determining the behavior of networks. Hence we do not address here the evolutionary trajectories of threshold loci for the feedforward network. These can be found in the Supplementary Material (Text S1 and Figure S1).

Feedforward network
Optimal worker distribution 1:1. When b~1 2 , both tasks are equally needed, and a 1:1 distribution of workers over tasks would be optimal (see (3)). It is therefore somewhat surprising that, in the absence of switching costs, all replicate populations evolved a work distribution where one of the tasks was performed three times more often than the other (Figure 2A, top panel). From here on we refer to the task performed most often as the ''preferred task''. Which task was preferred varied among replicate populations, but within a population all colonies preferred the same task. Variation among colonies in fitness values was small; all colonies reached approximately 94% of the maximum fitness (see Figure S3). Higher fitness values could not be achieved due to the deviation from a 1:1 task distribution.
Typically in our simulations, both 'incoming' connection weights of one of the two output neurons (the neuron corresponding to the preferred task) became positive over evolutionary time (Figure 2A, bottom panel). As for the incoming connections of the other output neuron (corresponding to the nonpreferred task), the direct connection (w 11 in the example simulation of Figure 2A) became positive, while the crossconnection (w 21 , in Figure 2A) typically became weak, oscillating between positive and negative values. In all simulations, the strongest positive connection was between the stimulus input neuron of the non-preferred task to the output neuron of the preferred task (w 12 , in Figure 2A). Hence, individuals use the stimulus for one task (their non-preferred task) to motivate them for performing the other task (their preferred one). As a consequence, they continue performing their preferred task, even if the stimulus level of this task has become very low ( Figure S2). For this parameter combination (b~K, c~0), the degree of recombination had no effect on the outcome of the simulations ( Figure S4A).
In the presence of switching costs, the results are considerably different. When switching costs were low (c~1), worker specialization only evolved in the absence of recombination (r~0), with 61.467.2% of the colonies (mean 6 SD) evolving values of Dw0.5. When c = 2, worker specialization also evolved in the presence of recombination ( Figure 2B). Here 35.668.2% of the colonies showed Dw0.5. In all simulations with c §2 there was a clear (but weak) positive relationship between colony fitness and the degree of worker specialization within the colony; colonies with high mean specialization have a fitness advantage of approximately 20% over non-specialized colonies ( Figure S3).
The bias in favor of one of the tasks that was observed in the absence of switching costs was much less pronounced or even absent in the presence of such costs. For c~2, initially most colonies show a work distribution close to 1:1 ( Figure 2B, top panel). After about 3500 generations, a new pattern arises, with part of the colonies having a pronounced bias toward task 1, while the other colonies have a bias toward task 2. The simulation shown is representative for higher switching costs (c §2), but to a certain extent the outcome depends on the detailed assumptions. If, for example, recombination was not allowed in the simulation of Figure 2B (i.e., r~0), three different types of colonies evolved (with p 1~0 :4, p 1~0 :5 and p 1~0 :6, respectively; see Text S1 and Figure S4B).
The neuronal connection weights linking input neurons to the corresponding output neurons (i.e., w 11 and w 22 ) tended to evolve positive values, between 0 and 4 ( Figure 2B, bottom panel). One of the cross-connections (i.e., w 12 or w 21 ) showed evolutionary branching [28], that is, polymorphism evolved from an initially monomorphic state. Figure 2B is representative in that w 21 branches into a bimodal distribution, with one branch becoming negative and the other positive. When such branching occurs, two distinctly different types of networks coexist in the population (Figure 3, top panel). This is crucial for worker specialization: a high degree of specialization only occurred in colonies where the two parents differed in the sign of one of their cross-connection weights. From Figure 3 we can deduce how specialization occurs in a colony with dissimilar parents. The key difference between the parents' networks is the genotypic value of w 21 , which determines that one parent (arbitrarily labelled 'male') is a specialist for task 2, while the 'female' shows a large area of the stimulus space where both tasks are activated and where accordingly one of the two tasks is chosen at random (Figure 3, bottom panel). The workers produced by these parents will be divided among those two phenotypes. Stimulus increase initially occurs for both tasks, until the stimuli levels reach a region where individuals with a positive w 21 will perform task 1. As a consequence, only stimulus 2 will keep increasing, until an area is reached where individuals with a negative w 21 will start performing task 2. The decreasing stimulus of task 2 means that fewer workers will do task 1, because the main motivating force to do task 1 is the positive w 21 . Hence, stimulus for task 1 will also increase. Individuals are then in an area of the stimulus space where half of them will work randomly on either task, while the other half will only perform task 2.
For 0vcƒ3, branching occurred at only one of the crossconnections, while for cw3 both cross-connections branched in some of the simulations.
In the absence of recombination, evolution leads to a higher degree of worker specialization ( Figure S4B). Evolutionary branching occurs now for all the connection weights and for the thresholds as well. The area in stimulus space where networks choose both tasks is much smaller in the absence of recombination ( Figure S5), leading to more pronounced differences between workers and, hence, more specialization. Branching of more loci means that networks will be more differentiated than seen previously for cases with recombination. . Optimal worker distribution 3:1. In view of eq. (3), when b~3 4 , the optimal worker distribution over tasks is 3:1, with task 1 being performed 3 times more often than task 2 (i.e. p 1~0 :75).
Populations indeed evolved a worker distribution approaching this value ( Figure 4A, top panel). In absence of switching costs, task 1 was performed 76.760.42% of the time (mean 6 SD across all replicate populations, for r~0:5). All colonies attained more than 99% of maximum fitness, with a few colonies achieving the maximum ( Figure S3). A general pattern in the evolution of connection weights was the strengthening of the cross-connection w 21 and the disappearance of connection w 22 (as in Figure 4A, bottom panel). This explains the observed increase in performance of task 1. The cross-connections once more play an important role; since the strongest incentive to do task 1 comes from the stimulus of task 2, this allows workers to keep doing task 1 even if the stimulus for that particular task is depleted.
Worker specialization did evolve, but only in the absence of recombination (r~0). Even then, specialization levels of D §0:5 were only obtained for a larger number of colonies when switching costs were high (c §3). When worker specialization did not evolve (as in Figure 4B, top panel), colonies evolved work distributions even more biased than p 1~0 :75. When the work distribution is that strongly biased, the probability to stick to the previous task (p 2 1 zp 2 2 ) is high even if tasks are taken on at random. Hence, by evolving a work distribution with more than 80% of the work devoted to task 1the number of switches decreases, thus allowing colonies to avoid switching costs even in the absence of worker specialization. In this case, connection w 22 reached lower values than for the simulations without switching costs ( Figure 4B, bottom panel).

Recurrent network: self-feedback
We tested the behavior of a more complex network, where the activation energy of an output neuron could have a feedback on the activation energy at the next time step ( Figure 1C). The selffeedback connections were allowed to co-evolve with the rest of the network. We ran ten replicate simulations for all the parameter combinations tested above.
Optimal worker distribution 1:1. In contrast to the results of the feedforward network, the optimal worker distribution p 1~0 :5 was now realized in a high proportion of colonies (e.g., figs. 5AB). However, this proportion decreased with increasing switching costs. For c~0, the proportion of colonies with p 1~0 :5 was 99.960.3% when r~0:5 and 100% when r~0. For c~2, this proportion was at 76.564.1% when r~0:5 and 46.767.6% when r~0 (mean 6 SD number of colonies across replicates).
When c~0, all colonies in all replicate simulations achieved the maximum possible fitness, indicating that all workers are active all the time ( Figure S6). Workers switched randomly between tasks (D = 0 for all colonies, Figure 5A). This was achieved by evolving positive self-feedback connections allowing workers to continue working even in the absence of an external stimulus for a task. Connection weights from stimuli input neurons to output neurons were also positive ( Figure S7).
Worker specialization evolved already for low switching costs (c~1), but the behavior shown by colonies, for all cw0, differs considerably in the simulations in the presence or absence of recombination. In the presence of recombination, all colonies within a population reached a high value of D ( Figure 5B). In the absence of recombination, populations typically consisted of colonies with low D and colonies with high D ( Figure 5D). For  Figure 2 are followed. (A) c~0. Top graphs: p 1 increases to the optimal value 0.75; specialization remains low. Bottom graphs: connection weights incoming at output node 1 become positive (strongest connection weight being w 21 ); connection weights incoming at output node 2 become negative (w 12 ) or positive, but very close to zero (w 22 ). (B) c~2. Top graphs:p 1 increases to values above 0.75; D remains low. Bottom graphs: similar to when c~0, but w 22 is closer to zero. doi:10.1371/journal.pcbi.1002430.g004 c = 1, for example, 2567% of the colonies (mean 6 SD across replicates) had Dv0:2, while 6667% of colonies showed Dw0:5.
In the simulations where all colonies exhibited a high level of worker specialization, self-feedback connections evolved very high positive values (as in Figure 6, top panel). The connection weights from task stimulus to corresponding output neuron (w 11 and w 22 ) evolved to positive values, while cross-connection weights (w 12 and w 21 ) evolved to negative values (as in Figure 6, bottom panel). In these simulations, the evolved strategy leading to division of labor uses the strong self-feedback connections, accompanied by negative cross-connection weights, to create differentiation between individuals. Since individuals from the beginning perceive different levels of stimuli, differences in activation energy will occur and will be amplified in subsequent time steps, creating consistent differences among individuals. Hence division of labor is achieved by experience-based specialization. In (A), switching costs are absent: p 1 quickly reaches the optimal value 0.5; worker specialization does not evolve. In (B), c~1: p 1 becomes more variable, but still approximates the optimal value 0.5; D rapidly increases to its maximal value 1, for all colonies in the population. (CD) r~0. In (C), switching costs are absent: the evolutionary dynamics is as in (A). In (D), c~2: not all colonies can evolve worker specialization, and p 1 is also more variable across colonies. doi:10.1371/journal.pcbi.1002430.g005 In the simulations where colonies differ in their degree of worker specialization, neuronal connections (including self-feedback connections) show evolutionary branching, with one branch showing positive values and the other branch negative values or values close to zero (Figure 7). In this case, evolutionary branching allows for the co-existence of different genetically determined specialists, as seen previously for the simpler feedforward architecture.
Optimal worker distribution 3:1. In the absence of switching costs, the mean p 1 calculated across replicates was 0.75 and, hence, corresponding to the optimal value for fitness ( Figure 8A). Interestingly, worker specialization was negative (Dv0) in all colonies in 19 out of 20 simulations (encompassing both simulations where recombination is present as well as where it is absent). In other words, individuals switched more often between tasks than expected by chance.
Worker specialization never evolved for c~1. For 2ƒcƒ3, specialization only evolved in the absence of recombination. These results are shown in the Supplementary Material (Text S2, Figures  S8, S9, S10). For c~3, in two of the replicates, all colonies show high levels of specialization, accompanied by the optimal worker distribution ( Figure S8A). In these particular replicates the self- feedback connections became strongly positive ( Figure S9). In all other replicates only about half the colonies showed Dw0:5, while the other half had no specialization ( Figure S8B). The distribution of workers over tasks was highly variable, with very few colonies actually achieving p 1~0 :75. The networks in these populations showed evolutionary branching of self-feedback connections ( Figure S10).
For higher switching costs (c §4), worker specialization could evolve in the presence of recombination, but only in three replicates out of 20, in the evolutionary time considered (results not shown). In these replicates, all colonies combined high levels of specialization and a work distribution very close to the optimal value of 0.75. Worker specialization was again achieved through two different types of networks; one where evolutionary branching occurs in key neuronal connections (particularly self-feedback connections), and the other through evolution of strong positive self-feedback connections (not shown). The first network type leads to a population where only half the colonies have specialized workers, and the correct work proportion is hardly achieved; the second network type leads to a population where all colonies have a high level of specialization and the optimal work proportion.

Discussion
Here we studied whether and how two different neural network architectures enable the evolution of self-organized division of labor and adaptive task ratios. Our results are summarized in table 1.
With a feedforward network (table 1), worker specialization evolved more easily (i.e. at lower switching costs) in the absence of recombination. In the absence of recombination the connection weights can co-evolve as a tightly linked block of genes, making it easier to evolve specific combinations of connection weights favoring specialization. Recombination   pushes populations into a solution where only one connection weight locus branches, the rest of the network being relatively homogeneous in the population. This allows worker specialization to occur, but to a lesser extent than in the absence of recombination, because at least one of the parent networks in a specialized colony behaves as a generalist for a large range of stimulus combinations. A large percentage of colonies showed no worker specialization, hence, no division of labor. This is because random mating allows for couples with similar genotypes to produce colonies where workers are too similar and therefore division of labor cannot emerge. Previous work on the response threshold model (A. Duarte, I. Pen, L. Keller and F. J. Weissing, subm.) showed that the work ratio could not easily deviate from 1:1, even if a biased work ratio was optimal. In contrast, in the case of the feedforward network, the work ratio was always biased for one of the tasks, even when a symmetric work ratio was optimal (table 1). Owing to selection for minimizing idleness, the evolved networks maximized the amount of work done by using the stimulus from one of the tasks to stimulate workers to perform the other task. In this way, one of the tasks was performed in excess (the 'preferred' task), even when its associated stimulus had been depleted. Although this may seem counter-intuitive, it represents an advantage over networks that attempt to maximize both tasks, because these networks would be limited to the work strictly necessary to reduce stimuli to zero. When b~3 4 , the optimal work ratio was achieved, but only in the absence of switching costs. When switching costs were present, the most common evolved strategy was to increase the proportion of work for task 1 in order to minimize switching among tasks. Some of the limitations of the simple feedforward network were eliminated in the slightly more complex architecture of the recurrent network, where previous activation energies feed back on current activation energies. Worker specialization evolved at low switching costs, now both in the presence and absence of recombination (table 1), at least for b~1 2 . Interestingly, the presence of recombination favored an outcome where all colonies showed a high degree of specialization. In these populations, specialization does not depend on the presence of two complementary networks in the parents of a colony (as in figure 3), but on a strengthening of the self-feedback connections. This allows for initial differences between individuals in stimulus perception to be amplified in subsequent time steps and leads to behavioral differentiation through reinforcement of previous experiences. In the presence of recombination, this strategy prevails. However, when no recombination occurs, evolutionary branching of connection weights is still the prevalent strategy through which worker specialization evolves. Why is the experience-based strategy not observed in all simulations? A likely reason is that to reach this strategy, the values of neural connections must first pass through values where, in the absence of recombination, evolutionary branching is more advantageous. Hence, the evolutionary outcome is dependent on initial conditions. We confirmed this by running simulations where the self-feedback connections were initialized at higher values (e.g., f 1~f2~2 ); in this case all populations evolved the experience-based strategy rather than evolutionary branching (results not shown). The evolution of an experience-based strategy is affected by stochastic effects at the moment that the population passes the ''branching point'', namely on the direction and magnitude of genetic variation, that may lead to local fitness optima. The two strategies may thus represent alternative stable states. The mean population fitness of the genetic specialization (evolutionary branching) is noticeably lower than the mean population fitness of the experience-based strategy ( Figure S7). The recurrent network also allowed for the optimal work ratio to be reached in most cases, at least by part of the population (Table 1), even in the presence of switching costs. When b~1 2 , the self-feedback connections allow the continuous activation of both tasks, stimulating individuals that had previously done a task to do it again, even in the absence of the corresponding task stimulus. With this architecture it is also harder to attain the optimal work ratio when b~3 4 and switching costs are considered, and only few replicate populations show both p 1~0 :75 and high degree of worker specialization.
The recurrent network has similarities with the reinforced threshold model, in which individual thresholds are lowered after the performance of the respective tasks and increased when the tasks are not performed [9,29]. In both models, initial differences in experience lead to consistent behavioral differentiation, thus bypassing the need of specific genetic combinations for the emergence of task specialization. However, in terms of the distribution of workers over tasks, the reinforced threshold model suffers from the same limitations as the fixed threshold model, with worker distribution being mainly dependent on the parameters of stimulus dynamics (A. Duarte, T. Janzen, F.J. Weissing and I. Pen, in prep.).
Our results highlight the importance of considering asymmetries in models of division of labor. In the evolutionary response threshold model by A. Duarte, I. Pen, L. Keller and F. J. Weissing (subm.), we show that a biased p 1 -value cannot be obtained through the evolution of thresholds. To achieve a biased p 1 -value in this model, asymmetry must be present in the environment (e.g. in the values of task-associated stimuli [8]) to which the responsethreshold mechanism then responds. However, in reality, asymmetries in the work distribution might also arise from the ability of individuals to perceive and prioritize tasks differently. Here we show that, for both types of networks studied, it is not easy to evolve strict worker specialization together with an asymmetric distribution of workers over tasks. A major difficulty is that in case of genetically determined specialization the work proportion is dependent, to a large extent, on the proportions of different specialists in each colony. Since we only consider singlemated foundresses, colonies in our model show either equal proportions of the two specialist strategies or only one of the specialist strategies. Evolving experience-based specialization enables an asymmetric work distribution and division of labor (although at a lower degree of worker specialization than under symmetric conditions, and only in the absence of recombination), yet the trajectory towards this strategy is subject to stochastic effects that may diverge evolution towards genetically determined specialization or towards an increase of performance of the most needed task beyond its optimal level.
The observed difficulty in favoring a specific work ratio under switching costs indicates that the simple behavioral architectures investigated are limited in the ability to evolve efficient solutions to complex optimization problems. In the presence of switching costs, it is important for colonies to maximize worker specialization, while at the same time minimizing the number of idle workers and optimizing the work ratio. The behavioral architectures considered thus far were only able to evolve sub-optimal solutions to this multi-faceted problem.
Modelling the evolution of behavioral mechanisms by means of artificial neural networks presents several advantages when compared to a priori chosen behavioral architectures such as a response threshold mechanism. First, mechanisms potentially leading to self-organized division of labor are not built into the model, but must emerge from the model. Second, evolving neural networks transcend some limitation of the human mind. When asked to design plausible mechanisms, the imagination of most modellers is limited to simple and intuitive mechanisms (like a response-threshold mechanism) that our mind can easily envisage. For example, it is unlikely that one would envisage a mechanism where a task-associated stimulus does not stimulate the performance of its corresponding task, but of a different one, as it occurs in the feedforward network. By using an independent modelling setup, we can get an idea whether, and to what extent, the results based on the more standard implementations are robust. In our case, the simple feedforward network is too constrained to achieve worker specialization and an appropriate distribution of workers over tasks. By adding a simple elemental feedback the resulting recurrent network had a much higher evolutionary potential. In future models we could consider the evolution of the network's topology, e.g. by allowing the addition and elimination of neurons and connections to an existing network through mutation [24].
The simple feed-forward neural network was constrained by a problem already present with the response threshold mechanism: to get specialization at the colony level, the coexistence of two specialist genotypes is necessary. Random mating and recombination played an important role in the evolutionary outcome. In general we observed that recombination made it more difficult for genetic specialization to evolve. With recombination, evolutionary branching at multiple loci occurred only rarely, at very high switching costs. This is in accordance with the argument that, in constant environments, recombination may destroy favorable allelic combinations [30,31]. Our model suggests that in systems where strong genetic task determination and high recombination rates exist, multiple mating would be favored, in order to increase the chance that workers have favorable allelic combinations. This is in accordance to what we observe in honeybees [32,33]. Under the recurrent network architecture, recombination may also play a beneficial role by creating more genetic variation in the self-feedback connections, which could favor division of labor emerging through the experience-based strategy.
The purpose of our approach was not to represent the behavioral architecture of real organisms, but to present a conceptual model that could shed some light on the role of architectural constraints in the evolution of self-organized division of labor. A limitation of this approach is that the larger the network, the more difficult it is to draw conclusions that are biologically relevant. We have implemented two very simple networks, and yet already have six to eight evolvable parameters. We were able to understand the interaction of the networks with the environment and pinpoint the key connections that allowed for specific behaviors, but this may not be possible for more complex architectures.
The fitness function used (eq. 2) favored the minimization of idleness. Although it is not unrealistic to assume that more work will translate to higher colony productivity, in reality social insect colonies contain a large proportion of idle workers [34][35][36]. Examples of circumstances that would allow the presence of idle workers include environmental perturbations that require quick recruitment of ''stand-by'' workers, advantage of energy-saving strategies under poor resource conditions, and selective neutrality of ''incompetent'' workers due to highly redundant organization of work [36] (and references therein). As stressed before, here we present a conceptual model for the effect of behavioral architectures in division of labor, and necessarily simplify certain assumptions. A more realistic version of our model would treat fitness as the number of offspring produced by a colony, and explicitly consider the nature of the different tasks (e.g. foraging and brood care).
Division of labor is a broad topic, with many aspects that were outside the scope of this study. Previous theoretical work has focused on the evolution of differentiated multicellularity, the evolution of germ and soma in multicellular organisms, and the effect of developmental plasticity in gene expression as a cause of individual differentiation [37][38][39][40]. Here we focused on the evolution of behavioral task specialization in groups where reproductive altruism (analogous to germ-soma differentiation) has already evolved, an assumption which is in line with a recent comparative analysis of the evolutionary history of division of labor [41]. We did not consider the role of developmental plasticity, although this plays an important role in the differentiation of morphological castes [42]. Underlying the different questions concerning division of labor, however, is a problem of functional optimization: Organisms can increase their reproductive success if they perform different tasks efficiently. Dividing tasks among lower-level units within the organism or colony (often referred to as a superorganism) is a solution to the problem. What our model suggests is that the particular behavioral rules through which task specialization arises may impact the evolutionary outcome. in main text. All colonies achieve the highest possible fitness, because they are now able to achieve the optimal ratio among tasks (3:1). As expected in absence of switching costs, there is no relation between fitness and D. (D) c~2, corresponding to fig. 4B in main text. Colonies do not reach high D, yet fitness changes with D in a non-monotonic way.

Supporting Information
(EPS) Figure S4 Evolutionary dynamics of two representative simulations of the evolution of a feedforward neural network, for b~K and r~0.