Using digital organisms to study the evolutionary consequences of whole genome duplication and polyploidy

The potential role of whole genome duplication (WGD) in evolution is controversial. Whereas some view WGD mainly as detrimental and an evolutionary ‘dead end’, there is growing evidence that the long-term establishment of polyploidy might be linked to environmental change, stressful conditions, or periods of extinction. However, despite much research, the mechanistic underpinnings of why and how polyploids might be able to outcompete non-polyploids at times of environmental upheaval remain indefinable. Here, we improved our recently developed bio-inspired framework, combining an artificial genome with an agent-based system, to form a population of so-called Digital Organisms (DOs), to examine the impact of WGD on evolution under different environmental scenarios mimicking extinction events of varying strength and frequency. We found that, under stable environments, DOs with non-duplicated genomes formed the majority, if not all, of the population, whereas the numbers of DOs with duplicated genomes increased under dramatically challenging environments. After tracking the evolutionary trajectories of individual genomes in terms of sequence and encoded gene regulatory networks (GRNs), we propose that duplicated GRNs might provide polyploids with better chances to acquire the drastic changes necessary to adapt to challenging conditions, thus endowing DOs with increased adaptive potential under extinction events. In contrast, under stable environments, random mutations might easily render the GRN less well adapted to such environments, a phenomenon that is exacerbated in duplicated, more complex GRNs. We believe that our results provide some additional insights into how genome duplication and polyploidy might help organisms to compete for novel niches and survive ecological turmoil, and confirm the usefulness of our computational simulation in studying the role of WGD in evolution and adaptation, helping to overcome some of the traditional limitations of evolution experiments with model organisms.


Introduction
Polyploidy, or the duplication of entire genomes, is a common phenomenon in the evolutionary history of many eukaryotic organisms, especially plants. Whole  (WGDs) have often been linked to speciation and species diversification, increase in biological complexity, alleviating the effects caused by extinction events, and an overall increased environmental and mutational robustness [1][2][3]. Nevertheless, although the prevalence of WGDs has now been firmly established, the effects and significance of WGDs for evolution remain vividly discussed. Whereas some regard WGD mainly as an evolutionary dead end, others see WGD primarily as an opportunity and source of evolutionary innovation. In support of the former, there is the observation that recent WGDs seem to outnumber established ancient polyploidy events by several orders of magnitude [4][5][6]. This relative paucity of anciently established WGDs seems to suggest that many WGDs have not survived on the long run, which may be due to the well-known detrimental effects of polyploidy, caused by minority cytotype exclusion, genomic instability, mitotic and meiotic abnormalities, or epigenetics changes [7].
On the other hand, we do observe 'ancient' organisms that underwent and survived WGDs, so that their descendants have outcompeted their diploid progenitors and bear traces of the duplication in their genomes. This is the case for several major eukaryote lineages such as vertebrates [8,9], fishes [10], ciliate protozoans [11], hemiascomycetous yeasts [12,13], and particularly plants [14][15][16]. This observation, together with earlier work on specific adaptations of polyploids, led some to conclude that WGDs might provide an adaptive advantage particularly under unstable, challenging and stressful environments or during periods of environmental upheaval [6,17]. This hypothesis is supported, among others, by the wave of successful WGDs that seems to have occurred around the Cretaceous-Paleogene extinction event or K-Pg boundary [18][19][20][21].
Although there is increasing support for a correlation between polyploid establishment and environmental challenge [6,17], the molecular evolutionary mechanisms behind this remain vague. Ideally, to comprehensively investigate the effects of WGDs on evolution and study the link between adaptation and polyploidy, we need to collect detailed evolutionary trajectory data of organisms with and without WGDs (e.g., the complete mutational landscape, dynamic gene expression profiling data, real-time adaptation data, etc.) under different environmental contexts. However, although there have been some very interesting and successful attempts, collecting these data from real biological evolution experiments is difficult and bound by experimental limitations [22]. Computational approaches, notwithstanding many other limitations, have the advantage that they can collect at least some of such evolutionary trajectory data and have already shown great potential in investigating the potential adaptive role of WGD. For example, computational evolutionary modeling of populations of so-called 'virtual cells' has been used to study the adaptive potential of polyploidy in times of drastic and enduring environmental change [23]. Although WGD was established only in a minority of lineages, polyploids were significantly more successful at adaptation (and readaptation), and therefore WGD seemed, at least in some cases, a powerful mechanism to cope with environmental challenges. However, as far as we understand, the model of Cuypers and Hogeweg (23) does not allow modeling the connection of the gene regulatory network (GRN) of these virtual cells with a particular environment.
We have recently published a computational framework aimed at mimicking biological evolution [24][25][26]. Our framework is based on so-called 'digital organisms' (DOs), which have their own genomes replicating under specific mutation and WGD rates. Furthermore, the expression of every gene in the genome is regulated by a set of environmental signals, or sensors, and the orchestrated activity of regulatory genes defined in the genome. The overall expression pattern of all genes encoded in the genome defines the GRN of the DO, which in turn determines its behavior, defined by a set of actuators. The strengths and advantages of our bio-inspired agent-based modeling approach have already been tested under a relatively simple setup, such as emergent swarm behavior and overall enhanced adaptation in a changing environment [24,25].
Here, we expand our previously developed framework to study the effects of polyploidy in a challenging environment. By considering the adaptation of each individual DO as an interactive and developmental process in evolution, we have a better chance to observe when and under which circumstances polyploidy can pose a potential selective advantage to the organism or population as a whole. Because the effects and consequences of polyploidy are complex and probably also often lineage-specific, we would like to stress that in the present study, we only focus on how polyploidy or WGDs might affect the evolution of GRNs and what might be some of the major constraints on the evolution of GRNs, and hence adaptation of their hosts, subsequent to WGDs. We do not study (the probability of) short-term establishment of polyploidy, nor longer-term phenomena such as gene loss or the functional divergence of genes (sub-or neofunctionalization). Also, reproductive effects (e.g. sexual reproduction and recombination) and population-specific features, such as allele frequency changes following WGDs, are not considered here. Although we are of course aware that these are all very important to fully understand the significance of polyploidy and WGD, we nevertheless hope that, by focusing on certain molecular properties, some aspects of WGDs can be better understood.
By analyzing multiple simulations of populations of clonal DOs under various environmental contexts, we found that DOs with non-duplicated genomes conformed most, if not all, of the population under stable environments. In contrast, environmental stress or change usually resulted in higher rates of extinction in the population, whereas WGDs seemed to increase the possibility of survival under dramatically challenging environments. Furthermore, tracking mutational changes in populations led us to observe that the GRNs in duplicated genomes tend to show greater flexibility with fewer mutations compared to those in non-duplicated genomes. Our results provide some initial insights into how genome duplication might help organisms to diversify, compete for novel niches, and survive ecological turmoil.

Methods
Our framework is an adaptation of work previously described in detail elsewhere [24,25]. In short, to improve the adaptability and robustness of digital organisms under changing environments, we have developed a swarm evolutionary DO framework which is inspired by the concept of complex adaptive systems in biological evolution [24]. In this framework, we use the behavior of a swarm of virtual organisms or individuals and agents (i.e. individual 'entities'-programs or functions-that can interact with each other and with the environment) to resemble gene regulation and adaptation of biological organisms. Complex adaptive systems originate through self-organization at different levels (the genome, GRN, organism, population, . . .) leading to swarm DOs that can more efficiently cope with the challenge of changing environments. Such a model did not only demonstrate the advantage of testing the adaptability of DOs in certain environments or under certain conditions [25,27], but also provides a way to simulate, at least to some extent, the dynamic evolutionary trajectories that occur in biological evolution. By tracking the behavior of both DOs and agents (representing, amongst other things, expressed genes or gene products, i.e. proteins) during simulation, we can recreate certain evolutionary contexts and replay evolutionary scenarios over and over again and study the emergence of particular adaptations. Specific extensions made to our framework include an increase of simulation platform (a grid of 1000 by 1000 cells), allowing a larger population to evolve (up to 20k DOs), and a much larger genome size (starting from 100 kb, but which can grow-through WGDs-to a size that is only limited by the amount of memory).

Description of the simulation framework
Below, we provide a summarized description of the different components of our simulation framework, as well as their interactions, focusing on the ones relevant for the current study. A more detailed description of the framework can be found elsewhere [24,25].
Digital Organism (DO). Every DO is defined by a genome and a set of i) virtual sensors, which allow DOs to measure different ecological and physiological parameters and ii) actuators, which allow them to interact with each other and with the environment and define the overall behavior of the DO [25] (Fig 1). These actuators confer DOs the ability to replicate, move, attack, defend or search for food [24,25]. For the purposes of this work, we will focus on the actuators replication and movement.
Genes and genomes. In our framework, the genome of a DO has been inspired by the model previously proposed by Reil [28], and consists of a randomly created string of four digits representing each of the four nucleotides with a total initial size of 100 kb. Genes in the genome are not pre-specified, but identified in the randomly built genome, and typically a genome contains about 150-200 genes (Fig 2). Up to 1000 different gene products can be encoded by a single genome, each defined by a unique pre-specified cis-sequence of four digits, and associated with a specific identity number and decay rate. Compared to the initial model of Reil, our model makes an explicit distinction among three kinds of genes; i) structural genes, which control every particular behavior of the DO or actuator (e.g., replication or  Fig 2D). When this number matches the specific identity number of a particular gene product, there is an increase of its corresponding concentration level. The concentration level of all gene products determines the overall behavior of a DO, which is in turn defined by the set of possible actuators, each one encoded by its corresponding structural genes.   gene is defined by a promoter and a coding region. The promoter must start with a TATA box and be followed by a set of four-digit cis-elements numbering 5-50, each of which is bound by a specific kind of TF. The promoter region is followed by the coding region, whose first four digits define the specific kind of gene to be encoded. No interspersed sequence is allowed between any of the components of the gene. (A) Structural gene, which defines a specific kind of actuator. (B) Regulatory gene, which encodes a specific kind of TF defined by its cis-binding element (i.e., 2231). (C) RNA polymerase gene, which specifically binds to a TATA box (i.e., 1010). (D) Structure of a signaling gene, defining signaling pathways, which are defined as linear equations of sensor inputs. At each time step during the simulation, the value of each kind of sensor input (see Fig 1) is estimated and used as a variable in the equation. When the result of the equation corresponds to the identity number of a particular gene, its expression is activated. Linear equations are encoded in the sequence of the genome as shown. A special part of the gene sequence is specified for representing the equation and is composed of a group of 'elements', with each element having a fixed eight bases slot. The first four bases hold movement), either repressing or promoting it (Fig 2A); ii) regulatory genes, encoding transcription factors (TFs), which can specifically activate or repress (50% of each class) the expression of other genes by binding to specific cis-elements in their promoters ( Fig 2B); and iii) a gene encoding 'RNA polymerase', which specifically binds to a 'TATA' box, which in turn must be present in the promoter of the target gene for proper expression (Fig 2C). Gene expression. The expression level of a particular gene is calculated based on the combination of TF binding cis-elements found in the promoter of that gene (Fig 2). When the gene expression level reaches a minimum threshold, the gene is translated into its corresponding gene product. The total amount of gene product (i.e. dosage) corresponds to the expression level of that gene. At every time step during the simulation, the total amount of gene product decays gradually, mimicking protein degradation, under a specific decay rate that is randomly initialized for each gene at the beginning of the simulation. In turn, the decay of gene product is counteracted by the activity of activating TFs on that particular gene. The expression level of all gene products defines the global GRN of the DO, which is translated into a specific set of actuators [29], which in turn determines the overall behavior of the DO (Fig 3) [25].
Signaling pathways. Signaling pathways are defined as linear equations of sensor inputs (Fig 2D). At each time step during the simulation, the value of each kind of sensor input was estimated and used as a variable in the equation. Every DO has the ability to integrate different sensor input variables into the equation, including its energy level, the total number of surrounding food sources and/or DOs, and the ability to interact with other DOs through defense the address of the corresponding sensor input data while the last four bases hold the weight for that particular sensor input. As such, we have a sensor input value and weight value for a particular sensor input. If we consider the particular sensor input value as a variable, a linear equation is obtained providing an output variable. A signaling pathway is then a combination of several such linear equations and their output values. The signaling gene also contains a field 'gene length' which will affect the number of 'elements' that can be present. Each signaling gene can have a minimum of three and at most 20 elements. These elements can be partially redundant. For example, two different elements may point to the same sensor inputs with different weight values.
https://doi.org/10.1371/journal.pone.0220257.g002 or attack (Fig 1). When the result of the equation corresponds to the identity number of a particular gene, its expression is activated. Linear equations are encoded in the genetic sequence of the genome of a DO (Fig 2D), and can change in number and sequence through WGDs and substitutions, respectively. This way, upon initialization of a new DO, a set of 12 TFs is activated [24,25] (Fig 1). Subsequently, at every time step during the simulation, the linear equations compute a new set of identity numbers, increasing the concentration of the corresponding gene product by one unit.
Food. Food represents a source of energy. Each food source can store from 10,000 up to 20,000 energy units. The energy of a food source increases 300 units at every time step until it reaches 20,000. At that point, the food source divides into two food sources, with the newly created food source being randomly placed around the old one in a neighboring cell. Food sources than thus, for instance, be compared to growing patches of grass, with new patches sprouting from older ones. When a DO finds a food source, it assimilates its energy level, and the food source is removed. In our framework, environmental challenges were introduced through the random removal of food sources at specific rates and time points.

Running simulations
In our evolutionary framework, populations of clonal DOs evolve under fixed WGD and substitution mutation rates, resulting in the evolutionary diversification of the population. Rather than introducing all polyploids at once in the simulation, we decided to gradually introduce them by setting a fixed WGD rate, which is a situation closer to the real evolutionary process [6,30]. At the beginning of the simulation, the WGD rate was arbitrarily fixed at 40%, meaning that when a new DO is created (after replication, inheriting the genome and energy level of its 'parent'), there is a 40% chance it will have its genome duplicated. When the population size of polyploids equals that of the non-polyploids, which usually happens in about 100 to 200 time steps, the WGD operator is permanently removed. This high rate of WGD, as compared with what would be expected in nature, was chosen because lower WGD rates result in extremely long computation times until polyploids and non-polyploids reached similarly sized populations. Because our simulations focused less on the initial establishment of polyploids, but more on the potential longer-term consequences of doubling genome content and structure, we decided to start with populations of similar size as the initial condition of the competition. Similarly, at every replication step, a fixed substitution rate (10 −4 per base) was applied to the offspring genome.
Digital organisms move in a matrix composed of 1000 x 1000 cells, looking for food. Every time a DO moves into a neighboring cell in the matrix, its energy level drops by 70 units. DOs were removed from the population when their energy levels dropped below 0. When the energy level of a particular DO reached a threshold of 20,000, the DO was considered to have reached a level of adaption, and was therefore allowed to replicate (a sign of fitness). However, the final decision of a DO to replicate depends on the relative concentration of the structural gene products involved in promoting and repressing replication. When both concentration levels are equal, the replication has a 50% possibility to happen at that time step.
In order to simulate different evolutionary scenarios, we applied different food removal patterns. In order to mimic environmental turmoil or radical environmental challenges (for instance as caused by an extinction event), we introduced drastic and/or repeated drops of food during our simulations. To investigate the interaction between adaptive potential and WGD, we examined the mutational landscape of the genomes across these various evolutionary and environmental contexts. For every DO and at every time step in the simulation, we computed its ploidy and energy level, as well as i) the short-term genetic distance (STGD), defined as the average number of substitutions (per 10 kb) between the individual DO and its direct parent; ii) the long-term genetic distance (LTGD), defined as the average number of substitutions per 10 kb between the individual DO and the initial ancestor at time 0 in the simulation; and iii) the expression distance (ED), defined as the average of the concentrations of all gene products in a given DO compared to the average of the concentrations of all gene products in the previous time step. When the ED differs by more than 30% (arbitrarily defined) between two time steps, we considered the corresponding DO as having an unstable expression pattern. In our simulation, a stable expression pattern (considered to reflect homeostasis) is necessary to maintain a certain 'phenotype' and is considered a sign that the DO has adapted to its environment. Therefore, a sequence of low ED values helped us to identify those DOs that have evolved effective strategies for adaptation.

Studying the impact of WGD on the evolution of DOs under different environmental conditions
To study the putative effect of WGDs on the evolution of DOs under challenging environmental conditions, we have simulated the evolution of clonal populations of DOs under two different scenarios of food availability (environmental challenges) and specific rates of WGD and nucleotide substitution. At every time step and for every simulation, the population size of polyploids and non-polyploids was computed. Although, after food removal, a drop in population size was generally observed, when and how and to what extent the population size drops can greatly vary from simulation to simulation [25] (see further). It is inherent to the system that DOs with different genomic makeups and encoded GRNs can have very different responses to the same amount of food removal, which makes it hard to summarize the results, except for evaluating overall success when putting polyploids and non-polyploids in competition. In the first scenario, food was dynamically removed every 60 time steps after the total population (polyploids plus non-polyploids) reached 1000 individuals. Starting from a population of 200 individuals at t 0 , having reached a population size of 1000 indicates that at least part of the population has adapted to the current environment and started to replicate rapidly [24,25]. Under this first scenario, we performed 30 simulations under each of the following four food reduction levels (thus 120 simulations in total): no food reduction, except for what is found and consumed by the DOs, 50% food reduction, 70% food reduction, and 85% food reduction. After 2000 time steps, or until the subpopulation of polyploids or non-polyploids became extinct, we compared the population sizes of polyploids and non-polyploids ( Table 1). As can be observed, under no or low food reduction scenarios, at the end of the experiment most of the population consisted of non-polyploid DOs. The population size of polyploids increased with the percentage of food removed, becoming bigger than that of non-polyploids after 70% of the food was removed repeatedly, i.e. every time the total population reached 1000. Taking away 85% of the food resulted in the extinction of the whole population of both polyploids and non-polyploids in all simulations. In the second scenario, 'food removal' was introduced at a fixed time step, namely at time step 300. Because the growth of food throughout the experiment is exponential, introducing food reduction at time step 300 represented an appropriate balance between earlier reduction before polyploid and non-polyploid populations have reached an equilibrium in the total population, and later reduction, which may not remove enough food for re-adaptation or efficient selection to occur. Under this scenario, 10 independent simulations were run under four different levels of fixed food reduction (0%, 30%, 60% and 90%, thus 40 simulations in total), each with populations starting with 200 clonal DOs each. Simulations were run for up to 1000 time steps, or until the subpopulation of polyploids or non-polyploids became extinct or started growing too big to be computationally tractable (due to exponential growth). At the end of every simulation, we checked whether the population size of polyploids was bigger than that of non-polyploids or the other way around (Table 2). Similarly to what we observed under the previous scenario, at 0 and 30% of food reduction, no polyploid DO survived at the end of the experiment, with some polyploid genomes surviving only under a 60% or more of food reduction. Polyploid populations were able to overtake non-polyploid ones only when significant levels of food (90%) were removed ( Table 2). In Fig 4, we show the detailed evolution of population sizes of polyploids and non-polyploids with respect to food energy availability for two independent simulations under 30% or under 90% of fixed food reduction, illustrating the opposite evolutionary trajectories of polyploids and non-polyploids in terms of population sizes under less and more challenging conditions, respectively.
In summary, both simulation scenarios show that, while non-polyploids clearly outcompete polyploids when environmental conditions are stable, when major environmental changes are introduced in the form of drastic reductions of food available, polyploids seem to confer an advantage over non-polyploids. This effect is more noticeable under the first scenario, where the environmental changes are repeated multiple times through recurrent food removal, creating stronger environmental constraints and hence greater needs for multiple re-adaptations (see further).

Evolution of the mutational landscape of DO genomes following WGD
In order to gain further insight on why and how polyploid DO populations have, in most simulations, a greater chance of surviving extreme environmental changes (measured by the number of surviving DOs at the end of the simulation), we examined the evolution of the average long-term genetic distance (LTGD) and short-term genetic distance (STGD) of 'adapted' populations by running an additional set of 30 simulations under a 70% dynamic food reduction scenario and computing STGD and LTGD separately for non-polyploids and polyploids. When referring to an 'adapted population', we mean that each DO in the population has an energy level of at least 20,000 units at that particular time step. This ensures that we only reconstruct the evolutionary trajectories from DOs that have had a chance for replication in the population.  the opposite pattern was generally observed for LTGD, especially at later time steps. Overall differences in STGD and LTGD between non-polyploids and polyploids were further  The significance of whole genome duplication for evolution It should be noted that the substitution rate for all genomes is the same and hence, under the assumption that WGD has no effect on mutational patterns, the average LTGD and STGD should also be the same for both populations. Consequently, the lower values for the STGD in polyploids have to be the result of selection and thus selection on the short-term seems more stringent for polyploids than for non-polyploids. However, once fixed and inherited, mutations might have more chance to be retained for longer times in polyploid populations, likely because of redundancy in the genome-although this needs further investigation-, making polyploids less vulnerable for single detrimental mutations.

Evolution of GRNs following a WGD
We also investigated the effect of WGDs on the encoded GRNs by measuring the overall extent of changes in the concentration levels of its different components (gene products) upon adaptation. In stable environments, we assume that 'adapted' DOs maintain a relatively stable GRN most of the time. In other words, we consider the stability of GRN expression patterns-represented by expression distances (ED) between any DO and its descendants smaller than 30%, see Methods -as an indicator for adaptation. Abrupt food removal may cause rapid changes in gene expression patterns, ultimately altering the underlying GRN of the DO's genome. Indeed, differences in food availability might lead to abrupt changes in the 'sensed' environment, in turn potentially resulting in domino effects on the concentration levels of particular TFs (see Fig 1). To study the link between adaptation and changes in the GRN, we ran 40 additional simulations under a scenario of 70% dynamic food reduction and extracted the subpopulations of polyploid and non-polyploid DOs that had likely established a comparatively stable GRN (ED values lower than 30%, reflecting homeostasis) and reached an at least temporary adaptive state (energy levels above 20,000). The evolutionary trajectories in terms of LTGD were then compared for each subpopulation to that of the total population of DOs in each simulation (Fig 6). Despite the overall higher LTGDs observed among polyploids, we observed that adapted and stable non-polyploids usually need to accumulate considerably more mutations to establish novel GRNs that have adapted to the new environment, while for the polyploid subpopulation, the difference between the adapted subpopulation and the total (including likely non or not-yet adapted DOs) population was less outspoken (Fig 6). In the current implementation of our framework, we do not know whether this is due to positive selection or selective sweeps in one group of DOs or due to purifying selection in the other, but the fact is that different regimes of selection are ongoing in polyploids and non-polyploids. This contrasting pattern between polyploids and non-polyploids is more obvious at later steps of the simulation (Fig 6), suggesting that polyploids and non-polyploids have indeed different ways to respond to environmental changes. In any case, based on these observations, we can infer that, for re-adapting under changed or changing environments, non-polyploid genomes need to accumulate more mutations than polyploid ones (Fig 6).

Discussion
With the purpose of investigating whether WGDs may confer an adaptive advantage in the face of drastic environmental changes or extinction events, such as abrupt reductions in available food resources, we have implemented here an improved version of our previously designed bio-inspired and agent-based computational framework for simulating biological evolution [24,25]. Two different environmental scenarios of food reduction were set: either food was removed dynamically every time the total population reached 1000 individuals, or at a fixed time step during the simulation. From our simulations, we learned that under stable environments, non-polyploids in general do better than polyploids, as reflected by the low or null fraction of polyploids surviving at the end of the different experiments performed under low or null food reduction. Under both scenarios, the relative fraction of polyploid DOs surviving at the end of the experiment increased with increasing levels of food reduction, overcoming the population of non-polyploid DOs when the environmental challenge introduced was strong enough. Taken as whole, these results suggest that WGDs are usually maladaptive under stable environments, but may actually confer an adaptive advantage under certain  The significance of whole genome duplication for evolution environmental constraints, which seems in agreement with recent 'real-life' observations [6,[31][32][33].
Although the substitution rate was the same for all genomes and is not supposed to change after WGD, contrary to our initial expectations, we observed lower STGD among polyploid DOs, suggesting that duplicated genomes are under stronger purifying selection, at least on the short-term. Once fixed and inherited though, mutations might have more chance to be retained for longer times in duplicated genomes, as suggested by the higher LTGD values for polyploids, possibly because of redundancy in the genome, although this needs to be confirmed and further investigated.
We have also shown that DOs with non-duplicated genomes need to accumulate many more mutations than polyploid genomes to reach adaptation (using homeostasis as measured by ED and the energy level as proxies). We interpret this as follows: under stable environments, and in 'well-adapted' organisms with (relatively) stable GRNs, there is always a certain chance that a random mutation renders the GRN (and consequently, its host) less well adapted to that environment. This phenomenon is exacerbated in more complex GRNs. With an increasing number of edges and connections in the network, the probability increases that a random mutation in a regulator affects a larger number of genes, in turn increasing the chance of causing major disturbances to the underlying GRN [34][35][36][37]. In this respect, WGDs increase the number of edges in the network exponentially (Fig 7). For instance, in our simulations, while the average GRN for the total population consisted of 272 instances or agents at the start, this increased to about 1,700 for DOs with a duplicated genome at the end of the simulations.
From previous theoretical and experimental work, it has been shown that gene duplication and divergence may be at the origin of specific properties of biological networks such as GRNs, including increased complexity and, also, modularity, i.e. the occurrence of multiple functional subnetworks or modules formed by highly connected genes that are co-expressed and/or co-regulated by the same set of key regulators, while establishing sparser connections with nodes from different modules [1,[38][39][40][41][42][43][44][45][46]. As a consequence, single substitutions in duplicated genomes have the potential to affect many more key regulators, in turn having a greater impact on the underlying GRN and the resulting phenotypes. Again, whereas this may be disadvantageous in stable environments, under drastic environmental changes, DOs with duplicated genomes may actually have a better chance to acquire the drastic genomic-and consequently phenotypic-changes necessary to survive major changes in the fitness landscape (Fig 8). In addition, the duplication of genomes and their encoded GRNs may result in redundancy, contributing to the genetic or mutational robustness at the GRN level. This facilitates the rewiring of novel functional modules while maintaining the ancestral one, a feature that is expected to be especially advantageous under unstable, recurring, and/or challenging environments [23,41]. In this respect, our observations are compatible with previous claims that increased complexity and modularity results in increased evolvability [36,[47][48][49][50][51][52]. For a population that is already well adapted to a certain environment, the increased complexity and modularity of the duplicated GRNs could be maladaptive or detrimental because the GRNs are more vulnerable to random mutations and the potential for increased evolvability may have little value or direct benefits. On the other hand, the complex and modular structure of duplicated GRNs might allow polyploids to explore a wider evolutionary landscape (Fig 8), providing short-term increased opportunity to adapt to novel, different, or rapidly changing environments, while redundancy facilitates the gradual evolution of duplicated networks for adapting in the longer term. In partial summary, increased complexity, modularity and functional redundancy following WGD would help to explain the different behavior of polyploids and non-polyploids under stable or challenging environments [6].
Although many other factors may be at play as well, which, for simplicity, are not considered here (but will be addressed elsewhere), the effect of WGD on network complexity and redundancy could be part of the explanation why also in natural environments, floras in stable environments might have lower proportions of polyploids than highly, recently disturbed floras. In a recent study, Oberlander et al. [53] suggested that the hyper diverse South African Cape flora, which has anomalously low proportions of polyploids compared to global levels, might be due to long-term climatic and geological stability, supporting the hypothesis that WGD may be rare in stable environments. On the contrary, there is considerable evidence that polyploidy is much more common in disturbed habitats, or habitats that (have) show(n) environmental turmoil [6,31,54]. Previously, we have also wondered about the discrepancy between the existence of many polyploids of fairly recent origin and the scant evidence of ancient polyploidy events, certainly within the same evolutionary lineage [6]. The paucity of polyploidy events that 'survive' and are established in the long term would suggest that polyploidy is usually an evolutionary dead end. Again, this fits with our observation that the The significance of whole genome duplication for evolution Top panels, 3D surface representation of the phenotype space accessible for every combination of two hypothetical phenotypic traits, which ultimately correspond to an organism's specific genotypic configuration. Each coordinate in the XY plane conformed by phenotype 1 and phenotype 2 has a relative fitness value associated on the Z axis. The resulting fitness landscape is formed by a range of hills, each corresponding to the total phenotype space accessible to a specific organism or group of organisms, i.e, their ecological niche, with a local maximum or adaptive peak located at the top. Hills are surrounded by valleys or depressions, colored in dark blue, corresponding to regions of the phenotype space inaccessible to any organism. Bottom panels show the 2D colored image plots corresponding to the fitness landscapes above. Black dots, white dots and red crosses represent organisms occupying the fitness landscape. Concentric circles around organisms represent the areas of the phenotype space accessible by non-polyploid organisms (inner circle) or their polyploid relatives (outer circle). Arrows indicate the trajectories followed by every organism to reach their adaptive peak. Black dots represent organisms that have reached their local maxima and are thus located at their adaptive peaks; white dots represent organisms that survived after a change in the environmental conditions and subsequent shifts of their adaptive peaks; red crosses represent organisms that perish because of big shifts or extinction of their niches. From left to right, three different environmental scenarios are shown. In the left two panels, organisms evolving under a stable environment for a certain time are expected to have reached their local adaptive peaks. In the middle two panels, a small change in environmental conditions (e.g., a small drop in the total food available) may result in small shifts of the peaks or their relative fitness contribution, or eventually in the extinction or emergence of specific ecological niches. In most cases, the new location of the peaks fall within the phenotype area (still) accessible to non-polyploid organisms (inner circles). Under this scenario, non-polyploid organisms are expected to quickly reach the new peaks, rapidly outcompeting their polyploid relatives, which are more likely to fall farther away (outer circles) and have well-known detrimental effects. Finally, in the right panels, a cataclysmic or extinction event (e.g., a big drop in the total food available) is represented, resulting in a reduction in the number of available ecological niches, and big shifts in their relative locations. Under these conditions, although most organisms are expected to perish, polyploids organisms, featured by wider accessible phenotype space (as a result of the potential higher impact of mutations in more complex genomes, see text for details), have better chances to fall near the peak of a newly formed adaptive hill and to develop the necessary evolutionary innovations to colonize empty niches. https://doi.org/10.1371/journal.pone.0220257.g008 The significance of whole genome duplication for evolution PLOS ONE | https://doi.org/10.1371/journal.pone.0220257 July 31, 2019 increased complexity and modularity of the underlying networks may be detrimental (or at least sub-optimal) for species that are well-adapted to their environment, when accumulating (too) many changes (Fig 8). In this respect, it is also noteworthy to mention that during our simulations, rarely two or three WGDs were observed, and never more. Also in real-life organisms, very few examples are known of consecutively established WGDs that have occurred in a short period of time. Some of these presumed exceptions are Musa acuminata, Spirodela polyrhiza, and Arabidopsis thaliana [6]. However, at least for Arabidopsis [55] and Spirodela [56], the number of genes is low compared to what would be expected following several rounds of WGD, suggesting huge gene loss and fractionation following WGD. In line with our observations made in the current study, quickly losing a large amount of the extra genetic material created through a WGD, thereby potentially reducing the complexity of the active GRNs, might have facilitated surviving these major events.
Finally, our observations might also have some broader implications and might help to understand the origin of complex phenotypes and discontinuities in biological evolution. Traditionally, evolution is regarded as progressing at a more or less constant rate, which is linked to the gradual accumulation of small genetic changes over time. This model successfully explains, for instance, the variation between (closely) related species but often fails to explain bigger 'leaps' in evolution. However, there is ample recording of 'novel' or highly divergent species that seem to have emerged in a relatively short evolutionary time frame and sometimes even representing important evolutionary transitions. Such observations often seem difficult to reconcile with an evolutionary model that is based on the gradual accumulation of small changes, and a huge body of literature has been devoted to this seemingly contradictory phenomenon [57,58].
As supported by our simulations, and in previous research [24,25], in a stable environment, gradual evolution can successfully explain the divergence and optimization of evolutionary processes. In contrast, when the environment drastically changes, in a short geological time frame, gradual evolution has difficulty to keep up, while new species or life forms, if different enough from the original ones, can successfully occupy the 'new' niches that have become available. When the environment is rapidly changing, existing species may not have enough time to adapt and will disappear. WGD might provide a way out and might be one way reconciling a model of gradual evolution with adaptation to a rapidly changing environment [3]. Gene and genome duplication has been suggested before as a way to explain 'saltational' jumps in evolution [59][60][61]. Indeed, the duplication of genes and particularly the duplication of entire genomes immediately creates redundant informational 'entities' or 'modules' in the genome, offering possibilities for a more drastic change and a wider exploration of genotypic and phenotypic space (Fig 8). While in a constant environment, in which organisms are already fairly well-adapted, we only could observe the further slow optimization of biological systems, a WGD creates more drastic changes in certain individual species. These individuals often remain hidden in the population-and usually have a hard time competing with their non-polyploid progenitors, for reasons explained higher and elsewhere [6,7,31], but could 'grab their chance' upon different conditions or under different contexts. Increased gene pleiotropy evoked by WGD thus forms a fertile substrate allowing evolution to explore more diverse options possibly explaining those bigger jumps in evolution observed. Again, such evolvability might only be successful when the existing environmental conditions have been (seriously) disrupted (Fig 8). When a new environmental 'equilibrium' has been reached and the new environmental conditions have stabilized again, the increased evolvability through WGD will become less important in further optimizing the system and on the contrary selection on a more complex system (due to WGD, see higher) might be constrained again. Considering polyploidy in the evolutionary history of organisms might thus indeed be one additional way of possibly explaining bigger leaps or major transitions in evolution, as already suggested by Susumu Ohno in 1970 [62], but certainly needs further investigation.

Conclusions
Our observations seem to support the two-bladed sword role claimed for WGD in evolution, constituting on the one hand an evolutionary dead end in stable environments, but on the other hand offering opportunities to avoid or reduce the risk of extinction in unstable and/or drastically changing environments [1,6]. Furthermore, we hope that this work shows the usefulness of our DO-based evolution simulation computational framework in studying the role of WGD in evolution and adaptation, by tracking the evolutionary trajectories of individual artificial genomes in terms of sequence and expression diversification, helping to overcome some of the traditional limitations of evolution experiments with model organisms.