The authors have declared that no competing interests exist.
Conceived and designed the experiments: DMB CAO. Performed the experiments: DMB. Analyzed the data: DMB. Contributed reagents/materials/analysis tools: DMB. Wrote the paper: DMB CAO. Conceived and implemented architectural features: DMB.
We investigate fundamental decisions in the design of instruction set architectures for linear genetic programs that are used as both model systems in evolutionary biology and underlying solution representations in evolutionary computation. We subjected digital organisms with each tested architecture to seven different computational environments designed to present a range of evolutionary challenges. Our goal was to engineer a general purpose architecture that would be effective under a broad range of evolutionary conditions. We evaluated six different types of architectural features for the virtual CPUs: (1) genetic flexibility: we allowed digital organisms to more precisely modify the function of genetic instructions, (2) memory: we provided an increased number of registers in the virtual CPUs, (3) decoupled sensors and actuators: we separated input and output operations to enable greater control over data flow. We also tested a variety of methods to regulate expression: (4) explicit labels that allow programs to dynamically refer to specific genome positions, (5) position-relative search instructions, and (6) multiple new flow control instructions, including conditionals and jumps. Each of these features also adds complication to the instruction set and risks slowing evolution due to epistatic interactions. Two features (multiple argument specification and separated I/O) demonstrated substantial improvements in the majority of test environments, along with versions of each of the remaining architecture modifications that show significant improvements in multiple environments. However, some tested modifications were detrimental, though most exhibit no systematic effects on evolutionary potential, highlighting the robustness of digital evolution. Combined, these observations enhance our understanding of how instruction architecture impacts evolutionary potential, enabling the creation of architectures that support more rapid evolution of complex solutions to a broad range of challenges.
Over the past 50 years, the field of evolutionary computation has produced many successful tools, including genetic algorithms
Digital evolution is a type of linear genetic programming that provides a rich environment to study evolution in a more natural environment; populations of self-replicating computer programs must survive in a computational world where they are subject to mutations, environmental effects, interactions with other programs, and the pressures of natural selection
The instruction set architecture is the core of every instance of digital evolution, defining the characters and syntax of the genetic language, as well as the virtual hardware upon which that language executes. The design of the instruction set architecture within an evolvable system plays an important role in influencing the robustness and flexibility of evolved solutions
We performed all experiments using executables based on Avida version 2.12, with modifications to support each of the new instruction set architectures that we investigated. The full Avida 2.12 source code is available for download, without cost, from
All statistical tests were conducted using MATLAB 2012a. Configuration files, analysis scripts, and experimental results are available from figshare
The HEADS instruction set architecture is the default virtual CPU configuration in all versions of Avida 2.x, consisting of a Turing complete, von Neumann style architecture. The virtual hardware that implements this instruction set is designed to operate on a genomic program within a circular memory space (as shown on the left side of
Registers (upper right), stacks (lower right), genomic program (left), heads (middle), and environmental channels (lower right). The solid lines depict the default Heads architectural features. The dashed lines show some of the modifications tested.
Instruction | Description |
Add ?BX? to ?CX? and place the result in ?BX? | |
Decrement ?BX? by one | |
Copy the position of the ?IP? head into ?CX? | |
goto | Move IP to direct match label |
goto-if-n-equ | Move IP to direct match label if BX ! = CX |
goto-if-less | Move IP to direct match label if BX |
Allocate maximum allowed space for offspring | |
Copy from read-head to write-head; advance both | |
Divide code between read and write heads as offspring | |
Execute next instruction if just copied complement sequence | |
if-copied-seq-direct | Execute next instruction if just copied direct-match sequence |
if-copied-lbl-comp | Execute next instruction if just copied complement label |
if-copied-lbl-direct | Execute next instruction if just copied direct-match label |
if-equ-0 | Execute next instruction if ?BX? = 0, else skip it |
if-equ-x | Execute next instruction if BX = ?nop-defined constant?, else skip it |
if-gtr-0 | Execute next instruction if ?BX? |
if-gtr-x | Execute next instruction if BX |
Execute next instruction if ?BX? |
|
if-less-0 | Execute next instruction if ?BX? |
Execute next instruction if ?BX? ! = ?CX?, else skip it | |
if-not-0 | Execute next instruction if ?BX? ! = 0, else skip it |
Increment ?BX? by one | |
input | Input new number into ?BX? |
Output ?BX?, and input new number back into ?BX? | |
Move head ?Flow? by amount in ?CX? register | |
label | No-operation; marks the beginning of a genome position label |
Move head ?IP? to the flow head | |
mov-head-if-less | Move head ?IP? to the flow head if ?BX? |
mov-head-if-n-equ | Move head ?IP? to the flow head if ?BX? ! = ?CX? |
Nand ?BX? by ?CX? and place the result in ?BX? | |
No-operation; modifies other instructions | |
No-operation; modifies other instructions | |
No-operation; modifies other instructions | |
nop-D | No-operation; modifies other instructions |
nop-E | No-operation; modifies other instructions |
nop-F | No-operation; modifies other instructions |
nop-G | No-operation; modifies other instructions |
nop-H | No-operation; modifies other instructions |
nop-I | No-operation; modifies other instructions |
nop-J | No-operation; modifies other instructions |
nop-K | No-operation; modifies other instructions |
nop-L | No-operation; modifies other instructions |
nop-M | No-operation; modifies other instructions |
nop-N | No-operation; modifies other instructions |
nop-O | No-operation; modifies other instructions |
nop-P | No-operation; modifies other instructions |
output | Output ?BX? |
Remove top number from stack and place into ?BX? | |
Copy number from ?BX? and place it into the stack | |
search-lbl-comp-s | Find complement label from genome start and move the flow head |
search-lbl-direct-b | Find direct label backward and move the flow head |
search-lbl-direct-f | Find direct label forward and move the flow head |
search-lbl-direct-s | Find direct label from genome start and move the flow head |
Find complement sequence from genome start and move the flow head | |
search-seq-direct-b | Find direct sequence backward and move the flow head |
search-seq-direct-f | Find direct sequence forward and move the flow head |
search-seq-direct-s | Find direct sequence from genome start and move the flow head |
Set flow-head to position in ?CX? | |
sg-move | Move one location forward in the Navigation environment |
sg-rotate-l | Rotate heading 45% left in the Navigation environment |
sg-rotate-r | Rotate heading 45% right in the Navigation environment |
sg-sense | Read the value of the current location in the Navigation environment |
Shift bits in ?BX? right by one (divide by two) | |
Shift bits in ?BX? left by one (multiply by two) | |
Subtract ?CX? from ?BX? and place the result in ?BX? | |
Swap the contents of ?BX? with ?CX? | |
Toggle which stack is currently being used |
Description of the instructions used across all tested instruction set architectures. A register name (AX, BX, CX, etc.) or head (IP, FLOW, etc.) surrounded by question marks refers to the default argument used when executed, subject to nop modi_cation. Instructions depicted in
The HEADS instruction set has five flow-control instructions: h-search, jmp-head, mov-head, gethead, and set-flow. Each of these instructions can affect the position of one of the four architectural heads: the instruction pointer (IP), READ head, WRITE head, and FLOW head. The h-search instruction searches the genome, starting from the first executed instruction in the genome, for a label (a sequence of one or more nop instructions) that matches the cyclic complementary label that follows the instruction, placing the FLOW head after the matching sequence; if the sought-after label is not found, it places the FLOW head on the instruction immediately subsequent to itself. Thus if the h-search instruction were followed by nop-A nop-A nop-B it would search for the genome for the sequence nop-B nop-B nop-C. This is one of only two instructions in the default HEADS instruction set that is affected by more than one nop instruction, the other being if-copied described below. The mov-head instruction moves the IP to the current location of the FLOW head. The jmp-head instruction shifts the position of the IP by the amount specified in a register. The get-head instruction places the current location of the IP into a register. Finally, the set-flow instruction moves the FLOW head to the absolute genome location specified by the value in a register.
The HEADS set also contains three conditional instructions that will skip a subsequent instruction if the test condition is false. The two basic conditional instructions, if-n-equ and if-less, perform a comparison between two registers. The if-copied instruction interacts with the READ head, evaluating to true if the last sequence of instructions copied matches the complement of the label that follows the instruction. This instruction is primarily for use in conjunction with the replication instructions described below to identify the portion of the genome most recently copied.
Seven arithmetic and logic operations are supported in the default HEADS instruction set: add, sub, inc, dec, nand, shift-l, and shift-r. All of these instructions operate on values stored within registers and accept a single nop modifier, which changes the source and destination registers depending on the operation.
Five instructions in HEADS facilitate data movement and environmental interaction. The push, pop, and swap-stk instructions all operate on the two stacks within the architecture. Only one stack is accessible at a time, with the swap-stk instruction toggling the currently active stack, while push and pop copy numbers from registers to the top of the active stack and vice-versa. Each of these instructions can be nop-modified to specify which register should be used. The swap instruction exchanges the values of two registers. The IO instruction interacts with the environment of the digital organism, outputting the current value in a register and replacing it with a value from the environmentally controlled input buffer. Values output via this instruction are evaluated by the environment, potentially triggering a reward or other action if they match one of the tasks in the environment as explained below.
Lastly, there are three instructions that facilitate self-replication. The h-alloc instruction allocates additional memory within which the digital organism can copy its offspring. Copying is performed by repeated execution of the h-copy instruction, which duplicates the current instruction found at the READ head to the position marked by the WRITE head and advances both heads. Once copying has been completed, the organism must execute the h-divide instruction to finalize the replication process, extracting the memory between the READ head and the WRITE head as the genome of the offspring.
In the default HEADS instruction set, most instructions can have one aspect of their function modified by a single nop instruction that follows in the genome. We aimed to improve the flexibility by which data could be accessed and modified in the virtual CPUs by implementing the FULLY-ASSOCIATIVE (FA) instruction set. We extended the nop modification system used by instructions so that most instructions could be modified by more than one nop. The default behavior of all instructions remains the same when not followed by any nop instructions. Instructions that affect only a single register or head retain identical behavior to the HEADS in the presence of a nop. However, for arithmetic, logic, and conditional instructions that use multiple registers, the FA instruction set will shift all registers to correspond with a signal nop given, as well as read subsequent nops, if present, to further specify those parameters. For example, an add instruction, by default will perform
The REGISTER-series of instruction set architectures build upon the FULLY-ASSOCIATIVE architecture to increase the working register set beyond the three default registers, exposing one or more additional architectural registers, in sets R4, R5, R6, R7, R8, R12, up to a total of 16 in R16. The original design choice was made to minimize the number of registers in order to simplify the complexity of using them, but a larger number of registers has not previously been systematically tested. For each additional register, we added a corresponding nop instruction to the instruction set (nop-D, nop-E, nop-F, etc.). None of the default registers used by the instruction set were altered, meaning that these additional registers can be accessed only when the new nop instructions are used to modify an instruction. Since nop modification is also used for head selection, the additional nop-D in the R4 architecture provides direct access to the FLOW head. In the R5 through R16 architectures, extra unassigned heads that may be used as genome place-markers are available for each additional nop instruction.
The LABEL-series of instruction set architectures extends the R6 architecture (which proved to be the most effective, as described in the results below), explicitly separating genome labels from nop sequences used to modify instruction operands. The intent of this change was to prevent instruction argumentation as facilitated by the FULLY-ASSOCIATIVE architecture from otherwise conflicting with labeled genome positions, especially those used for self-replication. Instructions that operate on genome labels, search-seq-comp-s and if-copied-seq-comp, were extended with variants (search-lbl-comp-s and if-copied-lbl-comp) that recognize sequences of nop instructions only if they begin with the special label instruction (see
Instruction Set | label | if-copied-lbl-comp | if-copied-lbl-direct | if-copied-seq-comp | if-copied-seq-direct | search-lbl-comp-s | search-lbl-direct-s | search-seq-comp-s | search-seq-direct-s |
R6 | • | • | |||||||
Label | • | • | • | ||||||
LABEL-DIRECT | • | • | • | ||||||
LABEL-BOTH | • | • | • | • | • | ||||
LABEL-SEQ | • | • | • | • | • | ||||
LABEL -SEQ-DIRECT | • | • | • | • | • | ||||
LABEL-DIRECT-SEQ | • | • | • | • | • | ||||
LABEL-SEQ-BOTH | • | • | • | • | • | • | • | • | • |
Marks in each column indicating that the set contains the relevant instruction.
The SPLIT-IO instruction set architecture alters the LABEL-SEQ-DIRECT architecture, splitting the IO instruction into two separate input and output instructions. Both of the new instructions use the same default register location as the IO instruction and can each be modified by one nop.
The SEARCH-series of instruction set architectures extend the SPLIT-IO architecture with enhanced searching and jumping capabilities. The SEARCHDIRECTIONAL set adds two pairs of directional search- instructions that scan the genome forward or backward relative to the instruction pointer for a label or sequence match. The SEARCH-GOTO set, adds a single goto instruction that reads the nop sequence that follows the instruction, if present, and will unconditionally jump to the first genome location following the matching label that begins with a label instruction. If no matching label is found, execution ignores the goto instruction. The SEARCH-GOTOIF group adds two conditional goto variants, goto-if-n-equ and goto-if-less, that execute the jump only if the conditional test evaluates to
The FLOW-series of instruction set architectures builds upon the flow control features of the SearchDirectional architecture, testing multiple combinations of additional flow control instructions (
Instruction Set | IF0 Instructions | IFX Instructions | MOVHEAD Instructions |
FLOW-IF0 | • | ||
FLOW-IFX | • | ||
FLOW-MVH | • | ||
FLOW-IF0-MVH | • | • | |
FLOW-IFX-MVH | • | • | |
FLOW-IF0-IFX-MVH | • | • | • |
Instruction set by row, with marks in each column indicating that the set contains the relevant instruction group.
We use seven distinct computational environments to evaluate the effectiveness of all tested instruction set architectures. Each environment focuses on a different aspect of the virtual architecture. Environments contain a set of tasks that carry a metabolic reward associated with their performance. These metabolic rewards increase the computation speed of the digital organism's virtual CPU, making it possible to obtain a competitive advantage relative to other organisms in the population.
The
The
We designed the
The
The
The
Finally, the
We have focused on two measures to evaluate how well populations solved the computational challenges of the environment when evolved with each instruction set architecture: mean fitness and task success. Both measure ability of the evolved organisms to perform tasks within the environment.
In most environments task success will be highly correlated with fitness. Since organisms in digital evolution must self-replicate, it is possible for genotypes with identical task success to exhibit vastly different fitness measurements, so both metrics can be informative. Additionally, in some environments task success provides a more consistent measure of the evolutionary potential of the instruction set. For example, in the Limited-9 environment the reduction in resources due to additional task performance may actually reduce average fitness, even though more tasks are being performed.
We evaluated each of the six tested types of hardware modifications in consecutive evolutions of the instruction set architecture. The first hardware modification tested was the FULLY-ASSOCIATIVE set, followed by the REGISTER sets, LABEL sets, SPLIT-IO set, SEARCH sets, and finally the FLOW sets.
In conducting our analysis, the FULLY-ASSOCIATIVE (FA) instruction set, which addresses the flexibility of register data flow, shows significant improvement in six of the seven environments (
Logic-9 | Logic-77 | Match-12 | Fibonacci-32 | Sort-10 | Limited-9 | Navigation | |
HEADS | 19.07 | 12.43 | 0.173 | 3.730 | −0.54 | 4.430 | 1.071 |
(17.71, 19.76) | (11.51, 14.22) | (0.146, 0.224) | (3.300, 4.050) | (−0.63, −0.45) | (4.283, 4.595) | (1.035, 1.383) | |
FA | 1.038 | ||||||
(22.70, 23.08) | (35.05, 41.83) | (0.191, 0.251) | (4.474, 5.212) | (−0.45, −0.33) | (4.671, 5.082) | (1.022, 1.069) |
Fitness results of the HEADS and FULLY-ASSOCIATIVE (FA) instruction set architectures, where multiple nop arguments can modify the behavior of an instruction. Each entry shows the median log
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
HEADS | 0.829 | 0.176 | 0.145 | 0.206 | 1.31 |
0.909 | 3.97 |
(0.752, 0.839) | (0.161, 0.198) | (0.145, 0.146) | (0.178, 0.238) | (1.08, 1.47) | (0.894, 0.913) | (3.96, 4.35) | |
FA | 1.55 |
3.96 |
|||||
(0.930, 0.943) | (0.453, 0.546) | (0.147, 0.149) | (0.278, 0.332) | (1.44, 1.67) | (0.924, 0.929) | (3.95, 3.97) |
Task success results of the HEADS and FULLY-ASSOCIATIVE (FA) instruction set architectures. Each entry shows the median normalized task success in the respective environment, with
The REGISTER-series instruction sets generally show little variation in performance (
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
FA | 22.63 | 38.85 | 0.223 | 4.828 | 0.09 | 4.934 | 1.027 |
(22.39, 23.01) | (34.76, 43.67) | (0.191, 0.273) | (4.255, 5.332) | (−0.07, 0.21) | (4.661, 5.282) | (1.016, 1.042) | |
R4 | 22.85 | 38.70 | 0.243 | 4.666 | 5.253 | ||
(22.67, 23.01) | (33.90, 43.02) | (0.204, 0.290) | (4.233, 5.142) | (−0.48, −0.32) | (4.925, 5.514) | (1.054, 1.792) | |
R5 | 22.73 | 38.42 | 0.231 | 5.067 | 5.158 | ||
(22.50, 22.86) | (34.43, 42.54) | (0.206, 0.281) | (4.540, 5.623) | (−0.57, −0.39) | (4.300, 5.400) | (1.056, 1.340) | |
R6 | 22.78 | 43.01 | 0.229 | 4.908 | 5.117 | ||
(22.29, 22.97) | (40.01, 45.90) | (0.206, 0.274) | (4.293, 5.719) | (−0.54, −0.34) | (4.925, 5.374) | (1.080, 2.730) | |
R7 | 22.75 | 43.41 | 0.204 | 4.598 | 5.135 | ||
(22.58, 22.97) | (38.97, 45.67) | (0.177, 0.225) | (4.174, 5.078) | (−0.49, −0.29) | (4.978, 5.407) | (1.096, 3.234) | |
R8 | 22.75 | 43.04 | 4.831 | 5.292 | |||
(22.55, 22.95) | (39.25, 47.78) | (−0.07, 0.19) | (4.392, 5.308) | (−0.57, −0.33) | (5.058, 5.736) | (1.099, 2.815) | |
R12 | 22.62 | 44.26 | 4.678 | 5.180 | |||
(22.45, 22.76) | (40.82, 48.18) | (−0.12, −0.08) | (4.082, 5.244) | (−0.56, −0.49) | (4.901, 5.621) | (1.114, 3.012) | |
R16 | 42.26 | 4.028 | |||||
(19.76, 22.22) | (40.02, 46.26) | (−0.13, −0.10) | (3.620, 4.474) | (−0.59, −0.50) | (5.390, 6.466) | (1.157, 3.326) |
Fitness results of the Register -series instruction set architectures, which vary the number of registers available in the virtual CPUs. Each entry shows the median log
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
FA | 0.932 | 0.495 | 0.147 | 0.288 | 2.53 |
0.926 | 3.96 |
(0.921, 0.938) | (0.452, 0.565) | (0.146, 0.148) | (0.263, 0.307) | (2.21, 2.74) | (0.923, 0.929) | (3.95, 3.96) | |
R4 | 0.937 | 0.506 | 0.146 | 0.276 | 0.923 | ||
(0.929, 0.941) | (0.441, 0.554) | (0.145, 0.148) | (0.256, 0.289) | (1.42, 1.67) | (0.920, 0.929) | (3.96, 5.05) | |
R5 | 0.936 | 0.493 | 0.300 | 0.927 | |||
(0.929, 0.940) | (0.450, 0.544) | (0.144, 0.147) | (0.284, 0.327) | (1.09, 1.62) | (0.923, 0.929) | (3.96, 4.20) | |
R6 | 0.932 | 0.563 | 0.294 | 0.930 | |||
(0.927, 0.940) | (0.521, 0.592) | (0.144, 0.147) | (0.268, 0.326) | (1.14, 1.160) | (0.926, 0.932) | (3.97, 6.68) | |
R7 | 0.940 | 0.554 | 0.281 | 0.928 | |||
(0.930, 0.943) | (0.502, 0.592) | (0.142, 0.146) | (0.247, 0.305) | (1.26, 1.63) | (0.923, 0.932) | (3.98, 7.65) | |
R8 | 0.938 | 0.555 | 0.299 | 0.927 | |||
(0.931, 0.942) | (0.504, 0.613) | (0.078, 0.143) | (0.275, 0.323) | (1.06, 1.57) | (0.924, 0.929) | (3.97, 6.19) | |
R12 | 0.939 | 0.575 | 0.298 | 0.930 | |||
(0.933, 0.943) | (0.525, 0.613) | (0.077, 0.078) | (0.268, 0.318) | (1.03, 1.11) | (0.928, 0.933) | (3.98, 7.33) | |
R16 | 0.910 | 0.550 | 0.269 | 0.928 | |||
(0.854, 0.931) | (0.524, 0.589) | (0.077, 0.078) | (0.237, 0.302) | (1.01, 1.08) | (0.925, 0.932) | (3.99, 7.81) |
Task success results of the REGISTER-series instruction set architectures. Each entry shows the median normalized task success in the respective environment, with
The LABEL-series instruction sets show mixed results (
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
R6 | 22.59 | 38.42 | 0.216 | 4.572 | −0.22 | 5.205 | 1.537 |
(22.10, 22.85) | (34.96, 44.12) | (0.201, 0.257) | (3.904, 5.134) | (−0.32, −0.09) | (5.016, 5.518) | (1.097, 3.261) | |
LABEL | 22.67 | 0.248 | −0.42 | 5.430 | 1.926 | ||
(22.42, 22.94) | (25.66, 33.37) | (0.218, 0.316) | (5.669, 6.648) | (−0.52, −0.33) | (5.195, 5.630) | (1.108, 3.313) | |
DIRECT | 22.50 | 0.215 | 5.784 | 1.087 | |||
(21.84, 22.74) | (24.38, 31.32) | (0.189, 0.252) | (4.545, 6.174) | (−0.54, −0.40) | (5.429, 6.325) | (1.064, 2.742) | |
BOTH | 22.32 | 0.203 | 5.553 | 3.084 | |||
(19.68, 22.61) | (28.96, 38.07) | (0.163, 0.230) | (5.715, 6.606) | (−0.56, −0.38) | (5.098, 5.885) | (1.148, 3.396) | |
SEQ | 22.43 | 40.26 | 0.207 | −0.29 | 5.438 | 2.203 | |
(21.94, 22.66) | (35.12, 44.36) | (0.134, 0.257) | (5.797, 6.733) | (−0.36, −0.15) | (5.079, 5.755) | (1.089, 3.161) | |
SEQ | 22.46 | 44.13 | 0.198 | −0.33 | 5.651 | 2.335 | |
DIRECT | (22.23, 22.66) | (39.73, 48.52) | (0.126, 0.217) | (5.212, 6.531) | (−0.44, −0.21) | (5.441, 5.967) | (1.077, 3.335) |
DIRECT | 22.53 | 41.74 | 0.210 | −0.40 | 5.528 | 2.319 | |
SEQ | (22.25, 22.69) | (38.58, 44.36) | (0.183, 0.300) | (5.135, 6.495) | (−0.47, −0.23) | (5.528, 5.968) | (1.093, 3.235) |
SEQ | 22.44 | 39.56 | −0.33 | 3.173 | |||
BOTH | (21.75, 22.68) | (36.64, 42.72) | (−1.42, 0.088) | (5.336, 6.430) | (−0.45, −0.17) | (5.621, 6.374) | (2.955, 3.295) |
Fitness results of the LABEL-series instruction set architectures. Each entry shows the median log
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
R6 | 0.926 | 0.505 | 0.144 | 0.278 | 1.86 |
0.930 | 4.33 |
(0.908, 0.937) | (0.461, 0.574) | (0.142, 0.145) | (0.241, 0.303) | (1.62, 2.10) | (0.926, 0.932) | (3.97, 7.76) | |
LABEL | 0.941 | 0.146 | 0.932 | 4.58 |
|||
(0.934, 0.945) | (0.352, 0.450) | (0.144, 0.147) | (0.342, 0.396) | (1.14, 1.57) | (0.928, 0.934) | (3.98, 7.68) | |
DIRECT | 0.937 | 0.145 | 0.934 | 3.98 |
|||
(0.916, 0.943) | (0.329, 0.416) | (0.144, 0.146) | (0.295, 0.383) | (1.10, 1.57) | (0.931, 0.936) | (3.97, 6.16) | |
BOTH | 0.922 | 0.145 | 0.932 | 7.05 |
|||
(0.857, 0.939) | (0.385, 0.495) | (0.140, 0.146) | (0.367, 0.403) | (1.09, 1.51) | (0.929, 0.935) | (4.00, 7.96) | |
SEQ | 0.932 | 0.509 | 0.143 | 1.68 |
0.928 | 5.31 |
|
(0.919, 0.938) | (0.460, 0.573) | (0.139, 0.144) | (0.365, 0.405) | (1.57, 2.02) | (0.925, 0.929) | (3.98, 7.57) | |
SEQ | 0.929 | 0.559 | 0.144 | 1.61 |
0.931 | 5.06 |
|
DIRECT | (0.918, 0.939) | (0.522, 0.612) | (0.141, 0.146) | (0.309, 0.398) | (1.49, 1.81) | (0.928, 0.934) | (3.98, 7.70) |
DIRECT | 0.932 | 0.542 | 0.143 | 1.53 |
0.930 | 5.24 |
|
SEQ | (0.919, 0.941) | (0.500, 0.562) | (0.141, 0.145) | (0.300, 0.399) | (1.36, 1.77) | (0.928, 0.933) | (3.98, 7.63) |
SEQ | 0.926 | 0.517 | 1.64 |
0.928 | 7.74 |
||
BOTH | (0.914, 0.934) | (0.482, 0.545) | (0.078, 0.125) | (0.300, 0.398) | (1.51, 1.94) | (0.923, 0.930) | (7.20, 7.91) |
Task success results of the LABEL-series instruction set architectures. Each entry shows the median normalized task success in the respective environment, with
The SPLIT-IO instruction set shows improvements that are both significant and often substantial in the Logic-9 and Logic-77 environments, the Match-12 environment, and the Fibonacci-32 environment (
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
SEQ | 22.56 | 43.57 | 0.207 | 6.106 | −0.32 | 5.812 | 2.641 |
DIRECT | (22.32, 22.72) | (39.73, 46.60) | (0.182, 0.239) | (5.326, 6.549) | (−0.39, −0.24) | (5.390, 6.169) | (1.198, 3.341) |
SPLITIO | 5.343 | 1.091 | |||||
(22.87, 23.22) | (50.34, 56.72) | (0.314, 0.360) | (7.983, 8.207) | (−1.03, −1.02) | (5.221, 5.520) | (1.062, 2.920) |
Fitness results of the LABEL-SEQ-DIRECT and SPLIT-IO instruction set architectures. Each entry shows the median log
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
SEQ | 0.930 | 0.559 | 0.145 | 0.384 | 1.62 |
0.926 | 6.63 |
DIRECT | (0.920, 0.935) | (0.521, 0.593) | (0.142, 0.146) | (0.318, 0.397) | (1.52, 1.74) | (0.922, 0.928) | (3.99, 7.94) |
SPLITIO | 3.99 |
||||||
(0.936, 0.942) | (0.651, 0.707) | (0.148, 0.149) | (0.447, 0.461) | (0.0, 0.0) | (0.927, 0.933) | (3.97, 7.41) |
Task success results of the LABEL-SEQ-DIRECT and SPLIT-IO instruction set architectures. Each entry shows the median normalized task success in the respective environment, with
The three SEARCH-series instruction sets showed little measurable difference in performance for the Logic-9, Match-12, Fibonacci-32, Sort-10, Limited-9, and Navigation environments (
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
SPLITIO | 23.19 | 54.58 | 0.307 | 8.139 | −1.03 | 5.477 | 2.909 |
(23.05, 23.25) | (52.46, 58.54) | (0.261, 0.335) | (8.027, 8.318) | (−1.04, −1.02) | (5.014, 5.912) | (1.121, 3.382) | |
SEARCH | 23.02 | 0.313 | 8.188 | −1.02 | 5.393 | 3.150 | |
(22.87, 23.17) | (46.33, 52.21) | (0.265, 0.335) | (8.042, 8.273) | (−1.03, −0.98) | (5.177, 5.745) | (1.708, 3.431) | |
GOTO | 23.13 | 0.311 | 7.946 | −1.04 | 5.598 | 2.584 | |
(22.90, 23.21) | (48.27, 53.34) | (0.232, 0.337) | (7.853, 8.080) | (−1.05, −1.02) | (5.272, 5.850) | (1.084, 3.219) | |
GOTOIf | 0.283 | 7.937 | −1.04 | 5.840 | 2.283 | ||
(22.61, 23.06) | (44.61, 52.01) | (0.223, 0.336) | (7.844, 8.070) | (−1.05, −1.01) | (5.624, 6.059) | (1.322, 3.028) |
Fitness results of the SEARCH-series instruction set architectures. Each entry shows the median log
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
SPLITIO | 0.937 | 0.694 | 0.149 | 0.449 | 0.0 | 0.927 | 7.28 |
(0.934, 0.942) | (0.658, 0.719) | (0.148, 0.149) | (0.446, 0.463) | (0.0, 0.0) | (0.926, 0.930) | (3.99, 8.03) | |
SEARCH | 0.626 | 0.149 | 0.448 | 0.929 | 7.67 |
||
(0.932, 0.941) | (0.584, 0.652) | (0.148, 0.150) | (0.445, 0.455) | (0.0, 0.0) | (0.927, 0.932) | (4.60, 8.01) | |
GOTO | 0.940 | 0.148 | 0.447 | 0.930 | 6.43 |
||
(0.935, 0.943) | (0.608, 0.676) | (0.147, 0.149) | (0.444, 0.452) | (0.0, 0.0) | (0.928, 0.933) | (3.99, 7.61) | |
GOTOIF | 0.149 | 0.447 | 0.929 | 5.78 |
|||
(0.928, 0.939) | (0.569, 0.650) | (0.148, 0.150) | (0.445, 0.448) | (0.0, 0.0) | (0.925, 0.933) | (4.15, 7.57) |
Task success results of the SEARCH-series instruction set architectures. Each entry shows the median normalized task success in the respective environment, with
In the SEARCH-GOTO instruction set, we initially tested a variant of the jmphead instruction, which changed the default head it operated on to be the flow head. A notable and often significant drop in fitness was observed in all seven environments with these two instruction sets, leading to the architectures explored here.
The FLOW-series instruction sets tested three groups of flow control instructions separately and in several combinations (
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
SEARCH | 23.14 | 48.59 | 0.313 | 8.061 | −1.02 | 5.571 | 3.022 |
(23.01, 23.23) | (46.46, 51.41) | (0.243, 0.346) | (7.990, 8.176) | (−1.04, −1.01) | (5.334, 5.867) | (2.151, 3.387) | |
MVH | 0.277 | 8.124 | 5.474 | ||||
(22.38, 23.05) | (39.76, 46.35) | (0.216, 0.347) | (7.990, 8.270) | (−1.01, −0.69) | (5.181, 5.771) | (3.729, 4.052) | |
IF0 | 47.99 | 0.296 | 7.995 | −1.04 | 5.553 | 3.229 | |
(22.62, 23.07) | (45.02, 51.50) | (0.258, 0.323) | (7.857, 8.099) | (−1.05, −1.02) | (5.326, 5.855) | (2.919, 3.549) | |
IFX | 46.40 | 8.037 | 5.804 | ||||
(22.33, 22.86) | (44.51, 48.76) | (0.168, 0.264) | (7.964, 8.198) | (−1.07, −1.03) | (5.460, 6.243) | (3.861, 4.312) | |
IF0-IFX | 46.00 | 8.011 | 5.595 | ||||
(22.65, 23.01) | (42.64, 49.03) | (0.201, 0.308) | (7.952, 8.078) | (−1.09, −1.07) | (5.292, 6.071) | (4.113, 4.615) | |
IFX | 0.311 | 8.063 | −1.00 | 5.751 | |||
MVH | (22.74, 23.11) | (38.48, 44.64) | (0.244, 0.346) | (7.980, 8.193) | (−1.03, −0.92) | (5.522, 6.061) | (4.244, 4.983) |
IF0-IFX | 7.995 | −1.01 | 6.077 | ||||
MVH | (21.55, 22.37) | (39.12, 43.84) | (0.189, 0.283) | (7.849, 8.066) | (−1.05, −0.91) | (6.723, 6.625) | (4.576, 6.457) |
Fitness results of the FLOW-series instruction set architectures. Each entry shows the median log
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
SEARCH | 0.943 | 0.623 | 0.148 | 0.448 | 0.0 | 0.928 | 7.55 |
(0.939, 0.946) | (0.584, 0.648) | (0.147, 0.149) | (0.444, 0.452) | (0.0, 0.0) | (0.926, 0.932) | (5.51, 8.04) | |
MVH | 0.938 | 0.563 | 0.150 | 0.465 | |||
(0.930, 0.946) | (0.532, 0.593) | (0.149, 0.150) | (0.452, 0.476) | (0.0, 0.0) | (0.941, 0.948) | (8.40, 8.93) | |
IF0 | 0.611 | 0.148 | 0.447 | 0.0 | 0.932 | 7.78 |
|
(0.924, 0.937) | (0.581, 0.647) | (0.147, 0.149) | (0.446, 0.451) | (0.0, 0.0) | (0.930, 0.935) | (6.52, 8.12) | |
IFX | 0.607 | 0.147 | 0.447 | 0.0 | 0.931 | ||
(0.916, 0.935) | (0.577, 0.622) | (0.146, 0.148) | (0.445, 0.453) | (0.0, 0.0) | (0.929, 0.933) | (8.43, 9.67) | |
IF0-IFX | 0.937 | 0.594 | 0.148 | 0.451 | 0.0 | 0.931 | |
(0.932, 0.941) | (0.550, 0.626) | (0.147, 0.149) | (0.446, 0.459) | (0.0, 0.0) | (0.928, 0.935) | (8.75, 10.12) | |
IFX | 0.945 | 0.459 | |||||
MVH | (0.940, 0.951) | (0.508, 0.576) | (0.149, 0.151) | (0.451, 0.467) | (0.0, 0.0) | (0.937, 0.947) | (9.10, 10.64) |
IF0-IFX | 0.150 | 0.451 | |||||
MVH | (0.901, 0.935) | (0.516, 0.577) | (0.148, 0.150) | (0.447, 0.459) | (0.0, 0.0) | (0.935, 0.945) | (10.23, 13.40) |
Task success results of the FLOW-series instruction set architectures. Each entry shows the median normalized task success in the respective environment, with
Individually, the IF0 instruction group made virtually no difference in performance among any of the seven environments. When tested in combination with the other instruction groups, there is no clear indication of interaction, positive or negative.
The IFX instruction group both individually and in combination with other groups shows positive gains in the Navigation environment, both fitness and task success. This outcome is likely due to the nature of the signposts in this environment
The third instruction group, MOVHEAD, shows the greatest variation in performance among those tested. In the Logic-77 environment, all instruction sets containing the MOVHEAD group show substantial decreases in median fitness, 14.3% on average. The two combination sets containing MOVHEAD, FLOW-IFX-MOVHEAD and FLOW-IF0-IFX-MOVHEAD, also show corresponding decreases in task success in the Logic-77 environment. The Sort-10, Limited-9, and Navigation environments, on the other hand, show substantial improvements in task success, and often fitness, for all three instruction sets containing the MOVHEAD group. The Navigation environment, notably, approaches median task success around 1% when the IFX and MOVHEAD instruction groups are combined, indicating the importance of effective flow control for that environment. The Sort-10 environment improvements are difficult to observe from median values. Indeed the greatest driver of the improvements are infrequent outliers approaching 0.7% task success, the highest ever observed in the Sort-10 environment (see
We have investigated the evolutionary potential of six groups of modified instruction set architectures of a digital evolution system, each within seven different computational environments (see
The evolutionary potential of the architecture selected as the basis for further experiments in each series (shown in bold) is displayed (right) for the Logic-9, Logic-77, Match-12, Fibonacci-32, Sort-10, Limited-9, and Navigation environments, respectively. Up arrows (black) indicate increased potential, down arrows (gray) indicate decreased potential, and double ended arrows (white) denote no significant trend. In general, FA (fully-associative) and Split-IO (separated input and output operations) demonstrated broadly beneficial impacts on evolutionary potential. The remaining tested modifications highlight the robustness of digital evolution, exhibiting no systematic effects on evolutionary potential.
Two groups of instruction-set modifications yielded broadly beneficial changes in both fitness and task success. The FULLY-ASSOCIATIVE (FA) architectures instruction data flow enhancements led to highly significant gains in five of the seven environments. The remaining two environments, Sort-10 and Navigation, show some slight improvement and no discernible difference, respectively. The second group that demonstrated broadly positive results was the SPLIT-IO instruction set. The separation of the input and output operation allows finer-grained data flow between the CPU and the environment. This control afforded by the SPLIT-IO architecture was beneficial to the same five environments as the FA architecture. The Navigation environment showed no particular change in fitness performance, and a small, but insubstantial change in task success. The only major detriment to the splitting of input and output operations was observed in the Sort-10 environment. As a whole, these two groups indicate that it is beneficial to maintain as much flexibility as possible with regard to instruction interactions. This flexibility allows evolution to finely tune interactions, yielding greater evolutionary potential.
The REGISTER-series, LABEL-series, and SEARCH-series architectures all demonstrated no discernible trend in performance, despite representing 17 of the 25 tested architectures. There were some particular environment/instruction set combinations that had significant variations, yet these were rarely substantial in nature. It is particularly surprising that the REGISTER-series instruction sets showed such minimal deviation, given that going from the FA architecture to the R16 architecture represents a greater than five-fold increase in working set and a 50% increase in instruction set size. Similarly, the LABEL-SEQ-BOTH instruction set represents a 20.6% increase in instruction set size, with no substantially negative effect. Taken together these groups provide additional evidence that the evolutionary process is rather robust to genetic language dilution
The FLOW-series of instruction set architectures represents a third class of outcomes, yielding improved results in a subset of environments and degradation of performance in one environment. The Sort-10, Limited-9, and Navigation environments all show substantial gains in both fitness and task success metrics when using instruction sets containing the IFX and MOVHEAD instruction groups. The Logic-77 environment, on the other hand, shows a notable drop in performance. It is possible that this environment does not require a great deal of flow control, thus is being negatively affected by the disruptive nature of the additional flow control instructions. In environments where flow control decisions are critical for success, such as the Sort-10 and Navigation environment, the benefits of more flexible flow control outweigh their disruptive effects.
The Sort-10 environment stands out as the only example where a single, small change – splitting the input and output instructions – made a large destructive difference in performance. Median task success collapsed to be statistically indistinguishable from 0, and remained there despite further beneficial instruction set modifications. These results are likely an artifact of the environment itself, rather than a general trend. We set up the Sort-10 environment to control for random inputs and to, on average, provide no benefit unless active sorting was performed by an organism. However, the inputs for sorting are indeed a random sample of 10 integers. It is possible, due to chance, for a partial ordering of numbers to yield a positive metabolic reward even if the sequence of inputs is simply echoed back to the environment. When using instruction sets featuring the paired-input-and-output instruction, simply mutating this instruction into the section of the genome responsible for replication may be enough to confer the echo capability, presenting an opportunity for lucky organisms to occasionally reap rewards. When the operations are split into two separate instructions, it then requires two coordinated mutations to confer the echo capability and doubles the execution cost for performing the task. The combination of these factors most likely contributes to the observed drop in median performance.
Instruction data flow, working set size, and flow control are the three main features addressed by the six groups of instruction set modifications presented here. All of these features play an important role in implementing a successful sorting algorithm. Despite the modifications in the instruction set architectures we tested, no significantly beneficial change was observed in either fitness or task success within the Sort-10 environment. Most likely, the highly constrained memory size of these architectures limits the potential within this environment. In fact, a hand-written organism that performs the task successfully with the Heads architecture requires nearly every single stack location in both available stacks. Another factor limiting potential may simply be the time allotted for evolution, which was held constant in our current study. The additional flow control instructions tested in the Flow -series architectures show some signs of improved success in this environment, with numerous outlier populations. Given additional time to evolve, these and other populations would likely be able to refine the emerging solutions.
When features from all six instruction set groups are combined to form the HEADS-EX architecture, significant and substantial improvements relative to the base HEADS architecture are observed in six of the seven environments (
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
HEADS | 19.44 | 13.50 | 0.194 | 3.453 | −0.47 | 4.328 | 1.656 |
(17.74, 19.79) | (11.67, 15.30) | (0.168, 0.248) | (3.216, 3.858) | (−0.61, −0.33) | (4.157, 4.445) | (1.108, 3.606) | |
IFX | |||||||
MVH | (22.74, 23.11) | (38.50, 44.56) | (0.245, 0.347) | (7.980, 8.189) | (−1.03, −0.92) | (5.517, 6.049) | (4.244, 4.953) |
Fitness results for the base HEADS and the HEADS-EX instruction set architectures. The HEADS-EX architecture includes features from all six tested feature groups, including fully associative arguments, six registers, direct-matched labels, split-I/O, directional search instructions, the ifx instruction, and conditional mov-head instructions. Each entry shows the median log
Logic-9 | Logic-77 | Match-12 | Fib.-32 | Sort-10 | Limited-9 | Navigation | |
HEADS | 0.834 | 0.185 | 0.146 | 0.202 | 1.42 |
0.908 | 4.72 |
(0.752, 0.844) | (0.162, 0.211) | (0.145, 0.147) | (0.177, 0.228) | (1.08, 1.66) | (0.897, 0.914) | (3.99, 8.23) | |
IFX | |||||||
MVH | (0.940, 0.951) | (0.507, 0.577) | (0.149, 0.151) | (0.451, 0.467) | (0.0, 0.0) | (0.937, 0.947) | (9.12, 10.62) |
Task success results for the base HEADS and the HEADS-EX instruction set architectures. The HEADS-EX architecture includes features from all six tested feature groups, including fully associative arguments, six registers, direct-matched labels, split-I/O, directional search instructions, the ifx instruction, and conditional mov-head instructions. Each entry shows the median normalized task success in the respective environment, with
It is clear from this present study that we have just started to identify the most effective genetic hardware for adaptive evolution in digital organisms and there remains room for significant future improvement. Indeed, our current study has focused on modifications within the framework of von Neumann machine code formalisms. We expect that further studies of instruction set architecture enhancements for evolvable systems, both within the limits of von Neumann architectures and the broader range of programming formalisms, will unlock this potential, facilitating advancements in the application of digital evolution and artificial life.
The authors would like to thank Jeff Barrick, Matt Rupp, Chris Strelioff, Aaron P. Wagner, Bess Walker and the members of the MSU Digital Evolution Laboratory for comments on the manuscript.