Dynamics of Complex Systems Built as Coupled Physical, Communication and Decision Layers

This paper proposes a simple model to capture the complexity of multilayer systems where their constituent layers affect, and are affected by, each other. The physical layer is a circuit composed by a power source and resistors in parallel. Every individual agent aims at maximizing its own delivered power by adding, removing or keeping the resistors it has; the delivered power is in turn a non-linear function that depends on the other agents’ behavior, its own internal state, its global state perception, the information received from its neighbors via the communication network and a randomized selfishness. We develop an agent-based simulation to analyze the effects of number of agents (system size), communication network topology, communication errors and the minimum power gain that triggers a behavioral change on the system dynamic. Our results show that a wave-like behavior at macro-level (caused by individual changes in the decision layer) can only emerge for a specific system size. The ratio between cooperators and defectors depends on the minimum gain assumed—lower minimal gains lead to less cooperation, and vice-versa. Different communication network topologies imply different levels of power utilization and fairness at the physical layer, and a certain level of error in the communication layer induces more cooperation.


I. INTRODUCTION
The modernisation of large-scale engineering infrastructures brings to the table new challenges to their already complicated design [1]- [3].For example, electric power grids are large-scale engineering systems built to generate, transmit and distribute electricity from generators to end-users [4]- [6].Although their technological development has never stopped, a strong political demand for a structural change is taking place.Such change basically consists in decentralising generating units (e.g. from nuclear to solar panels and wind turbines), spreading of electric vehicles (which are mobile batteries and loads) and controlling demand based on information technologies; all in all, the traditional consumer is predicted to become a prosumer: a consumer who participate more actively in the grid management either by supplying electricity or decreasing their consumption.Modern power grids will become more dynamic and distributed.this arXiv:1504.04235v2[cs.MA] 17 Apr 2015 will bring new complexities to their already complex dynamic together with new research challenges to cope with them.
The same trends -although in with their own specificities -can be seen when analysing the modernisation of other large-scales systems, from smart cities [7], [8] to factories of the future [9] or the 5-th generation of cellular systems [10].As in power grids, new complexities in those systems will emerge followed by a need for a new body of knowledge.Notwithstanding the unquestionable technical evolution, there is still a limited number of simple analytic models that are able to capture the dynamics of these modern systems, where the physical infrastructure, the information network and regulations affect, and are affected, by each other dynamics.In this context the present article proposes a discrete-time agent-based model assuming these three layers as constitutive parts of a multilayer system composed by an electric circuit as the physical infrastructure, a communication network where agents exchange local information and a set of regulations that define the agents' behaviour.
The proposed model is built as follows.The electric circuit is composed by one constant voltage source including its inner resistance and resistors (loads) in parallel [11].The resistors in parallel are grouped by their controlling agent, as shown in Figure 1.Every agent may switch on or off one of the resistors under its control at every time step.One can expect that, the greater the number of active resistors a given agent has, the more power is delivered to it.The actual delivered power is, however, a non-linear, concave, function (as presented in Figure 2) of the electric current flowing in the circuit; there exists then a saturation point where adding more resistors will decrease the delivered power for the whole system.
Looking at the whole system, a "tragedy of the commons" kind of problem arises [12], where add a resistor is individually beneficial, while socially harmful.However, in our case, the resource recovers very quickly.The agent's decision regarding the resistors (adding, subtracting or maintaining) is built upon the following criteria: the behaviour of the agent's neighbours at the last time step, the previous state of the whole system and its own selfishness gene.As we will discuss later, this interactive decision procedure resembles the prisoners dilemma (e.g [13]- [15] and references therein) In this case the agents' neighbourhood is defined by the communication network (where links can be in error), and the selfishness genes of the agents are independent and identically distributed random variables.

A. Complexity sciences
Before we start presenting our contribution in more details, it is worth indicating the theoretical ground that supports our findings: complexity sciences.Complexity is a term used in several diverse research fields [16], [17], from theoretic physics to social sciences and biology, to characterise a state that is neither completely deterministic nor random.The so-called complex behaviour emerges in systems whose elements interact; they may be heterogeneous and may also adapt their relation rules in accordance to internal and/or external factors.
In its extensive work [18], Wolfram has shown that simple interaction rules applied in one-dimensional cellular automata may lead to unexpected intricate patterns -defined therein as complex -over time.
this work tells us that, when looked at a higher level, the spatial-temporal dynamics of a fairly simple deterministic system composed by homogeneous agents that follow fixed interrelation rules may generate complexity.This fact suggests that decentralised systems, based only in local information, might be functional without any controlling entity.In Wolfram's case the spatial-temporal pattern is determined by the interaction rule1 .
For some researchers, this fact indicates the system is able to self-organise without any explicit centralised controller.We can cite here few illustrative examples regarded as self-organised [16]: ants working in colonies, neurons building a capable brain and birds flying in groups.As an interesting counter point to this perspective, one may argue that the interaction rule that the agents follow is per se a kind of central control or a strict regulative force [19].By using this view, many questions may be posed: From where the interaction rules comes from?Are they evolving?Are they changing in a much slower time-scale that can be considered as given?These questions in fact still cause hot philosophical debates among biologists, economists, social scientists and other theorists concerning who controls the "invisible hand"; more details about these debates can be found in, for example, [20] and references therein.
When dealing with large-scale engineering systems (the focus of this paper), these questions seem to have clear answers as far as the system is designed and follows pre-defined requirements.This is true for some systems, but it is far from being a universal feature, mainly when the infrastructure is heavily dependent on human actions and interactions.Road networks provide an educative, well studied, example of this when one tries to understand the formation of traffic jams [21].Without going into further details, the key for solving the traffic puzzle is not found by looking at what happens in individual cars or in the design of the whole transportation infrastructure.While these aspects are necessary conditions to the formation of the jam, they are not sufficient to explain the phenomenon.The most accepted theory is built upon the interactions between cars and reactions to the individual behaviours within a specific region of the road network; one car slowing down in highly dense highway causes other nearby cars to slow down as well to avoid collisions, which in turn may trigger a traffic jam that will fade away after some time.
This simple example identifies few important characteristics of complex phenomena: they are spatial and temporal, they never reach stable equilibrium states, the context where the individuals interact and its perception are important to individual decisions, and one individual action might cause a local change that might also trigger changes in the global state of the system.More details about the so-called complexity sciences can be found in, for instance, [22,Ch.1].

B. Contributions
Motivated by a growing literature in complexity sciences in general and their application in engineering systems in specific, this articles present a new perspective to analyse multilayer, strongly coupled, systems.
We construct a simple, while illustrative, multilayer model composed by agents that control a set of resistors in an electrical circuit.These agents play an evolutionary "prisoners' dilemma" style of game to decide if they should collaborate or not, based on the local information gathered from their communication network, the estimated state of the whole system and their own random selfishness.Our results indicate that: (i) a wave-like behaviour at macro-level spatio-temporal dynamics, which is caused by changes in individual behaviours at the decision layer, can only emerge for a specific system size, (ii) the ratio between cooperators and defectors depends on minimum gain assumed -lower minimal gains lead to less cooperation and vice-versa, (iii) different communication network topologies -ring, Watt-Strogatz-Graph and Barabasi-Albert-Graph [23] -lead to different levels of power utilisation and fairness at the physical layer, and (iv) a certain level of error in the communication layer leads to more cooperative behaviour at the decision layer, affecting the physical layer dynamics in terms of power utilization and fairness accordingly.

II. RESULTS
In this section we present the main results of this report.Before starting, we think it is worth describing here the agents' decision process, which is a key point for understanding our results.We also systematise in Table I some useful notation to facilitate our analyses.More details about the multi-layer system proposed here can be found later in Section IV (Methods).

A. Agents' decision process
We assume a discrete-time system such that the changes in the agent behaviour occur in time-steps, denoted by t ∈ Z.At every time-step t, each agent wants to maximise its own power, so their interactions can be then viewed as a round-based game [14].To achieve that goal, the agent has three options: add a resistor, remove it, or do nothing.Table II shows how we classify the agent behaviour.
To make a decision at time t, every agent i looks at what his gain from the previous strategy S i [t − 1] was, as to decide its new state S i [t].The decision process for the agent i is the following.If the gain λ i [t − 1] is greater than or equal to a system-wide pre-defined minimum λ min , the agent sticks to its (successful) strategy at time t, i.e.
then agent i compares its strategy with its neighbourhood N i , which will be defined later in this section.If the majority is cooperative, then j∈N i S j [t − 1] < 0 and the agent under analysis will also cooperate, leading to S i [t] = −1.Otherwise, the agent draws a random number between 0 and 1 to be compared to its own selfishness gene s i (which is also randomly generated as discussed later) in order to decide whether it will start cooperating.If it does not cooperate, it again draws a random number to be compared to the selfishness gene s i , but now to decide if stays inactive (i.e. S i [t] = 0) or adds another load in the circuit (i.e. S i [t] = +1).

B. Communication network
In the multi-layer system proposed here, agent i knows the state S j [t − 1] of the agents j ∈ N i through a communication network.We assume that agent j always transmit its actual state S j [t] to agent i.The neighbourhood N i of agent i is defined as the agents j ∈ A\{i} that are directly linked with it.In the case of ring topology, the cardinality of N i is 2 for all agents i ∈ A. For more complex network topologies, N i will be characterised differently [23], as discussed later.
The communication links can also experience errors.An error event means the received message by agent i contains a different information than agent j has sent.Let S j→i [t−1] = S j [t−1] be the state information transmitted from j to i at time-step t and Ŝj→i [t − 1] be the information received by i.We consider that error events are independent and identically distributed such that Pr Ŝj→i [t − 1] = S j→i [t − 1] = p err for all t ∈ Z, i ∈ A and j ∈ N i .It is worth mentioning that the network is a bidirectional graph so that an error event at i → j does not imply an error event at j → i, and vice-versa.
If an error event happens, the received information Ŝj→i [t − 1] will be uniformly and identically distributed between the other possible states.For example, if S j→i [t − 1] = −1 and error happens, then Ŝj→i [t − 1] = 0 or Ŝj→i [t − 1] = +1 will happen with 50% chance each.

C. Physical system
For the physical systems presented in Figure 1, there exists a certain number of resistors that leads to the maximum power gain in the system, as shown at Figure 2. If the delivered power is below the maximum on the right, then there will be a gain by removing a resistor until the system has reached such point.Conversely, if it is below on the left, then there will be a gain by adding a resistor.
In this case, one may ask the following question: Is it possible to reach the optimal point while fairly delivering power among the agents (i.e. they consume about the same amount of power)?In the presence of a central control unit this would be a fairly easy problem: first finding the number of resistors that lead to optimal power, and then fairly allocating them among the agents by some kind of centralised coordination mechanism like time division in computer networks or cellular systems [24].For example, if there are ten agents and the optimal number of resitors is twenty, then the central control coordinates the behaviour such that all ten agents have two active resistors, summing up twenty.However, as discussed before, our model does not consider the presence of a central control and the agents have a limited knowledge about other agents.
At time-step t, the power each agent consumes P i [t] with i ∈ A = {1...N } and is given by: where is the number of active resistors the agent i possesses, r i [t] is the number of active resitors in the system excluding the source resistor R V and the ones controlled by agent i, and The physical system is then described by its size N , the ratio µ of the resistance values and the power source V .The resistors are scaled so that the optimal average number of resistors a * avg is independent of N while the voltage might be scaled with

√
N to have a constant ratio of power per agent, as explained in Section IV.The gain that agent i experiences at time-step t is then defined as: This implies that the agents only use the information about the previous time-step t − 1.If we expand (2) using ( 1), the equation that determines λ i [t] becomes more complicated.To make the analysis clearer, we choose to apply the following approximation (more details in Section IV): such that the gain λ i [t] is now a function of the variations in agent i's own number of resistors ∆a i and in the number of resistors controlled by other agents as the average number of resistors a avg [t] and the system parameters N and µ.
Let C t ⊆ A be the cooperative agents at time-slot t.In this case, is the cardinality of the set.The spatial-temporal average number of cooperators in the system (c avg ) is then: Figure 3 shows the change of the system behaviour with varying size N .The left side shows the changes in the spatial-temporal average number of cooperators in the system c avg as a function of the system size N , while the right side shows a representation of a typical spatial-temporal system behaviour for three different sizes N .In the latter, each line of vertical pixels represents the state of the system for one time-step: white means cooperation, red means defection and black means doing nothing.
For small sizes (as when N = 10), one can see a kind of checkerboard pattern where cooperators and defectors alternate on time and space axes.For middle-sized systems (as when N = 100), the most striking feature is that there exist certain points in time when sudden changes in the behaviour happen.The system seems to move closer and closer to a global cooperation state until suddenly it falls back to a state with much less cooperation.Such pattern becomes even more pronounced for a meshed communication network built as a Watts-Strogatz graph [23], as seen in Figure 4.For larger systems (as when N = 1000), one can see that a pattern where cooperation is dominant and only few stripes of defecting appear.As we will see next, this behaviour is not due to simple scaling effects in the system variables, but rather it results from inherent scaling effects within the physical layer.In other words, certain behaviours can be only observed for certain system sizes N .Therefore, for a given N , one cannot pre-set the system variables expecting a certain kind of behaviour.Rather, the size of the system is itself a variable that influences its on macro-level behaviour.
Figure 5 shows how the minimal gain λ min affects the agents' behaviour quantified by average number of cooperators in the system c avg for two different system size N .One can see that bigger values of λ min lead to more cooperation in the system and different behaviour patterns.But only in mid-sized systems (as when N = 100) one can see wave-like patterns for certain ranges.We can then infer that the system is dominated by the agent behaviours for high and low thresholds of λ min , while complex behaviour only emerges in few cases where a proper interplay between the layers occurs.
To have a better understanding on how the minimal gain influences the behaviour, we need to analyse (3).We find that the upper limit for the tipping point is given by 1 λ min (more details in Section IV), meaning that after such a point the gain of adding another resistor for a single agent becomes too small.This stands in contrast to the global optimum, which represents the tipping point after which another resistor added to the system as a whole will result in reduced power delivered to the agents.Only in the case that all agents behave equally at every point in time, i.e. a i [t] = a avg [t] ∀ i ∈ A, the tipping point for each agent also becomes the global optimum of a avg [t] = µ.
As the system grows, the feedback that each agent can infer is then reduced, so is its influence on the system as a whole.For large systems the deviation of a single agent from the average of the rest of the system has little influence on the average of the whole system.This reduced feedback leads to a situation where agents will become trapped in a state of what we call cooperative solidarity.Agents in this trapped state have reduced their number of resistors to minimum (i.e.only one active resistor), while having still not seen a positive gain.Consequently, they remain cooperative as do their neighbours.
In this scenario, all resources of the system will be used by the agents that are not yet trapped, leading to high inequality levels.For large systems the feedback about individual behaviours are so small that the only way to skip from this solidarity trap is, surprisingly, through communication errors.In this case, the trapped agents (wrongly) believe that some neighbours stopped cooperating.
For middle-sized systems, however, the gain may become large enough if a sufficient number of agents synchronise in reducing their number of resistors.This results in a positive gain to all the trapped agents since the system recovers from a state of overusing.This positive gain then leads to agents that no longer seek cooperation as they are already in the lowest state possible and the system has recovered.This in turn explains the wave-like behaviour in mid-size systems.
Figure 6 shows the system from a different perspective.Let us first define the power utilisation P util as the fraction of power that is utilised by the system and the available power: where P i,avg is the time average power of agent i computed as: .
By looking at the power utilisation P util and the average number of resistors a avg , one can see that both are related to each other 2 .How close the system can operates to the optimum wildly varies between the different system sizes.The only system that can operate very close to the optimum is the very small system with a size of only five agents (N = 5).We also see the sudden jumps in behaviour for mid-size systems, while large ones appear to be the most stable.
To better understand what happens in large systems, we analyse Figure 7, which illustrates the changes in power utilisation and inequality (fairness) in power usage between the agents depending on the size of the system N and different topologies of communication network.The inequality (fairness) is measured here by the Gini index [25]: where total equality is G = 0 and the biggest inequality is G = 1.
We see that, for small systems, the results are very close together specially for the complex network Watt-Strogatz and Barabasi-Albert [23].When the system size increases, on the other hand, one can see a growing difference between the results of different topologies, suggesting that the communication layer plays a big role in the system dynamics.
One can also see a drastic change for the Barabasi-Albert network with m = 4 that shows a much lower power utilisation than all other networks.This is due to the fact that the lowest degree of any node in this network is four.This means that, for a given agent breaking free from solidarity trap, at least four communication errors must happen, instead of two in the ring topology or only one in Watt-Strogatz (for the nodes with a degree of one).Therefore, the power is underutilised for the Barabasi-Albert network.
The Gini index analysis indicates, that when the system size grows, few agents receive most ot the power, which from a global perspective makes the system very stable.We also see how the outcome is dominated by the structure of the communication network when large systems are considered.
Figure 8 presents more evidence on how the communication system starts dominating the system dynamics when the system size N grows.For small systems, one can only see a small reaction with a lot of scattering when rising the communication error probability.Surprisingly we can see global maxima appearing for mid-size and large systems.This fact may indicate that the communication error probability has a similar effect on our systems as temperature on the susceptibility of physical systems [26].When p err = 0, the system can become trapped in a state of solidarity with 100% cooperation (minimal power consumption).With increasing the error probability, the system has a random aspect that allows for agents to defect.However, if the error probability becomes too high, the state information exchange becomes worthless and the system is dominated by the randomness.

III. DISCUSSIONS
We believe that our proposed multi-layer system can indeed emulate features of a real-world system with coupled physical, communication and decision layers.Nonetheless, our framework has not been developed to model any specific infra-structure.Our idea is to construct a toy-model where the components are rather simple and easy to understand, but where unpredictable behaviour can emerge in certain circumstances, resembling real-world phenomena as modern power grids [5] or cities [8].As previously mentioned, large-scale systems built upon those layers are getting more and more usual.In any case, we believe that it is also worth discussing the design of the components employed herein.Let us first deal with the rules the agents follow.One might compare the agents to humans or machines acting in behalf of humans.In either case, the possible variety of behaviours may be as large as infinite.The same can be said for the interactions that occur between the agents.In this work, the decision procedure and the communication network connections have been arbitrarily chosen to be understandable and justifiable.
Although our simulation assume the network topology and the decision rule (including the selfishness gene and the minimal power gain) as given, both of them could be evolved as part of the simulation.
We could argue that our model allows for complete explanations in the sense that we need not to find explanations that are external to the system; everything can be constructed within the system domain.
This, however, might mask the results by, for example, slowing down changes in behaviour [19,Ch.5].
For the basic agent principle of maximising power, it is important to remember that usually power consumption is just a by-product of making life more comfortable and less manual labour intensive by using more loads in the electricity power grid.Similarly to humans, the agents assumed here are most of the time reluctant to remove a resistor once it is installed.They only consider to do so when they did not had a large enough gain from adding a resistor (which means that the system might be overused) or when the social pressure is too much (by being non-cooperative in a cooperative neighbourhood).
To further justify the concept of cooperative solidarity, one has to understand the prisoners-dilemmatype of situations [15].If the system is close to the optimal point, the agents see a very low gain so that they would have to change their strategy.If most of them reduce their number of resistors, they might receive a little less power (in the case of the system being on the left side of the optimum) or they receive a little more by getting closer to the optimum.However, if only a certain percentage of them reduces their number of resistors while others raise their numbers, the latter gain more power even if the system is very overused.If most of them add resistors, however everyone is worse off than before, which resembles the tragedy of the commons [12].
One of the core findings of this report is that, not only is it possible to create systems with complex behaviour through a combination of a few simple parts (as in [18]), but every layer of the system has an influence.And in some cases, one layer can even dominate the global system behaviour.A special note should be taken in the strong size-dependency of the model, since as far as real-world engineering systems are always subject to changes in the number of users after the initial deployment, leading to unforeseen situations (e.g.power grids [5], cities [8] or highways [21]).
Another interesting aspect can be found in the system behaviour in response to communication errors.As was shown in Figure 8, a sharp peak exists for the number of cooperators in the system for a communication error probability of about 1%.From an engineering standpoint, one might prefer a system without errors, which (ideally) leads to a stable and a more predictable behaviour.However, the existence of even a small amount of errors leads to a significant change in the behaviour and sometimes it might even be desired (e.g. to unfreeze the system from solidarity trap).
Let us assume that the system should work on a state of very high cooperation.Our results indicate that an external attacker trying to disturb the system does not have to shut down the whole communication network to break the dominant cooperative state of the whole system.Rather, it would suffice to generate a small amount of randomly-generated erroneous messages as this will unfreeze the system.This might impose a new systemic dynamics.For an attacker, this would then mean that he does not have to capture the whole communication network to disturb the desired behaviour, due to the coupled nature of the system as a whole.Consequently, security precautions should be designed accordingly.

A. Physical layer
The physical layer used in this paper is depict in Figure 1.The value of the resistors R that are under the control of the agents i ∈ A should be scaled with the number of agents in the system so that where R 0 is a constant value, arbitrarily chosen, independent from the system size.
If all agents have in total k appliances, the equivalent resistance of the system is R eq = R k (k resistors of R in parallel) [11].The system starts with the different agents having a random number of resistors so that the equivalent resistance is above the system-wide optimal point R * eq = R V , computed in terms of power consumed by the agents.
Let us now describe the system from the point of view of a single agent i ∈ A. For agent i, all the other individual agents can be combined at time-slot t into where r i [t] stands for the number of resistors in the system that do not belong to agent i.The agent itself is then described by We can then derive the power P i [t] that agent i consumes at time-slot t by where V is the voltage source, I is the electric current passing through the source resistor R V , P typ = V 2 R V and µ = R R V .Note that we scale the voltage with the square-root of the system size so that the power available for the agents stays constant.We could then write P i as where

B. Communication layer
The communication layer allows for exchange of information between the agents, which is a necessary condition to coordinate their actions in the system.In this case, we need to describe what kind of information is sent by the agents and how they build links creating then their neighbourhood set.
Every agent i ∈ A sends to their neighbours (to be defined next) a message containing its own state in that time step t such that S i [t] ∈ {−1, 0, +1}.The transmitted message may be detected in error with a given error probability p err ∈ [0, 1], which is independent from anything else and uniformly distributed.If an error event in the message detection happens, the receiving agent read one of the two other possible states with the same probability.In this case, the error probability p err is the only parameter we have control.
It is worth mentioning that, although we assume p err as given, it in fact is a result of the communication strategy used [24].Nevertheless, we believe that this more realistic approach goes beyond the focus of this work.
The topology of the communication network defines the neighbourhood set of the agents.In this report we focused on three classes of graphs: ring, Watts-Strogatzs (WS) and Barabasi-Albert (BA) [23].The ring network is defined in a way that agent i is connected with to the agents i − 1 and i + 1 (refer to Figure 1).In this case, agent 1 is connected to agent N and vice-versa, so the graph topology resembles a ring To define more complex neighbourhoods, we employ the Watts-Strogatzs (WS) and the Barabasi-Albert (BA) graphs for social networks.The WS graph is constructed as follows.The graph starts with a regular lattice where each node has exactly K neighbours.The links are then rewired with probability β resulting in a more random structure.
The BA graph is formed by adding the desired number of nodes step-by-step, starting with a small initial set.Each new node is then preferably connected to nodes with an already high degree (k) , with probability p i such that generating then a network whose degree distribution is given by a power law.

C. Agent behaviour
We explain here the behaviour of the agents and its relation to the physical and communication layers previously defined.Let us start in the scenario where none of the neighbours of agent i, defined by the neighbourhood set N i , is cooperative.This is the default state in the beginning.Agent i will then behave randomly according to its selfishness gene s i .The selfishness gene is a random number with uniform distribution assigned to every agent i ∈ A before start the system simulation itself.
The decision procedure is the following: a random number ξ is drawn.If ξ > s i , agent i will switch to the cooperative mode and remove one resistor.Otherwise, it will switch to one of the non-cooperative modes.This means that, when s i is big, agent i is more selfish.Conversely, agent i is more cooperative when s i is small For the non-cooperative modes, the agent will again draw a random number do decide if it will add a resistor if ξ < s i or do nothing otherwise .The decision process is consistent, but depends on random variables.Therefore the higher the selfishness, the higher the probability that an agent starts accumulating resistors.
The agent behaviour also depends on the minimal gain λ min that the agents needs to stick to their strategy.We assume that λ min is fixed and pre-defined before the simulation.The gain λ i [t] that agent i has at time-slot t is computed as Let us now assume that the functions of t are continuous so that we can use the total derivative rule as follows:

Now returning to the discrete domain, we consider ∆P
. Then, the gain λ i [t] can be evaluated as The advantage of proceeding in this way is that one can see that the that the gain not only depends on the amount of resistors agent i possesses and the behaviour of the system, but also on the system size N .
The higher the number of resistors agent i has, the smaller the first term is since ∆a i [t] can only be −1, 0, or +1.For a very large system (N → ∞) and agent i non-cooperative (i.e.∆a i [t]=1), agent i reaches the minimum gain when it reaches a max = 1 λ min .
For example, the minimal gain λ min = 0.0005 used for most of the simulation scenarios leads to a max = 2000, which is far above the optimal point a * avg = µ = 100.The second term, which has a negative leading sign, provides some sort of feedback.We could then split the term in the feedback from individual actions ∆a i [t] and external actions ∆r i [t].The negative leading sign means that this term will reduce the gain in case of rising amount of resistors (∆a i [t] = 1) or will deliver a positive gain in case of cooperation ∆a i [t] = −1 or keep its state by doing nothing ∆a i [t] = 0.In any case, the effect of such feedback will diminish with rising system size N .

D. Simulation tools
This research was carried out using different open source python tools including: NumPy [27], iPython [28], NetworkX [29] and Matplotlib [30].We would like to acknowledge the computing facilities of CSC -IT Center for Science Ltd. [31], which were used to run the simulation scenarios.
representing the physical layer of the system.The circuit is composed by a (V RV, and resistors of R parallel.resistors related to agents that can add, or resistors their control in the The minimum number of resistors an agent can have is one and there is no maximum.We N as the size of the system.Besides the physical layer, the agents are connected in a communication network so that a given agent has access to the information related to the previous action of their first-order neighbours [23].In the ring topology illustrated agent is connected with two other agents.In this case agent i is connected with agents i − 1 and i + 1 with i = 1, 2, ..., N .In the ring topology, agents 1 and N are neighbours.

TABLE II CLASSIFICATION
OF AGENT i ∈ A BEHAVIOUR BASED ON ITS ACTION AT TIME t Normalised power delivered P all [t]/Ptyp to the agents with rising number of resistors n[t] in the system, where Ptyp = V 2 R V and P all [t] = i∈A Pi[t] with Pi[t] given by (1).