Quid Pro Quo: A Mechanism for Fair Collaboration in Networked Systems

Collaboration may be understood as the execution of coordinated tasks (in the most general sense) by groups of users, who cooperate for achieving a common goal. Collaboration is a fundamental assumption and requirement for the correct operation of many communication systems. The main challenge when creating collaborative systems in a decentralized manner is dealing with the fact that users may behave in selfish ways, trying to obtain the benefits of the tasks but without participating in their execution. In this context, Game Theory has been instrumental to model collaborative systems and the task allocation problem, and to design mechanisms for optimal allocation of tasks. In this paper, we revise the classical assumptions of these models and propose a new approach to this problem. First, we establish a system model based on heterogenous nodes (users, players), and propose a basic distributed mechanism so that, when a new task appears, it is assigned to the most suitable node. The classical technique for compensating a node that executes a task is the use of payments (which in most networks are hard or impossible to implement). Instead, we propose a distributed mechanism for the optimal allocation of tasks without payments. We prove this mechanism to be robust evenevent in the presence of independent selfish or rationally limited players. Additionally, our model is based on very weak assumptions, which makes the proposed mechanisms susceptible to be implemented in networked systems (e.g., the Internet).


Introduction
Selfish behavior is becoming a subject of great concern and practical importance to network designers [1].Game Theory is the approach of preference to face the design of communication systems with (potentially) selfish entities.This has lead to the proposal of a number of interesting protocols and mechanisms for networks based on Game Theory concepts [2,3].However, in the study of networks under conventional models, a collection of simplifying assumptions are typically made.For instance, it is assumed that selfish users are rational, that they are homogeneous, that they can compute a Nash equilibrium, that their utility function is known, etc.However, there are many systems in which these assumptions assumptions are not very realistic.
In this paper we revisit the study of communication systems with selfish users (or players), reevaluating and relaxing the above-mentioned common assumptions.In particular, we propose the problem of analyzing and designing of a fair collaborative system under a very weak set of game theoretic assumptions.In this general context, we propose mechanisms to be used to implement this collaborative system with provable properties, like the fairness of the system and the truthfulness of its users.The mechanisms proposed can be applied to such varied technologies as social and crowd computing, Web 2.0, P2P, opportunistic networks, and cloud technologies.
As mentioned, we abstract the problem to be solved as the fair execution of tasks in a decentralized collaborative system.The main challenge when creating collaborative systems in a decentralized manner is dealing with the fact that system nodes may behave in selfish ways, trying to obtain the benefits of the tasks but without participating in their execution.(This is the realm of Game Theory, which has been instrumental to model collaborative systems and the task allocation problem, and to design mechanisms for optimal allocation of tasks.)We assume that all nodes have an interest on having the tasks done.However, establishing fair mechanisms for sharing the generated work-load is not immediate.(E.g., in current P2P systems, usually a low fraction of peers assume most of the required effort, and this causes reduced performance, lack of reliability, low incentive to participate for fair users, etc.)It would therefore be desirable that each node could take the responsibility of the execution of a balanced fraction of the tasks.
The objective is to establish some kind of protocol to share the task execution costs.For this, we need to consider the concept of ability or opportunity of execution.Let us assume that each node has some capacity for timely execution of a given task.This capacity may vary over time and with the type of task.For example, at a given time, a node may have free bandwidth but have full utilization of its CPU, while its situation could be the oposite at another time.Hence, at a particular moment, a node may have greater ability to perform tasks involving communication, while at a later time its situation may change to prefer tasks more intensive in CPU computation.
This opportunity or ability is related with the notion of task execution cost.In other words, we define the cost as some kind of metric measuring the capability of executing a particular task at a given time.Hence, the cost varies from one task to another (even when the task is the same, but at a later time).In Game Theory, closely related to cost, there is the notion of utility.We define the utility as the cost savings associated with a work not done.Hence, given that all nodes are interested in the execution of the tasks, a node gets more utility whenever it avoids running tasks by letting other nodes to do it.
Clearly, when trying to formalize a model based on these notions, a number of problems arise.First, node's costs are only known by the node itself.For external entities it would be difficult to audit or check if a given particular node has more or less CPU capacity.In Game Theory, this concept is called private information.For obtaining the private information of a node, the basic mechanism is to directly ask for it and expect the node to declare its value correctly.
For us, each node is a computing node that belongs to a user who can alter her node's behavior for her own benefit (i.e., may declare false costs trying to avoid the execution of tasks).Whenever this happens, we claim that the user acts in a selfish way.This selfishness is one of the factors that may distort the internal workings of a distributed application.The loss of system performance produced by selfish nodes is a parameter to consider and it is called price of anarchy [4,5].
Therefore, the problem we face consists of designing a system capable of assigning tasks to nodes so that all the tasks are executed, and the total cost incurred is minimal.When the behavior of nodes is guaranteed to be fair, this is just a simple optimization exercise.However, when nodes may choose whether to be selfish the problem becomes much more complex.In this paper we propose an algorithm that, basing on game theory principles, solves this problem.We have called this algorithm Quid Pro Quo Mechanism (QPQ).The name comes from a Latin expression commonly used by lawyers and which may be translated as "This for that" or "A thing for another".This expression is often used when someone makes a job or and waits for and equivalent compensation in exchange.We used this expression since it reflects the spirit of the algorithm: due to the lack of payments in our model, the nodes work for others with the hope that others will work for them in the future.

State of the Art
As described above, the problem addressed in this paper is the allocation of task executions to potentially selfish users.This problem has been extensively studied in the literature.One important related work was carried out by Rosenschein et al. [6], where they define a "Task Oriented Domain".Even though they obtain fairly relevant conclusions, they do not shed any light on the specific problem considered here, since their model makes strong assumptions, such as knowledge of the task costs or a bargaining power over time.Recently, the use of game theory to model selfish behavior in the design of distributed systems has been proposed.Some works have appeared using mechanism design, a branch of mathematics derived form game theory, which provides the required background for the study and design of distributed systems under the action of selfish nodes (see, e.g., [7,8,9]).
In this direction, our QPQ algorithm is similar to the mechanism proposed by Jackson et al. [10].In that work, they present a new interesting type of mechanism (called linking mechanism") which, instead of offering incentives or payments to players, limits the spectrum of players' responses to a probability distribution known by the game designer.In that paper the authors proved that a linking mechanism is valid when the players' possible decisions are distributed following discrete probabilities.Additionally, the authors show that a linking mechanism can also be used for repeated games.Even though the work of Jackson et al. is very relevant to the problem we consider, it does not offer a method for the construction of mechanisms when the game is based on unknown continuous probability distributions, as assumed here.A second work that explores the idea of linking mechanism due to Ferenc [11].In that paper, he proposes a mechanism which limits player responses by restricting the first two moments (mean and variance) of the probability distribution, being that distribution known to the designer.Both works reflect the main idea behind the concept of linking mechanism: when a game consists of multiple instances of the same basic decision problem (e.g., saying yes or no, choosing among a number of discrete options), it is possible to define selfishness-resistant algorithms by restricting the players' responses to a given distribution.Hence, in that case, the frequency with which a player declares a particular decision is known beforehand.
In the specific areas of computing and communications, it is important to remark that most mechanisms proposed for dealing with selfish agents make unrealistic assumptions [12].In this direction, Bauer et al. [13] criticize many of these hypotheses, reviewing well-known works [14,15,16] to show that they are not applicable in real environments.Specifically, they identify two common strong artificial assumptions: 1.The assumption that the designer of the algorithms has some knowledge about the preferences of the nodes.

2.
The assumption that the interaction among players is limited to a single round (while it is well known in the literature that a solution for a single round does not necessarily apply when the game is repeated).

Contributions
In this paper, we face the problem of task allocation relaxing these (and other) common hypotheses, so that the obtained results can be applied in real environments.Hence, the contributions of this paper are twofold.First, to the best of our knowledge, this is the first work proposing a linking mechanism solution without prior knowledge of the distribution of the players' decisions, and without a payment system among them.Second, we generalize and improve previous works in the area to provide algorithms which are susceptible of being applied in the context of repeated task execution allocation in real communication and computing systems, even in the presence of selfish or non-rational users.
As we previously claimed, we do not want to restrict our mechanism to a set of unrealistic hypotheses.Instead, we establish a number of requirements that our model must satisfy.These requirements should provide the appropriate flexibility to guarantee the applicability of our results in real environments.

Abstract utility metrics
We assume, as an abstract notion, that the cost of a executing task to a node1 depends on its interest on the task, its opportunity or ability to execute it, or its degree of willingness to cooperate.We need to accept that each node may measure this parameter in its very own metric and units.Hence, for example, a node may decide on the cost of a task according to the occupation of its CPU, but another one may prefer to make it depend on its available bandwidth.In a real scenario, the number of factors that can influence the execution cost of a task can be extremely large.In this direction, out model must enable each node to define, in a flexible way, how costs (and utilities) are measured.
No payment system Payments are, in its most basic interpretation, a way of exchanging costs.Many existing mechanisms base their incentive schemes on the existence of payments.For payments to be possible, it is necessary that all players manage a common currency reference (euro, dollar, etc.).However, given our previous requirement, it is not clear how we can find that shared currency reference in our model.If a node measures its costs in terms of, for example, reputation, it can hardly "pay" to another node that measures its costs on CPU units.Hence, in our work, we assume that payments are not possible.
Player's rationality In game theory, most of the existing algorithms require players to be perfectly rational.This means that a player, using the available information, should always be capable of selecting the best strategy (the one that maximizes her utility).However, this is a controversial hypothesis which is suffering much criticism.Accepting this assumption means that players are capable of mathematically calculating all alternatives, which in some cases requires solving complex (NP-hard) problems.Clearly, this is not always feasible for all players.Hence, we commit ourselves to proposing mechanisms suitable for finding quasi-optimal task allocation, even in the presence of rationally limited players.
Incentive to participate In relation to players rationality, even in the case in which we are able to find global quasi-optimal task allocation, it is possible that the behavior of rationally limited users may harm the benefit of other players.In this direction, we add a stronger requirement.We force to ensure an incentive to participate in the game to all nodes, independently on whether they are rational or not.
No central entity A final requirements we impose is the capability of the system to work without the existence of any kind of central entity.This means that the proposed mechanisms must be susceptible of being implemented following completely distributed schemes.

Structure
The rest of the paper is structured as follows.In Section 2 we provide a formal definition of the problem and define basic terminology.In Section 3 we present a basic linking mechanism, and evaluate the issues that need to be faced to make it suitable for our problem.In Section 4 we present the QPQ mechanism, and formally prove its properties.In Section 5 we describe how QPQ could be used in real environments.Finally, Section 6 concludes the paper.

Definitions
To establish a formal framework for the problem, let us provide some definitions.Definition 2.1 (Problem). .The problem of the assignment of tasks is a tuple T, N, C where: 1. T = {t 1 , t 2 , . ..} is the (not necessarily finite) set of tasks that are issued to the system over time (t k is the task issued at time step k).We assume tasks to be atomic, independent, and of fixed duration σ. (For simplicity we will assume σ = 1, i.e, each task takes one time step to be executed.)Note that we assume that complex tasks may be divided into atomic tasks.
2. N = {1, 2, . . ., n} is an ordered list of nodes or players, where N is assumed to be finite, 3. (C(t)) i∈N is a vector of costs (or utilities) where C i (t) is the cost of executing task t ∈ T by node i.This information is private (only known by node i).
It is important to remark some aspects of the above definitions.First, we assume that the set of tasks is not known beforehand.Tasks appear one by one in a sequence of time steps, which command our discrete time evolution.Hence, the arrival of a new task dictates the start of new a round of our repeated game.We assume that tasks are independent among them and that the execution of a task does not influence the cost of the subsequent ones.Moreover, we force that one task must be completely executed by the time the next task is issued.For simplicity, we assume that the mechanisms to coordinate the allocation of the tasks take negligible time (with respect to the time step).Finally, we assume that every node that is assigned a task by the allocation mechanism actually executes the task.
Hence, as tasks are issued, each node i ∈ N estimates a sequence of costs C i (t 1 ), C i (t 2 ), . . ., C i (t k ), . . ., which we assume as independent samples of a probability distribution σ i ∈ ∆(S i ) characterizing node i's behavior.In this context, we denote S i as the distribution support (i.e., the range of values for which the probability is different to zero) and ∆(S i ) as the set of all possible probability distributions over S i .From now on, we will consider that C i is a real-valued random variable with probability distribution σ i ∈ ∆(S i ).To simplify the notation, we define realizations of this random variable as c i (t) = C i (t), t ∈ T .When clear from the context, we may remove the task t from the notation c i (t), as c i .
Given that all players enjoy the result of any task executed in the system, we can define the utility of a player as the savings obtained by not executing some tasks (i.e. the benefit obtained from participating in the cooperative computing scheme and not making all the work by itself).That is, the utility u i (t) of node i corresponding to a given task t is given by and the total utility of node i is u i = t∈T u i (t). 2e define U i as the random variable associated to the total utility of node i.In a similar way, we denote by W i the real-value random variable associated to the actual player i's executed cost and by w i (t) its concrete realization for task t.Note that each task could be executed or not by a particular player.Hence, (2.2) Finally, we assume that communication between players is reliable and concurrent.In particular, in the mechanisms we propose all players exchange their values c i (t).We assume that these values are correctly received by the players in a time that is negligible with respect to the time step (hence the reliability property).Additionally, we assume that each player sends its value before receiving the value of any of the other players (hence the concurrence property).

Basic Linking Mechanism
As mentioned above, a linking mechanism is applicable to repeated games where the decision (also know as message) of players is restricted to a particular known set.In our problem, the decision is the cost c i (t) of the task.With this concept in mind, let us define our first algorithmic attempt to solve the problem by applying a linking mechanism, presented in Algorithm 1.
Algorithm 1 Simple linking mechanism (code for node i, and a generic task t, omitted) 1: Estimate and publish the cost c i of the task 2: Wait to receive the costs c j from the other players 3: for all j ∈ N do 4: if not Accepted (c j , Historic j ) then 5: end if Historic j ← Historic j ∪ {c j } 8: end for do nothing (node d will execute the task) 14: end if As it can be observed, for each task, each player estimates the cost of computing the task and publishes it.Publication means broadcasting a message with the cost to all players (although any other means of distribution, like shared memory, can be used).By assumption, a player sends it costs before it receives any of the others (concurrency, which implies that they do not depend from each other), and all of the costs are correctly received at each player (reliability).Then, the algorithm assigns the task to the player that publishes the lowest cost.If players publish their real costs, this will produce that the total utility is maximized.However, this kind of approach could drive selfish users to publish fake costs in order to avoid executing tasks.For this reason, we add an acceptation test.When a published cost is not considered acceptable, then the system generates a random value for the cost of that node on the round.The implementation of this acceptation test will be discussed later, however it is important to remark that it contains the linking part of the mechanism (it depends on the historical values published by that particular node).Just as an example, we can imagine that if we mandate that nodes must publish costs between 0 and 1 following a uniform distribution, then we could consider unacceptable values deviating from that distribution.It is also important to note that all nodes use the same acceptance test with the same history.Then, they all accept or reject.Then, if players reject a value c j , the value Random(c −j ) generated is in fact a value deterministically generated from the set of values c −j = ∪ k =j {c k }, so that all players re-generate the same value for j.
Algorithm 1 has the objective of providing intuition on how we build our mechanism, but it clearly has several issues that contradict our previously stated requirements.In particular, fair allocation is not guaranteed.For instance, there is not a way of defining a notion of fairness within this algorithm, given that costs may have different meanings for different players.Additionally, given that costs are abstract notions, we cannot have any a-priori information on the shape of their corresponding distributions.So, it is not clear how to implement the acceptance test.
Digging into these problems, it is easy to understand that one of their causes is the fact that, given our requirements, each player has the right of measuring her costs on her preferred metric.(Hence, each player may have different distributions with different supports.)For this reason, cost comparisons cannot be easily made.Additionally, there is a second aspect that must be addressed.In the literature about linking mechanisms, authors assume that instances of the game (rounds) are simultaneous in time.In this case, defining the acceptance function over the set of values is easier.However, in our case, tasks are issued, and hence players generate their costs, over time.Then, from the point of view of the designer, it is not clear how to determine the acceptance of a value by comparing with a certain probability distribution.The issue is even worse given the fact that this distribution is not known by the designer.To solve all these problems we propose a novel solution based on applying a transformation over the utility function.
Utility normalization Given that the utility is defined as the work not done by a node, we may use as utility function of a node its probability distribution of costs.Once this is done, we may modify Algorithm 1 and normalize players' utilities so that they may be compared among each other.To normalize we use a transformation called Probability Integral Transformation (PIT).Our idea is to use the known fact that any cumulative probability distribution function has in itself a uniform distribution [17].More formally, the PIT is defined as Definition 3.1 (Probability Integral Transformation).Let X be a continuous random variable with a Cumulative Distribution Function (CDF) F ; that is X ∼ F .Then, the probability integral transformation defines a new random variable Y as: Y = F (X).
As mentioned above, our interest in the PIT is due to the following lemma.Lemma 3.1 (PIT follows uniform distribution).Let X be a continuous random variable with CDF F , then F follows a uniform distribution on interval [0, 1].That is, the random variable Y defined by the probability integral transformation Y = F (X) is a normalized uniform distribution.
Note that X does not need to be a continuous random variable.In the case that the player's costs follow a discrete distribution, it is still possible to perform a similar transformation called Generalized Distributional Transform [18], whose properties are equivalent to those of the PIT.Definition 3.2 (Generalized Distributional Transform).Let X be a random variable (not necessarily continuous) with a cumulative distribution probability F and let V ∼ U (0, 1) be a random variable with uniform distribution in [0, 1] independent of X.The modified distribution function F (x, λ) is defined as From this, we can define the general distributional transform of X as Y = F (X, V ), which can be proved to be a uniform distribution on the unit interval.
Proofs of these properties can be found in [18].Many studies in economics use this definition and its properties, such [19] or [20].In our case, to simplify the notation, we just call PIT to both transformations independently on whether the base distribution is continuous or discrete.
Coming back to Algorithm 1, our idea is to modify it by applying the PIT on the players' declared costs.Hence, instead of publishing the values from its real probability distribution, a player must publish the normalized ones, so that the new algorithm chooses for execution the player minimizing the normalized cost values instead of the original costs.Fig. 1 illustrates this process.
Based on these arguments, it is clear that the PIT provides a mechanism for comparing (normalized) node costs.However, we may wonder if the proposed transformation is valid, in the sense that it may not preserve the preferences of the player.To solve this issue, it suffices to notice that, what we are doing is changing the space of preferences.Therefore, the PIT somehow means that, instead of asking the user "How much does it cost to execute the task?", we inquire for something like "What percentage of tasks do you prefer to this one?"At the end As it can be noticed though the depicted arrows, the fact that player A has a minor cost than player B (0.3 versus 11) does not mean that player A will be assigned the task.Instead, when applying the PIT, player B is the one publishing the lower normalized cost.
of the day, and for our objectives, these questions are requesting the same information, but the latter is normalized in the interval [0, 1], which is a great advantage.
Although from an analytic point of view we assume that players could compute the PIT perfectly, in a practical set up players do not need to consider any a priori distribution of probability.They can simply generate costs using their particular distribution and apply the PIT using the successive generated samples.This process uses what in statistics is known as the Empirical Cumulative Distribution Function (ECDF).We will review this concept later, when we analyze the practical formulation of QPQ in subsection 5.
Acceptance test Once we know the properties of the PIT, it is clear how we can implement the linking mechanism for the acceptance test.The idea is that any player applying correctly the PIT on her real cost distribution, must generate a uniform distribution on the unit interval on her published normalized cost values.Hence, from the point of view of the mechanism designer, the problem consists on determining whether these published values follow or not that uniform distribution.There are a wide range of tests that allow checking that.These tests are called Goodness-of-Fit (GoF) tests.
Continuing with this argument, we propose to implement the acceptance test of our algorithm by using some GoF test on the declared transformed sequence of costs published by the player.Whenever a player is honest and she declares the values by applying the PIT transformation on her own distribution, these values will be uniformly distributed in the unit interval.In that case (with high probability) the GoF tests will accept the samples.More important, this process has an error which tends to zero when the number of samples (rounds) increases for any reasonable value of the threshold.For the study of our analytic results, we assume that GoF tests are perfect and this error is zero.(We will review this concept again in our practical implementation of QPQ, in Section 5.) Punishment In the case that a dishonest player tries to avoid the execution of tasks, one possible strategy is to generate increasing cost values, so that the PIT transformed values are close to the unit.However, this type of behavior is quickly detected by the test.An open question is how to establish a punishment to this and any other player whose GoF test comes out negative.One possibility is to force the node to execute the task.Unfortunately, this policy would force fair players to execute tasks in cases of false negatives.
Another possibility, inspired on previous works on linking mechanisms, is to reject the value declared by the player and generate a new random value according to the normalized uniform distribution.Additionally, we require that no central entity exist on the system.For these reasons, we propose to use a deterministic (repeatable) random generator that any of the remaining nodes can use to calculate the new value.(We deal with the practical aspects of this approach in Section 5.) At a first sight, this strategy may seem a very poor punishment, given that there is always a chance that a player emerges victorious of a lie.However, later in this paper we will prove that this is not only enough to discourage dishonest players, but also a crucial ingredient to guarantee that our mechanism is strategy-proof.

The Quid Pro Quo Mechanism
After describing the different ingredients of our solution, we are able to propose the final algorithm, which we call the Quid Pro Quo (QPQ) mechanism.The details can be observed in Algorithm 2.
Algorithm 2 Quid Pro Quo mechanism (code for node i, and a generic task t, omitted) 1: Estimate the cost c i of the task 2: Publish the normalized cost ci = PIT (c i ) 3: Wait to receive the normalized costs cj from the other players 4: for all j ∈ N do 5: if not GoF Test(c j , Historic j , p-th j ) then end if

8:
Historic j ← Historic j ∪ {c j } 9: end for do nothing (node d will execute the task) 15: end if Note that we use ci to denote the PIT-normalized cost to the published, while c i is the actual cost.We also put in ci the pseudorandom value that replaces the value published by i when it does not pass the acceptance test.(Hopefully context will allow disambiguation.)It is important to notice that the algorithm is the same for all participants, and that it is based on information known by them.Therefore, no central entity is required.When a task is issued, each node can estimate its own cost and publish its PIT-normalized value.This value is then received by all other players.When a player has all the values, she checks whether any player published a dishonest value by applying the GoF test.If the value does not pass the test, it is regenerated as described above, by using a pseudorandom generator (that allows all players to generate the same value) of uniformly distributed values in [0, 1].With these reviewed values, the player proceeds to determine if its own value is the minimum, in which case it executes the task, publishing the results to the rest of nodes if necessary.
In the following sections, we formally study the expected harm (or reduction of benefit) that dishonest behavior causes on QPQ.Intuition says that the loss due to a dishonest player should be comparable to having that player executing tasks at random.Indeed, we show below that, independently of their behavior, nodes may never expect a profit of less than the one obtained through a mechanism in which tasks are randomly assigned.This property is very useful in case the node is not capable of accurately evaluating its costs (it is non-rational).
Another important aspect is that QPQ guarantees a minimum benefit to the entire system, even if one or more players are non-rational or rationally limited.In this sense, we will show that the best strategy for any player is to act as if the rest of the players were rational and fair.That is, incorrect behaviors of some players does not alter the strategy of correct players.In next section, we prove all these claims in a formal way.

Formal Analysis of QPQ
Our QPQ algorithm is strongly inspired on the work of Jackson et al. [10].Hence, some of our proofs have been adapted from the ones provided there.We review now the most relevant properties of the QPQ mechanism presented in Algorithm 2. Assuming that the number of rounds (tasks) is large enough, and that players' costs are independent to each other, we prove the following properties.
1. QPQ is optimal in the sense that it minimizes the total work done when all players are honest.
2. For any player, the rest of players can be seen as a single aggregated player.For each task, the aggregated player's cost is the smallest of its members'.These costs follow a Beta distribution.
3. The best strategy of a player is independent of the behavior of the rest.
4. The strategy that optimizes the utility of a player is being honest.In game theory terminology, this means that QPQ is strategy-proof.
5. Each player always obtains a positive expected utility, which is determined by the number of players.
6.An irrational or rationally-limited player always obtains a positive profit.
7. The system is fair in the allocation of tasks and in normalized effort.That is, all the players will run the same number of tasks and perform a similar normalized effort (in expectation).
8. When the number of player is high enough, QPQ ensures very attractive performance.
To address the mathematical analysis of the algorithm we will assume that the PIT and GoF steps are perfect.In fact, with a large number of samples, these processes have errors close to zero.Another aspect that will simplify our analysis, is the idea of aggregated player.We evaluate the performance of a node playing against a "fictitious" node that aggregates the responses of all other nodes.This aggregated player behaves by publishing at each round the minimum of all the normalized costs of the players in the aggregation.This approach is compatible with all the assumptions of the model and is helpful because it significantly simplifies the analysis.
To make our notation clearer, given a task, we use x to denote the true normalized cost ci of player i for that task, while X or X i is the random variable for that value.When executing QPQ, players may publish x = ci or another false value.In that case, we use z to denote that dishonest value ĉi and also, overloading the notation, the re-generated random value replacing it when the GoF test fails.We assume that the z values are realization of some random variable Z.Given a task with cost c i , the player obtains a normalized utility ūi = ci when she does not execute the task (independently on what she published) and makes a normalized work of wi = ci when she executes the task (where Wi denotes the random variable).Additionally, we use y to denote the value min c−i published by an aggregated player.Following mechanism design notation, we say that the (social) decision function Then, we define Pr [d = i] as the probability that player i declares the minimum value and executes the task.When working with the aggregated player, Y is a vector of random variables, and we use Pr [Y ≤ y] to denote the probability that at least one element of Y , say j, validates Y j ≤ y.
With this notation in mind, we can prove that, for any player i, the expectation of the declared costs is equal to the expected utility plus the expected work.Additionally, this quantity is a constant.I.e., This means that a player maximizes her utility when she minimizes her work, and vice-versa.In the following propositions, we will use this fact.

Players' normalized costs distributions
We argue here that all players' normalized costs follow independent uniform distribution on [0, 1].When players are honest, their report values follow a uniform distribution on [0, 1].This follows from the properties of the PIT transformation introduced above.On the other hand, when a player is dishonest, it may change the distribution of its normalized costs trying to obtain extra benefit.However, we assume that in this case the GoF test fails.Then, her attempt will be detected, and the value will be replaced by pseudorandom value drawn from an independent uniform distribution on [0, 1].A final case is that the dishonest player may generate fake normalized cost that follow a uniform distribution on [0, 1], hence passing the GoF test.
In this case the normalized cost ci (t) for a task t is independent from the values cj (t) of the other players, since from concurrency the value has to be sent before the others are received.Hence, the following result.Optimality The QPQ algorithm is optimal in the sense that, if all players are honest, it minimizes the total normalized work done.
Proposition 4.2.Assume that all players are honest.For a given set T of tasks, there is no mechanism where W M i is the random variable associated with the normalized work done by player i when using mechanism M .
Proof.The proof is straightforward using contradiction.Assuming that such mechanism M exist, there must be, at least, one task for which wM < w, however, the social decision function of QPQ always selects the player publishing the minimum of the normalized costs, so it is not possible that M is able to select another player capable of executing with less cost.So, we conclude that M cannot exist.
Aggregated player It is assumed that players' normalized costs have independent uniform distributions on [0, 1].Hence, the probability density function of each player i is f i (c i ) = 1 on that interval.Thus, the costs of an aggregate player for n − 1 nodes follows a probability distribution Beta(y; 1, n − 1) as shown.Proof.Recall that the cost of an aggregated player is the minimum of the normalized costs of the players in the aggregation.The CDF F (•) of that cost can be obtained as follows.Let us assume that the players in the aggregation are 1 to n − 1.
Where Y j is the random variable associated with the normalized cost of node j.Hence, the density probability distribution is The Beta distribution is defined as follows [21].
where B(•, •) is the Beta function.Now, it is easy to check that f (y) = Beta(y; 1, n − 1).
Players' strategies Every rational player knows that the rest of players follow uniform and independent distributions.The question a selfish rational player makes is which is the best strategy for obtaining the greatest possible benefit.If a player uses a distribution other than the uniform, her values will be rejected by the GoF, and will be re-generated from a uniform distribution.However, a player could lie following a uniform distribution that is not independent of her actual values.Note that QPQ does not know about true normalized costs (they are private) and uses for the assignment decision the declared value or the random value assigned by the system if a lie is detected.In both cases, the aggregated player see a random variable Z that must follow a uniform distribution.We show now that either case drives the player to worse results that her own honest distribution, then that player will no have any incentive to cheat.Let us first quantify the expected work done by honest players.
Proposition 4.4.The expected normalized work E[ Wi ] done by an honest player i is 1 n+n 2 .Proof.Recall that we assume that player i is in the system with an aggregated player of n − 1 nodes.Then, Pr [d = i] is the probability that player i publishes a normalize cost smaller than the one of the aggregated player.
Notice that we use the probability distribution of the aggregated player derived in Proposition 4.3.
Proposition 4.5.The total normalized work done by an aggregate player j (aggregating n − 1 nodes), with costs x = cj , does not change when a player i (not in the aggregation) declares dishonest values z = ĉi .
Proof.Let us abuse the notation and use z to denote the dishonest values declared by i if the GoF is passed or the re-generated values if it does not.Let Z be the uniform random variable associated with these values.We assume that there is a bi-variate probability distribution with density f x,z (x, z) that relates both values x and z.In that case, the marginal distribution for z must be uniform.Therefore, we have, Hence, the expected work done by j is where E [ Ŵj ] is the expected work done by the aggregated player j when player i lies.But, as we have uniform marginals, the above expression becomes Which is equal to the total wok done by the aggregated player j when i is honest, that can be computed as follows.
In summary, an aggregate player j expects to performs the same amount of work, independently of the behavior of a given player i not in the aggregation.I.e., its expected work is not affected by whether i is honest or dishonest.This allows us to prove that the optimal strategy for a player is to be honest.Proposition 4.6.A player i never does more normalized work (in expectation) by being honest.That is, where E [ Ŵi ] is the expected work performed by player i when it is dishonest.
Proof.For the sake of contradiction, let us suppose this proposition is false.Hence, there is some set of tasks for which, if i is not honest, it performs less work in expectation.I.e., E [ Wi ] > E [ Ŵi ].Additionally, using Proposition 4.5, we know that the aggregated player, j, will do the same expected work, i.e., . However, if the above inequality were true, QPQ would not be optimal, since a mechanism that reproduces the same task assignments done under i lying (in presence of honest players would have less expected work).Clearly, this is in contradiction of Proposition 4.2.Therefore, the best strategy for a player (the one minimizing her normalized work done) is to be honest.
We complement this result with the following property.Proof.The values z used to decide whether to assign a task to player i follow a uniform distribution that is independent of the actual costs for i.Hence, From this result, Proposition 4.4, and Eq.4.1, we directly derive the following theorem.
Hence, since the sum of the expected work and expected utility is 1  2 , players obtain higher expected utilities by being honest than by publishing dishonest normalized costs.
Real expected utility Note that the normalized work done by a honest player, as calculated above, is equal to 1 n+n 2 .But we may wonder what is the real (not normalized) work done.We can easily calculate it in terms of real utility as follows.
Theorem 4.9.For each player i, the real expected utility is where the real cost of player i is a continuous random variable with support Ω, probability density function f i (•), and CDF F i (•).

Bounds
Finally, we think that it could be interesting to define some ratio that measures how the efficiency of the QPQ mechanism degrades with the selfish behavior of the players.Following concepts similar to the "price of anarchy" [4], we define the measure of efficiency as the ratio between the utility of an equilibrium (usualy the "worst equilibrium") and the utility of some optimal solution.
Obviously, the player's normalized utility must be between 0, when the node runs all tasks, and 1 2 when the node has not executed any task.But there are two levels that may be considered as references to establish the goodness of the algorithm.On one side, when a node runs completely random 1  n tasks, the expected effort is 1 2n .On the other hand, the maximum benefit a player i could get occurs when its tasks correspond exactly to her cheapest tasks.In this case, the expected utility would be Although this case has null probability, we propose to use this concept for our definiton of measure of efficiency.
Definition 4.1 (Measure of efficiency). .We define the measure of efficiency of an algorithm M for tasks assignment under selfish behaviour as the ratio between the expected normalized utility obtained under some equilibrium and Hence, we can compute the efficiency of QPQ as Note that the efficiency of QPQ is close to 1 when the number of participants is high.For instance, with just 10 nodes the efficiency of QPQ is 0.991.

Implementing QPQ in real environments
In this section, our objective is analyzing what are the restrictions for QPQ to be implemented in real environments.
From above sections, we may claim that the computation and communication capabilities required by the algorithm are affordable with current technology.We do not claim that implementing such capabilities would be an easy task, since there are many technological challenges that should be addressed to do it.Other previous works show some of them [22].Thus, our only claim is that it would be feasible.However, going beyond the required communication and computation capabilities, we may see that a number of issues arise.The first of all is on the definition of selfishness itself.This paper is mainly focused on detecting and neutralizing users publishing values not coming from the PIT of their real costs.However, one can claim that other non-cooperative harmful behaviors are possible such as, for example, not executing tasks at all, or executing them incorrectly.Hopefully, most of these evil conducts can be easily avoided using a two step scheme.First, by detecting such behaviors (previous works on the area show that it is possible [23,24]).Second, by establishing a strong enough punishment to discourage misbehaving players from repeating them.For example, we may adopt the radical solution of just sending off misbehaving users.In order to guarantee that reoffending players participate again, all that is needed is that users identities are unique and cannot change on different game instances.Note that QPQ does not discard misbehaving users, because it assumes that the publication of dishonest values cannot be distinguished from the publication of values generated from rationally-limited players, and it would not be reasonable to send off the latter from the game given that, in a realistic scenario, all players would have some rationally limitations (i.e. it is not possible to estimate costs with total accuracy).Hence, QPQ's approach of keeping them in the system is one of the most difficult ways of dealing with selfish users.
Coming back to the subtleties of QPQ, another point to consider is how to re-generate the random value when the system detects a lie.As we said before, we require a deterministic (repeatable) random generator that any of the remaining nodes can use to calculate the new value.One possibility for generating the random value is to use a hash function over the published normalized cost of other nodes.Alternatively, it is possible to request a random value to each player (except the value of player in question) and apply the hash function on them.Even another possibility is to use techniques similar to the procedures proposed by Aumann et al. [25] to generate jointly controlled lotteries.For example, for two players, we can request random values to both, and replace the value of the liar's by the sum of these numbers, if the sum is less than 1, or with one minus the sum otherwise.With this scheme, it is easy to show that when one of the player declares random values according to a uniform distribution, then this process generates random values also uniformly over [0, 1], regardless of what the other player does.As a conclusion, we may claim that there are several mechanism suitable for the generation of the punishment random value independently on the behavior of a dishonest player.
Another obstacle that stands on the way of a potential implementation of the mechanism is the acceptance test.We have assumed that we have a perfect GoF test function.This is somewhat similar to assume that we have a set of samples whose number is very large (ideally infinite) for detecting lies with the usual tests.In a real system, this solution is impractical since nodes would require to store all the historical values of the rest of players, and initially the number of samples is necessarily limited.As we saw before, we propose that players simply generate costs using their particular distribution and apply the PIT using the successive generated samples.The CDF used for the PIT is synthesized from the existing samples y i .This CDF obtained from samples is known in statistics as the Empirical Cumulative Distribution Function (ECDF).Definition 5.1 (Empirical Cumulative Distribution Function).The empirical cumulative distribution function (ECDF) F n for n observations y i is defined as where 1{A} is the indicator function or the characteristic function of event A. In our case, it is defined as Obviously, this process has an error which tends to zero when the number of samples (rounds) increases as it is proved by Glivenko-Cantelli theorem [26].
Regarding the GoF used, a tremendous number of GoF tests have been proposed in the scientific literature.Some of them may be applied over discrete distributions and others require continuous distributions.The Kolmogorov-Smirnov (KS) test [27,28] is probably the best-known test for continuous distributions, basically due to its simplicity.The KS test calculates the greatest distance between the ECDF associated to a sequence of samples and the CDF we want to check.It may be defined by the following expression: where F (•) is the CDF to check, n is the number samples, and (x 1 , x 2 , • • • , x n ) is the set of samples arranged in increasing order.What makes the KS test so versatile is that the distribution of the distance D does not depend on the theoretical probability distribution (null hypothesis).Several authors, such as Smirnov [28], Birnbaum and Tingey [29], have obtained exact and approximate expressions of the distribution of the variable D as a random function of the number of available samples.Due the complexity of such expressions, the KS test is often used through tables containing the most common percentiles.Figure 2: This picture depicts the curves of our elastic p-value thresholds as function of the normalized utility of a player.When we have a small number of rounds (blue line for 10 rounds) our system is quite tough, but if the number of rounds increases (yellow line correspond to 100 rounds and green line is for 500 rounds), our proposal is more relaxed, and accepts values if the player's utility is within a reasonable range.
We propose to use the KS test as the GoF test of QPQ.Hence, whenever a new normalized cost is issued, we check the KS test of it, together with the historical sequence of that player, so that we obtain the corresponding (p-value).Note that, in statistical significance testing, the p-value is the probability of obtaining a particular test statistic on the model at least as extreme as the one that was actually observed.Now, the value is accepted by the test when that p-value is over a particular acceptance threshold, p-th.
For practical reasons, we need to reduce the history of a user to a relatively small number of samples.Hence, we propose a slight modification to the acceptance test of Algorithm 2 to make it implementable in real systems.With this modification, each node applies the KS test using only a small number of the latest published values.However, this makes the KS test susceptible of generating inaccurate estimations.For example, a selfish node could publish values following a Beta distribution (1, 0.9).With high probability, this situation could not be detected with sample sequences of small length.In addition of choosing a large enough sample size (our simulations show that 50 samples are enough), we play with the threshold to refine the test.The idea is to modify the acceptance threshold so that it is hardened when the actual normalized utility of the player is higher than the theoretical expectation, and it is relaxed when players are losing more than expected.There are many ways of implementing this idea, but we propose the following expression , where δ is a tuning parameter, µ is the expected normalized utility of all players and µ k is the actual normalized utility of the player at round k.To illustrate this idea we depict Fig. 2, which represents the value of this threshold as a function of the total normalized utility of the player.Clearly, the above formula is entirely empirical, although the simulations below in this paper show that it fits well our requirements.One of the reasons that has led to the development of this proposal has been the idea that a new player must "pay" some kind of "fee" when she enters into the system.In this way, we want to avoid, or at least reduce, the problem of low-cost identities or cheap pseudonyms.With our proposal, at the beginning QPQ asigns tasks almost randomly, while later, when we have more information about players, QPQ assigns tasks optimally.Each player has to "pay" at the beginning working in random assigments and thus, she has no incentive to exit and reenter into the system.The final implementable QPQ algorithm we propose may be written as presented in Algorithm 3.
Algorithm 3 Implementable Quid Pro Quo mechanism (code for node i, and a generic task t, omitted) 1: Estimate the cost c i of the task 2: Publish the normalized cost ci = PIT (c i ) 3: Wait to receive the normalized costs cj from the other players 4: for all j ∈ N do if not KSTest(c j , Historic j , p-th j ) then

Simulations
By performing simulations, we have checked various aspects of the implementable QPQ.First of all, we wondered if the new GoF test may punish fair players by generating false negatives.In this direction, Fig. 3 represents the boxplot of the expectation of the normalized work done in 100 rounds when all players are honest and no GoF test is applied.This picture serves as control and allows us to compare it with the same game but introducing the GoF test of Algorithm 3, using a history of 50 samples for the KS test and δ = 2.The results are depicted in Fig. 4. As it can be seen, the performance loss caused by false negatives is minimal and barely noticeable in these scenarios.
The next question is to which extend selfish users can fool the algorithm and achieve improvements in their utility.We have simulated dishonest behavior by using several distributions close to the uniform but with higher mean, by taking advantage of the properties of the Beta function, so that these distributions try to pass the implementable KS test and, at the same time, obtain some profit on the long run.Again, we have run simulations considering a game with two nodes, one honest (uniform) and one dishonest for a set of 1, 000 rounds, with historical lengh of 50 samples for the implementable KS test and with δ = 2.The results can be seen in Table 1, which depicts the normalized player utilities for different scenarios.In the table, the name Uniform represents    honest nodes, Random is used for non-rational players generating random costs and finally, "Beta" and "Normal" are used for dishonest players following those distributions.As it can be observed, honest utilities remain quite constant, while non-rational and dishonest utilities decrease, although never under a given limit.Interestingly note that this behavior is maintained even in the extreme case of a Beta(1, 0.9) distribution.Observe that, when the number of samples is small (around 50), a Beta(1, 0.9) is so similar to a uniform distribution that it is hardly distinguishable to the eye.
Finally, for the same simulation scenario, in Fig. 5 we compare behavior of the implementable KS test of Algorithm 3 for fair users (uniform) playing against a node with several manipulative profiles (Beta distribution variants) as the number of rounds increase.As it can be observed, the honest player rapidly gets her values to pass the test, while the dishonest gets into trouble rapidly because her values are rejected, even with distributions very similar to the uniform.

Distributions
9: d ← argmin j∈N c j 10: if d = i then

Figure 1 :
Figure 1: At the top, we can see the execution task cost histograms of two different players.Note that they follow different probability distributions.At the bottom, we depict the Cumulative Distribution Function (CDF) for both.As it can be noticed though the depicted arrows, the fact that player A has a minor cost than player B (0.3 versus 11) does not mean that player A will be assigned the task.Instead, when applying the PIT, player B is the one publishing the lower normalized cost.

Proposition 4 . 1 .
The set of final normalized costs considered in Line 10 of Algorithm 2 are drawn from independent and identical distributed (iid) random variables, with uniform distribution on [0, 1].

Proposition 4 . 7 .
When a player i publishes dishonest non uniform values or values independent of her true normalized uniform distribution, it performs in expectation E [ Ŵi ] = 1 2n work.