Quantum Two Player Game in Thermal Environment

A two-player quantum game is considered in the presence of thermal decoherence. It is shown how the thermal environment modeled in terms of rigorous Davies approach affects payoffs of the players. The conditions for either beneficial or pernicious effect of decoherence are identified. The general considerations are exemplified by the quantum version of Prisoner Dilemma.


Introduction
Information processing is a physical phenomenon and therefore information theory is inseparable from both applied and fundamental physics. Attention to the quantum aspects of information processing revealed new perspectives in computation, cryptography and communication methods. In numerous cases a quantum description of the system provides some advantages over the classical situation, at least in theory. Does quantum mechanics offer more subtle mechanisms of playing games? In game theory one often has to consider strategies that are probabilistic mixtures of pure strategies [1,2]. Can they be intertwined in a more complicated way by exploring interference or entanglement? There certainly are situations in which quantum theory can enlarge the set of possible strategies [3][4][5]. This is a very nontrivial issue as genuine quantum systems usually are unstable and their preparation and maintenance might be difficult e.g. due to decoherence [6][7][8][9][10]. Note that quantum formalism can be used in game theory in a more abstract way without any reference to physical quantum states [11][12][13]-the decoherence is not a problem in such approaches. The question is if quantum games are of any practical value. The answer is positive and some commercial cryptographical and communication methods/products are already available. The field of quantum auctions seems to be promising too [14,15]. In this paper we would like to show how the decoherence in quantum games can be described in terms of completely positive Davies maps [16]. This should be compared with approaches presented in [6][7][8][9][10]17]. We focus our attention on the quantum Prisoner Dilemma [4] but the approach can be used in other games too. We show that properly utilized decoherence can, sometimes and at certain circumstances, have a beneficial effect. The paper is organized as follows. We will begin by a brief presentation of quantum game formalism. Then we will describe our approach to the decoherence in quantum games. Finally we will discuss some problems that should be addressed in the near future.

Quantum game
The general definition of a quantum game would be involved. Here by a quantum game we understand a quantum system that can be manipulated by at least one party and for which utilities of moves can be reasonably defined. We shall suppose that all players know the state of the game at the beginning and, possibly, at some crucial stages of the actual game being played. We neglect the possible technical problems with actual identification of the state. Implementation of a quantum game should include measuring apparatuses and information channels that provide necessary information on the state of the game at crucial stages and specify the moment and methods of its termination. We will not discuss these issues here.
We will consider only two-player quantum games: the generalization for the N players case is straightforward. Therefore we will suppose that a two-player quantum game Γ = (H, ρ i , S A , S B , P A , P B ) is completely specified by the underlying Hilbert space H of the quantum system [18], the initial state given by the density matrix ρ i 2 S(H), where S(H) is the associated state space, the sets S A and S B of quantum operations representing moves (strategies) of the players, and the pay-off (utility) functions P A and P B , which specify the pay-off for each player after the final measurement is performed on the final state ρ f . A quantum strategy s A 2 S A , s B 2 S B is a collection of admissible quantum operations, that is the mappings of the space of states onto itself. One usually supposes that they are completely positive trace-preserving maps. Schematically we have: This scheme for a quantum two-player game can be implemented as a quantum map: where initially describes identical starting positions of Alice (A) and Bob (B). Using entanglement is one of possible ways to utilize the power of quantum mechanics in quantum games. Here the state of players is transformed using with into an entangled state. Here I and σ x denote the identity operator and the Pauli matrix, respectively. Note that due to the presence of noise the amount of entanglement does not necessary increase with increase of γ. Due to omnipresent decoherence the entangled state of two players can be affected by thermal dissipation and dephasing described by completely positive Davies map D. Description of its detailed properties is postponed to the next section. For a standard (canonical) matrix representation of quantum states of a two level system the initial state in Eq (2) is given by whereas the entangling operator in Eq (4) reads as Then Eq (3) has, for the initial state given by Eq (2), the following matrix form [19]: The individual strategies of players S X , X = A, B are implemented as follows: In this work we assume that there are only two classical pure strategies available, identity and flip operation: We also allow Bob to follow his strategy by a pure quantum strategy i.e.
where the quantum strategy is given by unitary transformation with the explicit matrix form [19] Uðy; a; bÞ ¼ e ia cos ðy=2Þ ie ib sin ðy=2Þ ie Àib sin ðy=2Þ e Àia cos ðy=2Þ In other words, we assume that the pure strategies of both players differ because only one of them (Bob) recognizes that information is stored in qubits rather than bits i.e. Bob can utilize richer class of operations formalized by U. This knowledge is beneficial provided that Bob can influence Alice's strategy. It is the case as the J(γ) in Eq (4) entangles Alice's and Bob's systems for γ 6 ¼ 0. Let us emphasize that in the presence of the entanglement, the actions of the players are not fully independent, as their qubits remain correlated. In real systems the correlation is never maximal due to the omnipresent decoherence affecting qubits represented in Eq (1) by D. It is assumed here that the decoherence influences players in the time when they are selecting their strategies S. It is clear that in general decoherence affects quantum states used in the game at any stage of its time evolution. However, if a considered time interval (e.g. between preparation of the initial state Eq (2) and applying entangling operator J) is significantly shorter than the time scale of decoherence process one can safely neglect any dissipation of information in that time interval. Physically, we assume that the only time interval which is comparable (or larger) to the decoherence time scale is the one which is required by the players to work out their strategies. We ask then how the quantum game becomes modified for given model of quantum dissipation.

The model of decoherence
The only fully natural source of decoherence affecting quantum systems is due to their environment causing both energy and information dissipation. For our model considerations we assume that at least one of two qubits in the state χ AB = J(ρ i ) shared by Alice and Bob just before applying their strategies S A, B interacts with its own environments E A, B . As Alice and Bob can be separated from each other we neglect any direct interaction both between their qubits (via proper Hamiltonian term) and the environments E A, B . In other words, Hamiltonian of the total system is simplified to the form: We also assume that qubits A and B are identical: Moreover, we assume that the interaction between qubits and their environments satisfies Davies weak coupling approach [20,21]. Davies approach allows for a mathematically rigorous construction of a qubit's reduced dynamics (with respect to environments) in terms of a completely positive (strictly Markovian) semigroup [20,21]. Moreover, Davies semigroups can be rigorously and consistently derived from microscopic Hamiltonian models of open systems [20], so they satisfy most desired thermodynamic and statistical-mechanical properties such as the detailed balance condition [21]. Davies approximation has been successfully used in studies of various problems in quantum information and physics of open quantum systems including entanglement dynamics [22], quantum discord [23,24] or properties of geometric phases of qubits [25] and thermodynamic properties of nano-systems [26]. Here, instead of exploring the full power of Davies semigroups, we consider only certain elements of Davies dynamical semigroups: Davies maps [16] which inherit all the properties proved to hold true for Davies semigroups with the complete positivity as the most desired among them. Here we adopt notation of Ref. [27] (instead of that used in Ref. [16]) and recapitulate an explicit form of Davies maps applied to the initial state given by Eq (8): and we consider three possibilities: where U A, B denotes Hamiltonian dynamics of a noiseless qubit and the Davies map D = D A, B (p, A, G, ω, t) reads as follows [16]: Dj1ih0j ¼ e iotÀGt j1ih0j ð 20Þ Dj0ih1j ¼ e ÀiotÀGt j0ih1j ð21Þ or, in terms of coherence vector formalism adopted in Ref. [16]: where the matrix is acting on a density matrix in the column-vector representation, i.e.: r 00 r 01 Let us notice that, contrary to the Dirac bra-ket formalism which we adopt in this work, the coherence vector formalism is not very convenient for presenting states of composite qubitqubit systems as it requires vectors with 16 elements and the two-qubit operators require (16 × 16)-dimensional matrices. The p 2 [0, 1/2] parameter appearing in transformation is related to the temperature (here we set k B = 1) via: The parameters A = 1/τ R and G = 1/τ D , if interpreted in terms of spin relaxation dynamics [28], are related to the energy relaxation time τ R and the dephasing time τ D , respectively [16]. The parameters A, G and p depend solely on details of the qubit-environment coupling encoded in the Hamiltonian Eq 14 [21]. Fulfilling the inequalities [28] guarantees that the Davies map is a trace-preserving completely positive map [21] as it is an element of the Davies semigroup which is proved to be completely positive and trace preserving [21]. This property allows one to apply Davies maps to any part of a composite system, also in the case when the subsystems are initially entangled. Let us notice that it is crucial as the decoherence D in Eq (1) is a tensor product of two maps with at least one being the Davies map. Complete positivity guarantees that the 'output' χ AB (t) in Eq (16) is a quantum state. The limiting case A = 0 and G 6 ¼ 0 corresponds to pure dephasing without dissipation of energy. The Davies decoherence introduces two parameters A and G modifying quantum game which we consider in addition to the 'generic set' consisting of the entangling parameter γ and three parameters constituting U in Eq (12).

Payoffs
Payoff, the results of the game, can be calculated as an expectation value, weighted by certain game-dependent real numbers a, b, c, d constituting the payoff matrix: Bob : which leads the the following pay-off operator For example, the strategy profile (S A = 0, S B = 1) is encoded in the quantum state j01i and results in payoffs c for Bob and d for Alice. The trace operation represents projective measurements performed on the output state. We describe the influence of a thermal environment in terms of payoff's differences where the tilde denotes the "noisy" player. The signs of Δ's in Eqs (29)(30)(31)(32) identify the winner of the game i.e. the one of two players whose payoff is larger. Below we present explicit formulas calculated for four typical quantum strategies. As the general formulas are very complicated we present the case θ = π/2 for simplicity. Let us notice that due to symmetry of the system the payoff differences do not depend on a and b. There is also no difference between the payoff of the 'noisy' player and that which is unaffected by the environment provided that there is no energy exchange between noisy qubit and the thermal bath i.e. A = 0.
Let us start with the considering the strategy profile (I,IU). The payoffs are given by the formulas: For qualitative predictions of payoff's character particularly important is the long time behavior of the above presented formulas. We consider D$ 1 X :¼ lim t!1 D$ X where X = A, B, AB i.e. we assume that the payoffs are calculated at the time significantly longer than the time of thermal equilibration of the player's qubit. Explicit formulas (with no restriction imposed on θ) read as follows: D$ Explicit formulas for remaining strategies are postponed to the Appendix. There is a non-trivial issue factorizability of probabilities. We are aware of no reliable method to analyze such problem in simulations. The interested reader is referred to [17,29,30] for discussion of this problem.

Results and Discussion
In this section we study one of best known examples of a game: the celebrated Prisoner Dilemma (PD) [1,2]. Prisoner Dilemma is often used for analysis of various aspects of cooperation in economics, biology and network science [31]. The story says that two rational agents (prisoners) have to decide without communication whether cooperate or not. They might decide to not cooperate, even if it is obvious that they are better off if they do so. In its quantum version [4] the game ceases to be paradoxical for some classes of quantum strategies but we should stress here that the dilemma disappears due to dramatic enlargement of the set of strategies for both agents. Therefore, Quantum Prisoner Dilemma is a quite new game that reduces the classical one if the strategy sets are properly reduced. A general quantum game considered so far becomes reduced to the PD provided that parameters in payoffs Eq (28) fulfill the relation c > a > b > d [31]. We choose the following values: ða; b; c; dÞ ¼ ð3; 1; 5; 0Þ ð 39Þ Further we analyze in detail four strategies of players (or prisoners) assuming that one of them (Bob) can apply both classical and quantum strategies. Our aim is to present a relation between difference of Bob's and Alice's payoffs with respect to the entangling parameter γ in Eq (4). Initially we limit our attention to the case when there is only one noisy qubit belonging either to Bob or to Alice.
First we consider (I,IU) strategy profile. The payoff differences Eqs (31 and 30) calculated at different time instants are presented in Fig (1). The quantum part of the Bob's strategy is chosen to be U = U(π/2,0, π/2). The payoff difference Δ$ can be either positive (Bob is winning) or negative depending on the value of γ. This dependence is strongly affected by thermal environment. Moreover, this dependence is very different in the case when the environment is attached either to Bob's or Alice's qubit. Let us notice that in the case when thermal environment affects Bob' qubit his payoff is in the long time limit always larger than the payoff of Alice i.e. Δ$ B > 0 for all γ. It is not the case when the noisy qubit belongs to Alice. There is a range of γ when Δ$ A < 0 i.e. when $ B < $ A . In other words, in the situation when Alice can control or choose γ and possesses noisy qubit is favorable if she tries to win or at least minimize her losses. Let us also notice that there are parameters γ < π/4 such that Δ$ A > Δ$ B > Δ$ and, simultaneously Δ$ < 0. This range of parameters is particularly favorable for Bob who wins due to the presence of thermal environment.
As the second example we consider (I,FU) strategy profile. Here, in the absence of noise, the γ-dependence of payoff difference Δ$ is trivial: Bob always wins. In the presence of thermal bath it does not hold true any more. As presented in Fig (2), there exist γ's resulting in Δ$ A < 0 i.e. Alice payoff becomes larger for sufficiently long time of interaction between her qubit and environment.
The payoffs for the remaining two strategy profiles (F,FU) and (F,IU) can be obtained form (I,IU) and (I,FU), respectively, by change of sign: Δ$ ! −Δ$ and Δ$ A, B ! −Δ$ A, B . This symmetry is generic for PD game Eq (39) and does not depend on the specific choice of quantum part of Bob's strategy U.
The parameter γ is not the only one which affects Alice's chance to win with Bob. We consider the case when the noisy qubit belongs to Alice. The energy relaxation parametrized by A is one of the parameters which most significantly affect character of thermal dissipation. As it was discussed in previous section for A = 0 (i.e. when there is only pure decoherence with no energy dissipation) Δ$ A = Δ$ B and the Bob's 'quantum benefit' becomes neutralized. The larger A is the more different are the payoffs of Bob and Alice as presented for two strategies in Fig  (3). The effect of increasing temperature is visualized in Fig (4). In the limit of high temperature Δ$ A is small but positive. In other words, for some strategies and for given γ Alice defeat Bob by warming her qubit.
The parameter A can influence payoffs in the games when the classical part of strategy is given by V c = I/2+F/2 i.e. for the game averaged with respect to both classical strategies. The results presented in Fig (5) indicate two basic features. First, after averaging Δ$ A = Δ$ B . Second, changing A results in changing γ-'periodicity' of payoffs.     Payoff differences Eq (29) as a function of A calculated at time t = 2 for a mixed Alice-Bob strategy profile (V c ,V c U) with V c = V/2+F/2 and the quantum strategy Eq (12) with U = U(π/2,0, π/2). The thermal Davies environment (with G = 1) influences only Alice's qubit. We set A = 0 and p = 0 in panels where these parameters are fixed.
The temperature dependence of payoffs is well visible in the long time limit (t ! 1) of the payoff differences Eq (29) here calculated for (I,IU) strategy:

Conclusions
A method of taking account of decoherence in quantum game theory has been presented. We have assumed that the interaction between qubits and their environments are weak and satisfy requirements for applying Davies weak coupling approach to reduced dynamics [21]. Actually, we have represented decoherence via Davies maps. Our analysis shows that the dependence is strongly affected by a thermal environment. The temperature dependence of payoffs is noticeable in the long time limit. Moreover, the presented analysis stresses that the payoffs can vary dramatically in cases when the environment is attached either to Bob's or Alice's qubit. This effect can be beneficial for one of the players as presented graphically for various special cases of payoff differences. It would be of great interest to adapt this approach to quantum games on networks of agents [32][33][34] because systems involving a large number of simple variables with mutual interactions appear frequently in various fields of research.

Appendix: Explicit payoffs formulas
Here we provide explicit formulas for payoffs for three remaining quantum strategies (I,FU), (F,IU) and (F,FU) with θ = π/2 together with their long time limits: calculated for an arbitrary value of θ.
(I, F U)