A behavioural model of minority language shift: Theory and empirical evidence

Natural languages with their speech communities tend to compete for speakers, very much like firms compete for market shares. As a result, some languages suffer a shifting pressure which might lead them to their extinction. This work studies the dynamics of language shift in the context of modern bilingual societies like the Basque Country, Ireland and Wales. They all have two official languages, linguistically distant: A, spoken by all, and B, spoken by a bilingual minority. They also have a bilingual education system that ensures a steady flow of new bilinguals. However, a decay in the use of B is observed, signalling that shift processes are at work. To investigate this apparent paradox, we use a novel approach in the literature of language competition. We build a behavioural game model with which bilinguals choose either language A or B for each interaction. Thus, they play repeatedly the game. We present a theorem predicting that under reasonable assumptions, any given population of bilinguals will converge into a linguistic convention, namely into an evolutionary stable equilibrium of the game, that always embeds a proportion of bilinguals shifting to A. We validate this result by means of an empirical version of the model, showing that the predictions fit well the observed data of street use of Basque and daily use of Irish and Welsh.

Having in mind readers who do not belong to the very technical strand, i.e. are not familiar with our methodology, we revised the entire text again, acknowledging your interest and e¤orts to get the message of our work through all the concepts and mathematical derivations. We believe that now have found an adequate mixture of rigorous technical parts and intuitive examples and explanations (or wording). In the present second revision we also introduced some additional clari…cations.
Comment 2: "On page 8 you write: The replicators are the pure strategies R and H." For non-specialists (e.g. me) at least a little surprising. How can. . . ? But certainly correct (for game theorists), could you explain in detail?" Answer 2: Thanks for pointing this out; you will see that in this new version we added much more explanations or changed the wording to provide a more intuitive approach. The standard replicator dynamics, which is the one used in our work, presumes that bilingual individuals only play the pure strategies of language use: R and H. These strategies can be copied or replicated. That is, the choice of R by one bilingual can be adopted, by way of imitation, by other bilingual(s). Similarly, for H. The percentage of bilinguals playing a given strategy changes continuously. It can be expanded or contracted depending on the expected payo¤s to the strategy relative to the average expected payo¤ of the system. This is what the replicator dynamics will tell us. At each moment of time, the replicator equation (see below why we say "equation", in singular) will give us information about the proportion of bilinguals playing R and the proportion of those playing H. Comment 3: SI 9: new term: "one-population replicator dynamics". Why "onepopulation"? Answer 3: Probably you are thinking of something more complicated, but we actually refer to something quite simple: the LUG is only played by the population of bilinguals because monolinguals cannot choose language. This is emphasized now a bit more. Consequently, the LUG is a one-population game, from which we obtain the one-population replicator dynamics.

Comment 4:
"I think that in particular Theorem 1 (page 9) could be presented in a way which permits better access to non-specialists.
Some ideas: Realize that the terminology might be repulsive for linguists et al. Could you try not to shock non-specialists with terminology like: "standard replicator dynamics", "replicator dynamics equation" in other place just called "replicator equation", "evolutionary stable state strategy ESS", "Nash equilibrium", "global attractor", most of which presented in just 3 lines on page 9? Of course, by searching in Wikipedia also non-specialists can learn what all that means, but will they do so? And I doubt that you really need all of this terminology: it could be mentioned in footnotes for specialists who otherwise might be frustrated to miss it.
Structure the explanation of the most important Theorem 1, i.e. in particular allow more space for deriving the replicator equation for the derivative of p_i. Possibly without making extensive use of game-theory terminology.
I believe equ. (1) is correct, but have to confess that I was not able to convince myself." Answer 4:

Presentation of Theorem 1:
We see your point. In this spirit but also sticking to what we said before (looking for a 'compromise') we revised that part. In particular, see the presentation of the theorem in the Introduction and the discussions right before and after the Theorem.
We believe that the reader does not need to be a game theory specialist to accept that conventions (of any nature) are quite stable, and even hard to change. Recall that the Example on the admittedly somewhat trivial 'road game'introduces the main game concepts needed, see also Remark 3 for a an intuitive understanding of the theorem. It basically says that the mentioned equilibrium of the LUG is a linguistic convention explaining why a proportion of bilinguals may shift to the majoritarian language A.
Let us try to explain now how to obtain equation (1), our equilibrium p i .

The derivation of p i and its stability:
We have added this in the SI, in order to see how one could proceed to obtain equation (1). It is probably simpler than you might believe: We need to know the points where the replicator dynamics equation stops moving. Those points are called the rest points of the equation. Consider the points p i = 0 and p i = 1. If we substitute these values of p i on the right-hand side (of the equality) of the replicator equation, then the dynamics stops; i.e., p i = 0. Hence, p i = 0 and p i = 1 are rest points. Now, let us take the expression inside brackets [:::]. If we equalize this expression to 0, then p i = 0 as well. Speci…cally, consider Solving this equation for p i , you would obtain the equilibrium p i = 1 So p i is a rest point. Is it stable? Consider any any p i such that 0 < p i < p i . Then p i < 1 Hence, p i > 0, which would lead p i toward p i . For any p i , such that 1 > p i > p i , using the same procedure, we can see that p i < 0 decreases toward p i . Hence, dynamically p i is stable and a global attractor in (0; 1) and both p i = 0 and p i = 1 are unstable rest points. As said, we have added this explanation now in the SI.
We hope to have answered your doubts in an understandable way.

Comment 5 with Answer:
Typing errors and alike: We are very thankful for these comments and have tried to address them all, except point 4 (we are aware that di¤erent communities unfortunately sometimes use di¤erent notations). Note that g 0 ( i ) denotes the derivative of the function g( i ) with respect to i .

Comment 6:
Terminology: 1. Please de…ne "private information". 2. Is there a di¤erence between "payo¤" and "bene…t"? On SI p.13 you even sometimes change from bene…t to pro…t, that appears confusing. And how about "utility", e.g. "perceptible utility gain" (SI p.7, third line from bottom) -what is di¤erence from payo¤ or bene…t? My confusion continues on SI p.8 where new term "preference intensity"appears.
5. SI p.11: why p i 2 (0; 0:293)? 6. SI p.12-13: what is di¤erence E(KE) -P(KE), E(DU) -P(DU), resp.? Is "predicted"not the same as "expected"? Answer 6: 1. In the context of the LUG means that when you participate in an interaction, you do not know, ex-ante, whether the interlocutor is bilingual or monolingual. Similarly, your interlocutor does not have information about your linguistic type. This information is therefore private. Hence, there is uncertainty for the bilingual involved in the interaction. Once the interaction starts, and if you are a bilingual who chooses the strategy Reveal, then the information that you are bilingual is revealed and cease to be private. The uncertainty has disappeared. However, if two bilinguals meet and both play Hide, the information keeps being private, the uncertainty has not been resolved, and they will speak in the majoritarian language A.
2. We use the concepts, "utility" and "perceptible utility gain", simply as it is standard in such theoretical concept or model. However, we have extended the notation to o¤er readers of other …elds a better understanding. Our notation is commonly used in measurement theory, games and decision theory, and were introduced by the corresponding authors who worked in those …elds (see for instance (11), (12)). The concept "perceptible utility gain"is a psychophysical expression that is used in Measurement Theory. It means that, given the bounded discrimination capacities of the human mind, utility di¤erences below the perception threshold are not distinguishable. Binary relations like similarity and indi¤erence are the outcome of bounded human rationality.
In game theory, "payo¤" and "von Neumann and Morgenstern (sorry!) utility" or just "utility" are used equivalently. In the context of utility theory, "bene…t", "pro…t", "net bene…t", refer to levels of utility or payo¤ to the bilingual player. Whenever one of those terms is introduced in our work, you will …nd the mathematical expression of the term. Such as pro…t in line 7, page 13. Net pro…t is described by equation (5), and so on. What matters is the mathematical expression of the term. It may happen that the same mathematical expression is called sometimes bene…t and sometimes pro…t. But we know that they mean the same mathematically. Frustration cost means a "utility loss".
3."Preference intensity": this concept is introduced in Assumption A.3, page 7 of the main text, line 283: "Note that m( i ) > n shows that the preference intensity for using B decreases as i increases". Where m( i ) denotes the payo¤ (or utility) obtained from speaking in B, and n is the payo¤ from speaking in A, a positive constant. The inequality m( i ) > n > 0 means that the bilingual of locality i prefers using language B. The intensity of this preference is measured by the size of the di¤erence m( i ) n. Since the function m( i ) is assumed to be decreasing in i , the preference intensity decreases as i increases in (0; alpha i ).
4. We certainly understand your point but actually it is not that we fancy the term. We use it because in the game-theory community it is well established (so it is not that 'we'love the term. You can …nd it in Wikipedia as a popular game theory model). But we have added 'so-called'and put it in quotation marks. Let us explain: we show in SI that the expected payo¤ matrix of the LUG is symmetric. In game theory there is a classi…cation of symmetric 2x2 games. After the normalization of the 2x2 symmetric matrix, there are four types of symmetric games, depending on the payo¤s of the main diagonal (outside this diagonal all elements are 0). If the normalized payo¤s of the main diagonal are both negative, which is the case of our expected payo¤ matrix, then it is said that has the structure of the Hawk-Dove Game. In game theory it is common to mention it when you face a symmetric matrix like this. Finally, notice that we do not use it in the main text to avoid misunderstandings.
5. Because it is the solution of the quadratic equation resulting from the inequality that you …nd two lines below in the text.
6. This is a really good question, and sometimes admittedly confusing. While it is true that 'expectation'would be the safer term from a statistical point of view, unfortunately, the much less speci…c terminology 'prediction' is by far more popular and wide-spread over di¤erent communities. Please allow us to omit here any mathematical explanation of the general di¤erences between prediction and expectation in mathematical statistics; it might be su¢ cient for our purpose that in these models and in our context, it is -fortunately -the same. We have included a sentence in the revised version that clari…es this issue saying that here we can and will use the terms prediction and expectation synonymously. So the easy answer to your question is that in this paper, yes, it is the same. However, we …rst estimate the equation for conditional expectations, and then use these estimates to predict the entire street or daily use functions.
To conclude, we hope you would understand our di¢ cult position of using a formal model and being accessible to a broad audience, from linguists, to mathematicians, physicists, psychologists and behavioural scientists. We have tried to 'smooth'the text to be open to all those potential readers.
Thanks you for your work as a highly interested reviewer of our manu-script.