Exact Derivation of a Finite-Size Scaling Law and Corrections to Scaling in the Geometric Galton-Watson Process

The theory of finite-size scaling explains how the singular behavior of thermodynamic quantities in the critical point of a phase transition emerges when the size of the system becomes infinite. Usually, this theory is presented in a phenomenological way. Here, we exactly demonstrate the existence of a finite-size scaling law for the Galton-Watson branching processes when the number of offsprings of each individual follows either a geometric distribution or a generalized geometric distribution. We also derive the corrections to scaling and the limits of validity of the finite-size scaling law away the critical point. A mapping between branching processes and random walks allows us to establish that these results also hold for the latter case, for which the order parameter turns out to be the probability of hitting a distant boundary.

Boltzmann constant), yielding the reduced magnetic field h = H/(k B T). Additionally, one may consider a system of units in which μ and h are dimensionless. The former, μ, will be the order parameter, whereas h and τ are control parameters.
"Near" the critical point of the transition, defined by τ = h = 0, the equation of state fulfills a scaling law, which gives μ as a function of τ and h as where β and Δ are critical exponents, andF AE represents two scaling functions, one (+) for h > 0 and another one (−) for h < 0. The scaling law Eq (1) indicates the invariance of the equation of state under appropriate scale transformations (which are linear transformations of the axes μ, τ and h). By the universality property many different systems share the same values of the critical exponents and the same scaling functions, and then the scaling law Eq (1) constitutes a law of corresponding states [8].
For instance, for the mean-field theory or the Landau theory of the Ising model [9,10], β = 1/2, Δ = 3/2, and the scaling functionF AE is given by the two real solutions of x ¼ jF AE ðxÞj À1 À jF AE ðxÞj 2 =3, Ref. [11]. This yieldŝ leading to the equation of the spontaneous magnetization, the critical-isotherm equation, and the Curie-Weiss law, respectively [9]. AsF AE ðxÞ is a smooth function, it is only at the critical point that a sharp transition emerges. It is important that the correlation length ξ fulfills a scaling law analogous to Eq (1), with ν another critical exponent andĜ AE another pair of scaling functions. Then the main fact of critical phenomena is that ξ diverges (goes to 1) right at the critical point (as ν and Δ are positive). For instance, at the critical isotherm, τ = 0, one has ξ / 1/|h| ν/Δ , whereas at zero field, ξ / 1/|τ| ν . Strictly, all these equations are only valid in the thermodynamic limit. For a system of finite size L (in all dimensions [1]) the correlation length cannot be infinite. When L is much larger than the correlation length one does not expect that the finiteness of the system has any influence on the behavior of the system; however, this is not the case when L becomes smaller than the correlation length of the corresponding infinite system [1]. So, one can introduce a phenomenological additional dependence on ξ/L in the equation of state [7], as where the terms |h| β/Δ and τ/|h| 1/Δ have been transformed to L −β/ν and L 1/ν τ, respectively. The previous equation constitutes a finite-size scaling law or ansatz, where nowF AE ,F AE and F become bivariate scaling functions, with the latter unifying the positive and negative values of h. The finite-size-scaling ansatz can be verified by plotting μL β/ν versus τL 1/ν and hL Δ/ν ; if a data collapse emerges, this gives the shape of the scaling function F. In this way, finite-size behavior is determined from the critical exponents of the infinite system [1]. Although usually finite-size scaling is derived in this phenomenological way, there have been exact derivations for particular systems [12]. Note that for a finite system with h = 0 the system size L plays a role similar to that of the inverse of the magnetic field in an infinite system, or more precisely, L 1/ν acts as 1/|h| 1/Δ , and in this way, one expects that the first argument of the scaling function F in Eq (5) behaves, qualitatively, as the scaling functionF AE in Eq (1). This implies that a sharp transition can only take place for L ! 1, i.e., in the thermodynamic limit. There are numerous examples in the literature about the "smoothness" of phase transitions for finite systems, see for instance Ref. [13] 2 Introduction: Phase transition in the Galton-Watson process The Galton-Watson process [14,15] provides the simplest model for the growth (and degrowth) of a biological population [16], but it is equally applicable to the growth of a nuclear reaction [17], an earthquake [18], or mean-field self-organized critical processes in general [18][19][20][21]. It belongs to a more general class of models known as branching processes. The Galton-Watson process starts with one single element that replicates, producing more elements, called offsprings, which also replicate, producing more elements and so on. The model is stochastic, as the (total) number of offsprings produced by each element is random, characterized by a distribution that is the same for all elements and also independent of the number of offsprings of the other elements.
In mathematical terms, the probability that the number of offsprings K of one element takes the value k is given by P[K = k], with k taking discrete values from 0 to 1. In this paper we will consider that P[K = k] is given by the geometric distribution, or by the generalized geometric distribution, but the model is totally general. The distribution P[K = k] completely defines the model, as, we insist, the number of offsprings of each element are identically distributed and independent. The initial element defines the 0-th generation, its offsprings are the first generation, and so on. An index t labels each generation. The model does not incorporate time, but one can interpret t as a discrete time. An important auxiliary variable is N t , which counts the number of elements in each generation, starting with N 0 = 1 (one single original element).
The key question to ask is if the process gets extinct, i.e., N t = 0 at some t ! 1, or not (where it goes on forever). A fundamental result in the theory of branching processes [15,18] is that the probability of extinction P ext can be obtained from where f t (s) is the t-th composition of the probability generating function f(s) of the random As we iterate successive compositions of f(s) starting from s = 0, the limit is given by the smallest fixed point s Ã of f(s) in the interval [0, 1]; so, s Ã necessarily satisfies s Ã = f(s Ã ), but it is the smallest value in [0, 1] verifying such relation. Introducing the probability of survival, or probability of non extinction ρ, fulfilling P ext = s Ã = 1 − ρ, the fixed-point condition becomes From here, it is clear by normalization that ρ = 0 is a possible solution. Expanding the equation up to second order in ρ using the binomial theorem one gets The solutions, in terms of the mean number of offsprings, m = hKi, and close to m = 1, are then where we have used that when ρ is close to zero (from above) m is close to one, and therefore hKðK À 1Þi ¼ s 2 þ mðm À 1Þ ' s 2 c , where s 2 c is the variance of K when its mean is one. It can be proved that there are no other fixed points than the two above [15,18].
It is clear that the case in which the offspring distribution verifies m = 1 is critical, in the sense that it separates two very different "phases" of the system: extinction for sure if m 1 and non-sure extinction (and the possibility of a "demographic" explosion) for m > 1. Even more, this phase diagram is analogous to the spontaneous (zero-field) behavior of a magnetic system, Eq (3), if we identify m − 1 with the control parameter τ and ρ with the order parameter μ, and so we can talk about a phase transition in the Galton-Watson model [18] with critical point at m = m c = 1. Note then that s 2 c becomes the variance of the number of offsprings in the critical case. There are, though, two quantitative differences: β = 1 (in contrast to β = 1/2 in the magnetic example above) and that the ordered phase (non-zero order parameter) is above the critical point now. Eq (11) also tell us that when the distance to the critical point, m − 1, is rescaled by s 2 c the behavior of the transition is universal, i.e., independent on the underlying distribution of the number of offsprings K.
In this paper we investigate this phase transition for a finite number of generations, i.e., when the number of generations is limited by t L. In a previous paper [22] we expanded f(f t (0)) around the critical point s Ã to obtain a general finite-size-scaling law for the probability of survival ρ. Here we follow a different, more direct approach, particularized for a geometric distribution in the number of offsprings, which will allow us to obtain also the corrections to scaling.
After the introduction to finite-size scaling in critical phenomena in the previous section and the introduction to branching processes in this section, in Sec. 3 we analyze the finite-size effects in the critical properties of the Galton-Watson process when the offspring distribution is given by the geometric distribution. Two different order parameters are explored, [ρ and ρ/(1 − ρ)], and the corrections to scaling and the range of validitity of the scaling law are obtained as well. We generalize the finite-size scaling law for the so-called generalized geometric distribution in Sec. 5. Previously, in Sec. 4, we establish that our scaling law also describes the escape probability of a simple one-dimensional random walk. An appendix gives some details of the calculations of Secs. 3 and 5.

Finite-size scaling in the geometric Galton-Watson process
We consider the Galton-Watson model with a finite number of generations L, which means that the process is stopped when it reaches the L-th generation, i.e., the elements of this generation are not allowed to replicate. Viewing the process as a branched tree, L becomes the height of the tree and is therefore a measure of system size (more precisely, the height of the tree is L + 1, counting the 0-th generation).
The extinction of this process is given by the event N L = 0, as extinction at any generation t < L is included in the case N L = 0 (extinction is forever, as it is an absorbing state). In the same way as for an unbounded system, the probability of extinction will be (we only make explicit the dependence on L, but a hidden dependence exists in the parameters of the distribution of K, in particular on m). The probability of extinction is obtained then as the L-th composition of the probability generating function of the distribution of the number of offsprings, but note that as L is not infinite, f L (0) will not reach the fixed point s Ã . Although formally the problem is solved by the calculation of f L (0), in general it is not feasible to arrive to an explicit expression for the composition, even for small values of L. A remarkable exception is the case when K follows the geometric distribution, given by for k = 0, 1, . . . 1 (and zero otherwise) and with q = 1 − p. The only parameter of the distribution is p, which is called the success probability. The geometric distribution has a straightforward interpretation in terms of biological populations. For instance, consider that the elements that replicate are female individuals, and each female has a probability q to produce another female and a probability p of producing a male. Each female reproduces until it gets a male, and when the male is obtained the mother does not reproduce anymore. Although getting a male is considered a "success" (this is just a name), it is the female individuals what are counted as offsprings, so K counts the number of females disregading the male. Note that another variant of the geometric distribution counts also the male, this would be for us a shifted geometric distribution and is not considered here. The probability generating function of the geometric distribution turns out to be from which the mean is obtained as m = hKi = f 0 (1) = q/p and the variance as σ 2 = f 00 (1) − m(m − 1) = q/p 2 , see Ref. [18]. Note that the critical point of the corresponding Galton-Watson process is at m = q/p = 1 and so p c = q c = 1/2, with a critical variance s 2 c ¼ 2. The fundamental property (for our problem) of the geometric distribution comes from the fact that its probability generating function is a fractional linear function [15], also called a linear fractional function [23]. In this case the successive compositions of f(s) can be computed for any L, yielding see Ref. [23] or Eq (58) at our Appendix. The constant s 0 is a fixed point of f(s) different from 1 (this fixed point, s 0 , always exists except for m = 1), and the constant κ is given in the Appendix. Then, the probability of survival will be which contains the solution to our problem. For the geometric distribution the fixed point s 0 is at s 0 = p/q = m −1 , and then κ = p/q = m −1 (see Appendix); therefore, substituting into Eq (16) we get This exact equation provides the order parameter ρ as a function of the control parameter m for any system size L (in the case of the geometric distribution). In order to verify if a scaling law is fulfilled it is convenient to introduce the rescaled distance to the critical point, where the "distance" m − 1 is rescaled (divided) by the term 1/L 1/ν , with the value of the exponent ν unknown. Substituting m − 1 = x/L 1/ν and into Eq (17) we observe that the rescaled survival probability L 1/ν ρ(L) in the limit L ! 1 either tends to zero or infinite (depending on the sign of x and on whether ν > 0 or ν < 0), except in the case ν = 1. For ν = 1 and close to the critical point, the limit of L 1/ν ρ(L) is a positive value that only depends on x, which is the signature of a scaling law, with F the scaling function. Indeed, rewritting Eq (17) in terms of x, using that m L ! e x for ν = 1 leads to up to the lowest order in L −1 . Taking into account that the variance at the critical point is s 2 c ¼ 2, the scaling law can be written as with scaling function Finite-Size Scaling Law and Corrections in the Geometric Galton-Watson Process in total agreement with Ref. [22]. The reason to introduce the value of s 2 c will become more clear when we consider the generalized geometric case, in Sec. 5. Note that the scaling law obtained here for the Galton-Watson process is very similar to the purely mathematical case considered in Sec. 2.5.1 (p. 85) of Ref. [2].
It is important that the scaling function Eq (23) fulfills Although our calculation does not include the critical case, x = 0, the Appendix shows that indeed the critical case is also described by the value of the scaling function F at x = 0. Therefore, there is a removable singularity at x = 0. The limit behavior of F, substituted into the scaling law, leads to rðLÞ ¼ 2s À2 c ð1 À mÞe ÀLð1ÀmÞ for m < 1 and L ! 1; 2s À2 c L À1 for m ¼ 1; 2s À2 c ðm À 1Þ for m > 1 and L ! 1: We see that the infinite-size case, Eq (11), is recovered when L is infinite, and that it is only in this case that a sharp transition exists. Comparison with Eq (3) allows one to see which is the equivalent of the "critical isotherm" and "spontaneous magnetization" laws for the Galton-Watson process. For the latter case we see that β = 1. The Curie-Weiss law is not fulfilled as ρ does not decay as a power law in L but exponentially for m < 1.
We may also obtain the corrections to scaling, taking care of terms beyond the leading one. Going back to Eq (17), we substitute there the exact expression m L = (1 + x/L) L = e x (1 + ∑ n a n ), with a 1 = −x 2 /(2L), a 2 = x 3 /(3L 2 ), etc., then, with u = e x /(e x − 1) and ∑b n = x/L+(1+x/L)∑a n . The first terms of the different sums are X a n ¼ À Let us study the behavior as far from the critical point as possible. Below it (x < 0), we take x ! −1 and then u ! 0 (exponentially in x); therefore, only ∑a n contributes and we get rðLÞ ¼ FðxÞ so, the first correction-to-scaling term goes as −x 2 /(2L) = −L(m − 1) 2 /2. This means that if this term is of order ε (i.e., L(m − 1) 2 /2 = ε) all other terms are of higher order in ε, in the limit L ! 1. This is so because the rest of terms are of the form and x 2k L k ; Above the critical point (x > 0) we consider x ! 1, then, u ! 1 and the sums lead to the cancellation of all terms that are not powers of x/L, so rðLÞ ¼ FðxÞ The first correction to scaling is given by the term −x/L. If we impose this to be of order ε, (i.e. ε = x/L = m − 1), we will obtain the limit of validity of the scaling law above the critical point. In summary, the scaling law will hold in the range with ε ( 1. For instance, for a 5% error [defined as the ratio between the approximation given by the scaling law and the exact ρ(L), Eq (17)], ε = 0.05 and then 1 À ffiffiffiffiffiffiffiffiffiffiffi ffi 0:1=L p < m < 1:05. Fig 1 shows that this is valid for L-values above 40 for m < 1 and above 160 for m > 1. Note that the range of validity that we obtain, Eq (39), is much larger than the one implicit in Ref. [22], 1 − c/L < m < 1 + c/L, with c a constant. If we do not take the limits x ! ±1, we have, keeping terms up to first order in 1/L, which is also shown in Fig 1a and 1b. A scaling law with a broader range of validity is obtained taking as an order parameter not ρ but ρ/(1 − ρ). This is just the ratio between the number of realizations that survive at t = L and the number that are extinct at t = L. From Eq (17) we obtain and proceeding as in the preceding case, we get ¼ FðxÞ s 2 c L 1 þ ð1 À uÞ X a n À uð1 À uÞ X a n 2 þ u 2 ð1 À uÞ X a n 3 þ Á Á Á : The factors u k (1 − u) = −e kx /(e x − 1) k+1 go to zero exponentially fast when x ! ±1, except the first one (k = 0) when x ! −1, for which u ! 1. This is the only contribution away from the critical point, and so (below the critical point) the correction to scaling goes as −x 2 /(2L). The range of validity of the scaling law is then given by i.e., the scaling law is valid arbitrarily far from the fixed point in the supercritical region, as the correction term there decays exponentially fast in x. If we keep x finite and terms up to first order in 1/L we arrive at This can be verified in Fig 2, where the scaling law describes system sizes as small as L = 10 arbitrarily far from the critical point in the supercritical region.

Applicability to random walks
Thanks to a well-known mapping between branching processes and random walks [24,25], our finite-size scaling law is also applicable to the latter system. In concrete, a one-dimensional random walk can be obtained from the geometric Galton-Watson branching process by Fig 1. (a) Comparison of the exact probability of survival, ρ(L), given by Eq (17), with the approximations given by the scaling law Eq (22) and by the scaling law with the first correction to scaling, Eq (40), for different m and L. (b) The same taking the y-axis logarithmic. (c) The same data, taking the ratio between the approximation given by the scaling law [FðxÞ=ðs 2 c LÞ], Eq (22), and the exact value of ρ(L). Larger values of L are included in this case. The program used to draw the figure is provided as S1 File. doi:10.1371/journal.pone.0161586.g001 Finite-Size Scaling Law and Corrections in the Geometric Galton-Watson Process following the branches sequentially. Instead of considering that each generation t of the process is generated in parallel from the previous one (as the identification of the index t with time suggests) one changes the order in which offsprings appear. The position of a walker in the tree associated to the branching process determines which element (which node of the tree) replicates.
The walker is initially located at the root (the element at the 0-th generation), and moves to one of the elements in the first generation (it does not matter which one). If this element has its own offsprings, the walker moves to one of this, and so on. A branch is followed sequentially until the branch gets extinct (the last element has no offsprings), and then the walker moves back to the parent of the last element (from generation t to t − 1); if this parent has more offsprings the walker follows the branch of one of the remaining offsprings; if not, the walker moves back to the previous parent (at generation t − 2) and so on. Note then that the walker passes twice through each link or edge between parent and offspring. If, arbitrarly, we consider that the root is at the bottom of the tree (as in real, biological trees!) and each new generation is one level above the previous one, the walker travels up and down through all the tree.
The one-dimensional random walk is obtained from the projection of the position of the walker on the axis counting the number of generations, so, the t-axis of the branching process becomes the spatial axis of the random walk. Then, the walker moves up with probability q and down with probability p (the parameters of the geometrical distribution). Notice that the The exact behavior is given by Eq (41), and the scaling law with the first correction to scaling is given by Eq (45). It becomes clear how the performance of the finite-size scaling law is even better than for ρ(L), in particular for m > 1. The program used to draw the figure is provided as S1 File. Finite-Size Scaling Law and Corrections in the Geometric Galton-Watson Process mapping is possible and exact because the number of offprings follows the geometric distribution, Eq (13).
The finite-size condition imposed to the branching process translates into the existence of a reflecting boundary at t = L for the random walk, and then, the probability of survival ρ of the branching process turns out to be the probability of hitting the reflecting boundary, P hit , for the random walk. This also has an absorbing boundary at t = −1, where the walk dies (after a duration equal to twice the number of elements, minus one).
After all these considerations, the mapping is established, and we can write a finite-size scaling relation for the hitting probability, with F(x) given by Eq (23) and Remember that this is valid for large L and close to the critical point q = q c = 1/2, because m = q/p. In particular, the corrections to scaling of the previous section hold in exactly the same way when the relationships are written in terms of m or x = L(m − 1). In fact, the previous scaling law describes the probability that a random walk starting next to the absorbing boundary hits the other boundary, independently of the nature of the latter (reflecting or not), as it is only the first-passage time what matters. In this way, the one-dimensional random walk, the simplest system in statistical physics, displays a continuous phase transition with finite size scaling, for which the corrections to scaling can be easily obtained as well.

The generalized geometric distribution
The previous analysis of the geometric Galton-Watson process in terms of fractional linear functions (see Appendix) suggests a generalization of the problem. We may consider the generalized geometric distribution, in which the zero-offspring probability, P[K = 0], is released from following the geometric distribution and instead it takes a free value p 0 , which is a new parameter. The rest of values of K follow the geometric distribution, but rescaled by (1 − p)/(1 − p 0 ) (because of normalization). In a formula, and zero otherwise. We recover the usual geometric distribution for p 0 = p. The generating function is indeed a fractional linear function, which yieds m = f 0 (1) = (1 − p 0 )/p and σ 2 = (1 + p 0 − p)(1 − p 0 )/p 2 . The critical point turns out to be at p c = (1 − p 0 ) The analysis of Sec. 3 is fully applicable in this case, in particular Eq (16). We need to know that s 0 = p 0 /q and κ = m −1 (see Appendix); in fact, we write s 0 as a function of m and p 0 , which is s 0 = p 0 m/(m − q 0 ), with q 0 = 1 − p 0 . Notice that we study the transition keeping fixed p 0 .
Substituting into the formula for the order parameter ρ(L), Eq (16), we arrive at Introducing again the rescaled variable x = L 1/ν (m − 1), and taking the limit L ! 1, the only non trivial limit arises for ν = 1. In this case, up to first order in 1/L and introducing the critical variance s 2 c ¼ 2p 0 =ð1 À p 0 Þ, we get which is the same scaling law as for the geometric case, with the scaling function F(x) given again by Eq (23).

Summary
We have presented here direct analogies between branching processes and thermodynamic phase transitions. We have considered the classical Galton-Watson model of branching processes when the number of offsprings K per element is given by the geometric distribution. This process has as natural control and order parameters the mean value of K and the probability of survival ρ, respectively. We study finite-size effects by imposing an upper limit L to the number of generations. After obtaining the exact expression for the equation of state, that is, the dependence of the order parameter with the control parameter, Eq (17), we introduce the rescaled distance to the critical point, x = L 1/ν (m − 1). When ν = 1 we demonstrate that a finitesize scaling law, Eq (22), emerges in the limit L ! 1.
In general, the theory of critical phenomena does "not explain why in some systems scaling holds for only 1-2% away from the critical point and in other systems it holds for 30-40% away" [26]. In particular, finite-size scaling should work when the system size tends to infinite and the control parameter approaches the critical point; nevertheless, in practice, finite-size scaling predictions turn out to apply to rather small systems at a non-negligible distance from the critical point [1]. We provide a quantitative derivation of these limits for the finite-size scaling behavior of the Galton-Watson process, Eq (39), thanks to the calculation of the corrections to scaling, Eqs (35) and (38), or Eq (40). If we define an alternative order parameter as ρ/(1 − ρ), the same scaling law holds, but with a larger range of validity, given by Eq (44). In this case the corrections to scaling are given by Eq (35), below the critical point or by Eq (45), in general.
A straightforward mapping between branching processes and random walks allows one to establish that all our results for the survival probability of a geometric Galton-Watson process are equally valid for the probability that a one-dimensional random walk, starting above but close to an absorbing origin and evolving through ±1 increments, reaches a distance to the origin equal to L. In this way, a subcritical Galton-Watson process corresponds to a random walk with a bias to the negative (−1) increment, for which the hitting probability becomes zero as L ! 1. On the other hand, the supercritical case corresponds to a random walk with a positive bias in the increment, for which there exists a non-zero probability that never returns to the origin in the limit L ! 1. Obviously then, the critical case is the one of a fair random walk. To the best of our knowledge, the one-dimensional random walk provides the simplest example of a system exhibiting a finite-size scaling law. Therefore, the analogies between branching processes and equilibrium phase transitions are totally applicable to the one-dimensional random walk.
constant κ turns out to be, substituting s 0 , which happens to be identical to the inverse of the mean, i.e., For the generalized geometric distribution, from its probability generating function, Eq (49), and from the definition of fractional linear functions, Eq (53), one establishes that a = p 0 , b = p − p 0 , c = 1, and d = −q, and the fixed point s 0 turns out to be which, for the particular case of the geometric distribution, defined by p 0 = p, turns into The knowledge of the value of the fixed point s 0 leads to the explicit form for f t (s). At the critical point, given by m = 1, it is necessary to follow a separate approach. For the generalized geometric distribution, the critical point is given by p = 1 − p 0 , which, substituting into the probability generating function, Eq (49), leads to Induction leads directly to f t ðsÞ ¼ ð1 À p 0 Þftp 0 þ ½1 À ðt þ 1Þp 0 sg 1 þ ðt À 2Þp 0 À ðt À 1Þp 2 0 À ðtp 0 À tp 2 0 Þs ð64Þ ¼ tp 0 þ ½1 À ðt þ 1Þp 0 s 1 þ ðt À 1Þp 0 À tp 0 s ; ð65Þ from where the order parameter of the transition turns out to be taking the limit of large L and using the expression above for s 2 c . This is in perfect agreement with the results obtained for m 6 ¼ 1. Note that the results for the geometric distribution are a particular case corresponding to p 0 = p = 1/2 at m = 1.