The effect of cash transfer programs on educational mobility

In this paper I develop a model to reproduce the phenomenon of high intergenerational correlations in education observed in Latin American Countries. The model is based on empirical evidence and implemented through agent based modeling techniques. The effect of conditional cash transfer programs on educational mobility is then analyzed. The results suggest that conditional cash transfer program can substantially increase intergenerational mobility in education. I find that using parental education as eligibility criterion and adapting the subsidies to the income level can improve the efficiency of a program in increasing educational mobility as compared to a purely income based program.


S2.1 Overview of parameters
In the S3 Appendix I present sensitivity checks for some of the parameters, particularly those for which no empirical calibration was possible.

S2.2 Simulation strategy
In some agent-based models the results depend on the initial conditions and therefore it is important to test the sensitivity of the model to the initial conditions (Helbling, 2012). Fortunately, the model I present in this study is very insensitive to the initial conditions and quickly converges to a stable situation (steady state). A way to reduce the potential impact of initial conditions is to run the model for several periods and to use only the results of the model once it is stable Tesfatsion (2002Tesfatsion ( , 2003. I follow this approach and exclude the first periods from the analysis. Let us have a look at how the model behaves at the very beginning of the simulation. is independent of the initial conditions. To ensure that all results are completely independent of the initial conditions, I never use the first 10 periods of the simulation in the results. The vertical line in the graphs depicts this period and we can see that at that point of time the model is already stable for several periods 1 . Hence, if the objective is to compare two steady states, we can simply simulate the two settings with the same random seed and compare the values at the steady state. By repeating this procedure for different random seeds we obtain a large number of observations at the population level. For instance, if we use 10 different random seeds and compare periods 11 to 50 we will have 10 × 40 = 400 data points at the population level 2 .
In case of analyzing policy interventions the strategy is slightly more complicated because more stages must be considered. If we would introduce the policy from the beginning there might S2 Appendix for: The effect of cash transfer programs on educational mobility be a non-equal reaction to the initial conditions under policy intervention as compared to a government free simulation. In such as case the starting point under the policy intervention would be different than in the baseline and we would confound the effects of the policy measure with this difference. To avoid this, I first simulate the model without policy interventions until the model is stable and then I introduce the policy. Figure S2.2 depicts this using the example of the intergenerational correlation with the father. The gray markers refer to the simulation without any policy intervention, while in case of the black markers the policy intervention starts at period 25. Both simulations are based on the exactly same random seed. The solid line refers to a linear spline regression allowing us to analyze if there is a significant trend in the data at each stage of the simulation. In this case, the only slope that is significantly different from zero is the first spline for periods one to three.
For the remaining splines we cannot reject the hypothesis of a steady state as the coefficients are not significantly different from zero. Now, let us have a closer look at the four stages indicated in the graph and divided by vertical lines.
The whole simulation includes 50 periods, which are divided into four stages. The first stage is the initialization period of 10 ticks. The data from this first stage are never used for the presentation of results in this study to avoid any kind of initialization effects.
The second stage is a pre-intervention steady state phase. The third stage starts in period 25 when the policy intervention is introduced. The policy intervention period is divided into two sub-periods: the short and the long run effects. We can see that once the policy measure is introduced the model almost immediately jumps to the new steady state. Even though the Florian Chávez-Juárez S2 Appendix for: The effect of cash transfer programs on educational mobility effects are almost instantaneous, it can be interesting to analyze in detail this short run period 3 In contrast, if we want to focus on the steady state comparisons, I start using the data-points only five periods after the policy intervention was introduced. This rather short period can be justified by the almost immediate jump to the new steady state we can observe in Figure S2.2. Figure 3 in the main body of the article is an example for such an analysis.
I will now present an overview of the different simulation settings used to produce the tables and figures in the result section of this study. Table S2.2 provides an overview of the samples used to produce the different figures and tables in section 4. The column Runs per setting refers to the number of different random sees used and column Used periods per run shows how many periods (ticks) were taken from each of them. The number of data points refers to the number of data point used for each statistic. As statistic are considered for instance one correlation, one point in a scatter plot or one curve when using a non-parametric regression. The column Steady state comparison indicates whether the situations were compared at the short run (stage 3 in Figure S2.2) or at the steady state (stage 4 in Figure S2.2).  Table 3 10 40 400 yes Figure 2 4 20 80 yes Figure 3 10 21 ≈ 210K yes The number of data-points refers to each line The baseline results where no policy intervention is introduced is based on 10 different random seeds and from each of them I consider 40 periods. Hence, each statistic presented in Table 3 is the average out of 400 population statistics, each based on several thousands of individuals. In addition to the reported results in this study I also varied the number of data points considered to see whether this affects the results. In no case changing the number of data points had an impact on the results.