Sharp increase in inequality in education in times of the COVID-19-pandemic

The COVID-19-pandemic forced many countries to close schools abruptly in the spring of 2020. These school closures and the subsequent period of distance learning has led to concerns about increasing inequality in education, as children from lower-educated and poorer families have less access to (additional) resources at home. This study analyzes differences in declines in learning gains in primary education in the Netherlands for reading, spelling and math, using rich data on standardized test scores and register data on student and parental background for almost 300,000 unique students. The results show large inequalities in the learning loss based on parental education and parental income, on top of already existing inequalities. The results call for a national focus on interventions specifically targeting vulnerable students.


Introduction
The COVID-19-pandemic of early 2020 interrupted or even completely halted the learning of children in many countries around the world. Globally, schools were closed for an average of almost 95 school days between March 2020 and February 2021 [1], which is equivalent to almost half a school year in countries where a school year is 40 weeks. In many western countries, schools continued to teach remotely. However, there were many challenges related to distance learning, such as access to digital learning devices and digital learning gaps [e.g., [2][3][4]. This prompted serious worries of social-emotional problems and learning loss. Despite the lack of adequate data in many countries, some studies appeared on the use of online learning tools by students [e.g., 5] and on the effect of distance schooling on performance and learning gains of students in primary education. Although some studies did not find significant learning losses [e.g., no effects on reading in the USA [6], no learning deficits on schools with a large share of students with advantaged backgrounds in Australia [7]], most studies report negative consequences of the school closures for children's educational development [Belgium [2], UK [4,8], Italy [9], Switzerland [10], Germany [11], USA [12][13][14][15], Norway [16]]. For higher education, the results are less consistent: some find negative effects [17] while others indicate that distance learning might have made students more efficient [18] or see little effects [19]. There are worries that some groups of students experienced lower learning gains due to the school closures and the COVID-19-pandemic than others. Our hypothesis is that the school closures and the pandemic resulted in increased inequality in skill development for students from specific backgrounds (socio-economic status, income and migration background). There is reason to believe that inequalities have indeed increased due to the school closures. For instance, in the Netherlands especially lower-educated parents felt less capable in helping their children with their schoolwork [20,21]. In the United Kingdom, we see that middle class parents spent more time on home schooling than parents from the working class [22,23]. If this is the case, and these learning losses persist, they can be detrimental for development of skills in the long run, and in turn lead to an increase of the existing inequalities in opportunities in education and on the labor market [24].
Previous studies on inequalities based on socioeconomic background variables in students' learning gains during the school closures, were hampered by data limitations. Some were limited by their data on educational performance: they used relatively small samples, focused on a specific region rather than a national representative sample, or were limited to only one grade level or subject [6,8,9,12,13,16]. Others had limited information on students' background characteristics. They used school level indicators [2,7,11] or relatively uninformative categories. For example, a recent study based on Dutch data was only able to distinguish between families in which at least one parent had a lower secondary degree (92%) and families in which both parents had less than a lower secondary degree (8%) [3]. Our study improves upon these studies for several reasons: 1) as a result of the widespread use of standardized testing in the Netherlands, we have a large sample of students who were tested shortly before and after the first lockdown, 2) we have rich student background information at the individual level, including multiple student background variables that indicate whether a student is disadvantaged or not, based on meaningful and informative categories and 3) we focus on effects for separate grade levels, and three different subjects (reading, spelling and mathematics) showing large variation, instead of only looking at overall effects or one subject. Therefore, in the study at hand, we are able to look in greater detail at background differences between students and present results showing that the learning loss due to the school closures are unequally distributed: students from disadvantaged backgrounds have suffered much more than their fellow students. To show this, we use standardized test score data from the Netherlands and link this to register data on student and parental background for primary school students.

COVID-19 educational policy changes in the Netherlands
Although compulsory education starts at age 5, Dutch children generally enter primary school at age 4. They remain in primary school up to age 12, after which they enter secondary school and are tracked according to their ability. Almost all schools in the Netherlands are public schools (99%) funded by the Ministry of Education, Culture and Science [25].
February 27, 2020 the first COVID-19 patient was reported in the Netherlands. Primary schools closed at March 16, 2020 and reopened May 11, 2020. Vulnerable children, and children of parents with essential occupations who could not work from home, were allowed to come to school during the school closure. However, these children usually followed the same program as the children who had to stay at home and comprised only around 5% of all children in this first period of school closure. Up to June 7, 2020 children only went to school half of the time. In this way, groups were smaller and it was easier to keep distance. From June 8 onwards schools went back their usual schedule. Children and teachers were still urged to stay at home when they showed any symptoms associated with COVID-19.
The Netherlands was relatively well equipped for online education, as a total of 96% of the Dutch households have internet access at home [26]. Additionally, the Dutch government made 2.5 million euros available in March to support online learning. This money was used to buy laptops and/or to provide internet access for 7,000 students. This money was supplemented with another 3.8 million euros in May 2020. In total, over 16,000 laptops and tablets were financed in this way [27]. Nevertheless, the school closure happened relatively sudden with no time to prepare. Teachers had to improvise, students suddenly had to structure their own school day, and parents had to act as teachers for their children. Although we do not know exactly how much education children received while schools were closed, there are strong indications that children spent less time on their education than usual. Studies in Germany and Switzerland report considerable reduction in studying time during school closings [28,29]. Moreover, a survey among Dutch parents revealed that parents, especially in disadvantaged families, did often not feel equipped to support their children during the school closing [20].
Children in countries with longer school closings and less internet access might have experienced larger learning losses and larger inequalities because they experienced prolonged periods of limited and unequal excess to education. In line with this hypothesis, a recent study in Italy [9] reports larger learning losses (0.19 SD) than previous studies in the Netherlands (0.08 SD) [3]. In Italy schools were closed for 15 weeks (one of the first and longest school closings in Europe). Moreover Italy has one of the lowest share of households with a broadband connection [30] and 12% of the students between 6 and 17 years old did not have access to a computer or digital tools at home in 2018/2019 [31]. However, contradicting the idea that longer school closings result in larger learning deficits, a study in Belgium-where schools closed for 8.5 weeks-reports a reduction in mathematics scores of 0.19 SD [2] which is similar in size to the effects found in Italy (15 weeks). Altogether, more research based on country comparisons is needed to be able to state that longer lockdowns result in larger learning deficits and an increase in educational inequalities.

Materials
In the Netherlands, students take standardized tests throughout grades 1 to grade 6 in primary education. These standardized tests come from different suppliers, with the largest supplier being CITO, with which we collaborated for this paper. Furthermore, schools use administration systems to store the information about the standardized test scores. Three administration systems exported the data on standardized test scores from school year 2013/2014 onwards as part of the Netherlands Cohort Study on Education (NCO) project, a national project initiated by the Dutch Research Council (for a description of this project, see [32]). With permission of the schools, the administration system exports the data on the standardized test scores to Statistics Netherlands, who pseudonominize the student-id and school-id. Before any data was exported, parents were informed about the project and data export by the school, and were given the opportunity (during 4 to 6 weeks) to object against export of their child(ren)'s data (by informing the school written or orally). The school registered any objections in their administration system, and data was not exported from those students whose parents objected.
The data was collected over a period of three months with two exports from the administration systems, the first export took place on the 30th of November 2020, the second on the 18th of January 2021. In this export, information is collected from school years 2013/2014 to 2019/ 2020 and gradually consists of more and more students in more and more grades. For more information, see Table 1. In total, 1,319 schools and unique information of 291,635 students was gathered on standardized test scores. After cleaning the data, the total sample used for analyses of this paper comes down to 201,819 students in 1,178 schools.

National standardized test scores
From grade 1 to grade 6, students take standardized tests twice a year, a midterm test, most often administered to students in the months January and February of the school year, and an end-of-term test, mostly administered in the months June and July, right before the summer holidays. For most schools, these are digital tests. Some schools opt for the pen-and-paper version. Due to the school closure in the spring of 2020, for the school year 2019/2020 the end-ofterm test could be postponed until after the summer holidays, which many schools did: about a quarter of schools decided to test their students after the summer holidays in August, September or even October. Test supplier CITO made a recalculation for the test scores in August, September and October to account for the extra time until the test, and make them comparable to the test scores of students who made the test before summer. Note that the tests written during the pandemic were exactly the same type and format as before the pandemic, and there is no within school variation between the type and format of the tests before and during the pandemic.
We use test scores in the domains reading, spelling and math. Table 2 shows the number of test records and unique students per domain. The test in math contains both abstract problems and contextual problems that describe a concrete task. The reading test assesses the student's ability to understand written texts, including both factual and literary content. Lastly, the test in spelling asks students to write down a series of words (no verbs), demonstrating that they have learned the spelling rules. For reading, there is no mid-term test in the first grade, therefore the learning gains between the midterm test and end-of-term test cannot be calculated for grade 1.
The learning gains are defined based on the standardized test scores and are calculated by subtracting the score on the midterm test from the end-of-term test of each domain within a school year, with the condition that the student must have taken a midterm and end-of-term test within the same school year at the same school. To remove the influence of outliers, the top and bottom 1% of the absolute learning gains scores are not included in the analyses.

Student background variables
In the secured virtual environment of Statistics Netherlands, standardized test scores can be matched to background information of the students and their parents. Note that the data in the environment of Statistics Netherlands are pseudonymized such that data are fully anonymous to the researchers that use these data. The data on background information that we use are the highest education level and highest income of parents, student migration background and student gender. Parental education is defined as low when the highest obtained degree of (one) the parents is in pre-vocational secondary education (vmbo b/k), or a degree in upper secondary vocational education (mbo 1), or grades 7 to 9 in pre-vocational secondary education (vmbo gl/tl) or senior general secondary education or university preparatory education (1), middle when a degree in upper secondary vocational education level 2, 3 or 4, or when completed senior general secondary education or university preparatory education (2), and high when a degree at a university of applied sciences is attained or higher (3). This division of parental education over three categories is also being used in the Netherlands Cohort Study on Education and leads to a division in categories that is not only relevant at the content level, but also provides us with large enough groups to have statistical power. Highest parental income is defined as low when the highest income of one of the parents is below the minimum income level (1), middle when the income is higher than minimal level but below twice the minimum income level (2) and high when the income of one of the parents is higher than twice the minimum income. Students' migration background is defined as either having a Dutch background or a western background, or a non-western background. Students with a Dutch or western background are combined into one category because the data contains only very few students with a western background, and the results of these two groups are very comparable. In terms of parental education and household income, students in our sample with a non-western migration background are more likely to come from households with relatively low educated parents (26% compared to 6% for the native Dutch and western migrant student sample) and a relatively low income (45% compared to 16%). Lastly, the gender of the student is defined as male or female.

Representativeness
The data on standardized test scores are only available for schools who gave permission to export the test scores from their administrative system to Statistics Netherlands. As a result, we do not have full population data and consequently selectivity of the sample might play a role.
In the schoolyear 2019/2020, we had a total number of 6,174 primary schools in the Netherlands. The 1,178 schools in our sample therefore comprise a proportion of 19% of the total number of schools. Two main sources of selectivity into the sample can be identified. First, the schools that decided to participate in the data collection might not be random. In exchange for sharing the standardized test score, schools received a report on the performance of their school relative to other schools with a comparable student population. We can expect that active schools, which are keen to monitor their progress, are especially interested in the reports and more likely to participate in the data collection project. Second, not every student is tested. Schools tend to exclude students who are absent (e.g., due to illness) or have a very large learning loss. For these students, schools feel a test is not possible or useful. Usually, the number of students per school which are excluded from the standardized tests is relatively small. However in 2020, after the school closed for several weeks, more schools decided to skip the standardized tests for a larger share of the student population. It is reasonable to assume that students with larger learning losses are less often tested. Therefore, it is likely that our data is not representative for the whole population, and additional tests on our sample in comparison to the full population confirm this. Table 3 shows the representativeness of our sample in comparison to the full population (based on the National Cohort Study on Education; [in Dutch  [32] on student and school background characteristics. Overall, we see that our sample is over-represented in students with a non-western migration background, and students with low parental income. Furthermore, schools in our sample tend to be larger schools located in more urbanized areas. To limit the impact of selectivity and over-representation of certain students and schools, we use inverse probability weights. In calculating the weights, we use population data on all students enrolled in Dutch primary education and calculate the probability to be in our test score dataset separately per academic year, grade, and test subject domain as a function of students' observable characteristics. These characteristics are parental education, income, migration background, gender, percentage of students with low educated parents at the school, number of students at the school, urbanisation level (based on location of the school), province (based on location of the school) and school denomination. Table 4 shows the unstandardized learning growth for the three domains for the 2 years before the pandemic and the year of the pandemic separately. It also shows the learning growth split by group of parental education. Table 4 is used to calculate the normal average learning growth per week in the 20 weeks between midterm and end-of-term test, and the deviation from this in the COVID-19-year. For example, if the normal learning growth for reading is 7 (in 20 weeks time), and during the pandemic it's only 5, the decline in learning growth in weeks is (20-((20/7) � 5) = 5.7.

Methods
In order to estimate the effect of the COVID-19 related school closure on students' learning gain, we compare the learning gain between the midterm and the end-of-term test of the COVID-19-exposed cohort (2019/2020) to the learning gain of students from the two previous cohorts using OLS regressions. To account for potential differences in observable characteristics between students of different cohorts, we add controls for student gender, student household income, migration background, and parental educational background. Further, since for some students of the 2019/2020 cohort the end-of-term test was postponed until the start of the next academic year, we add a dummy indicating whether the test was taken at the end of the 2019/2020 academic year or at the beginning of 2020/2021, resulting in the following regression equation, resembling a difference-in-differences design: Where Δy ij stands for the difference in achievement between the end-of-term test and the midterm for student i in grade j. T ij is an indicator for the COVID-19 exposed 2019/2020 cohort, X ij is a vector consisting of the aforementioned control variables, and ε ijs is the schoollevel clustered error term. β is our coefficient of interest, which captures the difference in average learning gain between the COVID-19 exposed 2019/2020 cohort and the average learning gain of the (pooled) preceding two cohorts (2017/2018 and 2018/2019). Identification of the COVID-19 effect hinges on the assumption that the learning gain of the different cohorts would have followed a similar trend in the absence of the pandemic. While this assumption is fundamentally untestable, we can provide supporting evidence for it by looking at the variability of learning gains for all grades over time. If these trends are stable, we can be reasonably sure that the difference between the 2019/2020 cohort and the previous two cohorts was caused by the impact of the pandemic. The results of these analyses can be found in  In order to estimate the heterogeneous impact of the COVID-19-pandemic along student background characteristics, we add an interaction between the treatment-dummy and the student characteristic of interest to the regression. This results in the following equation: Where C ij stands for one of the aforementioned student characteristics: gender, parental education, household income, and migration background. The vector of control variables X ij still includes all other student characteristics. As a robustness check, we also present results of analyses where apart from the interaction we do not include any of the other control variables, with similar results (see Tables 5-10). Finally, a concern could be raised that some of the student characteristics we observe are capturing similar things. For example, parental education and household income are likely to be strongly correlated. In order to isolate the additional impact of COVID-19 along (for example) household income, we therefore run analyses where we control for the interaction between parental education and the treatment-dummy in addition to the interaction with household income. In addition to household income, we do this for student gender and migration background as well, resulting in the following equation: With E ij standing for the highest level of obtained parental education. As mentioned before, in our main specification we use inverse probability weighting to obtain results representative for the whole Dutch primary school population. As a robustness check, we run the same analyses without employing weights as well as using entropy-balancing weights ensuring covariate balance between the COVID-19 exposed cohort and the control cohorts (similar to the method used by [3]), and obtain similar results (see Tables 5-10).

Results
In this section, we show the consequences for inequality during the COVID Looking at the different grades, we see a gradual increase in the learning loss from grade 1 to grade 5 onwards across all domains, with some outliers, like for instance spelling in grade 4. For reading, we see students experience about 0.06 to 0.20 SD learning loss compared to students from previous cohorts. Looking at spelling shows a similar result, where students experience about 0.13 to 0.18 SD learning loss. Math shows the largest deficits in learning with on average 0.13 (grade 1) to 0.33 (grade 5) SD learning loss.
Although most students learned less in 2019/2020 than their peers in previous cohorts, some students show larger learning loss than others, leading to (increasing) inequality between students. We look at four dimensions of inequality: (1) by parental education, (2) by family income, (3) by migration background and (4) by gender. Fig 2 shows that children with low-educated parents learned less between the midterm and end-of-year test than their peers with high-educated parents, and that the differences are largest in grades 1, 2 and 3, and for spelling and math(note that alternative specifications in which we use four categories of parental education, or in which we use three categories which are not based on parental education, but on the indication (used for funding purposes) whether a child is a regular child, has a disadvantaged background or a very disadvantaged background, yield very similar results and the same conclusions.) The results show, for example, that children of high-educated parents experience about 0.1 SD more learning gains during the year 2019/2020 compared to children of low-educated parents. In other words, the learning loss due to school closings is larger for students of low-educated parents, and inequalities have grown because of this. The differences between students of high-and low-educated parents are statistically significant for spelling and math but not for reading, implying that the role of parental background on educational development during the first school closure due to the COVID-19-pandemic is largest for math and spelling, and less pronounced for reading comprehension. Altogether, these findings show that the existing differences in learning gains based on parental education prior to the COVID-19-pandemic have increased during the spring of 2020 when learning was disrupted. These increased differences based on parental education are not surprising, since students were more dependent on the help their parents could provide with schoolwork during the school closure. This finding is also confirmed by other studies: parents in the Netherlands with lower educational attainment felt less capable to help their children with schoolwork [20,21].
Parental income also plays a role: Fig 3 shows that children from medium and higher income households increase their scores between the midterm and end-of-year test more strongly in the COVID-19-year than their peers from a family with a lower household income, with the largest effects in grades 2 and 3, and for spelling and math. For example, children from medium and high household income experience about 0.05 SD more learning gains than children in low-income households. Note that the relation between income and learning gains is additive to the additional role of parental education during the pandemic. We explicitly take into account the effect of parental education on learning gains and the additional role during the pandemic, and we still see an effect of household income during the school closure. However, it is not surprising that we find an effect of parental income on top of the effect of parental education: parents with higher household income were more likely to afford additional help for their children during their time at home. One study suggested that they provide more private access to additional online learning materials [33].
We also looked at the role of migration background, again on top of effects of parental education. The results in Fig 4 indicate that, conditional on the effect of parental education, overall, students with a non-western migration background did not perform significantly worse than other students (native and with a western migration background) during the COVID-19-pandemic. We only find a small significant result for math for grade 2 and 3, and for all grades taken together. Note that, if we do not condition on effects of parental education, we do find significant differences for migration background. Hence, overall, we find that the increased inequality during the pandemic is based on parental education and parental income rather than on migration background.
Lastly, we find that there are no significant differences for gender, neither with nor without controlling for the effect of parental education. Girls seem to perform slightly worse on reading and math but the coefficients are small and almost only statistically significant when all grade levels are taken together (see Table 14).

Robustness checks
Our preferred model, used throughout the main text, includes inverse probability weights to obtain results that are representative for the entire Dutch primary school population, and we compare the learning gains between the midterm and the end-of-term test in the COVID-19-year to the gains in the two years prior. Furthermore, in the analyses mapping the disparate impact of COVID-19 along several student characteristics we control for all other background characteristics. These choices could potentially influence our results and their interpretation. Therefore, in this section we show the results of additional analyses where we change the specification of our main model.
To demonstrate how the choice of including inverse probability weights influences the results, Tables 5 and 6 show the results when using entropy-balancing weights and unweighted regressions, respectively. While there are some slight differences in terms of significance levels of certain coefficients, the overall pattern of lower learning gains during the COVID-19-year for students from low-income households and low educated parents, especially in spelling and math, remains similar in magnitude.
In Table 7 we run our main specification without controlling for student gender and migration background, and the dummy accounting for whether the end-of-term test was taken at the end of the school year of 2019/2020 or at the beginning of the 2020/2021 school year. It could be that the interaction effects found on student household income and parental educational background only hold conditional on these other student characteristics. If so, this complicates the interpretation of our results. Fortunately, the exclusion of these additional control variables does not change the found associations. Table 8 shows how the results change when we control for students' learning gains that they obtained in the previous year. Including prior performance helps in addressing potential differences between cohorts in the trend of their cognitive development. However, the downside of this specification is that we do not observe prior performance for students that are in the first grade (or second grade for the reading domain), and they drop from the analyses as a result. For the other grades, the results are similar to the main specification. Prior learning gains are significantly positively related to later learning gains for all but one subgroup (grade 5 spelling), but its inclusion does not alter the size and significance of the main results. Table 9 adds school fixed effects to the regression. With this we control for the possibility that time-invariant factors at the school level are driving our results. This could be the case, for example, when students are strongly sorted into schools according to their background characteristics. In this case the association between student characteristics and learning gains could be driven by (unobserved) differences between schools that house different kinds of student populations. The results from Table 9 however show that including school fixed effects does not change the patterns of the found associations.
Finally, Table 10 shows the results of our main specification as well as the previously discussed robustness checks for the pooled sample over all grades. This table further demonstrates that while there are some slight differences in terms of significance between specifications when looking at grades separately, the overall picture of the disparate impact of COVID-19 on student learning gains along household income and parental education levels remains strong and is robust to various alternative model specifications.

Trends in learning gain over time
The interpretation of the differences in learning gains between the cohort affected by the COVID-19 induced lockdown and previous cohorts as attributable to the impact of COVID-19 hinges on the assumption that learning gains would have been similar in the absence of the pandemic. While this assumption is untestable, we can provide supporting evidence for it by looking at the variability of learning gains for all grades over time. If these trends are relatively stable, we can be reasonably sure that the difference between the 2019/2020 cohort and the previous cohorts was caused by the impact of the pandemic. Figs 5-7 show the trends in learning gains per grade over time by plotting the (inverse probability population weighted) unstandardized learning gains for all available cohorts in reading, spelling, and math respectively. Because our data does not go back equally far for all grades, the lines are of different length. As noted earlier, we do not have information on grade 1 learning gains for the reading domain, as grade 1 students do not take a midterm test for this domain. For higher grades, we also have fewer available cohorts due to the manner in which data was collected (see also Table 1) The figures clearly show a marked decrease in learning gains between the COVID-19 affected cohort of the school year 2019/2020 relative to the prior cohorts in all domains and for most grades. For the spelling and math domains, learning gains prior to the COVID-19 cohorts were remarkably stable over time across grades 1 through 4. For grade 5, the 2018/ 2019 cohort had somewhat lower learning gains than the 2017/2018 cohort. However, since these are the only 2 pre-COVID-19 cohorts for which grade 5 data is available, it is unclear whether this represents a somewhat random fluctuation between cohorts, or whether it is part of a longer trend in declining grade 5 learning gain. For reading, the results are less clear. Grades 2 and 3 show a less stable trend over time than the other grades. For our main estimation sample of the 2017 and 2018 cohorts comprising the control group however, the differences in learning gain between these two cohorts is relatively small for all grades.
A different way of showing whether the 2019/2020 COVID-19 affected cohort is somewhat of an outlier in terms of their regular learning gain, is to plot the prior performance of this Looking at the figures, this is indeed what we see for spelling and math. Both cohorts are on remarkably similar learning gain trajectories from grade 1 through grade 4. In grade 5, the 2019/2020 cohort experiences a stronger decline in learning gain, especially in math, than the students of the previous cohort that were unaffected by the pandemic. For reading, the results are again less stable. The 2018/2019 cohort experienced a stronger decline in learning gain in grade 3 relative to the other grades and the 2019/2020 cohort. The overall pattern of decreasing learning gain from grade 2 to grade 3 and increasing learning gain from grade 3 to grade 4 is visible for both cohorts, however, and the grade 5 learning gain of the pandemic-affected 2019/2020 cohort does decrease more strongly than the learning gain of the prior, unaffected cohort.

Conclusions
This study describes the additional inequality in learning gains of primary school students in the Netherlands during 12 weeks of disrupted learning due to the COVID-19-pandemic for three domains: reading, spelling and math. We show large inequalities in the learning loss These results are quite alarming and indicate an average delay in learning of about 5.5 weeks for reading, and around 3 weeks for spelling and math with larger deficits in the higher grades. Relative to the period between the midterm-and end-of-term tests of around 20 weeks, this is rather a lot. It is to some extent reassuring that in general the decline in learning gains do not take place in the lowest grades in which the foundation for math and language skills are laid [34]. On the other hand, the decline in learning gains is larger for students from a low socioeconomic background (lower parental education and household income) for spelling an math and these inequalities are higher in the early grades. We see a delay of around 4 weeks for spelling and math for students with low-educated parents. This implies that during the school closure period students with low-educated parents hardly learned anything. However, there are no statistically significant socioeconomic status differences in reading scores. Previous research has shown that the home environment is important for the development of literacy skills and reading motivation [35,36]. In line with this finding some have also suggested that center-based reading interventions might be less effective than mathematics interventions [37,38]. Our finding that reading skills are less affected by the school closure support the idea that the family environment plays an important role in the development of reading skills, also when schools are open. In contrast, the limited increase in socioeconomic inequalities in reading skills is not in line with our expectations. Normally, family environment is an important source of inequality in reading skills [36] and we would expect inequalities to rise when schools closed and the role of family environment increased. We attribute the additional inequality in learning loss of students in math and spelling based on parental education and   The outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using entropy weights.  The outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Standard errors are clustered at the school level and are omitted for brevity.  The outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Observations are weighted using inverse probability weights. The outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using inverse probability weights.  The outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using inverse probability weights. Standard errors are clustered at the school level and are omitted for brevity.   Note: the outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, an indicator for student grade, and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Standard errors are clustered at the school level and are omitted for brevity.  Table 11. Underlying regression results main effects (Fig 1).  The outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for gender is "boy". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, parental income and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using entropy weights.  Fig 2). The outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using entropy weights. The outcome variable, learning gain between the midterm and the end-of-termtest, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for household income is "low". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using entropy weights.  Note: the outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for migration background is "no migration background". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, parental income and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using entropy weights. Standard errors are clustered at the school level and are omitted for brevity. Note: the outcome variable, learning gain between the midterm and the end-of-term test, has been standardized within grade using the inverse probability population weighted means and standard deviations. The baseline category for "COVID-19 year" are the pooled students from the 2017/2018 and 2018/2019 school years. The baseline category for gender is "boy". The baseline category for parental education is "low". Additional controls include student gender, students' migration background, parental income and a dummy indicating whether students took their end-of-term test at the start of the next, rather than at the end of the current school year. Observations are weighted using entropy weights. income to better resources these students had: Students with higher-educated parents most likely all possessed a laptop, had parents that were able and willing to help with schoolwork and could even afford additional private tutoring if needed. The results call for national focus on reducing the learning loss of students from lower-educated parents and lower household income. It is worrisome and unfortunately not unlikely that the increased inequalities in learning loss due to the pandemic may lead to long lasting inequalities, deepening the gap in adult outcomes between groups in the population. This very much stresses the need for targeted interventions to reduce the current inequalities in learning loss caused by the pandemic.

Reading
This article shows that schools matter, specifically for the most vulnerable groups. Distance learning may prevent part of the damage but cannot compensate for classroom teaching. The policy implications of these findings are therefore twofold: 1) Government budgets that are made available to make up for learning loss should give schools with many students from loweducated parents and low household income a large share of the pie, and 2) in an event of another crisis, or the current COVID-19-pandemic continues, schools should be closed only as a very last resort to avoid further inequalities.