Measuring the effect of climate change on migration flows: Limitations of existing data and analytical frameworks

The aim of this paper is to review quantitative large-N studies that investigate the effects of climate change on migration flows. Recent meta-analyses have shown that most studies find that climate change influences migration flows. There are however also many studies that find no effects or show that effects are dependent on specific contexts. To better understand this complexity, we argue that we need to discuss in more detail how to measure climate change and migration, how these measurements relate to each other and how we can conceptualise the relationship between these two phenomena. After a presentation of current approaches to measuring climate change, international and internal migration and their strengths and weaknesses we discuss ways to overcome the limitations of existing analytical frameworks


Introduction
Migration and climate change have become two of the most salient political and social issues over the last decades, with research on their relationship now abundant. While some studies find that climate change influences migration flows [1], others either do not [2] or reveal just indirect effects [3,4]. Recent meta-analyses by [5][6][7] have shown that most studies find that effects are dependent on specific contexts. It also appears that this broad range of findings can be explained by the fact that the studies use a wide variety of climatic and migration variables as well as different samples and estimation strategies [5,8]. Against this background, we argue that we need to discuss in more detail how to measure climate change and migration, how these measurements relate to each other and how we can conceptualise the complex relationship between these two phenomena. The aim of this paper is to review quantitative large-N studies that investigate the effects of climate change on migration flows in economics and related fields. We thus focus on one specific approach and its limitations and leave out qualitative small-N research, which faces different challenges. In contrast to other review studies that systematically focus on the evidence collected so far [5][6][7] or methodological approaches [9,10], we consider a large selection of key studies to discuss various limitations in this field that we believe simplify the picture of how climate change affects migration flows [11,12]. In the following, we first present current approaches to measuring climate change, international and internal migration and their strengths and weaknesses, before discussing ways to overcome the limitations of existing analytical frameworks.

Measuring climate change
In most large-N studies, climate change has been quantified through time series of temperature and precipitation changes or related measurements of droughts, heatwaves, soil moisture and evapotranspiration (see S1 Table in the Supplementary Material for an overview). These can take simple forms such as annual or monthly means at country or grid cell level [13,14]. Some studies use weighting to tailor the indicators to the hypothesised mechanism of climate influence on migration: for instance, measurements can be weighted by agricultural areas and aggregated over the main growing season to obtain a proxy for crop growing conditions, or by population density if the hypothesis focusses on the direct effect of weather on humans [3,15].
Historical temperature and precipitation data for such studies is readily available in gridded format, for instance from the University of East Anglia Climatic Research Unit (CRU) [16]. Such gridded observational products rely on direct measurements by weather stations, meaning that data quality can be lower in regions of the world with low station density. Reanalysis products integrate direct observations into weather models to overcome limitations in the data regarding spatial coverage, temporal resolution and the number of variables measured. Such datasets, e.g. the ECMWF's ERA5 reanalysis, provide not only information on temperature and precipitation but also many other atmospheric variables on a daily or sub-daily basis and on global grids with a spatial resolution of below one degree [17]. This high spatial and temporal resolution allows researchers to use more elaborate temperature measurements, such as heatwave or warm spell indicators, or to construct multivariate indices that measure the effects of drought conditions-such as the Standardised Precipitation-Evaporation Index (SPEI) [18] or the Soil Moisture Anomalies index [19].
Indeed, although temperature and precipitation are important variables that lend themselves to easy interpretation based on everyday experience, they may not always be the most appropriate and comprehensive measure of climate change when studying climate-induced migration flows. While the mechanisms linking climate change with migration are not fully understood, there is a rich and growing literature on the many impacts that climate change has on individuals (in cities or rural areas), their societies and economies, not all of which are explained by temperature and precipitation alone [20]. For instance, a combination of high temperature and atmospheric humidity can create conditions under which even healthy people cannot survive outdoors-conditions which are expected to occur more frequently in parts of South Asia as global warming develops [21]. Due to the urban heat island effect, cities experience even more extreme climate conditions [22]. Similarly, major wildfires, high windspeeds or coastal flooding induced by storms can destroy settlements and force people to flee their homes [23].
Even agricultural production, for which temperature and precipitation are important inputs, is insufficiently explained by linear measurements of those two variables. For example, there are important non-linearities in crop response to heat: crop yields depend in complex ways on the timing of meteorological inputs relative to the crop's specific growth cycle, meaning that additional variables such as soil moisture are important, possibly inducing path dependencies across adjacent growing seasons [24,25]. Water-related problems such as water scarcity or river floods can result from the interaction of precipitation, evaporation and vegetation processes far upstream from where they appear, meaning that local or country-level precipitation may be poor proxies. Researchers can instead draw on an increasing array of observational [26] and model-based datasets (e.g. ISIMIP, www.isimip.org) directly providing the relevant variables to describe flood hazards, crop yields, water scarcity or many other climate impacts. Moreover, the 6 th assessment report of the IPCC's Working Group II is an important reference for identifying the relevant variables to assess a given hypothesis in climate migration studies.

Measuring international migration
As for international migration flows, there is a trade-off between datasets that include flow data but only refer to OECD destination countries and datasets that include non-OECD destination countries but only provide data on migration stocks (see S1  [1,16,27] use annual data on bilateral migration flows from a very large number of origin countries to between 19 and 42 mostly OECD destination countries for the 1990s and 2000s. A frequently used dataset in this context is the OECD's International Migration Database (IMD). Though containing yearly data on inflows and outflows, it is based on national statistics and thus on different definitions and sources [5]. The main limitation of these datasets is that they do not measure South-South migration, an extremely relevant aspect in the study of climate change effects and a particular challenge in exactly these regions.
As a way of dealing with this problem, some studies use census data. Beine and Parson [2] as well as Cattaneo and Peri [3] draw on the Global Bilateral Migration Database (GBMD) provided by the World Bank [28], which contains migration stocks of 226 origin and destination countries in a ten-year rhythm between 1960 and 2010. While this allows for investigating a larger number of (destination) countries over several decades, there are also severe constraints. Most importantly, the dataset only constitutes a proxy for migration flows because it compares contiguous censuses. For example, it is impossible to know whether a decrease in stocks between two censuses results from decreasing flows, increasing return migration or outflows to a third country.
A few studies use a dataset on asylum seekers compiled by the United Nations High Commissioner on Refugees (UNHCR). It contains annual data on the number of asylum applications filed in the destination countries in the period 1951-2016 and the countries of origin of the asylum seekers [29,30]. While this dataset includes flow data for the entire world, it should be noted that only about 10 per cent of all migrants are asylum seekers [30: 1610]. Another problem is that many flow dyads between countries in the dataset go unreported, with debate ongoing over how to treat such cases [31]. Another challenge to use data on asylum seekers is the fact that it is not possible to apply for asylum on grounds of climate change [32]. For this reason, migrants might not apply for asylum but rather cross borders irregularly. To study the effects of climate change on irregular migration, Cottier and Salehyan [33] analyse data collected by Frontex, the European Border and Coast Guard Agency, from national border authorities on irregular border crossings detected at the external borders of the EU and Schengen associated countries.
Recent studies have started to use survey data based on retrospective questions or questions on migration intentions to measure (proxies of) migration flows [34][35][36][37][38]. The main advantage of this kind of data is that it can help circumvent ecological fallacy problems [39] and gain a better understanding of the potential migrants' individual characteristics. Tjaden et al. [40] show that there is a high correlation between migration intentions and emigration at the aggregate level (see also [41]). Such data is problematic, however, in that there might be reporting (recall) and understanding biases.

Measuring internal migration
A series of studies investigate the impact of climate change on internal migration [3,[42][43][44][45]. However, data quality here is even more problematic than for international migration. Some single-case studies also use census data [46] or community surveys [47] to measure internal migration, though such sources suffer from problems similar to those described above. More specifically, it is not possible to systematically investigate to what extent internal migration constitutes temporary migration, for example in the case of 'survival migration' [11: 6]). It is often assumed that people return when short-term events such as floodings are over. Furthermore, data is often only available for single countries. Peri and Sasahara [48] have recently tried to overcome this lack of data at the subnational level by relating imputed census data to climate data at grid cell level for all countries in the world. However, they only provide data in a ten-year rhythm, meaning that shorter-term developments are not considered.
For these reasons, most studies have used urbanisation as a measure of internal migration, as it is generally a result of rural-urban migration [49: 19]. The validity of urbanisation as a proxy for internal migration can of course also be questioned because migration is not the only driver of urban population growth. It has, however, been shown that fertility rates in urban areas tend to be lower than in rural regions [50,51]. Another potential problem concerns international migration to urban areas, but this is mostly the case in developed destination countries.
The more important problems concern the lack of consensus on how to define urban areas and the general unavailability of fine-grained census data that would allow researchers to differentiate between urban and rural areas. Urbanisation can be measured by increases in factors characterising cities, such as the size of the urban area, population size and density, shares of urban land uses and land covered by artificial surfaces, building density, and road network density [50,52]. As none of these factors are exclusive to cities, urbanisation often relates to substantial densities. Yet, density thresholds have been criticised due to varying national urban population cut-off values and their general arbitrariness [53,54]. Specifically, an increase in population and population density in a specific area-usually defined by administrative boundaries-can serve as a proxy for urbanisation, though the use of census data limits both the spatial and the temporal resolution.
Alternatively, data on urban land cover or built-up areas is globally available in high spatial resolution [55]. However, the temporal resolution is still low, which constitutes a drawback when urbanisation trajectories need to be monitored over a longer period. Night light data is another factor that is sometimes considered [56]. The data has global coverage and is available for many years back. However, whether a place is lit at night is dependent on many factors, such as economic development.
Better data on rural-urban migration would help investigate the interactions between migration and climate change in both directions. Urbanisation processes have repercussions on emission levels of greenhouse gases [57,58] as well as on adaptation to climate change. For example, having more people adopt urban lifestyles could accelerate climate change, while migration to urban coastal zones and high-density urban areas might increase long-term climatic vulnerability [59,60].

Improving analytical frameworks
As we have seen, there are major gaps in the data used to measure migration flows. It will require major efforts by international organisations to extend existing databases but also to explore new data sources. Moreover, we need to develop detailed analytical frameworks that help us decide which indices best describe the influence of climate change on migration. This would allow us to understand which consequences of climate change lead to which kind of migration under which circumstances. To better understand these relationships, we first need to validate existing data and find out which indices are most relevant in this field. Moreover, future studies should account for the complexity of migration movements, the individual vulnerability of climate change migrants and contextual mechanisms.

Validating existing data
While collecting better and more migration flow data will help overcome some limitations, we also need to validate existing databases to gain a better understanding of what we actually investigate. Most studies will continue using flow data derived from migrant stock data [61]. Berlemann et al. [62] show that the various methods such as stock differentiation, migration rates and demographic accounting used to generate such flow data influences what climate change effects are found. Their study provides a good example of how we can start solving some of the problems discussed above. Systematic validity tests of different data sources and indices will allow us to better understand to what extent they are interrelated. Such tests should also be applied to survey data measuring migration aspirations, as we need to gain a better understanding of the extent to which this data helps us measure and predict actual migration.

Accounting for more complex migration movements
Having more valid migration flow data available will also allow us to improve our knowledge of different migration dynamics. With a few exceptions, most current research looks at either internal or international migration. However, this prevents us from understanding how these two forms of migration are interrelated in the context of climate change [2,44,38]. We also know from other fields that migrants do not always move to their preferred destination directly, but take a step-by-step approach in which intermediate destinations and stopovers are usually cities [63]. Ideally, it should be possible to track migrants over a longer period, thus gaining a better understanding of changing migration strategies that possibly reflect waning or increasing climate change effects [64]. To study internal migration, some studies use panel data based on household surveys in Nepal and Indonesia [65,66], censuses in South Africa [47] or credit records in Puerto Rico [67]. Lu et al. [68] draw on call detail records (CDRs), which provide information on the time of calls and mobile phone towers used, to follow the positions and movements of people during a cyclone in Bangladesh. While all these panel data allow for following people over time, they are limited to individual countries and may not be representative, as in the case of credit or call detail records, or they can only be interpreted together with qualitative data [69]. It is even much more challenging to follow people who migrate internationally. Initial studies are already investigating the possibilities of smartphone technologies to conduct such research with migrants [70][71][72].

Accounting for the individual vulnerability of climate change
While survey data will likely remain of limited use in predicting actual migration, it allows us to better understand who is most affected by climate change, as it provides more detailed information on potential migrants [10: 8-9]. This possibility points to a major gap in the current research on the nexus between climate change and migration. While existing studies primarily predict the number of migrants, we hardly know anything about the composition of migrant groups. This lack of information might be responsible for some of the uncertainty around how climate change affects migration flows. While some argue that it mainly affects vulnerable groups with no other means to adapt to climate change, such as people on low incomes or working in agriculture [3,73], others argue that perceptions of and knowledge about climate change plays an important role [70,35,37: 9].

Accounting for mechanisms and impact channels
Different social groups might also react differently in different contexts. It has already been widely shown that countries with a larger agricultural sector are more affected [1,15,27,46]. There are, however, also many other potential channels through which climate change affects migration intensity that have so far received little attention (see also S1 Table in the Supplementary Material). Climate change can, for example, affect an economy more generally [74], but also public health [75], and can have politico-institutional repercussions [76]. All these factors constitute important push factors of migration, making it difficult to disentangle them from direct climate change factors.
More elaborate hypotheses about the mechanisms or channels linking climate change to migration could also help in choosing the appropriate climate change measurement methods in a given study. For instance, a modelling study by Rigaud et al. [77] uses spatially explicit estimates of multi-year average water resources and agricultural yields to calibrate a gravity model of population change and estimate climate-induced changes in internal migration. The underlying hypothesis is that persistent shortfalls in water availability and crop productivity had the potential to undermine rural livelihoods, prompting those affected to move to cities or more productive rural environments. Another study employs flood modelling results to estimate future changes in river flood hazard levels and subsequently to project the future risk of floodinduced population displacement [78]. Focusing on international refugee flows, Abel et al. [79] combine data on asylum applications, conflict occurrence and the SPEI (as a drought indicator) to show that climate conditions in all probability affected refugee flows indirectly through their effect on conflict risk, but only in specific periods and regions of the world. These examples illustrate the different factors at which climate change can influence migration: from climate change-fuelled weather extremes that pose direct threats to human lives, deteriorating environment-dependent livelihoods, to indirect effects mediated by other migration drivers such as conflict and in the way countries as a whole are differently affected by and able to cope with the effects of climate change.

Conclusion
Boas et al. [80: 902] argue that climate-related migration is not the exception from the norm but rather the 'new normal'. However, empirical evidence on the complex interplay between climate change and migration is still weak. A better understanding of the inherent complexity of this field in the long term requires scholarly consensus on analytical frameworks, definitions and measurement methods. We have highlighted a few strategies to advance research towards such a consensus. Moreover, a wide range of data on climate change and its impacts is available from direct observations, remote sensing and models, much of which has yet to be used to fine-tune the representation of climate in climate migration studies. This goes hand in hand with a need to further develop the underlying hypotheses and to more explicitly model the different potential pathways through which climate change might affect migration. Finally, accounting for the connections between different types and scales of migration can help reveal some of those pathways, painting a more holistic and relevant picture of human mobility in a changing climate.
Supporting information S1 Table. Overview of key variables and data sources. (DOCX)