Strategic choices of migrants and smugglers in the Central Mediterranean sea

The sea crossing from Libya to Italy is one of the world’s most dangerous and politically contentious migration routes, and yet over half a million people have attempted the crossing since 2014. Leveraging data on aggregate migration flows and individual migration incidents, we estimate how migrants and smugglers have reacted to changes in the border enforcement regime, namely the rise in interceptions by the Libyan Coast Guard starting in 2017 and the corresponding decrease in the probability of rescue to Europe. We find support for a deterrence effect in which attempted crossings along the Central Mediterranean route declined, and a diversion effect in which some migrants substituted to the Western Mediterranean route. At the same time, smugglers adapted their tactics. Using a strategic model of the smuggler’s choice of boat size, we estimate how smugglers trade off between the short-run payoffs to launching overcrowded boats and the long-run costs of making less successful crossing attempts under different levels of enforcement. Taken together, these analyses shed light on how the integration of incident- and flow-level datasets can inform ongoing migration policy debates and identify potential consequences of changing enforcement regimes.


Introduction
There are approximately 272 million international migrants around the world (IOM 2019b), and estimates suggested that a quarter of the international migrant stock were irregular migrants as of 2009 (UNDP 2009). 1 Smuggling networks often enable such flows, moving an estimated 2.5 million people in 2016 for an annual profit of $5.5 -7 billion (UN Office on Drugs and Crime 2018).
Limiting human smuggling may be desirable from a security perspective (e.g., in order to enforce the sovereignty of borders) or from a humanitarian perspective (e.g., to reduce exploitation and trafficking of undocumented people and discourage migrants from risking their lives on dangerous crossings). However, past research has found that efforts to limit human smuggling by increasing border enforcement may lead to unintended consequences. For example, researchers have identified a "deterrence-diversion tradeoff" in which some migrants forego the journey altogether, but others adapt by switching to alternative routes which may place them at higher risk (Sorensen and Carrion-Flores 2007). As a result, it is difficult to estimate the effects of migration policies, and there is a need for system-level models to estimate how migrants and smugglers react to changes in the crossing environment.
In this paper, we aim to build such models and apply them to study sea crossings on the Central Mediterranean route from Libya to Italy. Since 2014, over two million people have arrived in Europe by sea and over 20,000 people have died or gone missing (UNHCR 2020b). There are three primary routes through the Mediterranean (illustrated in Figure 1): the Eastern route from Turkey to Greece (approximately 58% of crossings from 2014 to 2020), the Central route from Libya to Italy and Malta (approximately 34%), and the Western route from Morocco to Spain (approximately 8%) (UNHCR 2020b). We focus on the Central Mediterranean route, which is the longest of the three and represents approximately 81% of all casualties observed in this period (IOM 2021). In recent years, this route has been at the center of a contentious policy debate about the role of nongovernmental organization-led (NGO-led) rescues and coast guard interceptions in encouraging or deterring risky migration attempts by sea.  The gold number shows the total number of arrivals to the primary destination country for each route (UNHCR 2020b). The red number below shows the number of dead or missing migrants along the route (IOM 2021). Graphic adapted from UNHCR (2020b).
intercept a growing share of migrants with Italian and European Union (EU) support, and information on the activity of migrants and smugglers became more elusive. In the face of the drastic change of environment caused by the increased role of the LCG, we seek to answer the following policy questions.
1. To what extent do Libyan interceptions deter the crossing of migrants? Do increased interceptions on the Central route divert migrants towards the other two migration routes? Is there any causal relationship that we can identify from the data? 2. How do smugglers, who help the migrants to cross the sea, elude interception? Is there any evidence around how they adapt their strategy on the incident level, and what are the outcomes of these strategy adaptations?
3. Can we generalize the ideas behind the analyses above to other irregular migration contexts, in order to estimate the consequences of policy changes? What kind of models are able to support such counterfactual inferences?
Answering these questions is significantly more challenging than in previous periods due to the scarcity and bias of incident datasets after the increase in Libyan interceptions in mid-2017. In particular, compared with the reports produced by NGOs which are motivated to disclose their rescue incidents, details on interceptions by the LCG (as well as casualties that occur as a consequence of their activities) are unpublished and the number of interceptions is reported only in aggregate. The most comprehensive dataset from the EU border control agency (Frontex) does not generally cover boats intercepted by the LCG at all. Therefore, we have a combination of biased but high-quality incident data, and comprehensive but low-quality flow data. While this type of data is common in many mixed migration settings (where interceptions are often performed by non-transparent enforcement authorities), it is difficult to analyze because many of the conventional causal analysis methods do not apply to the situation. Indeed, few researchers 2 have analyzed incident-level data on crossings, limiting their ability to study the evolution of smuggler strategy at the boat level.
Our main contribution is therefore the causal analysis of Central Mediterranean crossing behavior both before and after the rise in Libyan interventions, with careful robustness checks. Unlike existing papers that focus on the causal impact of NGO presence (Cusumano and Villa 2019, Deiana et al. 2019) or civil disorder and discontinuous policy change (Camarena et al. 2020), we focus directly on the rate of successful crossing, which determines the benefit that migrants expect to obtain by their crossing attempts. We integrate the flow and the incident data by obtaining a reasonable estimate of the overall rate of interceptions across time, which we then associate with each recorded crossing attempt. We confirm that the rate of successful crossing affects multiple aspects of the Central Mediterranean crossing, from the volume of crossing attempts to crossing characteristics (namely, the boat size).
We conduct two primary analyses. First, we build a time series model to estimate how the flow of crossings on the Central Mediterranean route responds to the growing rate of LCG interceptions, which increases the proportion of people returned to Libya and reduces migrants' probability of successful rescue to Europe. We employ an Error Correction Model (ECM) to avoid finding false causal relationships due to spurious correlations. We find a long-run positive relationship between rescue probability and attempted crossings. This is consistent with the somewhat surprising resurgence of sea migrants observed in the third quarter of 2019, as probability of interception declined.
The ECM's second-stage model of the number of crossings has an explanatory power of R 2 = 0.20 (adjusted R 2 = 0.16), which implies that the probability of successful rescue has a meaningful effect on the flow size. We estimate that a decline in rescue probability from 90% to 50% corresponds to over 10,000 fewer attempted monthly crossings on average, from over 14,600 expected crossings to approximately 3,400 crossings.
Second, we analyze incident datasets to document the strategic response of smugglers to the increased probability of interception. As Libyan interceptions rise, anecdotal and observational evidence suggests that smugglers and migrants begin to prepare for longer voyages towards Europe.
They increase the use of wooden boats relative to rubber rafts, and reduce the average number of people on board. To systematically explore crossing strategy, we construct a theoretical model of smuggler utility as a function of boat size. We build a discrete choice model of utility because it connects the rate of interception (flow-level data) and the incident-level data (boat size, type of boat) and enables counterfactual estimation with a limited amount of incident records. We estimate the strategic tradeoff between the short-run incentive to crowd more passengers onto wooden boats, and the long-run incentive to avoid Libyan interception by using smaller boats. For rubber boats, we estimate that smaller boats (≤ 50 people) begin to dominate larger boats (> 100 people) as the preferred alternative once the rate of Libyan interception approaches 60%. These results are consistent with the rise in the use of smaller boats in early 2019, when the intensity of LCG activity peaked. This implies that our model can capture continuous changes in the enforcement situation, unlike before-after discontinuity models.
In summary, we build on the evidence base for evaluating policy changes in the Central Mediterranean. The policy implications of our results are as follows: • Our analysis of the flow-level dataset is consistent with the claim that increased interceptions decrease the odds of crossing. We find that this adjustment occurs relatively quickly in responses to changes in the rate of interception, and in the limit, we estimate that the total number of crossings could fall as low as 300 per month. However, we note that the crossings which do continue despite increased enforcement are more perilous because it is less likely that distressed boats will be met with a rescue response.
• Our analysis finds evidence of a constrained diversion effect in which some migrants switch to the Western Mediterranean route when the chance of successful crossing on the Central route is low. Descriptive analysis suggests that the extent of substitution varies by nationality and likely depends on the ease with which migrants can reach coastal departure points from their respective countries. This suggests that rather than focusing on conducting interceptions at individual crossing points (which can simply push migrants towards other crossing points that are less well policed and potentially more treacherous), policymakers should take a more comprehensive view of smuggling routes more generally. This is consistent with broader calls for integrated regional approaches to addressing the underlying drivers of migration, and for the expansion of safe and legal crossing routes (MOAS 2020).
• Our analysis of incident data suggests that smugglers adapt to the changes in the interception rate, and migrants are still elusive in the current environment. Even at times when the LCG operated most actively to block migrants transiting the Libyan coastal zone, its effectiveness was limited by smugglers' strategic response. Boats with smaller numbers on board are estimated to have had an advantage in passing the coastal region and reaching European search and rescue zones, and our model makes it possible to estimate how smugglers weigh this advantage against the possibility of collecting more revenue by adding passengers. Conditional on rescue to Europe, our analysis suggests that the risk of death for passengers has not changed from one period to the next, despite this shift in strategy. However, as smugglers switch to smaller boats which are rescued farther out to sea, it is possible that there is an increasing number of boats which sink without ever being detected for a rescue attempt, thus biasing recent casualty estimates downwards.
The rest of the paper is structured as follows. In Sections 2 and 3, we provide background on the Central Mediterranean policy environment and a review of the literature on migration strategy.
In Section 4, we analyze the overall flow of crossings along the Central Mediterranean route. In Section 5, we analyze the incident dataset and present a utility model of smuggling. Section 6 concludes the paper.

Background
The difficulty of policing maritime borders has long made the Mediterranean an attractive route into the European Union. In particular, the Central Mediterranean route from Libya to Italy and Malta draws migrants and refugees who have lived and worked in Libya for years, as well as those who use Libya as a transit country. All together, since 2014 UNHCR reports that over 690,000 people have attempted the Central Mediterranean sea crossing (UNHCR 2020c) and over 56,000 have been returned to Libya (UNHCR 2020a), while the International Organization for Migration (IOM) estimates that over 17,000 people have gone dead or missing along this route (IOM 2021).
Given the risks involved in the crossing, European policymakers are divided between the human-itarian imperative to save lives in the Central Mediterranean through search and rescue, and the desire to stop irregular migration flows and discourage risky migration. On the one hand, migrants have legitimate reasons for fleeing Libya, where they have faced discrimination, human trafficking, detention in inhumane conditions, and the risk of air strikes from the civil war (Amnesty International 2017, 2020). Conditional on successfully departing from Libya, they may be protected from a forcible return, since international law requires that rescued migrants, refugees, and asylum-seekers be transported to a "place of safety" (UNHCR 2018). On the other hand, smugglers are aware of these protections and actively manipulate them by sending migrants to sea in under-equipped boats that will require rescue. As crossings surged in 2016 the Italian authorities were reporting as many as 30 rescue operations a day, leading to accusations that search and rescue operations were acting as a "ferry service" for migrants and creating a "pull factor" which encouraged people to place their lives at risk (Deutsche Welle 2016, Baczynska 2017).

Policy Context
Below, we characterize the recent policy response in the Central Mediterranean according to three main phases: the dominance of Italian and EU naval missions; the rise of the NGO rescue response; and the growing role of the Libyan Coast Guard. We describe the phases in terms of the activities of these key actors, and thus some parts of them are overlapping. The cost of this policy change became clear in April 2015 when two major shipwrecks claimed over 1,000 lives (Heller and Pezzani 2016). In response to these tragedies, the EU launched Operation Sophia in June 2015, which supplemented Triton with an emphasis on preventing smuggling and destroying migrant ships so they could not be re-used in the future (EUNAVFOR Med Operation Sophia 2018).
2.1.2 Phase 2: The Rise in NGO Rescues (from mid-2015 to mid-2017) The first NGO to operate in the Mediterranean was the Migrant Offshore Aid Station (MOAS), which began conducting rescues in 2014. In the spring of 2015 it was joined by Médecins sans Frontières (MSF, also known as Doctors without Borders), and a number of other NGOs have since followed suit. When crossings peaked in 2016 -2017 there were as many as 13 different boats operating in the region (Zandonini 2017). The NGO presence increased the coordination of rescues, as NGOs would deploy off the coast of Libya and react quickly to any boats that were identified in international waters. However, this led to accusations that NGOs were colluding with smugglers and facilitating irregular migration.
In July 2017, Italy proposed an NGO Code of Conduct which would impose a number of constraints on rescue ships wishing to use its ports (Spagnolo 2017, Cusumano 2019). European countries also began other efforts to limit the operations of NGO ships, for example by initiating legal proceedings against them, preventing them from leaving port, or denying permission to disembark rescued migrants (European Union Agency for Fundamental Rights 2019).

Phase 3: Interceptions by the Libyan Coast Guard (since mid-2017)
Search and rescue (SAR) activity is typically managed by an international network of coastal countries, which formally declare rescue zones and then supervise the response to incidents within their zones. 3 Key SAR zones in the Central Mediterranean are shown in Figure 2. Prior to the 3 Formally, the coastal zone within 12 nautical miles (NM) from a country's shore is considered territorial waters, and from a legal perspective is essentially treated like the land within the country's borders. Between 12 and 24 NM from shore are a country's contiguous waters, a zone which is not technically part of the country's territory but in which the country may still enforce some of its laws. Beyond this 24 NM boundary are the search and rescue (SAR) zones (Figure 2), in which the corresponding country's coast guard and/or naval forces are responsible for coordinating rescue operations (Human Rights Watch 2017). While international ships are generally prevented from returning migrants to Libya because it is not considered a place of safety (UNHCR 2018), the LCG does not abide by these restrictions, and the increasing interventions of the LCG have clearly imperiled migrants. There are reports that the LCG is poorly trained, unprofessional, and ill-equipped to coordinate rescues (EUNAVFOR Med 2018); that it has ties to traffickers and has used its resources in the Libyan civil war (Scavo 2019a,b,c, Tondo 2019; and that it has perpetrated abuses against migrants in the course of rescues (Heller et al. 2018, Amnesty International 2017).

Smuggling Operations and Strategy
The Migrants typically depart at night, most commonly in a wooden fishing boat or a rubber raft.
Since the EU-led anti-smuggling Operation Sophia began destroying vessels in 2015, the incentive to purchase cheaper disposable rafts has increased (UK House of Lords, European Union Committee 2016). Wooden boats can hold up to 800 people (Grunau 2016), 5 whereas rafts have a maximum capacity of approximately 150-200 people. When boats are overcrowded, the risk of sinking and injury to passengers on board is expected to rise. 6 Therefore, smugglers face a trade-off between collecting additional revenue per passenger and the risks incurred by overloading the boats.
Migrant boats are optionally equipped with life vests and a satellite phone, and one of the migrants may be chosen to act as navigator. At the peak of the rescue response, boats were often given a limited amount of fuel, with the goal of reaching international waters (Baker 2016); once they passed the 24-nautical-mile boundary from the Libyan shore, they could use their satellite phone (if available) to request a rescue from the MRCC in Rome. As the LCG has grown more active in intercepting ships, anecdotal reports suggest that smugglers are equipping boats with more fuel in order to help them evade the coast guard and get further out to sea before requesting assistance (UNHCR 2018); the space taken by the fuel may in turn reduce the passenger capacity of the boats.
While the LCG is formally an adversary of the smuggling operations (since the coast guard is charged with intercepting migrant boats), there is evidence that coast guard members coordinate with, profit from, and/or are involved in smuggling operations (Tondo 2019, Michael et al. 2019 Office to Monitor and Combat Trafficking in Persons 2020).

Related Work
Our study of Central Mediterranean crossings fits into the larger literature on human migration, which models movement patterns as a function of costs (such as travel expenses) and benefits (such as employment opportunities). Informal migration is differentiated by the presence of a third factor: internal and external border enforcement (Orrenius and Zavodny 2015). While there not been much focus on the process of migration in the management science and operations research literature, related work has studied how to best allocate border patrol and coast guard efforts (Papadaki et al. 2016, Uzun et al. 2016; strategies for humanitarian logistics (Celik et al. 2012, Besiou andVan Wassenhove 2020); and the optimal placement and integration of migrants and refugees (Ahani et al. 2021, Haliassos et al. 2017, AbuJarour and Krasnova 2017.

Strategic Models of Migrants and Smugglers
Smuggling and trafficking involve sophisticated, diversified organizations: "the business is remarkably responsive to change and seems always to remain one or several steps ahead of those seeking to control it" (Salt and Stein 1997). Adaptation has been a key theme in existing research on the US-Mexican border, particularly with respect to the geographic intensity of US border pa-trol activities. For example, Sorensen and Carrion-Flores (2007) posit that enforcement has two primary effects: (1) a deterrence effect in which the policy discourages migrant crossings; and (2) a diversion effect in which migrants shift their crossings to other parts of the border. As a result of the diversion effect, the overall volume of crossings can be relatively inelastic with respect to border enforcement. Additional empirical research has found a low impact of enforcement on overall crossing volumes, but substitution to other border sectors with higher crossing times and crossing risk and an increase the relative proportion of deaths from environmental factors such as dehydration (Gathmann 2008, Cornelius 2001.

Analyses of Central Mediterranean Crossings Before Phase 3
In the Central Mediterranean, the deterrence-diversion debate has been shaped by two competing narratives: a security/border control logic, and a humanitarian/crisis discourse (Steinhilper and Gruijters 2018). A core research question is whether limiting rescue activity and increasing interceptions will discourage attempted crossings, or simply cause migrants to undertake increasingly risky crossings which are unassisted or even unobserved by state and humanitarian actors. crossing risk in order to analyze behavioral responses to policy changes. In their model, migrants make a strategic decision about whether to cross in a safe boat, an unsafe boat, or not at all. They predict that SAR efforts lower crossing risks conditional on boat type, and therefore: encourage more migrants to undertake the journey; lead a larger fraction of crossings to use unsafe boats; and consequently, make departures more sensitive to crossing conditions. Naiditch and Vranceanu (2020) approach the problem from a different angle, building an intertemporal matching model between migrants and smugglers. They predict that greater NGO presence in the Mediterranean will increase the number of migrants and smugglers, lower the costs of smuggling, and increase the likelihood of successful crossing. Smugglers will benefit but the effect on migrant welfare and crossing prices is ambiguous.

Analyses of Central Mediterranean Crossings Including Phase 3
Policy analysis of the Central Mediterranean crossings given the recent rise in LCG intervention faces two key challenges. First, because of the changing political and economic environment, it can be difficult to isolate the causal effect of policy changes on crossing behavior. Prior works have studied the growing role of the LCG as a discontinuous policy change (Camarena et al. 2020), and studied how crossings correlate with NGO presence or capacity on a daily basis (Cusumano and Villa 2019). However, there have been no studies that assess how continuous changes in the overall border enforcement regime affect crossings. We address this gap with the use of an error correction model which allows us to estimate how crossing decisions on the Central and Western Mediterranean routes respond to rescue probabilities in the short and long term. With the help of this model we are able to analyze recent crossing activity through the end of 2019, when we observe a recovering flow of migrants.
Second, the lack of data on events involving the Libyan Coast Guard makes it difficult to connect smuggler choices (i.e. departure ports, boat type, or boat capacity) to outcomes at the incident level, since there is almost no data on boats that are intercepted. This hinders efforts to identify smuggler strategy because it hides the smuggler's reward function. To address this limitation, we borrow from discrete choice models and their connection to inverse reinforcement learning (Abbeel and Ng 2004, Ziebart et al. 2008, Ermon et al. 2015. Using data on the characteristics of a given incident and the choices made by the smugglers, we attempt to infer the parameters of the smuggler's utility function. Specifically, we study the question of boat crowding, and estimate the value that smugglers place on the revenue collected from adding more passengers, relative to the reduced chance of success when using larger boats (which in turn depends on the overall level of Libyan enforcement). We estimate this choice model with newly released Frontex data which, to our knowledge, has not been analyzed in its entirety to study this context.

Limitations of Our Research
Our approach has two key limitations. First, we assume that migrants are free to leave (though they may soon be intercepted at sea), and therefore we do not account for efforts to stop migrants from departing in the first place. Camarena et al. (2020) note that reductions in migrant flows during and after 2017 may be a function of either increased coast guard interceptions, or agreements with militias to reduce the availability of smuggling services and prevent departures in the first place; our analysis addresses the former.
Second, while we believe that Frontex collects the most comprehensive incident data in the region, this dataset is biased because it generally does not include Libyan interceptions. This may lead us to overestimate the extent to which smugglers are strategic (because non-strategic actions are likely to be filtered out of the dataset by coast guard interceptions). However, LCG activities are erratic, with gaps in activity on certain days (EUNAVFOR Med 2018), which should help to ensure that a more diverse sample of incidents ultimately enters the Frontex dataset.

Analysis of the Aggregate Flow Dataset
A common argument in favor of stricter border enforcement is that a decreased chance of successful crossing will deter crossing attempts. We assess whether or not this hypothesis is consistent with empirical estimates by investigating how crossings relate to the likelihood of rescue at sea and successful arrival in Italy or Malta.
Since the crossing trend is highly non-stationary, a naïve regression can misidentify the model due to spurious regressions (Granger and Newbold 1974). To address this concern, we adopt a timeseries error correction model to analyze the long-and short-term effects of rescue probability on the log odds of crossing (Box-Steffensmeier et al. 2014). In Section 4.3, we show that a reduced chance of successful crossing results in a smaller number of attempted crossings. Section 4.4 similarly analyzes spillovers to the Western Mediterranean route, and finds significant but limited substitution to this route. Finally, Section 4.5 presents descriptive analysis showing that spillover effects vary widely among different nationalities of migrants.

Data and Setup
We collect data on overall migration flows from the International Organization for Migration (IOM), which provides data on aggregate sea arrivals 7 to Italy and Malta; interceptions by the Libyan and Tunisian Coast Guards; and dead or missing migrants along the Central Mediterranean route (IOM 2019a). This data can be used to calculate the total number of crossing attempts and the likelihood of successful crossing.
For a given month t, we define the number of people crossing on a given route as the sum of people who were rescued, intercepted, or reported dead or missing: Note that when estimating the ECM model below, we take the unit of N t,crossing to be in thousands in order to yield coefficients that are more comparable in magnitude.
The probability of rescue is therefore: The number of people crossing on the Central Mediterranean route, as well as the probability of each outcome (rescue, return, and sinking), are shown in Figure 3. From inspecting the figure, it is clear that the number of people crossing and the probability of rescue have fallen over time, which coincides with the growing intervention by the LCG in Phase 3. However, it is unknown whether both series simply follow a common downward trend, or whether one series reflects changes in the other over the short or long term. In the short term, crossings might respond to increased rescue probability because migrants already in Libya could depart when it appears that the chances of success are high. In the long run, crossings might respond to increased rescue probability because additional migrants could travel to Libya in order to cross, and because smuggling operations could reconfigure to increase the overall volume of migrants they are able to launch.

Model
In this section we provide a brief overview of the ECM, following the exposition in Box-Steffensmeier et al. (2014, Section 6). The development of the ECM was motivated by the observation that when running Ordinary Least Squares (OLS) regressions using non-stationary dependent and independent variables (in our case, N t,cross and P t,rescue , respectively), there is an elevated risk of finding a significant relationship between the two even when none exists (i.e., a spurious regression (Granger and Newbold 1974)). One solution in this case is to take first differences in order to obtain stationary variables, and then to fit a regression on the first differences. However, this provides insights only about short-run relationships between the variables, and does not account for the fact that these variables may react to each other in the longer term. In fact, it is possible that two non-stationary series have a cointegrating relationship, in which a linear combination of the series is stationary (i.e. they have a stable long-run relationship). For example, it may be the case that in equilibrium: where N t,cross is the monthly number of crossings (in thousands) and P t,rescue is defined as above.
The ECM essentially allows for both of these short-and long-run dynamics. Specifically, we estimate an ECM of the form: the observed deviations from equilibrium as defined by Equation 1; ∆P t−1,rescue is defined analogously to ∆N t,cross ; and t is a random error term. In this case, α 1 reflects the long run adjustment behavior that results from divergence between P t,rescue and N t,cross , whereas α 2 reflects the short run adjustment in N t,cross that results from a change in P t−1,rescue .
We conduct our estimation using the Engle-Granger method ( Table 1.
We find significant evidence of long-run adjustment behavior, suggesting that the number of people crossing increases in the probability of rescue. When crossings and rescue probability diverge Standard errors in parentheses * p < 0.10, * * p < 0.05, * * * p < 0.01 Table 1. Results of estimating the error correction model: Crossings on the Central route (in thousands) vs. rescues on the Central route from their equilibrium relationship, adjustment occurs fairly quickly: our results suggest that the log number crossing falls by approximately 40% of the deviation from equilibrium in each period after the divergence. In other words, within four months over 85% of the adjustment needed to restore equilibrium has occurred. Interestingly, we find no significant evidence of a short-run relationship between crossings and rescue probability.
The equilibrium relationship is estimated to be: Standard errors in parentheses * p < 0.10, * * p < 0.05, * * * p < 0.01 Next, we test whether the probability of rescue on the Central Mediterranean route affects crossings on the Western Mediterranean route through Spain. As above, we gather flow data from IOM, which reports the monthly number of sea arrivals in Spain as well as the estimated number of deaths along the Western Mediterranean route. 9 The maximum number of monthly crossings for the Western Mediterranean route was 10,598 in October 2018.
In Table 2, we re-estimate the error correction model from Section 4.3 using the differenced number of crossings (in thousands) on the Western route as the dependent variable. As above, our estimates suggest significant long-run adjustments in response to deviation from the equilibrium relationship between Western Mediterranean crossings and the Central Mediterranean probability of rescue. However, the speed of adjustment is considerably slower (20% vs. 40%). As before, we find no significant short-term effect of the probability of rescue on crossings.
The equilibrium relationship is estimated to be:

Evidence that Substitution Behavior Varies by Nationality
We also investigate route choice by nationality, and find that the choice of route is highly correlated with country of origin. We gather additional information on flows from UNHCR, which reports data on monthly sea arrivals to Italy, Greece, Spain and Cyprus broken down by country of origin (UNHCR 2019, 2020b). Figures 4 and 5 illustrate the share of African migrants crossing on each route by nationality over time. From Figure 4, we can see that North Africans primarily take the Central Mediterranean route, with the exception of Algerians (who seem to take advantage of all three routes) and Moroccans (who tend to take the Western Mediterranean route since Morocco is a key departure point for Spain, but who surprisingly preferred the Central Mediterranean route when crossing was easy on this route).
West Africans have generally substituted from the Central Mediterranean route to the Western Mediterranean route over time. This is likely due to the fact that from West Africa, multiple 10 That is, (4.31 × 0.9) − (4.31 × 0.5).  overland routes exist to either Western or Central Mediterranean departure points. In contrast, East Africans from the horn of Africa favor the Central Mediterranean route with very little substitution over routes, most likely because there are well-established smuggling routes from the horn to Libya.
Interestingly, migrants from two nationalities which are farther from common overland smuggling routes -Comoros and the Democratic Republic of the Congo -do not always choose the most geographically proximate points; we also see little substitution by these nationalities over time.
Taken together, we can see that many nationalities' preference of migration route varies over time. Furthermore, substitution between routes seems to coincide with major policy changes, such as the growing role of the Libyan Coast Guard in intercepting migrants as formalized by the establishment of the SAR zone in June 2018. However, substitution appears constrained by geographic proximity to departure points and by the availability of overland smuggling routes. 11 Therefore, both the routes chosen and the sensitivity of this choice to crossing conditions vary 11 Two other relevant factors are entry requirements for any borders that must be crossed in order to reach the target departure points (to the extent that border crossings are formally monitored), and the likelihood that members of a given nationality will be approved for an asylum claim (which may give some nationalities an incentive to select routes with a lower probability of detection).
substantially by region of origin.

Analysis of the Individual Incident Dataset
Thus far, we have analyzed the data on migration flows over the Central Mediterranean routes and found that migrants' decision to cross on the Central Mediterranean route exhibits a long-term response to changes in the probability of rescue.
While migrants make a strategic decision of whether and where to cross depending on crossing conditions, smugglers may also respond to these conditions by varying their strategy. 12 As the LCG claimed increasing control over the coastal zone, migrant boats located in the Libyan SAR zone were more likely to be intercepted and less likely to be rescued to Europe. Consequently, we observe two high-level shifts in the strategic actions of smugglers along the Central Mediterranean route.
First, the use of wooden boats increased, and the average size of boats departing from Libya (as measured by the number of people on board) grew smaller in this phase. The more aggressive the LCG's interception activities are, the greater the distance migrant boats need to cross to secure a rescue by NGOs or European authorities. As a result, anecdotal reports suggest that smugglers have been using space on boats to load more fuel (UNHCR 2018), rather than collecting more revenue (and potentially, slowing boats down) by adding additional passengers.
Second, the share of boats departing from Tunisia relative to Libya, which was very small at the beginning of Phase 3, increased. In this section, we focus on boats departing from Libya because it is not clear whether local smuggling networks in Libya can choose to launch boats from Tunisia, and because the Libyan route has been historically more popular and has more total incidents than the Tunisian route.
We first describe the incident data in Section 5.1. In Sections 5.2 through 5.4, we build and estimate a choice model of Libyan smugglers' strategic shift to smaller boats in response to LCG interceptions. In Section 5.5, we conduct counterfactual estimation of boat sizes under different levels of Libyan interceptions, which explains the snowballing shift towards small boats observed in 2018 -2019. Appendix C.3 provides further descriptive analysis of the incident datasets for context.

Description of the Incident Dataset
We collect incident data from Frontex, the European border control agency which supervises the deployment of aerial and naval assets to patrol the EU's maritime borders. The agency records border-related incidents in its Joint Operations Reporting Application (JORA) database. In response to our request for public access to documents, Frontex has released records of incidents which occurred under Operations Hermes, Triton, and Themis from 2014 -2019, including: information on the date of detection; the departure country of the migrants; the number of people involved; the number of deaths; the boat type; and whether the boat was detected inside or outside of Frontex's operational area (asktheEU 2020). In total, the datasets we received contained 4,365 incidents originating in Libya from 2014 -2019, including rescues involving actors outside of Frontex, such as NGOs and merchant ships. 13 Each incident corresponds to a boat or (occasionally) a set of multiple boats that is acknowledged by Frontex.

Other Datasets and Analyses
We have also gathered four other incident datasets (from Médecins sans Frontières, Watch The Med, Broadcast Warnings, and the IOM), which contain incidents recorded by NGOs, monitoring efforts, and emergency calls. After comparing these different datasets, we found that Frontex is the most comprehensive dataset in terms of volume. Therefore, we solely use the Frontex dataset for our main analyses and use the other datasets for supplementary analyses.
In Appendix C, we conduct a brief comparative analysis of these different datasets which suggests 13 We excluded incidents where the transportation type was land-based (i.e. "bus", "camper van", "on foot", and "car"). The dataset we used to estimate the choice model was further reduced to 1,851 incidents, because it was limited to: incidents occurring from 2016 onwards (since Libyan interception data is only available from 2016); incidents involving rubber boats; and incidents where the number of people on board and the number of vessels involved were reported (i.e., not missing). that after 2017Q3, there was (1) a decrease in boat size and (2) an increase in the share of wooden boats compared with rubber boats. We also observe (3) an increase in incident distance from the SAR border, which implies that the smugglers are traveling further before detection and that the mobility of the boats is increasingly important. Finally, we (4) confirm the relation between the mobility and boat size, showing that the probability of boats being found in the Frontex operational area (i.e., farther from the coast of Libya) increases when the boat size is small.

Strategy Model for the Shift in Boat Size
A key challenge with this incident dataset is that it is not a representative sample of all movements in the region. In particular, the Frontex data does not cover LCG interceptions. 14 As a result, we do not have the data to estimate how a given set of inputs (e.g. boat type or boat size) translates to outcomes (i.e. rescue, interception, or sinking) at the incident level. Instead, we proceed in two stages. First, we identify the overall probability of interception in a given quarter.
Then, we examine how the behavior of smugglers changes as a function of the quarterly interception probability.
Our model of the smuggler's utility depends on two primary factors: the number of people on board a given boat, and the estimated probability that a boat will be intercepted in the Libyan SAR zone. We aggregate the data by quarter because this secures a sufficient number of incidents sampled per quarter, and because we estimate that three to four months is approximately the amount of time it takes for crossing behavior to react to changes in the probability of rescue (see Section 4 for further discussion on the rate of adjustment), which suggests that this may also be a sensible time scale at which to analyze smuggler responses.

Estimating the Probability of Interception in the Libyan SAR zone
We begin by estimating the quarterly probability that a boat that departs from Libya is intercepted, p qL interception , where q denotes the quarter and L denotes the fact that the boat originated in Libya. This part utilizes the flow data as well as the incident data. While the IOM flow data includes the number of interceptions off the coast of Libya (N qL interception ), it includes only the total number of arrivals (N q arrivals ) and deaths (N q death ) for the Central Mediterranean route, which might include incidents originating in nearby countries such as Egypt or Tunisia. To proceed with the estimation of p qL interception , we therefore rely on the incident-level data from Frontex and make the following two assumptions: 1. Frontex's recorded incidents do not cover interceptions. However, Frontex data is a uniform and nearly comprehensive sample of rescue incidents in the region. This assumption is justified by the fact that the total population in the Frontex dataset generally matches the total number of arrivals reported by IOM (see Figure D.8).
2. The smuggler's decision focuses on the probability of interception and does not independently weigh the probability of sinking, which is small. Therefore, excluding this outcome will not substantially affect the model results. This assumption is justified by the fact that the casualty rate is low relative to the number of interceptions and crossings (see Figure 3), and that migrants are typically very distressed by the prospect of LCG interception.
Additional discussion of these assumptions is provided in Appendix D.1.
Using Assumption (1), the estimated number of arrivals from Libya can be estimated as: where s qL rescue is the quarterly share of migrants rescued who originate from Libya as opposed to other origin countries, which we estimate from the Frontex data. Using assumption (2), the smuggler's expected probability of interception given departure from Libya is: The calculated probability of interception over time is illustrated in Figure 6. For all remaining sections of the paper, we simplify the notation by removing the L superscript and use p q interception to refer to the estimated rate of interception for boats departing Libya only.  Figure 6. The estimated probability of interception for boats departing Libya by quarter.

The Smuggler's Utility Function
Having calculated the quarterly interception probability, we proceed to analyze incident-level decision-making. We next assume that the smuggler makes a choice of boat size. We discretize boat size into bins of n ∈ {1 − 50, 50 − 100, 100+} migrants. The smuggler then selects n to maximize his payoff according to his utility function, which takes the form: Here, i represents the choice setting (i.e., the incident); n represents the choice of boat size; α n represents a boat size fixed effect; p q interception is the estimated quarterly probability that a boat departing Libya will be intercepted in the Libyan SAR zone, for the quarter that corresponds to incident i; β n is a boat size-specific coefficient on the quarterly probability of interception; and ε in is a choice-specific idiosyncratic error term.
In other words, the smuggler's utility is a function of two deterministic factors: 1. The short-run marginal payoff to the number of migrants chosen (α n ), i.e. the total rent collected from all n migrants in exchange for the crossing. We generally expect this payoff to increase in n, since each passenger is charged a fare for the journey.
2. The expected long-run reputational payoff of crossing (β n p q interception ), which depends on the smugglers' risk of interception. 15 Since the probability of interception should vary by boat type, but the precise extent of variation is unknown, we allow for boat-size-specific coefficients on this probability. These coefficients are effectively an estimate of how changes in the interception probability affect utility for different choices of boat size. If small boats provide an advantage when interceptions are high, we expect β 0−50 > β 50−100 > β 100+ , whereas if large boats provide an advantage we expect the opposite to be true.
The addition of an idiosyncratic error term ε in allows for random variation in the behavior of smugglers due to unobservables, such that not all smugglers will select the same boat size choice even under the same conditions. 16 Assuming that ε in takes a Type-I extreme value distribution, the probability of choosing boat size n takes the form of the standard logit probabilities (Train 2009, Section 3.1). Let N = {1 − 50, 50 − 100, 100+} be the set of possible boat size choices and let V in = U in − ε in ; that is, let V in represent the deterministic part of the utility function. Then, This model is also sometimes referred as the Maximum Entropy Inverse Reinforcement Learning (IRL) model (Abbeel and Ng 2004, Ziebart et al. 2008, Ermon et al. 2015, in which the smuggler's 15 While there may be no immediate consequences of interception for smugglers (since migrants have typically already paid for passage), one can imagine that smuggling businesses will ultimately suffer if they are unable to secure rescues to Europe for their passengers.
16 For example, this term could include unobserved variation in smuggler costs, resources, or preferences. This was one of the motivations for using a choice model, since it generates variation in the smugglers' optimal boat size choices even under the same observable conditions. choice of boat size is drawn proportional to his exponentiated expected reward: P in (α n , β n ) ∝ e V in (αn,βn) .

Estimation of the Utility Function from Incident-level Data
Using the model in Section 5.2, we empirically estimate the parameters α n and β n in Equation 3. Recall that α n describes the payoff to a given boat size which is independent of p q interception , whereas β n describes the long-term payoff associated with the probability of interception vs. rescue, conditional on boat size. We attempt to recover α n and β n using incident-level data from Frontex.
We restrict our analysis to incidents involving rubber boats. This choice was made for two primary reasons. First, rubber boats tend to have a more uniform physical size, which means that the number of people is a reasonable proxy for crowding; this is not the case for wooden boats, which may vary dramatically in their capacity. Second, while rubber boats can be imported cheaply, the supply of large wooden boats in the region has become increasingly scarce over time as these boats have sunk or been destroyed by anti-smuggling operations (see Section 2.2); therefore, the choice of how many people to place on board a wooden boat may be exogenously affected by the scarcity of large ships. Restricting our analysis to rubber boats preserves the majority of incidents in the Frontex dataset: 74% of incidents originating in Libya involve rubber boats, relative to just 14% of incidents which involve wooden boats. Further details on the distribution of boat sizes and the relationship between boat size and quarterly interception probabilities are discussed in Appendix D.2.
In our model, the log likelihood of the data is: where I is the set of incidents in the dataset and d in is equal to one if n is the boat size actually chosen in incident i, and zero otherwise. Estimation was performed in Stata SE 12.0 using the clogit command, which optimizes the log likelihood function using Newton's method.

Results
Results from the estimation of the model are shown in Table 3. 17 Because choice probabilities are relative, we must fix the utility of one choice category in order to identify the others. We normalize the utility of the 0 − 50 boat size category by setting α 0−50 and β 0−50 to zero. The remaining coefficients are then interpreted in relation to this base category.
We can see that when there is no chance of interception, large boats are generally preferred (α 0−50 < α 50−100 < α 100+ ). This is consistent with our expectations, since smugglers can likely extract a higher rent by launching more crowded rubber boats. However, our results suggest that larger boat sizes face a disadvantage when the probability of interception is nonzero (β 0−50 > β 50−100 > β 100+ ). This is consistent with the empirical evidence that large boats are chosen less frequently when interceptions are high, as they may struggle to evade the LCG and reach the EU SAR zones. Taken together, these results support the hypothesis that there is a tradeoff between the short-run payoff to launching large boats and the long-run cost to interception as boats grow more crowded.
In Figure 7 Table D.4 of Appendix D.2 for detailed information). For this reason, we re-weight all incidents such that each quarter in the dataset is given equal weight; for further discussion of the role of the weights and a comparison with unweighted estimates, see Appendix D.4. Standard errors in parentheses * p < 0.10, * * p < 0.05, * * * p < 0.01 Table 3. Parameter estimates for the smuggler's utility model

Counterfactual Estimation
Equipped with these model estimates, we can conduct simulations to illustrate how smugglers are expected to react to a change in interception rates. Using the incident-level parameter estimates for α n and β n , Figure 8 shows how the expected utility (i.e., the deterministic component of the utility) and choice probability for each boat size varies with the quarterly rate of interceptions. We estimate that large boat sizes are preferred until the interception rate approaches 60%, which occurs starting in the third quarter of 2018 (see Figure 6). After this point, small boats are preferred.
Interestingly, midsize boats are never the dominant choice in expectation. Figure 9 illustrates a hypothetical scenario for rubber boats in which the number of incidents per quarter remains the same, but the probability of interception is increased by 10 percentage points (relative to the baseline interception rate) across all quarters. Using the model and these two different interception levels, we estimate the choice probabilities for each scenario. Across all quarters in our dataset, we predict that the smugglers will respond to the changing environment by increasing the use of small boats and decreasing the use of large boats; the effect on the use of mid-size boats varies by period.  Figure 9. Predicted distribution of boat sizes, given a 10 percentage point increase in the probability of interception "Baseline" refers to the "true" scenario in which the interception rate matches reality. "Strategic response" refers to the scenario in which smugglers adapt their choice probabilities to reflect the higher probability of interception.

Discussion and Conclusions
Prior work on migration flows in the Central Mediterranean has focused on assessing the claim that NGO rescues endanger migrants by encouraging crossings and incentivizing riskier trips. In response to these claims, Italy and the EU more generally have acted to increase the capacity of the LCG and encourage a more aggressive regime of interceptions and returns. However, to date there has been little theoretical analysis of how migrants have responded to these changes in recent years.
In Our analysis supports the claim that the growing rate of Libyan interceptions has discouraged migrant crossings on the Central Mediterranean route. However, crossings have continued despite a very high rate of interceptions (approaching almost 80% for boats departing Libya). Therefore, we also analyze how smugglers have adapted to the changing interception environment. When comparing incident-level datasets (Appendix C.3), we observed evidence of strategic responses along several dimensions. Namely, there appears to be a decline in the number of people per boat and a shift towards the use of wooden boats. These strategy shifts coincide with an overall tendency for incidents to occur closer to the EU SAR zones, and we find a general correlation between boat type/boat size choice and incident locations in these later periods.
To formally analyze these changes, we build a utility model of smuggling. Using a discrete choice model, we estimate that there is a positive payoff to launching larger/more crowded boats, but that this is counterbalanced by a penalty on larger boats that rises with the overall rate of interceptions.
Therefore smugglers trade off between collecting higher rents by continuing to crowd migrants on to boats, and the reputational costs of launching crowded boats that have a lower probability of success.
The trend towards the use of smaller boats may have several implications for migrants crossing on the Central Mediterranean route. On the one hand, boats with fewer passengers may be less likely to sink, since overcrowding can make boats less seaworthy. 18 On the other hand, boats that are physically smaller in size may be less likely to be detected at sea and may be more susceptible to weather conditions on the open water. We do see a rise in the rate of deaths reported by the IOM along the Central Mediterranean route after 2017 (see Figure 3), suggesting that deaths may be occurring in the course of LCG interceptions and/or that ships are sinking without a rescue or interception ever being initiated.
Consistent with other findings in the literature, our work suggests that increased border enforcement induces strategic responses on a number of different dimensions, which can lead to unpredictable or unforeseen consequences for the crossing experience. When debating an increase in border surveillance, it is important to consider implications for both migrant and smuggler strategy, as well as the possible long-run impact of these strategy shifts on fatalities. In general, analyzing these implications is difficult because data on informal migration is typically incomplete and/or biased (if it is available at all), and because causal inference is challenging in such a complex environment. We address the latter challenge through the use of an error correction model to reduce the risk of spurious regressions in the flow data, and address the former by combining biased incident data with more representative flow data in our multinomial choice model.  (1974) 1974 "The master of a ship at sea, on receiving a signal from any source that a ship or air craft or survival craft thereof is in distress, is bound to proceed with all speed to the assistance of the persons in distress informing them if possible that he is doing so." SOLAS Amendment UNHCR (2011) 2004 "This obligation to provide assistance applies regardless of the nationality or status of such persons or the circumstances in which they are found." International Convention on Maritime Search and Rescue, Ch. 2 UN Treaty Collection (1979) 1979 "Parties shall ensure that assistance be provided to any person in distress at sea. They shall do so regardless of the nationality or status of such a person or the circumstances in which that person is found." SAR Convention Amendment UN-HCR (2011) 2004 "The Party responsible for the search and rescue region in which such assistance is rendered shall exercise primary responsibility for ensuring such coordination and cooperation occurs, so that survivors assisted are disembarked from the assisting ship and delivered to a place of safety, taking into account the particular circumstances of the case . . . "

B Supplementary Details for the Flow Analysis
Below, we provide additional details on the estimation of the error correction models in Sections 4.3 and 4.4.

B.1.1 Checks on the Model
Dickey-Fuller tests are consistent with the hypothesis that the total crossings, the log total crossings, the log odds of crossing, and the probability of rescue are non-stationary but that their first differences are stationary, suggesting that the ECM is appropriate in this setting. Similarly, the Engle-Granger test supports the hypothesis of cointegrating relationships between (1) the total crossings and the probability of rescue, (2) the log total crossings and the probability of rescue, and (3) the log odds of crossing and the probability of rescue.

B.1.2 Alternative Specifications for the ECM of Crossings on the Central Mediterranean Route
Below, we discuss possible alternative specifications for the model presented in Section 4.3.
In Table B.2, we test three different measures of crossing behavior as the dependent variable: the differenced total number of crossings in thousands, which is the dependent variable used in the main paper (columns 1-3); the differenced log number of crossings in thousands (columns 4-6); and the differenced log odds of crossing (columns 7-9). 19 We also experiment with different specifications of the ECM; the model described in Equation 2 is shown in columns 2, 5, and 8, but we also present the results when excluding the short-run adjustment term (columns 1, 4, and 7) and including lagged crossing behavior (columns 3, 6, and 9).
19 In order to calculate the odds of crossing relative to staying, it is necessary to know how many people could potentially cross in a given period. We assume that the maximum potential number of people crossing is equal to the largest number of crossings observed in any period (29,478 for the Central Mediterranean route), multiplied by 10 9 to ensure that there is no period in which all potential crossings occur (because this would result in division by zero when calculating the odds).
We estimate a similar speed of adjustment across all dependent variables: in each period after a divergence from the equilibrium relationship between crossing behavior and the probability of rescue, we estimate that the number of crossings falls by approximately 40% of the deviation from equilibrium, whereas the log number of crossings falls by approximately 43% of the deviation from equilibrium and the log odds of crossing falls by approximately 46% of the deviation from equilibrium. The estimated speed of adjustment is slightly slower when using a model that omits the differenced probability of rescue from the previous period (the short-run effect), and slightly faster when we also include the differenced dependent variable from the previous period. Our coefficients on the long-run speed of adjustment parameter remain significant across all specifications, and we never estimate a significant short-run adjustment.
The estimated equilibrium relationships for the log number of crossings and the log odds of crossing are: log (N t,cross ) = −1.17 + 4.13 P t,rescue log P t,cross P t,stay = −5.33 + 5.70 P t,rescue .
These equations suggest that when the probability of rescue falls from approximately 90% to 50%, the number of monthly crossings will decline by approximately 10,300 -12,200 people.

B.1.3 Stability of Coefficient Estimates and Backtesting
In addition to testing alternative specifications of the model, we also tested the robustness of the model to being fit on different time windows, in order to gain insight into the stability of coefficient estimates.
In Figure B.1a, we re-estimate the speed of adjustment over different sliding windows, starting with the five-month window from January -May 2016 and ending with the full 48-month dataset spanning January 2016 -December 2019. This allows us to determine how the coefficient estimate changes as more (recent) data is added to the model. As we can see, the estimated speed of Standard errors in parentheses * p < 0.10, * * p < 0.05, * * * p < 0.01 Table B.2. Alternative specifications of the error correction model for crossings along the Central Mediterranean route adjustment has decreased 20 over time but appears to be stabilizing as more data points are used for estimation, suggesting that we are approaching a more consistent estimate.
The stability of coefficient estimates is particularly important for cases in which such estimates could be used to predict future behavior. Therefore, in Figure B.1b, we compare the observed and predicted period-to-period changes in the number of crossings along the Central Mediterranean route. We train the model on the years 2016-2018 and then predict the period-to-period changes in crossings in 2019 using the trained ECM model. For each data point, we then take the true observed arrivals from period t − 1 and add ∆N central t,cross as predicted by the model to determine the estimated number of arrivals. As shown in the figure, we are able to approximate the observed arrivals trend; the mean absolute error is 1,949 arrivals on the training dataset (relative to a monthly average of 11,226 arrivals) and 1,141 arrivals on the test dataset (relative to a monthly average of 2,199 arrivals). While the model is fitting the training dataset, predictive power appears limited as the MAE from this ECM model is higher than that from the naïve approach of predicting no change since the previous period.
20 Recall that the absolute magnitude of the rate of adjustment determines the speed whereas the sign determines the direction of the adjustment, so a less negative rate of adjustment is a "slower" rate of adjustment.

B.2.1 Checks on the Model
Dickey-Fuller tests suggest that the number of crossings (in thousands) in the Western Mediterranean is also non-stationary, and that its first differences are stationary. However, we do not find significant evidence of a cointegrating relationship according to the Engle-Granger test; given that we have only 48 observations for training the model, we may be under-powered to detect such a relationship. We also note that this model seems less stable; Box-Steffensmeier et al. (2014) suggest that when the independent and dependent variables in the equilibrium equation are reversed and the ECM is re-estimated, the long-run adjustment term should remain significant, which is not the case here. While we fit an ECM for comparability with the model of crossings on the Central Mediterranean route, the results of this model in the Western Mediterranean context should be interpreted cautiously.

C.1 Supplementary Data Sources
As described in Section 5, our primary analysis relies on incident-level data from Frontex. In these appendices, we draw on additional datasets to conduct supplementary analysis. Specifically, we gather additional data on incidents from four primary data sources: • Watch The Med/Alarm Phone: Watch The Med (WTM) is a platform that has monitored migration incidents in the Mediterranean since 2012 (Watch the Med 2020a). Watch The Med data comes primarily from Alarm Phone, an emergency telephone hotline designed to help migrants and refugees at sea. Alarm Phone typically receives calls directly from migrant boats that have been equipped with a satellite phone, which also allows these boats to report their positions. Incident reports may contain information on the ship location and type, the number of people on board, the port of departure, and the details of the rescue. Where geospatial information on incident locations is available, we restrict our dataset to incidents which occurred within the Italian, Maltese, or Libyan rescue zones. This seems to be the most comprehensive incidentlevel dataset. However, many incidents are missing precise location data, and the dataset does not appear to include Libyan interceptions. Sometimes, multiple boats are reported as part of the same incident. Watch The Med Alarm Phone (primarily) 2014-06 -2019-12 325 Incident reports that are largely drawn from calls for help to the Alarm Phone NGO hotline. Alarm Phone has no rescue capacity but it advocates on behalf of migrants, tracks their progress, alerts rescue authorities, requests intervention, and tops up the credit of boats' satellite phones so they can continue making calls.
Calls typically come only from boats equipped with satellite phones, or from relatives on shore. Calls to Alarm Phone seem to be lower in phases of high rescue activity, possibly because (1) many boats were independently detected by rescue NGOs in the region without a call; and (2)  Does not include incidents where everyone survived. Sources are mixed in quality (though the dataset does provide a source quality estimate), and sometimes multiple incidents appear to be reported as one.

C.2 Comparison of Alternative Data Sources
The analysis below is one of the first to compare multiple incident-level data sources, which we summarize in more detail in Table C.3. We argue that comparing multiple data sources is important because each dataset captures different phenomena. This is illustrated in Figure C From the right panel of Figure C.2, we can also see that the coverage of the datasets varies.
Frontex reports a peak of over 200 incidents a month, whereas the Watch The Med dataset never contains more than 20 incidents per month for this region. It is also evident that with the growing restrictions on NGO activities, the volume of incidents handled by MSF has declined.
Furthermore, we can observe a shift in the distribution of incidents across datasets over time.
For example, the Broadcast Warnings dataset recorded a large volume of incidents from 2015 -2018.
However, as the Italian authorities began turning over rescue responsibility for these calls to the Libyan authorities, it appears that some migrants have substituted to an alternative channel when requesting assistance: the volume of Broadcast Warnings has fallen even as calls to the independent 22 Watch The Med often associates a single set of geocoordinates with multiple different incidents. IOM attempts to recover incident information from public data sources, and it seems they may use approximate coordinates when no precise data is available. Above, we show the monthly count of incidents referencing different rescue authorities, which we identified by searching the text of the Broadcast Warnings for the phrases "MRCC Rome", "[J]RCC Libya", "Libyan Coast Guard", "RCC Malta", and "MRCC Tunis." Note that a single message might reference multiple rescue authorities, so the categories are not mutually exclusive.

C.3.1 Variation in Strategic Inputs (Boat Type and Size) Over Time
Using a combination of incident-level data sources, we empirically analyze variations in two key strategic choices made by smugglers: how many migrants to place on each boat, and what type of boat to use. Figure  the fact that rafts became an increasingly viable transport option during this period (since rafts are cheaper, and the growing intensity of rescue efforts made rescue more likely), thus decreasing the relative profitability of launching large wooden boats (which have very high fixed costs). If we examine rubber boats, we can find support for this hypothesis: Figure C.4b shows that crowding on rubber boats increased through early 2017 while rescue capacity was high, before beginning to decline towards 2018 -2019. Particularly interesting is the trend for the MSF rescues, which tend to occur within the Libyan SAR zone: we can see that crowding still remains higher for these NGO rescues than for other types of incidents. More generally, the fall in boat sizes is consistent with the notion that smugglers are currently using smaller, less crowded boats to move migrants further out to sea before detection, a hypothesis that we evaluate in more detail below.
Next, we analyze the choice of boats over time, which is illustrated in Figure C.5. In the Frontex and MSF data shown in Figure C.5a, we can see that the use of rubber boats peaked in 2016, when NGO rescue activity was at its highest. While MSF continues to rescue a high share of rubber boats, in both datasets we can see an increasing tendency to select wooden boats, which continued Each dot represents the monthly fraction of rubber or wooden boats for the respective dataset, whereas the lines represent locally weighted scatter plot smoothing (LOWESS) fits to the scatter plot trends. Note: The Frontex and Watch the Med datasets included boat types other than rubber or wooden. Therefore, the trends do not sum to one across both plots.
through 2019. While the Watch The Med dataset appears to show an increase in the proportion of rubber rafts, this may be due to the fact that rubber rafts are increasingly likely to call Watch The Med for help because fewer of them are being independently discovered by NGOs or EU ships near the Libyan Coast, and because in earlier periods these boats had been more likely to call the Italian MRCC for help instead (see Appendix C.2 for further discussion).

C.3.2 Variation in Outcomes (Boat Location) Over Time
We have little visibility into the activities of the Libyan Coast Guard, since our incident datasets generally focus on NGO or EU-led rescues. In the absence of comprehensive data on interceptions, we focus on proxy outcomes based on the location of individual incidents.
Boats departing Libya pass from the Libyan SAR zone into EU (Italian/Maltese) SAR zones (these zones are illustrated in Figure 2). In Phase 2, boats in either zone were highly likely to be rescued to Europe. However, starting in Phase 3 boats in the Libyan SAR zone were increasingly likely to be captured and returned to Libya, and crossing into the EU SAR zones was more and more important for securing rescue to Europe. Therefore, the average location of individual incidents in Figure C.6. Distance from the Libyan SAR zone border -Incident location by month and dataset Each dot represents the monthly median distance or fraction of incidents for the respective data set, whereas the lines represent locally weighted scatter plot smoothing (LOWESS) fits to the scatter plot trends. In Panel (a), negative distances represent incidents on the Libyan side of the border, and positive distances represent incidents on the EU side of the border. In Panel (b), "land" is defined as the land mass of any country bordering the Mediterranean. In Panels (a) and (b), we note that the Frontex data set only included location data for a subset of time periods and for incidents that occurred outside the Frontex operational area, that is, nearer to the coast of Libya.
smugglers are aware that boats must go farther to increase their probability of rescue, and may adapt their strategies accordingly. As noted above, they appear to be launching smaller boats and shifting to wooden boats, but migrants on board may also be delaying their decision to call for help until they are farther away from the Libyan coast. Weak evidence for the latter hypothesis may be seen by comparing incident records from Watch The Med/Alarm Phone to the other incident datasets; in recent periods, calls to Watch The Med/Alarm Phone appear to show a slightly larger shift away from land and into the EU SAR zone.
The one exception to these trends is the IOM missing migrants dataset, in which incidents appear to be moving closer to shore on average. It is important to note that this dataset does include deaths recorded off the coast of Libya (for example, IOM collects data on the number of dead and missing people from Libyan interceptions, and also reports incidents where bodies wash up on shore) and therefore might contain a more representative sample of incidents in the Libyan SAR zone when interception rates are high. Deaths may be occurring increasingly near the coast for two reasons. First, it seems likely that LCG rescues may be more dangerous for migrants on average, due both to the lack of professionalism and expertise by the LCG, and to the fact that migrants sometimes conduct risky maneuvers to avoid capture by the LCG. Second, it is also likely that a larger proportion of boats near the coast go undetected due to the decrease in NGO patrol presence, the slower response time of the LCG, and migrants' reluctance to call for help; this may lead to a growing number of sinking incidents in which bodies wash up on shore.

C.3.3 The Connection Between Strategic Inputs and Outcomes
Finally, we provide summary details on the connection between boat type, crowding, and incident outcomes. We assume that in Phases 1 and 2, reaching a European SAR zone or the Frontex operational area is relatively unimportant to migrants because almost all boats are rescued to Europe regardless of whether or not they are in the Libyan SAR zone. In Phase 3, however, we assume that migrants make the most effort to move away from the Libyan coast before being detected, since exiting the Libyan SAR zone and approaching Europe will make it more likely that they are rescued rather than returned to Libya. At the same time, we expect that the growing rate of interceptions in this period may make the payoffs to different boat sizes and types more pronounced, since a poor choice of boat size or type will leave migrants more vulnerable to interception. This is consistent with Figure C.7, which shows increasingly divergent outcomes by boat size and type in Phase 3. From Figure C.7a, we can see that smaller boats appear to have an advantage in reaching the Frontex operational area. This advantage is most pronounced for boats with less than 50 passengers, followed by boats with 50-100 passengers. From Figure C.7b, we see that wooden boats are most successful in reaching the Frontex operational area; this is likely because they are more seaworthy in general, although we do see some rubber boats that are able to cross over successfully. 23

C.4 Inferring Smuggler Strategy from Incident-Level Datasets
In this section, we have analyzed smuggler strategy from two different angles. First, we have shown that the inputs chosen by smugglers (i.e., the size and type of boat) have varied over time.
We also documented a corresponding shift in smuggling outcomes (i.e., the location of incidents involving migrants) and shown a correlation between these inputs and outcomes. This provides support for our main strategic model of boat size choice, which has been fit using incident-level data from Frontex.
We conclude with a note on the incident-level datasets analyzed above. (b) The probability that an incident is in the Frontex operational area, by boat type Figure C.7. Incident location by month and dataset, by number of people and boat type Each dot represents the monthly fraction of incidents in the Frontex operational area for the respective boat size or type, whereas the lines represent locally weighted scatter plot smoothing (LOWESS) fits to the scatter plot trends.

D.1 Justification of Assumptions
Below, we justify the assumptions used in the primary incident-level analysis, which we outline in Section 5.2.1.
Assumption 1: Frontex's recorded incidents do not cover interceptions. However, Frontex data is a uniform and nearly comprehensive sample of rescue incidents in the region.
To analyze the representativeness of Frontex's dataset, we compare the number of people involved in Frontex incidents to the total number of sea arrivals to Italy or Malta reported by the IOM.
As shown in Figure D.8, while the datasets do not match exactly, the Frontex dataset appears to capture most people who reached Europe on this route during our study period. A discrepancy emerges in 2018, when the IOM data begins to incorporate data on arrivals to Malta as well as Italy; however, the coverage of the Frontex dataset remains around 70-80%. Therefore, it seems reasonable to assume that the distribution of departure countries in the Frontex dataset approximates the distribution of departure countries in the IOM dataset. In Appendix D.3, we show that Frontex incidents (i.e., incidents in which a boat was rescued to Europe) were slightly less likely to involve dead or missing migrants in Phase 3, even as smaller boats were used and the incidents appear to have occurred farther from Libya. Therefore, it appears that the strategic changes have not increased the risk of death for passengers, conditional on boats being rescued to Europe.
In the IOM flows dataset, there appears to have been a rise in the fatality rate during Phase 3, but IOM does not report whether these fatalities are associated with the Libyan or the Tunisian route.
On the Libyan route, we estimate that fatalities may be closely associated with the probability of interception, because (1) migrants may be exposed to risks in the course of LCG operations; (2) the LCG may have a slower overall response time to distress incidents when it is charged with a rescue; and (3) the efforts of migrants to evade the LCG may lead them to sink without detection.
Therefore, we expect that the probability of interception may act as a proxy for the risk of sinking.

D.2 Additional Details on the Frontex Incident Dataset
Below, we briefly provide additional summary statistics on the Frontex incident dataset. In Figure D.9, we plot the empirical distribution of boat sizes in the dataset. We see that most wooden boats tend to be small, but that there is a long tail of extremely large boats. In contrast, rubber boats generally hold under 200 people, with a peak around 100 -150 passengers. For this reason, we have focused our estimation on rubber boats. The right panel of the figure illustrates that for both types of boats, the average number of people on board is positively correlated with the quarterly probability of rescue, which is consistent with the hypothesis that smugglers are responding strategically to LCG interceptions by changing the size of the boats they launch.

D.3 T-tests Supporting Strategic Shifts in the Frontex Dataset
Next, we briefly compare the characteristics of Frontex incidents originating in Libya during Phase 2 and 3 using two-sample t-tests with unequal variances. This analysis is intended to support the descriptive plots included in Section C.3. From Table D.5, we can see that incidents in Phase 3 have a significantly lower average number of people per boat; are significantly less likely to involve rubber boats; and are significantly more likely to occur in the Frontex operational area. Incidents in Phase 3 are less likely to involve dead or missing migrants, and have fewer deaths on average.
However, this latter result may be a function of boat size; when we test for differences in the average proportion of dead or missing people per boat, we find no significant difference.

D.4 Robustness to Alternative Weighting Schemes
To estimate the utility function using incident-level data, we used frequency weights in which each incident was weighted by the total number of incidents in its quarter: w = 1 N q . This ensured that each quarter was given equal weight in the estimation and effectively up-weighted incidents from later quarters, when there were fewer incidents observed.
To show that our results are not an artefact of the weighting scheme, in Columns (4)-(6) of Table D.6 we compare the frequency weights with two alternative weighting schemes. Column (4) shows unweighted estimates, whereas Column (5) shows estimates weighted according to the probability of rescue: w = 1 p q rescue . The motivation for weighting according to the probability of rescue is that when the probability of rescue is low, each observed incident should be up-weighted because (1) (2) (3) (4) Standard errors in parentheses * p < 0.10, * * p < 0.05, * * * p < 0.01 it represents other, unobserved incidents which were filtered out of the dataset by interceptions.
When we compare estimates from these alternative weights to the frequency weights in Column (6), we see that the frequency weights lead to estimates that are more extreme in magnitude. That is, the baseline payoff to crowding (α n ) is higher, whereas the penalty to interception (β n ) is more negative. This is unsurprising because, as noted above, the frequency weights place higher weights on incidents later in the dataset, when the probability of interception is higher and the strategic response is more evident.
In Columns (1)-(3) of Table D.6, we also present the results of a simple model where utility consists only of the first (α n ) term (i.e., the payoff to crowding) and ignores the penalty to interceptions (β); again, we consider all three weighting schemes. We can see that in this model, estimates of α n are more conservative in magnitude than in the full model; this is presumably because the simpler model fails to independently capture the interception penalty associated with larger boats, and this most likely dampens the estimates of α n .