The price is right!? A meta-regression analysis on willingness to pay for local food

We study the literature on willingness to pay (WTP) for local food by applying meta-regression analysis to a set of 35 eligible research papers that provide 86 estimates on consumers’ WTP for the attribute “local.” An analysis of the distribution of WTP measures suggests the presence of publication selection bias that favors larger and statistically significant results. The analyzed literature provides evidence for statistically significant differences among consumers’ WTP for various types of product. Moreover, we find that the methodological approach (choice experiments vs. other approaches) and the analyzed country can have a significant influence on the generated WTP for local.


Introduction
Local food production systems are one of agribusinesses' major innovations in the last decades [1: 2]. According to Mintel's Locavore report, consumers in the U.S. are highly motivated to purchase local food, with almost 50% of them stating that they are buying local foods at least on a weekly basis [3]. Moreover, in 2019, Mintel released a report looking specifically at private label food and beverage trends in the US. Testing the priorities for food shopping Mintel asked consumers, which attributes encourage them to buy store brands. Nearly 22% mentioned locally sourced products as a reason [4]. Similarly, in Europe, in 2017, German consumers were asked how often they purchase locally produced foods. Approximately 42% stated 'very often', and 45% answered 'sometimes' [5]. These examples from across the globe show that local food purchases are a global phenomenon.
The shift toward local foods became noticeable in the late 1990s, when consumer behavior studies linked to food purchasing began to show that consumers prefer local food [6,7]. Since then, the sector experienced significant growth, which triggered extensive empirical research on consumer behavior related to the "local" attribute of the food [2,8]. A qualitative synthesis, shown in Fig 1 (Please see S1 Table for a list of all articles), reveals the growth in the body of research that investigates consumers' demand and preferences for local food. Nevertheless, there is still ambiguity regarding consumers' actual willingness to pay (WTP) for the "local" attribute.
The remainder of the paper is organized as follows. In the next section, we explain the data generation process, and provide a description of the identified variables. This also includes an initial graphical investigation of publication selection bias. In the third section, we present and elaborate on the applied models used to carry out the MRA analysis. The forth section discusses and interprets the results. Finally, the last section offers some conclusions and more general implications, and discusses limitations and suggestions for future research.

Data
When conducting MRA, it is important to follow a clear approach while searching for the relevant literature. Therefore, we follow the "Meta-Analysis of Economics Research Reporting Guidelines" provided by Stanley et al. (2013 [53]). We conduct a thorough review of the scientific literature using the following electronic databases: To carry out the search, we apply a set of keywords that include "Willingness to Pay", and "Local Food", or "Local", or "Regional", or "State Grown" to ensure that we identify all relevant literature. We use Boolean strings and combine keywords with operators such as AND, and OR to produce more relevant results. We do not include the word "label" in our keywords as it is not usually specified in the article's title, abstract or keyword list, but rather is implied in the way the food attribute "local" is communicated to the consumers. Our search examines titles, abstracts and/or article keywords. Even though other definitions of local food, such as, "state grown" or "regional", are used by the studies, the overarching umbrella term "local" is mentioned by the relevant articles in one of these sources. Fig 2 displays the summary of our search results organized using the "Preferred Reporting Items for Systematic Reviews and Meta-Analyses" (PRISMA) template. PRISMA is a popular tool that is used to conduct and report search results [54,55]. As shown by PRISMA our initial literature identified 149 published papers, including 25 duplicates, such as, same papers yielded by different search engines, earlier versions of papers submitted for conferences, and working papers that were published later. Out of the remaining 124 articles, 52 are not suitable for the present analysis because they do not include a quantitative measure of WTP for "local." For example, they use qualitative methods to carry out the analysis or do not actually estimate WTP.
Because a uniform interpretation of the analyzed measure is of utmost importance for the feasibility of MRA [56,57], we assess the remaining 72 studies (see S1 Table for a list of all articles), and identify 35 articles that include a comparable measure(s) of WTP with clearly reported units, such as dollars per pound or dollars per ounce. The remaining 37 studies do not meet the criterion of a uniform interpretation for various reasons. For instance, twelve studies use consumer segmentation techniques or latent class analysis where participants are segmented based on certain variables (e.g., knowledge of organic/local). However, these studies do not provide all information necessary to analyze each consumer segment independently, hence not yielding an individual WTP measure for the local attribute [25,46,58,47]. We also exclude thirteen studies because they do not provide a clear measure for local (e.g., studies that bundle local with other attributes, such as societal benefits or support for local agriculture and environment, or do not explicitly measure WTP) (e.g., [18,59]). Moreover, we eliminate five studies that use different base prices or only state percentage increase in WTP (e.g., [42,60]). Another six studies do not provide weight units for the product, and instead state use measurements, such as, "loaf of bread" or "box of cereal" (e.g., [61,62]). For these studies, it was not possible to derive a comparable WTP measure. Finally, we exclude one study that was a pilot test with only 27 participants in total.
To ensure consistent coding and reduce the likelihood of errors in the data generation process a minimum of two co-authors read each article in the long list. During in-person meetings, the research group resolved disagreements regarding the inclusion of a specific article in our review and clarified other issues, such as, the most appropriate way to code possibly ambiguous findings and methodological categorizations. Table 1 presents a chronological overview of the 35 articles included in the MRA, providing the year of publication, authors, journal, number of participants, origin (region or country), and type of product analyzed. We also include WTP reported in each article. Multiple WTP estimates for a single product indicate that either more than one sample was utilized (e.g., participants were segmented based on the location of their recruitment), or different definitions of local were applied. Note that in our MRA we focus on a broader set of study design characteristics that potentially drive variation in reported WTP estimates, such as, specific product attributes, demographics of the analyzed sample, definition of local used, and experimental method employed (including hypothetical or non-hypothetical).
We then categorize the identified papers in order to derive binary variables for the underlying study design characteristics that might potentially affect the WTP estimates (e.g., [57]). The derivation of these variables has to be conducted under careful consideration of the tradeoff between a reasonable number of categories that will adequately capture the variation in WTP observations, and variables with low explanatory power relating to categories that only occur in a few articles (i.e., variables with a high share of 0-observations). The resulting categories are described below and summarized in Table 2.
Dependent variable. In our MRA we use the WTP estimates reported by the 35 articles as the dependent variable. We converted the WTP values to $/lb in order to keep the currency across studies consistent. In addition, we define a WTP measure in terms of an extra percentage that consumers are willing to pay over the base price. This was calculated as the average price used by the respective study following, e.g., Dolgopolova and Teuber (2017 [63]). We use percentage values as a robustness check. Also, it is important to point out that the final number of WTP measures identified (final number = 86) is larger than the number of studies included in the MRA (number included = 35), since some of the papers report multiple WTP estimates due to multiple products per study, multiple samples, and/or multiple definitions of local used in a single article. It can be observed that the mean WTP for the local attribute across the included studies is 1.204$/lb (0.292 when calculated as a percentage premium). Moreover, the mean number of participants based on which the WTP estimate has been generated is reported. This number is an indicator of how precisely the measure of WTP is estimated [52]. As will become apparent below, the relationship between reported WTP estimates and their precision can be an important indicator for the presence of publication selection bias [52: 64]. Table 2 indicates that WTP estimates are, on average, generated based on samples with 622 participants.
Independent variables.
Year of study. Concerning underlying study design characteristics, we first identify the year in which the data was collected. This allows to test if there is a trend over time in WTP for local estimates. For example, as local food gains more popularity, the WTP might have increased over time. On the other hand, as local food becomes more mainstream as time passes, WTP may subside. We find that the included studies focus on average on 2009 data. However, using this Year of study variable caused multicollinearity problems in the MRA. Therefore, we use a dummy variable with value one for studies conducted post 2011, and zero otherwise [63]. We choose the year 2011 as a threshold because it constitutes the median year of study across the included literature. Country of study. Additionally, we include a dummy variable, Country of study-US, which captures whether the study was conducted in the US. The "non-US" category includes studies conducted in European countries, such as, Germany, Spain and Italy, and one study conducted in The Commonwealth of Dominica. This country specific variable allows us to identify if the reported WTP differs between the US and other countries.
Type of product. Literature on WTP for local considers various product types, since consumer preference for local food does not only pertain to fresh products but extends to processed and animal products. For example, surveying randomly selected South Carolina households, Willis et al. (2013 [65]) find that households have a higher WTP for locally grown produce relative to non-locally grown. Conducting an in-store survey among residents of Kentucky, Hu et al. (2009 [38]) find that consumers had a higher WTP for pure blueberry jam, blueberry-lime jam, blueberry yogurt, blueberry dry muffin mix, and blueberry raisinettes with a Kentucky-grown label. Likewise, interviewing consumers in Germany, Wägeli et al.
(2016 [11]) find that consumers are willing to pay more for fresh milk, pork cutlets and eggs from the local region. The reported WTP, however, may vary significantly among the products analyzed. Therefore, we separate the included studies by product type leading to three categories: (i) animal products, such as, meat, fish, poultry, eggs, and dairy products, (ii) produce, such as, apples, beans, melons, potatoes, strawberries, tomatoes, yams, and (iii) processed food products, such as, blueberry fruit rollups, blueberry raisinettes, jam, and applesauce. The premium for processed local food can be higher than for unprocessed local food because consumers are willing to pay a higher price premium for value-added shelf-stable products [66]. Contrary to this, there might be a discount for processed local food because consumers may only hold trust towards unprocessed foods, i.e., whole products that were produced in the region, something that is more difficult to evaluate for processed foods considering that multiple ingredients are involved. Definition of local. With regards to the definition of "local", the literature has focused particularly on the following four categories: (i) State Grown; (ii) marketing program logos and labels: local brand; Grown Fresh with Care in Delaware, Maryland's Best, Jersey Fresh, PA Preferred, Virginia's Finest, Quality certified Bavaria; (iii) more precise definitions: product of city, province, or region; (iv) any of the following general definitions of local/ produced locally/ locally grown/ grown "nearby". The reported WTP might differ across these categories, as consumers seem to prefer a more precise definition of local, such as "produced within 50 miles" [14], or narrower defined geographical boundaries, such as, sub-state regions [13,17,30].
Method. Several methods have been employed across the literature to carry out the analysis with a majority of research based on choice experiments. As can be observed from Table 2, about 85% of identified WTP estimates have been generated using this method. Moreover, another set of studies has focused on a heterogeneous set of methods, such as, auctions and contingent valuations leading to two main method categories: (i) choice experiment, and (ii) other than choice experiment. This allows us to determine if the WTP estimates yielded by conducting choice experiments differ in WTP from the other experimental methods. Also, we include a dummy variable that captures whether the experiment was hypothetical. Table 2 reveals that 80% of included WTP estimates stem from hypothetical experiments. Since hypothetical studies are often criticized for not being incentive compatible, we examine whether there is a difference in reported WTP. However, since previous studies have demonstrated that hypothetical settings can provide a bias free estimate of marginal WTP [67,68,69,70], we expect to find no difference between WTP estimates generated based on hypothetical and non-hypothetical experiments.
Participants' origin. Moreover, for each study we determine the place participants were recruited from. This leads to two separate categories: (i) store shoppers, i.e., participants surveyed in person at the place of purchase, and (ii) "other" participants that were recruited through a market research company/online-survey database or through random representative samples (e.g., recruited by mail, phone, or at areas with heavy traffic, such as downtown museums or holiday parades). The WTP reported in the studies that used store shoppers as participants might be lower because those participants are in the shopping environment and, thus, more price conscious.
Number of attributes. We also include the number of additional attributes the study used to describe the product, since this might have an effect on the resulting WTP estimate. Consumers' choice satisfaction tends to decrease when complexity of the offered alternatives increases [71,72]. For example, having alternatives that are described using many different attributes has a significant effect on the ability to make a choice [73,74]. One reason for this is an intensity of the cognitive effort necessary to make a choice [75]. Therefore, as the number of attributes increases, the complexity of the experiment might have a negative effect on participants' WTP. On the other hand, if the attribute "local" is more salient among others, the number of attributes used will not affect WTP for local.
Demographics. Finally, we account for demographic characteristics of the surveyed participants that may have an effect on the resulting WTP estimate. Among those are the age (the mean reported by a study), gender (% of females), and income [24,23,42,37].

Graphical analysis of publication selection bias
Publication selection bias refers to a tendency of having a greater preference for estimation and publishing statistically significant results compared to results that do not reveal statistical significance [57]. Stanley (2005;2008 [52, 76]) shows that the relationship between analyzed estimates and their precision (e.g., standard errors or sample size) can serve as an indicator for publication selection bias. For example, average t-statistics around two, which refer to statistical significance approximately at the 5%-level, across the literature of interest are a strong indication for publication selection bias [57].
We use a scatter diagram of the relationship between estimated effects and their precision, also known as funnel plot. This plot can be used as an initial informal indicator for publication selection bias [52,77]. While the most precise estimates at the top of this plot should be close to the true effect, the less precise ones at the bottom of the plot are more dispersed resembling an inverted funnel shape. Without publication selection bias, estimated effects should vary randomly and symmetrically around the true WTP effect as all imprecise estimates at the bottom of the plot have the same chance of being reported [78]. In turn, if the plot is over-weighted on either one of the sides, this is considered an indicator for publication selection bias [52].
There are several ways to measure precision of estimated effects, with the most common one being the inverse of the standard error (e.g., [52,56,79,57]). However, for the WTP effects analyzed in the present case, standard errors are not available as these effects are calculated as a combination of regression coefficients, for which the calculation of standard errors is not possible without additional information on the underlying estimation. Nevertheless, the sample size (n) and particularly its square root (sqrt(n)) can also serve as an adequate precision measure because it is proportional to the inverse of the standard error [52,64]. For example, Stanley (2005 [52]) finds that correlations between (1/SE) and sqrt(n) exceed 0.9. Moreover, Sterne et al. (2000 [80]) and Macaskill et al. (2001 [81]) show that in MRA sqrt(n) is a superior measure of precision compared to standard errors. While standard errors are estimated values that are affected by random sampling errors, this does not apply to sqrt(n) [52]. This is also important for our MRA, as we include precision as an independent variable to explain the variance in reported WTP estimates. Since (1/SE) is measured with error, its inclusion as independent variable in a regression analysis will lead to errors-in-variables bias. In contrast, sqrt(n), although highly correlated with (1/SE), is free of estimation error [52]. Note that focusing on differences between published and working papers to identify potential publication bias is based on the assumption that working papers remain unpublished due to undesirable (i.e., insignificant results). However, it can well be the case that articles are not published and remain working papers due to quality issues with respect to method applied, writing style, etc. These articles do not necessarily have less significant (i.e., less desirable) results. Moreover, rational authors likely already follow a strategy of producing "desirable" results in the initial stages of research while preparing for journal publication. Therefore, including working papers in the MRA in some instances might not help to detect publication bias, while the inclusion of precision as an independent variable is more accurate (e.g. [82,83]). We use the mean of the 10% most precisely estimated WTP effects (measured by n) as the measure for the "true" WTP for local food. We do so because as n increases standard errors will become smaller, implying that reported WTP approaches the "true" value if n ! 1 [52]. For the WTP for local food literature this value is 0.29, represented by the vertical line in the upper panel of Fig 3. As a robustness check we have also calculated the proxy for the "true" WTP by using the 20% most precisely estimated WTP effects leading to a value of 0.40 (lower panel of Fig 3). It can be observed that in both cases the plot is strongly skewed to the right hand side of the "true" value, indicating publication selection bias towards larger WTP estimates.
Despite the usefulness of funnel plots to provide an initial indication of publication selection bias, their weakness lies in the assumption of a single underlying "true" effect. However, different countries, participant types, or product types may be characterized by their own distinct "true" effects [52,57]. The following meta-regression, therefore, provides a more objective test for the asymmetry of WTP estimates that also considers underlying study design characteristics as potential determinants for variation in WTP estimates.

Meta-regression models
We use MRA to assess previous research quantitatively because MRA is a powerful tool that provides information about relationships of interest by combining results from various previous studies [49]. When summarizing empirical findings of studies, focusing on a similar economic phenomenon, MRA is able to go beyond the estimates that are obtained from individual samples [50]. More precisely, MRA uses differences across studies as explanatory variables in a regression model to explain the effect of interest [84]. When combining information from independent but similar research, meta-analysis "borrows power" from multiple studies to improve parameter estimates that are obtained from a single study [85]. This allows us to estimate a proxy for the 'true' WTP effect.
The basic hypothesis of our MRA is that the variation in reported WTP estimates can be explained by the study design characteristics summarized in Table 2. Those include the year of study, number of participants and their origin, product type, definition of "local", methodological approach, the number of additional product attributes used in the study, and some key demographics of participants. Therefore, we estimate several specifications of the following MRA model (e.g. [52,76]): where WTP i is the dependent variable which captures the i = 1, . . ., 86 identified WTP estimates is regressed on precision, X k is a vector containing k variables related to the study design used to estimate the WTP for "local", and ε i is a classical i.i.d. error term. We start with a simple version of (1) that only includes precision measured by sqrt(n) as independent variable: The estimated constant of (2) (b 0 ) provides a proxy for the "true" WTP effect. Without publication selection bias, the observed WTP effects should vary randomly around this "true" effect, independently of their precision (sqrt(n)) [52,64]. Hence, the t-test forb 1 , also known as the funnel asymmetry test (FAT), can be used to detect publication bias. Accordingly, rejection of H 0 :b 1 ¼ 0 indicates publication bias [52]. The test of H 0 :b 0 ¼ 0, also known as precision effect test (PET), serves as an indicator for the presence of a significant WTP for "local" effect after correction for publication bias [52]. As a robustness check, we also estimate (Eq 2) using the number of participants (n) as the precision measure.
Our final MRA model extends (2) by additionally considering all variables that capture the study design X k used to estimate the WTP effect: The econometric estimation of the MRA models specified by (2) and (3) involves two hurdles. First, heterogeneous variances used in WTP estimation might lead to potential heteroscedasticity in the error terms (ε i ),which causes biased standard errors of (2) and (3) (e.g., [52,50]). Nevertheless, the square root of the number of participants (sqrt(n)) is a good indicator for this heteroscedasticity because it is positively related to the estimation precision [52]. Therefore, to generate efficient estimates of (2) and (3) with corrected standard errors, we use weighted least squares (WLS) regression with sqrt(n) as weights [86,52,56].
A second hurdle evolves from the fact that the 86 collected WTP effects constitute 35 clusters of estimates from the same study. Consequently, intra-cluster error correlations may affect WTP observations, which would result in biased standard error estimates of (2) and (3) [50,57]. Therefore, when estimating (2) and (3), we apply several approaches to mitigate intrastudy error dependency: (i) WLS with heteroscedasticity robust standard errors; (ii) WLS with cluster robust standard errors; and (iii) wild bootstrapped standard errors. WLS with robust standard errors is considered as the base specification, while WLS with cluster robust standard errors is usually considered as the superior approach, to capture the heteroscedasticity in meta-regression data [77]. Nevertheless, Angrist and Pischke (2008 [87]) show that the minimum number of clusters for its application should be 42. As our data only consists of 35 study clusters, we also apply the wild bootstrap specification as a robustness check, which is particularly suited to meta-regression data with a small number of clusters [88].

Results
We present our meta-regression results in Tables 3 and 4. As described above, we use sqrt(n) and n as precision measures and apply three different approaches to correct for intra-study error correlations. Table 3 presents WLS results for the simple model without additional study design covariates (Eq 2). Columns (3) and (4) display the results for the main specification, WLS with cluster robust standard errors. The significant and negative coefficients of sqrt(n) and n confirm the presence of publication bias already detected by the funnel plot. This finding is consistent across the remaining methods used to control for intra-study error dependence (robust standard errors in columns (1) and (2) as well as Wild bootstrapped standard errors in columns (5) and (6)).
The estimated constant which serves as a proxy for the "true" WTP for local, after correcting for publication bias is our main parameter of interest. The results in Table 3 indicate the presence of a significant WTP for local effect. That is, the constant estimate is significant in all of our models, implying that, after controlling for publication bias, the weighted average of WTP for local across the included studies ranges between $1.696/lb and $2.076/lb. Table 4 presents the results of the full MRA models that take into account methodological differences across the studies included in our analysis. Model diagnosis shows that all models are overall significant based on the F-test. S4 Table reports that none of the variance inflation factors (VIFs) values exceeds the critical level of 10. Moreover, the correlation matrix reported in S2 Table does not point to high correlations among the set of independent variables. This indicates that a severe degree of multicollinearity among the set of included explanatory variables is not present. To assess the remaining degree of heteroscedasticity after employing a weighted estimation technique, we calculate Breusch-Pagan (BP) test statistics for the final models. Results show that in none of the cases the null hypothesis of homoscedasticity is rejected (Chi 2 = 11.32, p = 0.660, df = 14 for the specification that uses sqrt(n) as a precision measure; Chi 2 = 11.78, p = 0.624, df = 14 for the specification using the number of participants as precision measure).
Similar to the simple model (Table 3), in our interpretation we focus on the main specification based on WLS with cluster robust standard errors. These results are presented in columns (3) and (4), while the remaining specifications shall serve as robustness tests. The results confirm the presence of publication bias as the coefficients for sqrt(n) and n remain significant and negative after the inclusion of relevant study design covariates. Moreover, the estimated constant terms are significant and positive. This indicates that after controlling for publication bias and variation in the WTP due to methodological and other study-specific characteristics, a premium for local attribute remains present. Note that the constant reflects the "true" empirical WTP for local beyond publication bias if all study design covariates are equal to zero [89]. Hence, one should be careful when interpreting the constant parameter because the results suggest that several covariates included in the model have a significant effect. We discuss this in more detail below. Several covariates related to the study design turn out to be significant. We find that the Year of study estimate is significant and negative, implying that consumers' WTP for local food has decreased over time, perhaps because it became more widely available as even supermarkets are now increasingly offering locally sourced products [90]. On the other hand, the estimated coefficient of Country of study-US is significant and positive, indicating that the studies that were carried out in the US reported a significantly higher WTP for local food than non-US studies. Therefore, researchers, farmers and policymakers should be careful when generalizing the results from different countries. With regards to the products analyzed, we find a significantly higher WTP for local Animal products and Processed products compared to local Produce, the base group of the estimation. This indicates that the price premium for processed and, hence, value-added local food products is higher than for unprocessed alternatives, such as produce [66]. This finding also seems to be consistent with the previous research that considers multiple products in their studies, and reports higher WTP for local animal products compared to local produce [37,91]. Future studies should take these results into consideration when deciding on the type of a product to utilize for their analysis.
We find no evidence that reported WTP vary depending on the definition of "local" used in the study. This suggests that consumers do not seem to differentiate between the various labels used to convey the fact that the product is produced locally. Labels based on a general definition of "local", State Grown labels, as well as, labels referring to a specific region yield the same WTP estimates as labels related to marketing programs (the base category of the regression model). The goal of marketing programs is often to increase consumers' WTP. The fact that these labels result in similar WTP as those related to the other definitions indicates that they might lack awareness and support among consumers [92]. This might be due to the fact that they are not widely promoted and explained.
The significant and positive Method-choice experiment variable indicates that employing choice experiments can lead to higher WTP estimates as compared to the application of other experimental techniques, including auctions and contingent valuation. This is in line with Gracia et al. (2012 [59]) and Grebitus et al. (2013a [93]) who find that WTP measures from choice experiments and auctions differ. The coefficient of Hypothetical is insignificant, suggesting that there is no significant difference between the WTP obtained by conducting hypothetical and non-hypothetical experiments. This finding is consistent with previous research that shows that hypothetical studies are a good representation of non-hypothetical settings and provide bias free estimates of marginal WTP [67,68,69,70].
Also, we find that the Number of attributes used in the study has no effect on WTP for local. Similarly, our results suggest that there is no difference in reported WTP among studies that use store shoppers as compared to participants recruited through a market research company, online-survey database or through mailing and calling (the base category of the regression model). This implies that samples of randomly recruited participants through various channels, including on-line, yield similar results to samples that use "real" shoppers as participants. This is consistent with prior research [94,95,96]. However, while we also find no evidence that reported WTP estimates vary significantly over Age, Gender of the study participants has a significant negative effect on the resulting WTP for local, indicating that female consumers might be more price conscious compared to male consumers. The variable income was excluded from the estimation as it is highly correlated with the Country of Study-US variable leading to severe multicollinearity problems.
To interpret the estimated coefficient of the constant we use model (4) as an example. Setting all categorical variables related to the underlying study design equal to zero, inserting mean values for the variables participants (n) and Gender, and adding the constant yields a statistically significant value of 1.89. This suggests the presence of a significant WTP for local of 1.89 $/lb after correction for publication bias for the base group (before 2011, non-US, produce, marketing definition of local, no choice experiment or hypothetical, participants recruited outside the shopping locations, and zero additional attributes). Starting from this value we can calculate the WTP for each combination of study design attributes. For example, focusing on the US increases this value by 0.79, while considering studies post 2011 leads to a decrease of 0.81.
Note that the small number of available observations [69] compared to the rather large number of explanatory variables included [14] might point to a problem of model overfitting. Nevertheless, recommendations on model overfitting by Howell (1987 [97]) and Harris (1985 [98]) state that the number of observations should exceed the number of regressors by 50 which is fulfilled in the present case.
Finally, directly taking WTP as reported in the underlying paper might result in biases caused by, e.g., heterogeneity across products or countries that cannot be controlled for by the explanatory variables included in the MRA [99]. Therefore, as additional robustness check, we estimated our models using the percentage of WTP premium over the average price of the product (e.g., [63]) as a dependent variable. The results are presented in Tables A  and B in S3 Table, and mainly reflect our original findings. In addition, a funnel plot using the percentage WTP premium also reflects the plot using the direct WTP measure (see S1 Fig).

Conclusion
The body of research on local food continues to grow, with many articles investigating the premium consumers are willing to pay for local. This literature, however, provides a range of estimates for the local attribute that appears to vary significantly based on, for example, the type of "local" labeling employed [38, 48,35,32,39] or the product category used [15,26,22,17,11]. Therefore, the objective of this paper is to determine a holistic estimate of the WTP for the local attribute. In order to do so, we utilize an MRA, which is a quantitative method used to evaluate the effect of study-specific characteristics on published empirical results. Collecting all relevant evidence on this topic and utilizing a systematic review methodology, we find that there is a significant mean estimate for products labeled as "local" that ranges between $1.70/lb and $2.08/lb (0.414 and 0.522 when the percentage WTP premium is used). As such, this research contributes to the broad literature that studies consumer demand for local food by deriving a proxy for "true" WTP for local.
In addition, we detect publication bias in the literature reviewed because we identify a significant relationship between WTP for "local" estimates and their precision. This suggests that there might be a tendency to select a specific combination of estimates and precision that leads to statistical significance, implying that significant estimates are more likely to be selected for publication.
Using our analysis, we also examine how major methodological characteristics of the studies affect the WTP estimates. For example, our results indicate that the specific labelling used by a study to convey the "local" attribute does not affect the reported results. This suggests that consumers do not seem to favor any particular definition of local. Therefore, policymakers and farmers should take our findings into consideration when deciding on how to promote local food. For example, investing in marketing program labels seems inefficient because they do not have a significant effect on consumers' WTP. This might occur due to the lack of awareness of such programs.
Instead of looking to increase the sales through the use of various labels, farmers should consider extending their product lines to include processed and, hence, value-added local items. By extending their product line, farmers will increase the variety of goods available.
This, in turn, may improve their profitability because, according to our results, consumers are willing to pay higher premiums for value-added local products compared to local produce. Moreover, it may present farmers with an opportunity to expand their distribution to intermediated channels, since shelf stable items are highly sought there.
Our results also indicate that there is no difference in reported WTP measured by hypothetical as opposed to non-hypothetical experiments. On the other hand, utilizing choice experiments seems to result in higher WTP as compared to other experimental technics, including auctions and contingent valuation. Note that although our findings reveal that the method used (choice experiment vs. other methods) has an impact on the resulting WTP estimate we cannot draw any conclusions on which experimental method is more adequate to use. Therefore, studies that investigate reliability and validity of various research techniques may explore these findings further.
This research is not without limitations. First, the number of studies included is limited by means of the search criteria imposed by our research objective. For example, some articles that do not state "local" explicitly in their title, abstract and/or article keywords, even though they include it in their research design, may have been missed. Therefore, it might be beneficial to repeat our analysis in the future using wider searching criteria. Meanwhile, research on WTP for the local attribute should focus on including all relevant information in the study description, to ensure transparency and allow for in-depth meta-analyses.
Second, while the increasing number of local food articles over time may suggest a rise in interest in local food, it may also indicate that there have been more food or agricultural economics papers published over the years. Future studies should consider looking at the relative rather than the absolute number of articles published. Third, using the year 2011 as a cutoff point to identify whether there is a change in local food preferences over time may be considered shortsighted. Finally, while the majority of studies used in our MRA utilize some variation of the Random Parameters Logit Model for their estimations, five papers employed another type of estimator. However, given the small number, it is not feasible to control for the type of econometric method used. Also, the type of data collection predetermines the type of econometric analysis used. Therefore, including both types of variables would likely lead to multicollinearity between variables related to the econometric method and variables related to the type of data collection. Nevertheless, the estimation method applied might have an effect on the final WTP value reported.
Despite these limitations, the results of this study are valuable because they provide useful information about consumers' response to labeling and marketing products as local. Our findings can be taken into consideration by future theoretical and empirical research focusing on the WTP for local food. Also, knowing a more precise proxy for the value that consumers place on the local attribute can assist stakeholders from industry, and those involved in policymaking, planning and management, to make better decisions when setting up prices and developing promotional activities.
Supporting information S1