Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Quantifying the foodscape: A systematic review and meta-analysis of the validity of commercially available business data

  • Alexandre Lebel ,

    alebel@criucpq.ulaval.ca

    Affiliations Evaluation Platform on Obesity Prevention, Quebec Heart and Lung Institute Research Centre, Quebec City (QC), Canada, Graduate School of Urban Planning and Land Management, Laval University, Quebec City (QC), Canada

  • Madeleine I. G. Daepp,

    Affiliation Department of Urban Studies & Planning, Massachusetts Institute of Technology, Cambridge (MA), United States of America

  • Jason P. Block,

    Affiliation Department of Population Medicine, Harvard Medical School/Harvard Pilgrim Health Care Institute, Boston (MA), United States of America

  • Renée Walker,

    Affiliation Zilber School of Public Health, University of Wisconsin, Milwaukee (WI), United States of America

  • Benoît Lalonde,

    Affiliation Evaluation Platform on Obesity Prevention, Quebec Heart and Lung Institute Research Centre, Quebec City (QC), Canada

  • Yan Kestens,

    Affiliations Social and Preventive Medicine Department, Université de Montréal, Montréal (QC), Canada, Research Centre of Centre hospitalier de l’Université de Montréal, Montréal (QC), Canada

  • S. V. Subramanian

    Affiliation Department of Social and Behavioral Sciences, Harvard School of Public Health, Boston (MA), United States of America

Quantifying the foodscape: A systematic review and meta-analysis of the validity of commercially available business data

  • Alexandre Lebel, 
  • Madeleine I. G. Daepp, 
  • Jason P. Block, 
  • Renée Walker, 
  • Benoît Lalonde, 
  • Yan Kestens, 
  • S. V. Subramanian
PLOS
x

Abstract

This paper reviews studies of the validity of commercially available business (CAB) data on food establishments (“the foodscape”), offering a meta-analysis of characteristics associated with CAB quality and a case study evaluating the performance of commonly-used validity indicators describing the foodscape. Existing validation studies report a broad range in CAB data quality, although most studies conclude that CAB quality is “moderate” to “substantial”. We conclude that current studies may underestimate the quality of CAB data. We recommend that future validation studies use density-adjusted and exposure measures to offer a more meaningful characterization of the relationship of data error with spatial exposure.

Introduction

The influence of local food environments on dietary behaviors has generated much interest among researchers and policymakers concerned about lifestyle, obesity, and other chronic health conditions [16]. However, associations between measures of exposure to food establishments (e.g. access or availability) and health or health-related behaviours are mixed [711]. While some researchers have found positive associations between measures of food establishment exposure and health outcomes [1215], several studies report negative associations [16, 17]. Errors in the information used to identify food establishments may contribute to the disparate nature of existing results [18].

Researchers seeking area-based measures of exposure to food establishments, commonly referred to as the “food environment” [10] or the “foodscape” [19, 20], often rely on commercially available business (CAB) data. CAB data are often more readily available than governmental resources (e.g. food establishment inspection or licensing records) and require less time to obtain than field observations, leading to their widespread use [12, 18, 21, 22]. Assessments of the validity of such data sources have increasingly been recognized as an important component of public health studies examining the food environment [23, 24]. These assessments compare CAB data sources characterizing retail food environments with a “gold standard” such as ground truthing–the systematic observation of the study area–or an official government listing (e.g. food safety inspection records).

A recent review of such studies found wide variability in CAB data validity estimates, ultimately recommending that researchers rely on primary data collection whenever possible [25]. Although this solution may be ideal, collecting primary data is often time consuming and expensive, and one could expect further use of CAB data in research along with validity measures. However, global measures of validity are sensitive to differences in the total count of food stores, while such absolute changes may not significantly affect relative food environment measures exposure (such as the number of stores per capita, or density-adjusted measurements) researchers ultimately use to explore associations of the foodscape and health [26, 27]. Furthermore, the traditional validation measures used are not necessarily comparable across studies since they are sensitive to sample size and dispersion measurement distribution [28]. These flaws arise as the validity measures evaluated, which were drawn from the field of epidemiology, are not designed to evaluate spatial exposure data [29, 30]. Counted data within a geographic area are known to be driven by the underlying urban density, and are not necessarily a relevant proxy to estimate a specific exposure in an epidemiologic study [31]. For example, where a great number of fast-food restaurants is found, a great number of other services, such as banks or pet shops, will also be found [32]. Drawing on approaches from geography, we argue that per capita measures are potentially more useful to estimate the exposure because they offer researchers an understanding of how errors in the CAB data affect measures of exposure to food outlets—and thus offer more insight on the likely effect on researchers’ ultimate outcome of interest: the association of the food environment and diet-related health [31].

The aim of this study was to characterize and interpret existing estimates of the validity of CAB data sets for foodscape research. The methodology includes three components: 1) a systematic review of studies assessing the validity of CAB food establishment data sources in public health and social epidemiology research, 2) a meta-analysis of the results obtained from these studies, and 3) a case study comparing the interpretation of validation measures with the correlation of density-adjusted food environment exposure between a CAB data source and a gold standard. Components (1) and (2) offer researchers a general estimation of the magnitude and type of error commonly observed in CAB data, while component (3) examines the effects error may have on research outcomes.

Materials and methods

Systematic review

The review focused on studies that investigated the validity of CAB food environment data sources. It was performed using PubMed, which gives access primarily to the MEDLINE database of public health and social epidemiology related scientific references. We created a two-step procedure for searches. We built two independent research “blocks” and then identified the manuscripts that were present in both blocks. Although the first block contained some “food outlet” terms, the search used search terms related to the types or descriptions of data sources that could be used to identify “food outlets” including, “commercial database”, “ground truthing”, “secondary commercial data”, and variations of these terms using the “OR” function. The second block included terms describing food establishments: “food supply”, “food stores”, “foodscape” OR “eating places”. The detailed search strategy is available in the supporting information document (S1 File- Search strategy) and presents all keywords used for each block.

The review was limited to primary studies published in English between January 1st 2006 and June 30th 2015, covering the last decade, where considerable progress has been made in GIS-based investigations [33]. Titles and abstracts were then examined by two researchers (BL, AL) to identify all studies that compared a CAB data source to a gold standard, such as primary data collection (e.g. ground truthing) or government lists (food establishment inspections or licensing records). For those titles and abstracts that did not reveal these criteria, two researchers examined the entire article (BL, MD) and two researchers checked the final selection (MD, AL). The search procedure was summarized in a flow chart (Fig 1). Examples of manuscripts that did not meet our inclusion criteria are listed in the supporting information document (S2 File- Examples not included).

All included studies reported epidemiologic validation measures to quantify error in the CAB datasets. These measures were typically constructed from the number of true positives, false positives, and false negatives (see Table 1). Authors used these measures to calculate sensitivity (the proportion of establishments in the gold standard also found in the CAB data source), positive predictive value (the proportion of establishments in the CAB data source also found in the gold standard) and concordance (the proportion of all establishments identified in the gold standard or CAB that are in both data sources, including true positives, false positives, and false negatives). Because most studies reported validation measurements across a variety of store types or between multiple CABs, we calculated the median and interquartile range of the measures reported in each study. We also examined whether these studies reported evidence of systematic bias according to the most commonly reported contextual measurements: neighbourhood socioeconomic status, population density, and neighbourhood racial composition. Each paper measured significance differently. As a result, we also relied on author interpretations to evaluate the results; details of author interpretations can be found in the supplementary documentation (S3 File- Author interpretations).

thumbnail
Table 1. Validity score measurements of a CAB dataset using a gold standard.

https://doi.org/10.1371/journal.pone.0174417.t001

Meta-analysis of CAB validity measures

The second component of this study, a meta-analysis of validity results, aimed to assess whether the use of classification schemes, characteristics of the CAB data source, or the sample size examined in the study were associated with error rates. To construct the meta-analysis, we followed several steps. First, one researcher (MD) extracted the concordance, positive predictive value (PPV), and sensitivity values across stores and CAB types from each reviewed study (S4 File- Meta-analysis dataset). For example, a study that validated both Dun & Bradstreet and InfoUSA data with ground-truthed food outlet locations for supermarkets, grocery stores, and fast food restaurants would have six entries for each concordance, PPV, and sensitivity category (separate for each CAB and each type of food establishment). Hereafter, we refer to these different types of food establishments as CAB subsamples and the multiple entries per subsamples as measures on CAB subsamples.

First, boxplots compared the distribution of sensitivity, PPV and concordance estimate across aggregated samples of all food outlets and across the subsamples to evaluate whether detailed store type classifications led researchers to report lower validity scores.

Next, we examined the associations between CAB characteristics and levels of validity. Studies commonly reported the geographic region for which the CAB was obtained as well as the CAB name. We used these data to construct scatterplots comparing subsample validity estimates with the sample size (defined below), stratified by country. Boxplots additionally compared the distributions of validity estimates for the most commonly examined CABs (InfoUSA and Dun & Bradstreet).

Finally, we examined the association of sample size and validity. We estimated the correlation of validity measurements of each CAB subsample with its sample size using Spearman’s rank correlation coefficient. Sample size was calculated as the number of food outlets of the type under examination that exist in the CAB, whenever available, or as the total unique outlets examined in either CAB or gold standard when CAB numbers alone were not reported.

Case study comparing validity scores and correlation of per capita exposure

This case study analysis used data from Boston (Massachusetts, USA) to assess the relationships of commonly used validity measures and food outlet exposure per capita at the neighbourhood level, the type of measurement ultimately of interest in health and place research. InfoUSA food outlet data for 2009 (obtained through ESRI Business Analyst) was compared against the 2009 food store database maintained by the city of Boston’s Inspectional Services Department (ISD); the former dataset served as the CAB data, while the latter—a comprehensive, well-maintained and validated government data source—was treated as the gold standard. We considered the ISD data to be the gold standard because the city of Boston is required by law to license all food establishments and to conduct annual food safety inspections [34]. Food safety inspectors visit these fixed locations, and food establishments are required to obtain a permit to operate. Therefore, there is regular “ground truthing” by the government officials. Some establishments could be missed if they did not obtain proper permits, of if they mobile installations.

The InfoUSA data set included all business establishments located within 500m buffers of the study’s selected census tracts; North American Industry Classification System (NAICS) codes were used to identify and classify establishments selling food or beverages (n = 7465). Each store classification was reviewed and category assignments were revised according to keywords as well as researchers’ local area knowledge. In the ISD data, each entry was reviewed individually to remove duplicates and non-commercial entities (e.g. children’s feeding programs), and was categorized according to the NAICS codes definition. All establishments (n = 1581) except those without identifiable civic addresses (n = 40) were geocoded with ArcGIS 10.0; the coordinates for addresses that could not be geocoded (n = 4) were obtained from Google Maps and validated in the field.

The clean data sets were merged in ArcGIS 10.0 according to spatial location. Each unique food establishment was examined to determine the number of stores found only in the ISD data (false negatives), those found only in the InfoUSA data (false positives), and those found in both datasets (true positives). These counts were assessed across all food establishments—regardless of classification—as well as across each of four food outlet types: full-service restaurants, fast-food restaurants, caterers and grocery/convenience stores. We consider a listing to be in both data sources (a true positive) if an outlet with the same name was observed was very close (within +/- 200 m) and on the same street in both data sources. Sensitivity, PPV and concordance between the two data sets were calculated following the formula in Table 1.

In addition to the validity statistics describing the entire area, we computed the correlation between the per capita food environment exposure estimated by both data sets. We used Spearman’s rank correlation coefficient to account for the non-normal distribution of the data. For each Boston neighbourhood (n = 27), the number of stores per capita based on the estimated 2009 population in census tracts was calculated for both the InfoUSA data and the ISD data; the correlation between the per capita exposure across neighbourhoods was then calculated for all food establishments as well as for full-service restaurants, fast food restaurants, caterers, and grocery/convenience stores. In addition of showing the level consistency of both datasets, were compared the validation measurements (concordance, PPV and sensitivity) and the correlations of the per capita exposure estimations to reveal if these different validation indicators provided diverging assessments of CAB data quality.

Results

Systematic review

The systematic search strategy produced 20 manuscripts that validated at least one CAB data source in comparison with a gold standard (Table 2). Twelve studies were conducted in the United States [9, 3545], four were conducted in the United Kingdom [19, 4648], two were conducted in Canada [49, 50] and two were conducted in Denmark [51, 52]. Eight of these studies reported concordance between a gold standard and CAB data sources, 15 studies computed positive predictive values (PPV), and 16 computed the sensitivities; five studies reported all three validation indices. The median reported PPV (across all store type subsamples) was 77% (IQR = 30%), sensitivity 60% (IQR = 37%), and concordance 71% (IQR = 57%) across all studies.

thumbnail
Table 2. Summary information for 20 foodscape validation manuscripts, 2006–2015.

https://doi.org/10.1371/journal.pone.0174417.t002

Thirteen studies examined the relationship of CAB data source validity scores and neighbourhood characteristics. Seven of the nine studies that examined neighbourhood socioeconomic status and three of the five studies that examined race concluded that there were no significant differences in CAB validity across neighbourhoods. In contrast, four of the seven studies that examined population density did find evidence of systematic differences in validity according to commercial or population density. It should be noted, however, that many of these studies tested several associations across different subsets of the CAB data without correcting for multiple testing, and thus the results may be subject to an inflated type 1 error rate [53].

Meta-analysis of CAB validity measures

A total of 540 measures on subsamples were extracted from the 20 studies under review. Sixteen studies reported sensitivity (n = 235), 15 studies reported PPV (n = 163), and 8 studies reported concordance.

When aggregate samples were examined (Fig 2), studies reported slightly higher median PPV (79%) and sensitivity (66%) in contrast with the medians reported for examinations of subsamples (median PPV = 76%; sensitivity = 59%). Median concordance was slightly higher for the subsamples (74%) than for the (53%); however, subsample estimates had much higher variability (Fig 2B) than did aggregate sample estimates (Fig 2A).

thumbnail
Fig 2. Median and interquartile range of validation measures across reviewed studies.

A) Validity measures reported for aggregate store sample examined in the study (PPV n = 20; Sens n = 28; and Conc. n = 12). B) Validity measures reported for subsamples, defined as subsets of outlets examined according to outlet type (PPV n = 136; Sens n = 200; and Conc. n = 130). In each boxplot, the dark line indicates the overall median; in the case of 1B, the dark line is the median of the medians reported in studies. The upper and lower hinges of the box are the first and third quartiles, and the whiskers extend to approximately 1.5 times the interquartile range.

https://doi.org/10.1371/journal.pone.0174417.g002

Between-country results (Fig 3) showed a greater range in PPV, sensitivity and concordance estimates in the United States as compared with other countries. Studies report comparable median validity scores in Canada (Sensitivity = 59%; PPV = 71%; Concordance = 71%) and the United Kingdom (Sensitivity = 61%; PPV = 81%; Concordance = 50%) to the medians reported for the US (Sensitivity = 59%; PPV = 75%; Concordance = 74%), although results from Denmark are consistently higher (Sensitivity = 82%; PPV = 94%; Concordance = 78%). However, there is a much smaller range of validity estimates obtained from the studies that have been conducted in Canada (Sensitivity IQR = 17%; PPV IQR = 25%; Concordance IQR = 6%), Denmark (Sensitivity IQR = 10%; PPV IQR = 5%; Concordance IQR = 9%), or the UK (Sensitivity IQR = 30%; PPV IQR = 8%; Concordance IQR = 16%) in contrast with the United States (Sensitivity IQR = 41%; PPV IQR = 35%; Concordance IQR = 61%)—though these differences in variability may be a product of the smaller numbers of studies or the smaller sample sizes for studies conducted outside of the U.S.

thumbnail
Fig 3. Validity estimates across countries.

Points represent the estimate obtained for each validity measure, plotted against the number of outlets listed in the CAB dataset. There is a much smaller range of validity estimates obtained from the studies that have been conducted outside of the United States. However, we can also see that fewer studies have been conducted in Canada, Denmark, and the United Kingdom than in the United States. Furthermore, the studies that have been conducted outside of the United States have smaller sample sizes than many of the U.S. studies.

https://doi.org/10.1371/journal.pone.0174417.g003

In the comparison across different sources of CAB data, median validation scores tended to be lower in studies using Dun & Bradstreet datasets. However, all sources had a similar and wide range of validity measurements across studies, even among government data, and does not allow to clearly identify if a data source is more valid than the others (Fig 4).

thumbnail
Fig 4. Validity estimates by dataset.

Boxplots are used to compare the validity estimates from studies that assessed Dun & Bradstreet CAB, DMTI Spatial, Inc.’s Enhanced Points of Interest (POI) or UKPOI CABs, government datasets (e.g. from health registers, SNAP or WIC listings, store licenses, and tax registrations), and InfoUSA CAB.

https://doi.org/10.1371/journal.pone.0174417.g004

The number of stores listed in the CAB was positively associated with sensitivity (Spearman’s Rank ρ = 0.178, p = 0.007) and inversely associated with PPV (Spearman’s Rank ρ = -0.287, p < 0.001) and concordance (Spearman’s Rank ρ = -0.646, p < 0.001), but this association was in part due to the presence of a very small number of stores examined. As an example in Table 3, when we examined the associations of validity measures and sample size while keeping CAB subsamples with a higher food store number (above 3, 10, and 30 observations), the strength, the sign and the p-value of the correlations changed importantly, suggesting the correlations were sensitive to the presence of subsamples having a small sample size in the distribution.

thumbnail
Table 3. Effect of small samples on Spearman’s rank correlation coefficient for sample size and validity measures.

https://doi.org/10.1371/journal.pone.0174417.t003

Case study comparing validity scores and correlation of per capita exposure

The mean food store density per 1000 people, estimated for the 27 neighbourhoods of Boston, varied between the InfoUSA and the ISD datasets (Table 4). Both datasets had a very high standard deviation, which limited our ability to demonstrate significant differences either for all food stores or between each food store types. The validity estimates obtained for the 2009 Boston foodscape (Table 5) were comparable to those observed across the studies surveyed (Table 2). For all food stores, InfoUSA had sensitivity of 68%, PPV of 51%, and concordance of 41%. According to the Landis scale (<0.00 poor, 0.00–0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, and 0.81–1.00 almost perfect reliability) [30], which was used to interpret validity scores in a CAB related literature review [25], the dataset sensitivity would qualify as substantially reliable, while PPV and concordance would be considered as moderately reliable.

thumbnail
Table 4. Descriptive statistics of food store exposure per 1,000 residents in 27 neighbourhoods of Boston, MA.

https://doi.org/10.1371/journal.pone.0174417.t004

thumbnail
Table 5. InfoUSA’s validity scores and correlation of per capita food store exposure as compared to Boston ISD, 2009.

https://doi.org/10.1371/journal.pone.0174417.t005

In contrast with the validity estimates, the relative food store exposure by neighbourhood—calculated as the number of stores available per capita in each neighbourhood—was similar between the two datasets (Table 5). The correlation of exposure to food stores per 1000 people between the gold standard (ISD) and the CAB (InfoUSA) was 86.9%. For each food store category, the correlations were 99.6% for full-service restaurants, 96.8% for fast-food restaurants, 83.5% for grocery and convenience stores, and 76.9% for caterers. All correlations were significant at the 1% level.

Discussion

Public health authorities and researchers are increasingly seeking to estimate the association of the food environment with health outcomes or diet, but the quality of food environment data poses a significant challenge. The main purpose of this research was to analyse studies assessing the validity of commercially available business (CAB) data sources for food establishments in order to characterize and interpret the validation indicators commonly used in health and place studies. This study consists of three main components: 1) a description of CAB performance across studies, 2) a meta-analysis of the associations of data errors with area characteristics, and 3) a critique of the interpretation of validity measures through an alternative method of validating geographic data.

Between study CAB performance

The quality of CAB food outlet databases has been the subject of at least twenty studies to date. The reviewed studies used the epidemiological validation measures of sensitivity, positive predictive value and concordance to assess the data quality. The resulting measures showed a high variability, but the majority of sensitivity and PPV results fall between 40% and 85%. Applying the interpretations of the Landis Scale, the above-mentioned results can be seen as moderate to substantial reliability. However, the Landis Scale was originally designed to evaluate Kappa statistics, which are slightly different from the validity measures surveyed in this study (Munoz and Bangdiwala 1997). The Kappa statistic is a measure of precision between raters that compares the observed agreement between two sources with the agreement that would occur by chance; in contrast, sensitivity, PPV and concordance are not adjusted for random agreement (cite: Viera & Garrett 2005), and thus its levels deserve a stricter interpretation. Furthermore, as Landis and Koch noted, the scale’s statistical thresholds were not supported by empirical investigations, but rather provided a useful benchmark for a discussion (Landis and Koch 1977). Furthermore, it is important to mention that several CAB validation studies directly referred to an interpretation scale proposed by Paquet [54] to analyze the concordance of their observations, which in turn referred to Janse [55]. The latter is actually a meta-analysis of patient-doctor agreement on the quality of life, and provide no justification to interpret the degree of agreement. Analysing the concordance between CAB databases is a very different research context and may not be directly transferable. The validity of a CAB would be better evaluated in terms of the error’s likely effect on study outcomes. For example, if 20% of fast food outlets are incorrectly classified in the CAB, will associations of fast food outlet exposure and diet-related health be compromised? Not necessarily, because one type of food outlet may be replaced by a similar type of establishment. In this situation, the validity measure will go down, while the exposure to food outlet of similar type would stay about the same. As only one study has examined the effect of dataset error on measurements of the food environment [56] and no study, to our knowledge, has examined the effect of data set error on study outcomes, this question remains unanswered. Future research could address this gap through methods similar to those presented in this paper’s case study—i.e. through field research that, in addition to calculating validity scores, also examines the correlations between food environment exposure measures constructed from secondary and from gold-standard data—or through simulation studies that estimate the potential effects of various levels of error on measures of food environment exposure.

This study found a statistically significant relationship between sample size and validity measures. However, the association of sensitivity and CAB sample size reversed direction when subsamples with very few listings—and thus with extreme values—were excluded. Although the associations of PPV and concordance were negative and statistically significant both for all subsamples and for subsamples with large n, excluding subsamples with few listings led to a large decrease in the magnitude of the association. As a result, comparing validity statistics between studies with large differences in the number of observations appears highly questionable, and we recommend that researchers use caution when interpreting data disaggregated into very small subcategories.

This study did not find evidence of noteworthy differences in quality across different CABs. This finding does not endorse those reported in a recent review, which reported high levels of agreement in InfoUSA and government data in comparison with other secondary data sources [25]. Although we also observe that these two sources had a slightly higher median validity measurements, there is strong variability around the median values, preventing a clear conclusion regarding each CABs relative reliability.

Comparability between countries is also limited. There is some evidence that studies conducted in Denmark, Canada, and the United Kingdom obtained higher validity measures than those conducted in the United States, but studies in the former three countries have been much fewer in number and used smaller samples than many of the studies conducted in the U.S.

Associations with area characteristics

This meta-analysis did not reveal evidence of a systematic relationship between CAB error and neighbourhood characteristics such as socioeconomic status or neighbourhood racial composition. Of the nine studies that disaggregated measures by neighbourhood socioeconomic status, only two reported a relationship with validity measures, and three of the five studies that examined racial demographics found no significant association with CAB data validity. These results align with the measures reported in a recent, similar review [25].

Among studies using CAB data in areas with variability in commercial or population density, four out of seven studies found that validity measures differed significantly between areas with high versus low densities. This result is possibly linked to the number of food stores under investigation as we demonstrated previously, and where the smallest samples (n<3) tended to lead to extreme validity scores. This finding suggests that validity scores are highly sensitive to very small sample size and thus may offer limited insight for studies conducted in rural areas or studies that disaggregate outlet data into many food outlet categories.

Comparison of validity indicators with a measure of exposure

This paper used a case study from Boston (Massachusetts, USA) to compare the validity measures with a more common characterization of spatial exposure data, correlation of per capita exposure. While the three validity scores identified many errors in the CAB data, the per capita exposure to the foodscape was highly correlated between the CAB and gold standard data sources. The validity measures, originally developed to evaluate the quality of diagnostic tests, may not be suited to the measurement of spatial exposure data. The calculation of true positives, false positives, and false negatives requires that the outlet characteristics in the CAB data be nearly identical to those in the gold standard dataset. Many studies did consider listings with slight errors (e.g. incorrect names but correct classifications) as true positives, but minor errors in address or classification would have been listed as false positives, while their corresponding “real-world” outlet would be considered a false negative. Small errors can thus lead to large differences in validity measures despite a high level of similarity between per capita exposure to CAB food outlets and to gold standard food outlets.

Strengths and limitations

This study is, to our knowledge, the first study to compare estimates of food environment dataset validity across countries; our assessment of the association between validity scores and sample sizes also offers researchers insight on the effects of detailed store classification schemes. However, this study did not test for associations between study characteristics (e.g. funding sources or research design) and CAB validity scores. The high variance observed in estimates of sensitivity, specificity, and positive predictive value thus may reflect differences in the quality of the studies examined rather than true differences in dataset quality. It should also be noted that this review relied only on data extracted from published studies. We did pursue unpublished data; thus the results may be affected by publication bias.

Although exposure measurements would allow a better assessment of the food environment, they also have limitations. The computation of a relative indicator, such as per-capita measures, is clearly pertinent for between-area or between-study comparison analyses, but it is dependent on the geography on which it is computed (e.g. the size and the borders of a neighbourhood) [57]. Also, correlation may not be the best validation tool when the objective is to construct measures of access to food sources (e.g. measuring the closest fast-food restaurant from home, or the mean distance to the three closest convenience stores) for which the precision of the geographic information is particularly important.

Conclusions

All studies inspected here examined global error in preliminary food environments data. Further research is needed to understand how error affects the food environment measurements that are ultimately used in health and place research, but this work can offer guidance for future validation studies.

Although the majority of CAB data sources have moderate to substantial reliability according to the Landis scale, this scale may not provide adequate guidance to evaluate CAB validity. No guidelines currently exist to interpret validity measures specifically for geocoded built environment databases and their interpretation requires caution. We thus suggest that the analysis of validity measures should be accompanied by relative measure of exposure. Researchers should further be cautious in disaggregating data by outlet classification and geography as the use of data subsets with very small sample sizes can lead to the proliferation of extreme results. The results of the case study in Boston brought new insight on this aspect, suggesting that existing validation studies may underestimate the quality of CAB data sources for food environments research. Although validity measures indicated substantial errors between the CAB and the gold standard, when adjusted for neighbourhood population density (i.e. per capita exposure to foodscape), a relatively high correlation was found between both datasets. Future studies should include measures that better evaluate the effective of error on spatial exposure—correlation or “representativity” [50]—to offer a more meaningful characterization of CAB data quality when the aim is to estimate the exposure to the food environment.

While the evidence, presented in this study, of a high correlation in measures of per capita exposure obtained from CAB and gold standard data sets will be reassuring to researchers, the results are less promising for practitioners. A policymaker who prohibits fast food restaurants from locating within a set distance of schools, for example, will need exact data on outlet locations; the lower levels of validity observed in our systematic review suggest that policies requiring exact information on store locations will need to be accompanied by improved data collection mechanisms.

Although all CAB datasets include error, the systematic underestimation of CAB data validity may be leading researchers to conduct time- and cost-intensive primary data collection efforts that ultimately lead to little improvement in the research quality. Such primary data collection may be necessary in the case of a study area with high variability in population density, but food environment validation research does not offer evidence of systematic error in relation to race or socioeconomic deprivation. Further research should be conducted to develop validity measurements adapted for geographic data and to quantify the effect of data set error on measures of exposure.

Acknowledgments

This work would not have been possible without the help of Rebecca Joyce, who did a rigorous work on geocoding and validating the Boston foodscape, as well as Frédérick Bergeron from the Laval University library, who provided guidance for the literature review strategy.

Author Contributions

  1. Conceptualization: AL MD JB YK SVS.
  2. Data curation: AL MD RW BL.
  3. Formal analysis: AL MD RW BL.
  4. Funding acquisition: AL MD JB SVS.
  5. Investigation: AL MD RW BL.
  6. Methodology: AL MD BL YK SVS.
  7. Project administration: AL MD JB YK SVS.
  8. Resources: AL MD JB RW.
  9. Software: AL MD.
  10. Supervision: AL JB YK SVS.
  11. Validation: MD BL.
  12. Visualization: AL MD BL.
  13. Writing – original draft: AL MD BL.
  14. Writing – review & editing: AL MD JB RW BL YK SVS.

References

  1. 1. McLaren L. Socioeconomic Status and Obesity. Epidemiologic Reviews. 2007;29(1):29–48.
  2. 2. Han E, Powell LM, Zenk SN, Rimkus L, Ohri-Vachaspati P, Chaloupka FJ. Classification bias in commercial business lists for retail food stores in the US. International Journal of Behavioral Nutrition and Physical Activity. 2012;9(1):46.
  3. 3. Fielding JE, Simon PA. Food deserts or food swamps. Arch Intern Med. 2011;171(13):1171–2. pmid:21747012
  4. 4. Karpyn A, Manon M, Treuhaft S, Giang T, Harries C, McCoubrey K. Policy solutions to the 'grocery gap'. Health Aff. 2010;29(3):473–80.
  5. 5. Sturm R, Cohen DA. Zoning for health? the year-old ban On new fast-food restaurants in South LA. Health Aff. 2009;28(6):w1088–w97.
  6. 6. White M. Food access and obesity. Obesity reviews. 2007;8(s1):99–107.
  7. 7. Chaparro MP, Whaley SE, Crespi CM, Koleilat M, Nobari TZ, Seto E, et al. Influences of the neighbourhood food environment on adiposity of low-income preschool-aged children in Los Angeles County: a longitudinal study. Journal of epidemiology and community health. 2014;68(11):1027–33. pmid:25012991
  8. 8. Block JP, Christakis NA, O’Malley AJ, Subramanian SV. Proximity to Food Establishments and Body Mass Index in the Framingham Heart Study Offspring Cohort Over 30 Years. American Journal of Epidemiology. 2011;174(10):1108–14. pmid:21965186
  9. 9. Powell LM, Han E, Zenk SN, Khan T, Quinn CM, Gibbs KP, et al. Field validation of secondary commercial data sources on the retail food outlet environment in the U.S. Health & Place. 2011;17(5):1122–31.
  10. 10. Thornton LE, Pearce JR, Kavanagh AM. Using Geographic Information Systems (GIS) to assess the role of the built environment in influencing obesity: a glossary. International Journal of Behavioral Nutrition and Physical Activity. 2011;8(1):71.
  11. 11. Fleischhacker SE, Evenson KR, Rodriguez DA, Ammerman AS. A systematic review of fast food access studies. Obesity Reviews. 2011;12(5):e460–e71. pmid:20149118
  12. 12. Currie J, DellaVigna S, Moretti E, Pathania V. The effect of fast food restaurants on obesity: National Bureau of Economic Research Cambridge, MA; 2009.
  13. 13. Inagami S, Cohen DA, Brown AF, Asch SM. Body mass index, neighborhood fast food and restaurant concentration, and car ownership. Journal of Urban Health. 2009;86(5):683–95. pmid:19533365
  14. 14. Moore LV, Roux AVD, Nettleton JA, Jacobs DR, Franco M. Fast-Food Consumption, Diet Quality, and Neighborhood Exposure to Fast Food The Multi-Ethnic Study of Atherosclerosis. American journal of epidemiology. 2009;170(1):29–36. pmid:19429879
  15. 15. Thornton LE, Bentley RJ, Kavanagh AM. Fast food purchasing and access to fast food restaurants: a multilevel analysis of VicLANES. International journal of behavioral nutrition and physical activity. 2009;6(1):28.
  16. 16. Frank LD, Sallis JF, Conway TL, Chapman JE, Saelens BE, Bachman W. Many pathways from land use to health: associations between neighborhood walkability and active transportation, body mass index, and air quality. Journal of the American Planning Association. 2006;72(1):75–87.
  17. 17. Pearce J, Hiscock R, Blakely T, Witten K. A national study of the association between neighbourhood access to fast-food outlets and the diet and weight of local residents. Health & place. 2009;15(1):193–7.
  18. 18. Morland KB. Local Food Environments: Food Access in America: Crc Press; 2014.
  19. 19. Lake AA, Burgoine T, Stamp E, Grieve R. The foodscape: classification and field validation of secondary data sources across urban/rural and socio-economic classifications in England. International Journal of Behavioral Nutrition and Physical Activity. 2012;9(1):37.
  20. 20. Macintyre S, Ellaway A, Cummins S. Place effects on health: how can we conceptualise, operationalise and measure them? Social science & medicine. 2002;55(1):125–39.
  21. 21. Daniel M, Paquet C, Auger N, Zang G, Kestens Y. Association of fast-food restaurant and fruit and vegetable store densities with cardiovascular mortality in a metropolitan population. European journal of epidemiology. 2010;25(10):711–9. pmid:20821254
  22. 22. An R, Sturm R. School and Residential Neighborhood Food Environment and Diet Among California Youth. American Journal of Preventive Medicine. 2012;42(2):129–35. pmid:22261208
  23. 23. Lytle LA. Measuring the food environment: state of the science. American journal of preventive medicine. 2009;36(4):S134–S44.
  24. 24. Forsyth A, Lytle L, Riper DV. Finding food: Issues and challenges in using Geographic Information Systems to measure food access. J Transp Land Use. 2010;3(1):43–65. Epub 2010/04/01. PubMed Central PMCID: PMCPMC3153443. pmid:21837264
  25. 25. Fleischhacker SE, Evenson KR, Sharkey J, Pitts SBJ, Rodriguez DA. Validity of secondary retail food outlet data: a systematic review. American journal of preventive medicine. 2013;45(4):462–73. pmid:24050423
  26. 26. Lebel A, Kestens Y, Pampalon R, Thériault M, Daniel M, Subramanian SV. Local context influence, activity space and foodscape exposure in two Canadian metropolitan settings: Is daily mobility exposure associated with overweight? Journal of Obesity. 2012;2012:9.
  27. 27. Kestens Y, Lebel A, Chaix B, Clary C, Daniel M, Pampalon R, et al. Association between Activity Space Exposure to Food Establishments and Individual Risk of Overweight. Plos One. 2012;7(8):e41418. pmid:22936974
  28. 28. Munoz SR, Bangdiwala SI. Interpretation of Kappa and B statistics measures of agreement. Journal of Applied Statistics. 1997;24(1):105–12.
  29. 29. Fletcher RH, Fletcher SW, Fletcher GS. Clinical epidemiology: the essentials: Lippincott Williams & Wilkins; 2012.
  30. 30. Landis JR, Koch GG. The measurement of observer agreement for categorical data. biometrics. 1977:159–74. pmid:843571
  31. 31. Clary CM, Ramos Y, Shareck M, Kestens Y. Should we use absolute or relative measures when assessing foodscape exposure in relation to fruit and vegetable intake? Evidence from a wide-scale Canadian study. Preventive medicine. 2015;71:83–7. pmid:25481095
  32. 32. Bader MDM, Schwartz-Soicher O, Jack D, Weiss CC, Richards CA, Quinn JW, et al. MORE NEIGHBORHOOD RETAIL ASSOCIATED WITH LOWER OBESITY AMONG NEW YORK CITY PUBLIC HIGH SCHOOL STUDENTS. Health & Place. 2013;(in press)(0).
  33. 33. Devillers R, Stein A, Bédard Y, Chrisman N, Fisher P, Shi W. Thirty Years of Research on Spatial Data Quality: Achievements, Failures, and Opportunities. Transactions in GIS. 2010;14(4):387–400.
  34. 34. ISD. Procedure for applying for a health permit. In: department BIs, editor. Boston2016.
  35. 35. Bader MDM, Ailshire JA, Morenoff JD, House JS. Measurement of the Local Food Environment: A Comparison of Existing Data Sources. American Journal of Epidemiology. 2010;171(5):609–17. pmid:20123688
  36. 36. Hosler AS, Dharssi A. Identifying retail food stores to evaluate the food environment. American journal of preventive medicine. 2010;39(1):41–4. pmid:20537845
  37. 37. Fleischhacker SE, Rodriguez DA, Evenson KR, Henley A, Gizlice Z, Soto D, et al. Evidence for validity of five secondary data sources for enumerating retail food outlets in seven American Indian communities in North Carolina. Int J Behav Nutr Phys Act. 2012;9:137. Epub 2012/11/24. PubMed Central PMCID: PMCPMC3551728. pmid:23173781
  38. 38. Gustafson AA, Lewis S, Wilson C, Pitts S. Validation of food store environment secondary data source and the role of neighborhood deprivation in Appalachia, Kentucky. BMC Public Health. 2012;12(1):688.
  39. 39. Jilcott SB, McGuirt JT, Imai S, Evenson KR. Measuring the retail food environment in rural and urban North Carolina counties. Journal of Public Health Management and Practice. 2010;16(5):432–40. pmid:20689393
  40. 40. Liese AD, Colabianchi N, Lamichhan AP, Barnes TL, Hibert JD, Prter DE, et al. Validation of 3 Food Outlet Databases: Completeness and Geospatial Accuracy in Rural and Urban Food Environments. American Journal of Epidemiology. 2010;172(11):1324–33. pmid:20961970
  41. 41. Liese AD, Barnes TL, Lamichhane AP, Hibbert JD, Colabianchi N, Lawson AB. Characterizing the food retail environment: impact of count, type, and geospatial error in 2 secondary data sources. J Nutr Educ Behav. 2013;45(5):435–42. Epub 2013/04/16. PubMed Central PMCID: PMCPMC3713101. pmid:23582231
  42. 42. Longacre MR, Primack BA, Owens PM, Gibson L, Beauregard S, Mackenzie TA, et al. Public directory data sources do not accurately characterize the food environment in two predominantly rural States. Journal of the American Dietetic Association. 2011;111(4):577–82. pmid:21443992
  43. 43. Rossen LM, Pollack KM, Curriero FC. Verification of retail food outlet location data from a local health department using ground-truthing and remote-sensing technology: Assessing differences by neighborhood characteristics. Health & Place. 2012;18(5):956–62.
  44. 44. Rummo PE, Gordon-Larsen P, Albrecht SS. Field validation of food outlet databases: the Latino food environment in North Carolina, USA. Public Health Nutr. 2014:1–6. Epub 2014/06/18.
  45. 45. Wang MC, Gonzalez AA, Ritchie LD, Winkleby MA. The neighborhood food environment: sources of historical data on retail food stores. International Journal of Behavioral Nutrition and Physical Activity. 2006;3(1):15.
  46. 46. Cummins S, Macintyre S. Are secondary data sources on the neighbourhood food environment accurate? Case-study in Glasgow, UK. Preventive Medicine. 2009;49(6):527–8. pmid:19850072
  47. 47. Burgoine T, Harrison F. Comparing the accuracy of two secondary food environment data sources in the UK across socio-economic and urban/rural divides. Int J Health Geogr. 2013;12:2. Epub 2013/01/19. PubMed Central PMCID: PMCPMC3566929. pmid:23327189
  48. 48. Lake AA, Burgoine T, Greenhalgh F, Stamp E, Tyrrell R. The foodscape: Classification and field validation of secondary data sources. Health & Place. 2010;16(4):666–73.
  49. 49. Paquet C, Daniel M, Kestens Y, Léger K, Gauvin L. International Journal of Behavioral Nutrition and Physical Activity. International Journal of Behavioral Nutrition and Physical Activity. 2008;5:58.
  50. 50. Clary CM, Kestens Y. Field validation of secondary data sources: a novel measure of representativity applied to a Canadian food outlet database. Int J Behav Nutr Phys Act. 2013;10:77. Epub 2013/06/21. PubMed Central PMCID: PMCPMC3710283. pmid:23782570
  51. 51. Svastisalee CM, Holstein BE, Due P. Validation of presence of supermarkets and fast-food outlets in Copenhagen: case study comparison of multiple sources of secondary data. Public Health Nutrition. 2012;15(07):1228–31.
  52. 52. Toft U, Erbs-Maibing P, Glumer C. Identifying fast-food restaurants using a central register as a measure of the food environment. Scand J Public Health. 2011;39(8):864–9. Epub 2011/10/05. pmid:21969329
  53. 53. Williams VS, Jones LV, Tukey JW. Controlling error in multiple comparisons, with examples from state-to-state differences in educational achievement. Journal of Educational and Behavioral Statistics. 1999;24(1):42–69.
  54. 54. Paquet C, Daniel M, Kestens Y, Leger K, Gauvin L. Field validation of listings of food stores and commercial physical activity establishments from secondary data. Int J Behav Nutr Phys Act. 2008;5:58. Epub 2008/11/13. PubMed Central PMCID: PMCPMC2615441. pmid:19000319
  55. 55. Janse A, Gemke R, Uiterwaal C, Van Der Tweel I, Kimpen J, Sinnema G. Quality of life: patients and doctors don't always agree: a meta-analysis. Journal of Clinical Epidemiology. 2004;57(7):653–61. pmid:15358393
  56. 56. Ma X, Battersby SE, Bell BA, Hibbert JD, Barnes TL, Liese AD. Variation in low food access areas due to data source inaccuracies. Appl Geogr. 2013;45. Epub 2013/12/25. PubMed Central PMCID: PMCPMC3869099.
  57. 57. Riva M, Apparicio P, Gauvin L, Brodeur J-M. Establishing the soundness of administrative spatial units for operationalising the active living potential of residential environments: an exemplar for designing optimal zones. International Journal of Health Geographics. 2008;7(1):1.