Comparing GIS-Based Measures in Access to Mammography and Their Validity in Predicting Neighborhood Risk of Late-Stage Breast Cancer

Background Assessing neighborhood environment in access to mammography remains a challenge when investigating its contextual effect on breast cancer-related outcomes. Studies using different Geographic Information Systems (GIS)-based measures reported inconsistent findings. Methods We compared GIS-based measures (travel time, service density, and a two-Step Floating Catchment Area method [2SFCA]) of access to FDA-accredited mammography facilities in terms of their Spearman correlation, agreement (Kappa) and spatial patterns. As an indicator of predictive validity, we examined their association with the odds of late-stage breast cancer using cancer registry data. Results The accessibility measures indicated considerable variation in correlation, Kappa and spatial pattern. Measures using shortest travel time (or average) and service density showed low correlations, no agreement, and different spatial patterns. Both types of measures showed low correlations and little agreement with the 2SFCA measures. Of all measures, only the two measures using 6-timezone-weighted 2SFCA method were associated with increased odds of late-stage breast cancer (quick-distance-decay: odds ratio [OR] = 1.15, 95% confidence interval [CI] = 1.01–1.32; slow-distance-decay: OR = 1.19, 95% CI = 1.03–1.37) after controlling for demographics and neighborhood socioeconomic deprivation. Conclusions Various GIS-based measures of access to mammography facilities exist and are not identical in principle and their association with late-stage breast cancer risk. Only the two measures using the 2SFCA method with 6-timezone weighting were associated with increased odds of late-stage breast cancer. These measures incorporate both travel barriers and service competition. Studies may observe different results depending on the measure of accessibility used.


Introduction
Breast cancer is an important public health issue and accounts for about 28% of cancer incidence and 15% cancer mortality in the United States [1]. Screening mammography reduces the risk of breast cancer death by early detection [2]. Geographic barriers in access to healthcare could significantly impact on population health. In recent years, it has become common to investigate the influence of geographic distribution of mammography service on mammography screening use and stage at diagnosis of breast cancer. However, findings from previous reports vary to a great degree. Some studies, found that barriers in spatial accessibility to mammography facilities increased the risk of non-adherence to screening and/or stage at diagnosis of breast cancer [3,4,5,6,7,8], but other studies did not [9,10,11]. Regardless of the limitations and potential biases in study design and data collection, inconsistency in these findings might result from the use of varying spatial methods in assessing access to mammography. Few studies have compared differences in Geographic Information System (GIS)-based measures of accessibility.
Previous assessments of spatial accessibility to mammographic service include neighborhood availability (or service density -the number of facility per population) [3,4,5,9] and travel distance (or travel time) to the nearest facility [10,11,12,13,14]. Service density has been frequently used and is easy to compute. The use of a road-network-based travel distance/time is becoming a popular measure with the rapid development of available GIS techniques. However, both of these two measures have limitations. The former ignored the interaction between population and service facilities across arbitrary neighborhood boundaries, while the latter does not account for the competition among different service facilities (demand) [15]. A gravity model overcomes both limitations through integration of travel barriers and service competitions and has become an alternative approximation of spatial accessibility to mammography services. Gravity models have been used extensively by geographers, but have been underutilized by epidemiologists. Two important gravity models are the Kernel Density (KDE) [16] method and the two-step floating catchment area (2SFCA) method [17]. The major limitation of the KDE method is that it ignores travel barriers by using a straight line Euclidean distance. The 2SFCA method uses the actual road network distance, which is much closer to real-world situations. Recently, a zonal-or continuous-weighting parameter was added to this method which allowed for a distance-related decay. This resulted in an enhanced two-step floating catchment area (E2SFCA) method [18] and a Gaussian two-step floating catchment area (G2SFCA) method [19]. More recently, the 2SFCA method was further improved to overcome the influences of rural-urban difference or large irregular study area through using varied catchment sizes [20] or aggregating small-area 2SFCA measures to larger neighborhoods [21]. Due to technical difficulties in implementing, the 2SFCA method is still underutilized in epidemiology.
In this study, we compared these three methods (nine GISbased measures) in assessing access to mammography facilities at the block group-level in the St. Louis area. In a previous study, we found the risk of advanced breast cancer was higher in the St. Louis area than elsewhere in Missouri [22]. As an indicator of predictive validity, we also compared the associations of these nine measures with neighborhood risk of late-stage breast cancer after adjusting for demographics and neighborhood socioeconomic deprivation using cancer registry data.

Study Population
The study area includes St. Louis City and St. Louis County, Missouri, that is located in the center of the greater St. Louis Metropolitan area, covering 590 square miles including 1124 block groups according to the 2000 Census. There are 719,737 women living in both counties, 337,966 of which are age 40 and above. Note that St. Louis City is its own county in Missouri. We obtained 2002-2006 primary breast cancer incidence cases from the Missouri Cancer Registry. Using a GIS, the address of breast cancer cases was geocoded to corresponding Census block groups and matched to U.S. Census 2000 TIGER/Line files. Breast cancer stage was defined according to the AJCC staging system as ductal/lobular carcinoma in situ (DCIS/LCIS, stage 0) and invasive breast cancer (stages I, II, III and IV). The study outcome was dichotomized as late-stage breast cancer (stages II-IV) vs. early-stage breast cancer (stages 0-I). Age was categorized as younger than 50 years, 50-64 years, and age 65 and above. Race was grouped as non-Hispanic White, African American, and Other. After excluding 62 ungeocoded cases and 148 cases with missing stage, a total of 4205 breast cancer cases were included in the analysis. This study was approved by Washington University's Institutional Review Board.

GIS-Based Measures in Assessing Spatial Accessibility to Mammography Service
We identified the locations of all 53 U.S. Food and Drug Administration (FDA)-accredited non-mobile mammography facilities during 1997-2001 in the study area from the FDA. The address of the facilities was geocoded to obtain latitude and longitude using ArcGIS (Version 9.3.1, ESRI inc., Redlands, CA). Based on three GIS approaches, we calculated nine measures of accessibility: A) nearest facility: (i) shortest travel time (DST), (ii) average of first 5 shortest travel time (DST5); B) (iii) service density (DES); and C) Two-Step Floating Catchment Area (2SFCA) indices: 3-timezone-slow weighted index (SA3S), (viii) 6-timezone-quick weighted index (SA6Q), (ix) 6-timezone-slow weighted index (SA6S).
We restricted the background population to women age 40 and above since screening mammography guidelines recommend mammography use for this population [23,24].
A. Nearest facility (facilities). We calculated the shortest travel time (DST) from the population-weighted centroid of each block group to mammography facilities using ArcGIS Network Analyst extension (Version 9.3.1, ESRI inc., Redlands, CA). We also calculated the average shortest travel time to the first five nearest facilities (DST5).
B. Service density. We calculated the service density (DES) by dividing the total number of mammography machines at the facilities that can be reached within 30 minutes (30-minute network buffer) from each block group centroid by this block group's population of women age 40+.
Where D i represents the density of block group i; M i is the number of mammography machines at facility j; P i represents the women population age 40+ at block group i; t ij is travel time (zone) from census block group i to mammography facility j which can be reached with 30 minutes from the block group i.
C. Two-Step Floating Catchment Area (2SFCA) Method. We applied the 2SFCA method to compute a spatial accessibility score for each Census block group. First, we computed the network road travel time matrix between all mammography facilities and all Census block group populationweighted centroids using ArcGIS Network Analyst extension (Version 9.3.1, ESRI inc., Redlands, CA). Maximum catchment range was set to 30-minute travel time (driving) based on other accessibility studies [17,18]. Second, we calculated the mammography machine-to-population (women population age 40 and above) of each mammography facility by dividing the number of machines by the weighted population of all Census block groups which centroids fall into the 30-minute catchment area of that facility (Equation 2).
Where R j denotes the ratio of mammography machines to population for facility j, while M j is the number of mammography machines at facility j; P i is the population of block group i; f t ij À Á is the weighting function; and t ij is travel time (zone) from census block group i to mammography facility j.
Third, we calculated the spatial accessibility for each Census block group (Equation 3).
Where A j denotes the spatial accessibility of census block group i. We weighted the population and machine-to-population ratio using a zonal Gaussian decay function which was thought of as an appropriate weighting function regarding distance decay in the zonal-weighted 2SFCA models [25] (Equation 4) and a continuous Gaussian weighting function in the continuous weighted 2SFCA model [19] (Equation 5).
Where b is the empirical parameter in the decay function; t 0 is the maximum catchment travel time (30 minutes in current study). In addition, we also used the original unweighted 2SFCA model in which f t ij À Á = 1. Because it is unclear how the decay of the travel time affect our findings, we used 3 time zones (per 10-minute travel time) quickdecay (1.00, 0.51 and 0.07) and slow-decay (1.00, 0.75 and 0.32) and 6 time zones (per 5-minute travel time) quick-decay (1.00, 0.82, 0.45, 0.17, 0.04 and 0.01) and slow-decay function parameters (1.00, 0.96, 0.85, 0.70, 0.53 and 0.37) in the zonalweighted 2SFCA models as part of a sensitivity analysis. We examined the locations of all mammography facilities and found a slight change in the number of mammography facilities over time. Nevertheless, to minimize the potential effect on our findings, we computed the spatial accessibility for each year and applied the 5year average as the final spatial accessibility score.

Neighborhood Socioeconomic Deprivation
It is well-known that women have lower screening mammography use in neighborhoods with more socioeconomic (SES) deprivation [26]. In this study, we regarded neighborhood socioeconomic deprivation as a potential neighborhood confounder. Referring to our previous study [27], we selected 21 Census variables from 2000 U.S. Census in six domains to construct a composite Census block group-level socioeconomic deprivation index using a multivariate approach. These domains included education, occupation, housing conditions, income and poverty, racial composition, and residential stability ( Table 1). A common factor analysis with varimax rotation was applied to construct a deprivation factor from the 21 Census variables. Variables with significantly larger factor loading on the deprivation factor were selected to build the socioeconomic deprivation index and its internal consistency was evaluated using Cronbach's alpha coefficient.

Statistical Analysis
To capture differences in the characteristics of the nine GISbased measures, we performed the analyses in three aspects. First, we calculated Spearman rank correlation coefficients to compare their simple correlations. Second, we categorized all nine GISbased measures into quartiles and computed weighted Kappa coefficients to examine their agreements. Quartiles reduce the effect of high and low prevalence on the Kappa coefficient [28]. The Kappa agreement was differentiated using a commonly cited scale: k,0, no agreement; k = 0.01-0.20, slight agreement; k = 0.21-0.40, fair agreement; k = 0.41-0.60, moderate agreement; k = 0.61-0.80, substantial agreement; k.0.80, perfect agreement [29]. Third, we computed global Moran's I indexes to compare differences in spatial autocorrelation, while we also performed Anselin local Moran's I tests to contrast their spatial patterns of these nine GIS-based measures. We specified the neighborhood relationship using the ''Inverse Distance'' weight function to obtain Moran's I statistics. All spatial features are assumed to impact on one another, but the farther away a feature is, the smaller influence it has [30]. The global Moran's I index is a spatial autocorrelation measure (feature similarity) ranging from 21 to 1. A value closer to 1 for Moran's I index suggests a more clustered global spatial pattern, while a value closer to 21 suggests a more dispersed global spatial pattern. A completely random spatial pattern exists when Moran's I is zero [30]. Anselin local Moran's I test is a tool to identify contiguous neighborhoods with values similar in magnitude (either high or low) and spatial outliers [30]. A spatial outlier indicates that a local region with high value is surrounded by neighborhoods with significantly low values, or vice versa.
As an indicator of predictive validity, we examined the associations of nine GIS-based measures with neighborhood risk of late-stage breast cancer. We applied a generalized linear mixed model to fit the multilevel logistic regression. All breast cancer cases were nested within their residential census block groups. The nine spatial accessibility measures and the socioeconomic deprivation index were dichotomized to below and above the median to facilitate interpretation. To examine the effect of spatial accessibility on late-stage breast cancer and the impact of neighborhood socioeconomic deprivation, we fitted the models in three ways. First, we used multivariate models that were adjusted for demographics and neighborhood socioeconomic deprivation to examine the independent effect of spatial accessibility. Second, we used jointly-classified models by combining the two categories of spatial accessibility and the two categories of neighborhood socioeconomic deprivation into one variable with four categories, which examines nonlinear effects of the combination of both variables. Third, we used stratified models in which the effect of spatial accessibility was examined in each stratum of neighbor-    hood socioeconomic deprivation, which examines the interaction between both variables. Scaled deviance was used to evaluate the goodness-of-fit of model fitting with smaller value indicating a better fitting. The data were managed and analyzed using SAS (Version 9.2, SAS Institute Inc., Cary, NC). Global and local Moran's I analyses were computed using ArcGIS spatial statistics tools package and GIS mapping were performed in ArcMap (ArcGIS, Version 9.3.1, ESRI Inc., Redlands, CA).

Results
Service density measures had a much broader range than measures using shortest travel time(s) and 2SFCA methods ( Table 2). The spatial pattern of neighborhood accessibility to mammography service using different spatial methods is shown in Figure 1. For the 2SFCA measures, the methods with distance decay weighting showed a larger variation and broader spatial accessibility ranges (Table 1 and Figure 1) compared to the unweighted method (SAU vs. SAC, SA3Q, SA3S, SA6Q and SA6S), while quicker zonal-weighting made the SA structure broader than slower zonal-weighting (SA3Q vs. SA3S and SA6Q vs. SA6S).
The principal components common factor analysis identified the first common factor as the deprivation factor which explained 44.1% of the total variance. The nine Census variables, with large factor loading on the deprivation factor, included the percentage of civilian labor force unemployed, the percentage of vacant households, the percentage of households with no less than one person per room, the percentage of female headed households with dependent children, the percentage of households with public assistance income, the percentage of households with no vehicle, the percentage of households with no phone, the percentage of population below federal poverty line, and the percentage of non-Hispanic African Americans. These nine Census variables indicated a high internal consistency (Cronbach's alpha = 0.93, Table 1).
The  Table 3. Table 4 showed    (Table 5). Anselin local Moran's I tests exhibited considerable differences in spatial pattern of nine GISbased measures ( Figure 2). The area with high access based on shortest travel time(s) measures were located mainly in the eastcentral part of the study area, but in the study area's central part for the 2SFCA measures. The service density measures showed smaller cluster areas compared to other measures.

Discussion
Our main purpose was to compare varied GIS-based measures of access to mammography service computed using three different spatial approaches, and we also determined the predictive validity in their association with odds of late-stage breast cancer. Our study demonstrated that the correlation and agreement among the different measures (shortest travel time, service density and 2SFCA measures) was low. Also, the spatial pattern of the measures varied considerably. Only measures using the 6-timezone-weighted 2SFCA method were significantly associated with increased neighborhood odds of late-stage breast cancer after accounting for demographics and neighborhood socioeconomic deprivation. The effect of neighborhood socioeconomic deprivation could be explained in part by neighborhood spatial accessibility. Combined with more deprived neighborhood socioeconomic condition, lower spatial accessibility to mammography service is associated with greater neighborhood risk of late-stage breast cancer.
Service availability or density is the most common measure in assessing spatial accessibility due to its easy computation [3,4,5,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]. It does not require advanced GIS skills, and only need to link each service location to its corresponding neighborhood or a predefined buffered area in which that service facility is located. With the rapid development of GIS techniques, it also becomes convenient to compute the network-based distance based on a GIS road network layer. This results in the frequently used measure of the shortest travel time (or nearest travel distance) to the service locations for assessing service accessibility [6,7,10,11,12,14,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67]. Recently, it has become feasible to create a composite accessibility index using a state-of-the-art two-Step Floating Catchment Area (2SFCA) approach. This approach is more reasonable than availability/ density and nearest travel distance/shortest travel time through the integration of travel barrier and service competition, however, it is a more sophisticated technique requiring several sequential steps: first, one needs to compute a travel distance/time matrix between service location and population locations within a predefined catchment area using a GIS, such as ArcGIS Network Analyst. If a study involves large numbers of study neighborhoods and participant locations, this process could be very timeconsuming. Second, one needs to compute the population-service ratio for each service location and then a composite accessibility score for each population location using statistical derivations with varied weighting techniques, such as the Enhanced 2SFCA method [18] and the Gaussian 2SFCA method [19]. Additionally, some efforts to improve the technical precision of the 2SFCA method make this approach more complex. Luo and Whippo explored an approach to reduce the bias due to the rural-urban difference through using predefined base population threshold and service-to-population ratio threshold to create varied catchment sizes for each service location and population location instead of fixed catchment size (fixed travel time or distance) [20]. Recently, Bell and Bissonnette et al developed an extension of the 2SFCA method, called the 3SFCA, aggregating the small-area spatial accessibility score to a larger study neighborhood [21,68]. Both methods could substantially improve the measurement precision and reduce the influence of the rural-urban difference especially when the study area is large or irregular, meanwhile, they also add considerable computational burden if the study sample size is large. In the former approach, the travel distance/time matrix need to be computed under a much larger range, such as 30minute travel, even 60-or 90-minute travel time, to capture the specific catchment sizes of each service and population location, while the latter one requires an additional step to obtain the accessibility measure.
Briefly, for most researchers, service availability/density and nearest travel distance/shortest travel time are easier to compute despite the fact that travel barriers or service competition is ignored. In contrast, the 2SFCA and its extended methods are more technical and require stronger computation skill to perform although this approach has methodological advantages. Therefore, it is necessary to compare these GIS-measures in principle and predictive validity for a specific study outcome. If no significant difference, service availability/density and/or nearest travel distance/shortest travel time could be applied instead of more complex 2SFCA approaches. Otherwise, it may be a better way to apply more advanced 2SFCA approach. It is noteworthy that, for the 2SFCA approach, the number of time zones and decay weighting parameters should be evaluated for different study outcomes. In our study, more time zones worked better while decay did not seem to play a role. In addition, for a study with large mixed area characteristics, rural-urban difference, such as different catchment sizes, may be considered when assessing spatial accessibility, including the application of varied catchment sizes [20] or the aggregation of small-area accessibility measures to larger neighborhoods [21].  Our study indicated that the GIS-based measures of spatial accessibility exhibit different characteristics. The findings suggest that the weighted 2SFCA method is better than service density and shortest travel time when assessing spatial accessibility to mammography service. Future studies should further investigate and improve the 2SFCA methods and compare GIS-based measures with perceived accessibility when assessing neighborhood effect of the distribution of mammography service. Appropriate assessment could reduce bias when investigating the effect of spatial accessibility on breast cancer outcomes. Additionally, precise and reliable measures of spatial accessibility to mammography cannot only provide justification for effective multilevel interventions, but also help local and state policy makers and health service planners identify service shortage areas to mammography and improve the allocation of mammography services to reduce geographic disparity in breast cancer-related outcomes that appears to exist in community settings. The selection of GIS-based measures can be extended to other areas of public health, including accessibility to other medical services, the food environment, and alcohol or cigarettes sale environments [33,43,48,57,62,66].
There are several strengths to our study. We computed nine GIS-based measures of access to mammography services using three different spatial approaches, including shortest travel time, service density and the 2SFCA method, and systematically compared their correlation, agreement and spatial pattern within a single study region and population. The 2SFCA approach with more time zone-weighting appears to capture more details in spatial pattern and significant or stronger association of spatial accessibility to mammography service with late-stage breast cancer. We applied the number of mammography machines as the service capacity and the population of women age 40 and above as the screening-eligible population. We also used the Census block group as the geographic unit which is much smaller than Zip code and can lead to a more precise measurement of accessibility.
Our study also has some limitations. First, our findings may only be generalized to a metropolitan area. Results may be different when examining more rural areas [18]. Second, the estimation of spatial accessibility for block groups at the edge of the study area boundary could have been underestimated since we did not include facilities outside the study area. However, this is unlikely to have affected our findings because there was only one mammography facility near the Missouri river. On the west-side and eastside of the study area, the Missouri river and Mississippi river formed a natural boundary. Third, except for age and race, our study did not include other individual-level factors that are associated with late-stage breast cancer, such as marital status, low education, unemployment, health insurance coverage, nonparticipation in regular general health check-up, low interest in health issues and diagnostic delay [69,70,71,72,73]. Additionally, our study assumed that all women with the same travel time had equal opportunity to access a mammography facility, that all facilities had similar quality of provided services, and that each : spatial accessibility index from the model with continuous weighting parameter; f : spatial accessibility index from the zonal weighted model with 3 time zones and quick decay weighting; g : spatial accessibility index from the zonal weighted model with 3 time zones and slow decay weighting; h : spatial accessibility index from the zonal weighted model with 6 time zones and quick decay weighting; i : spatial accessibility index from the zonal weighted model with 6 time zones and slow decay weighting. doi:10.1371/journal.pone.0043000.t007 mammography radiologist in each mammography facility had equal capacity to read mammography films. Women with lower income or without health insurance coverage might seek mammography service from safety net providers even if these locations might be farther and have lower quality services than other facilities. Regardless of these limitations, our findings provide helpful information to policy makers about where accessibility to needed mammography services is lower and counteract this in order to reduce the odds of late-stage breast cancer diagnosis. Future studies could include additional risk factors and service facility characteristics to validate the independent effect of spatial accessibility to mammography service.
In conclusion, different GIS-based measures appear to describe different concepts based on their intercorrelations, agreements and spatial patterns. Caution should be exercised in selecting a spatial approach in assessing access to mammography when investigating neighborhood contextual effects on breast cancer outcomes. The 2SFCA measure appears to be the best approach based on theoretical considerations, spatial patterns and predictive validity. Our findings suggest that the 2SFCA approach can be a valuable option for epidemiologists when investigating the health effects of the distributions of regional accessibility to services.