## Figures

## Abstract

Health disparities are commonplace and of broad interest to policy makers, but are also challenging to measure and communicate. The Health Disparity Calculator software (HD*Calc, v1.2.4) offers Monte Carlo simulation (MCS)-based confidence interval (CI) estimation of eleven disparity measures. The MCS approach provides accurate CI estimation, except when data are scarce (e.g., rare cancers). To address sparse data challenges to CI estimation, we propose two solutions: 1) employing the gamma distribution in the MCS and 2) utilizing a zero-inflated Poisson estimate for Poisson sampling in simulation experiments. We evaluate each solution through simulation studies using female breast, female brain, lung, and cervical cancer data from the Surveillance, Epidemiology, and End Results (SEER) program. We compare the coverage probabilities (CPs) of eleven health disparity measures based on simulated datasets. The truncated normal distribution implemented in the MCS with the standard Poisson samples (the default setting of HD*Calc) leads to less-than-optimal coverage probabilities (<95%). When both the gamma distribution and the estimated mean from the zero-inflated Poisson are used for the MCS, the coverage probabilities are close to the nominal level of 95%. Simulation studies also demonstrate that collapsing age categories for better CI estimation is not a pragmatic solution.

**Citation: **Ahn J, Harper S, Yu M, Feuer EJ, Liu B (2019) Improved Monte Carlo methods for estimating confidence intervals for eleven commonly used health disparity measures. PLoS ONE 14(7):
e0219542.
https://doi.org/10.1371/journal.pone.0219542

**Editor: **Aamir Ahmad,
University of South Alabama Mitchell Cancer Institute, UNITED STATES

**Received: **January 15, 2019; **Accepted: **June 14, 2019; **Published: ** July 11, 2019

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

**Data Availability: **Data are available from https://seer.cancer.gov/.

**Funding: **JA was supported by National Cancer Institute of the National Institutes of Health HHSN261201500408P. The funders played a role in data collection and preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Health disparities are commonplace and have received considerable attention in recent years. Better understanding the magnitude of health disparities and how they change over time has been challenging for policy makers and researchers interested in identifying effective interventions to reduce or eliminate health disparities [1, 2]. To support that goal, statistical software called HD*Calc [3] calculates eleven health disparity measures on both absolute and relative scales to measure the magnitude and trend in health disparities for specific cancer or other health outcomes. Thus far, a number of papers employed HD*Calc and have reported on the trend of disparities for a diverse range of disease outcomes [4–6]. Recently, two new disparity indices called extended relative concentration index and the absolute concentration index with directly standardized rates and their statistical inferences were developed and will be added to the next version of the HD*Calc [7].

When it comes to estimating confidence intervals (CI), HD*Calc offers two numerical approaches: the analytic approach and a Monte Carlo simulation (MCS)-based approach. The current version of the analytic Taylor-series expansion approach for relative risks, absolute risks, and Index of Disparity measures may lead to conservative confidence intervals because the dependence between the two groups being compared is neglected when deriving partial derivatives [8]. Although the MCS method performs relatively better with no such issue, its coverage probability can deviate from the nominal value (0.95) in situations with sparse data. We note that inadequate simulation procedures can result in such poor performance given that the MCS is a simulation based approach. A possible reason for this shortcoming is that a truncated normal distribution is employed in sampling age-adjusted rates (AARs) to ensure positive AARs during the Monte Carlo simulation. When the truncated normal distribution does not represent the true distribution of AARs, CI estimation will depart from the truth. In addition, Ahn et al. (2018) demonstrated that CI estimation tends to perform inaccurately when more than 25% of age-groups have cancer incidence rates of zero, often referred to as structural zeros. In this circumstance, with the standard Poisson distribution, the simulated counts are repeatedly be zero because the mean and the variance are identically zero by construction. Ahn et al. (2018) added a 1/*population*_{agegroup} to the Poisson mean; however, this is an *ad-hoc* modification and does not produce the true population mean. In addition, this approach often fails to address the problem of rare events. Ahn et al. (2018) also suggest collapsing age categories (e.g., 10-year rather than 5-year age groups) as a tentative solution when data are scarce, which is commonly used in categorical data analyses. However, details are lacking on: 1) how to combine age categories when different social groups have different frequencies of events and 2) the empirical performance of such a collapsing strategy.

To bridge these gaps, we propose: 1) employing a gamma distribution instead of the truncated normal distribution for the MCS-based method; and 2) using the estimated mean from a zero-inflated Poisson distribution for Poisson simulation experiments when there is an excess of zero incidence among age-groups. Through simulation studies, we illustrate how the proposed approaches perform compared to the original MCS approach. We use female breast, female brain, lung, and cervical cancer data from Surveillance, Epidemiology, and End Results (SEER) program (https://seer.cancer.gov/). We also explore the practicality of the age-category collapsing strategy by conducting exploratory simulation studies.

## Materials and methods

### 0.1 Variance estimation using the Monte Carlo simulation-based method

For the MCS approach, HD*Calc simulates age-adjusted rates (AAR), , for each social group *j* using a left-truncated normal distribution, where the mean and variance are estimated using the sample mean and variance of the normal distribution. In the current version of HD*Calc, the generation of the truncated normal samples can be inefficient in that sampling from the normal distribution is repeatedly performed until a positive value is drawn. Alternatively, a gamma distribution assures a positive value and is more flexible in terms of asymmetry compared to the normal or truncated normal distribution. For very rare disease incidence or when the distribution of corresponding AARs is skewed to the left near zero, the truncated normal distribution may be less plausible. Instead the Gamma distribution can be used as an alternative for sampling AARs. In applications, the shape and scale parameters of the gamma distribution will be determined corresponding to the observed sample mean and variance of AARs. Under the Monte Carlo simulation, per each social group *j* is repeatedly simulated to obtain *B* simulated (*B* is set at 1, 000 as a default in HD*Calc). Once each health disparity measure is obtained for all *b* = 1, …, *B*-th iterations, then the estimated health disparity measures are sorted and selected for the upper and lower bound for the CI by percentile. For example, for 95% CI the 2.5 and 97.5 percentile values are chosen.

### 0.2 Simulation studies

We aimed to assess the performance of: 1) the gamma distribution in comparison to the truncated normal distribution under the MCS approach and 2) the zero-inflated Poisson mean for the Poisson distribution in generating simulation datasets. To do so, we conducted a simulation study considering four schemes: (1) the truncated normal distribution with standard Poisson sampling (say TNP, default ub HD*Calc, where the Poisson mean is adjusted by 1/*population*_{agegroup}); (2) the truncated normal distribution with Poisson sampling using the estimated mean from zero-inflated Poisson (TNZP); (3) the gamma distribution with standard Poisson sampling (GP); and (4) the gamma distribution with Poisson sampling using the estimated mean from zero-inflated Poisson (GZP).

For reference, we used the NCI’s SEER Program data. Without a loss of generality, we focused on female breast cancer data from Kentucky and female brain, lung, and cervical cancer from Iowa where HD*Calc has been shown to provide inaccurate CIs attributable to excessive proportions of zeros (>25%)(8). Data characteristics in terms of the number of social groups, minimum and maximum of age-adjusted rates, minimum and maximum variance of AAR, average event count per social group, proportion of zero event counts, and population size are summarized in Table 1. These four cancer types represent ordered/unordered social groups as well as a varying the number social groups used to calculate the summary measure of disparity. For example, lung cancer data comprise household income decile groups (*J* = 10) whereas female brain data comprise six unordered racial groups (*J* = 6).

Each data set consists of 19 standard age groups based on 5-year intervals. For female breast and female brain cancer, six racial/ethnic groups (Hispanic (reference), White, Black, Asian, Asian Pacific Islander, and Unknown) were used. For cervical cancer and lung cancer, two ethnic groups (Spanish Hispanic Latino and Non-Spanish Hispanic Latino (reference)) and ten groups based on household deciles income in 2008 (the lowest decile group is the reference) were used, respectively.

For the sampling design, we use the following terminology. Let *x*_{jk} denote event counts for *j*-th social group and *k*-th age group And let *y*_{j} be the age-adjusted rate for *j*-th social group. When *x*_{jk} is zero for *j*-th social group and *k*-th age group, then simulating from *Poisson*(*x*_{jk}) always leads to zero. In such a case, this can become a structural zero problem.

To address this situation, we propose to exploit the estimates of a zero-inflated Poisson distribution to simulate *x*_{jk} [9]. To be specific, the zero-inflated Poisson can be expressed as a mixture distribution, i.e., *Pr*(*x*_{jk} = 0) = *π*_{jk} + (1 − *π*_{jk})exp(−λ_{jk}) and where *π*_{jk} is the weight probability of extra zeros at *k*-th age group and social group *j* and λ_{jk} is the population-adjusted Poisson mean. Using SEER data from 13 states for a specific cancer allows us to observe some non-zero events at a given age group *k*. In this case, we are able to obtain estimates and using *zeroinfl()* function in *R* package version 3.11. With estimated and , we consider a maximum of and 1/*population*_{agegroup} = 1/*n*_{jk} as the Poisson mean in order to avoid structural zeros. That is, when *x*_{jk} = 0, we sample from the rather than conventional . Resulting density plots of the proportions of zeros for four cancer types after employing this approach are illustrated in S1 Fig.

For the gamma distribution in the MCS approach, we used the scale and shape parameter corresponding to the mean and variance parameters obtained using the normal distribution. We repeated this procedures 10,000 times to generate and obtain . We used HD*Calc to calculate the eleven health disparity measures implemented in the current version (v1.2.4).

## Results

Table 2 reports the simulation results in terms of the coverage probabilities (CPs) of the eleven health disparity measures using the aforementioned four schemes. The TNP yields permissive CPs (less than 95% nominal level) in most of the eleven measures across all four cancer types, and approximately 70% to 80% CPs are observed for female brain cancer. The poor performance for female brain may seem quite surprising; however, we should note that female brain cancer data suffer from 66% zero cancer events across all age groups. There are subtle improvements in CPs with the TNZP compared to the TNP; however, the improvements are not consistently better for most health disparity measures. For example, TNZP increased the CPs for 8 out of 11 measures for female breast cancer by 0.5% to 1.6%; however, CPs for Range Difference (RD), Range Ratio (RR) and Index of Disparity (IDisp), decreased by more than 3%. The GP yields relatively better CPs than those of the TNZP and TNP by reducing the distance from 95% nominal levels across all four cancer types. For example, for lung cancer, the CPs from the GP become closer to 95% with a few exceptions (e.g., Between Group-Variance (BGV) and Theil’s Index (T)). The GZP produces CPs consistently close to 95% in most measures in the three cancer types, except for female brain cancer. Overall, the GZP appears to produce better CPs than the other three approaches, while GP outperforms TNZP. For female brain cancer data, however, all four schemes demonstrate inaccurate CI estimation.

Analogous to approaches for contingency tables with zero count cells, collapsing age boundaries can serve as an alternative strategy to reduce zero incidence age groups. To evaluate whether collapsing categories can resolve the incorrect CI calculation, we conducted an exploratory study by re-analyzing previously generated simulation datasets using the TNP and GP under the MCS approach. Considering that each SEER data set consists of 19 standard 5-year window age groups, it is conceivable that younger age groups are more likely to have zero cancers. We computed the tally of cancer events in consecutive age groups for each cancer type and then we used four broad age groups (0-39, 40-54, 55-69, 70 or higher) to generate at least one cancer events in each age group. We also considered eight different age groups (0-19, 20-34, 35-50, 51-54, 55-59, 60-64, 65-74, 75 or higher) to assess sensitivity of the results to the number of collapsed age groups.

We evaluated the impact of the two strategies for collapsing age groups by comparing two summary measures, that is, averaged relative changes |*HD*_{k′} − *HD*_{k}|/*HD*_{k}, *k* = 19 (original estimates), *k*′ = 4 or 8 and its mean squared error in Table 3. We found that approximately 5% and 10% relative changes in health disparity point estimates for lung cancer and female breast cancer, respectively. As expected, collapsing to eight age groups tended to produce heath disparity measures that were closer to the original measures than those obtained from the four age group collapsing strategy in both lung and female breast cancer. For female brain and cervical cancer, however, collapsing age groups led to large departures from the original measures regardless of the number of age groups. This inconsistency suggests that collapsing age groups is unlikely to be a reliable strategy for improving CI estimates for measures of health disparity.

## Conclusion

This article focuses on the situation in which the Monte-Carlo simulation-based approach implemented in the HD*Calc fails to accurately estimate CIs in the context of rare cancers or cancers with excess zeroes for some age groups. We demonstrated that the gamma distribution in the MCS approach and Poisson sampling with the zero-inflated Poisson mean for simulation experiments can improve CI estimation in HD*Calc. When the gamma distribution and the zero-inflated Poisson mean estimates are employed simultaneously, we observed consistently better performance compared with HD*Calc’s current approach, which is based on the truncated normal distribution. However, when cancer events are rare (e.g., female brain cancer), the proposed solutions could not fully resolve the problem of under-coverage.

## Discussion

Although we demonstrated that utilizing the zero-inflated Poisson mean estimates for Poisson sampling alleviates inaccurate CI estimation issues to some degree, this approach may not be practical in every situation. This is because the zero-inflated Poisson approach requires data for a specific cancer from multiple populations (e.g., counties and states) in order to obtain the empirical estimate of the weight that should be placed on zero events. In addition, contrary to the gamma distribution that can be easily implemented in HD*Calc, the zero-inflated Poisson approach can be applied through simulation experiments manually. When the zero-inflated Poisson approach is not applicable, the gamma distribution, in place of the truncated normal distribution for the MCS, can be used alone. As illustrated in Table 3, reducing the number of age categories can yield huge variability in HD measures and resultant HD measures are sensitive to which age groups are combined. Both of these problems make the category-collapsing approach undesirable for better estimating CIs.

Problems with scarce events are likely to occur in the context of rare diseases and when sampling is in less-populated areas. As discussed [8], when more than 25% of age groups have zero disease incidence, HD*Calc users are advised to be cautious in making inferences from health disparity measures, including interpreting 95% confidence intervals of disparity measures. HD*Calc users may conduct simulation studies in advance to examine the validity of the confidence intervals or standard errors and then make inferences about the presence/trend of health disparities. If a sampling zero problem persists, it may helpful to reevaluate age group boundaries. And we may perform a sensitivity analysis to evaluate the effect of re-grouping age-groups and see how results varying with changes.

## Supporting information

### S1 Fig. Density plots of the proportions of zeros for female breast, female brain, lung, and cervical cancer data.

The blue line represents the density when zero-inflated Poisson (ZIP) mean estimates is used for Poisson sampling while the red line represents the density when the standard 1/*population* is used for Poisson sampling.

https://doi.org/10.1371/journal.pone.0219542.s001

(PDF)

## Acknowledgments

Research reported in this publication was partially supported by the National Cancer Institute of the National Institutes of Health under contract number HHSN261201500408P. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

## References

- 1. Rust G and Cooper LA. How Can Practice-based Research Contribute to the Elimination of Health Disparities? J. Am. Board. Fam. Med., 2007;20:105–114. pmid:17341746
- 2.
Strategies for Reducing Health Disparities. https://www.cdc.gov/minorityhealth/strategies2016/index.html.
- 3. Breen N, Scott S, Percy-Laurry A, Lewis D, Glasgow R. Health disparities calculator: A methodologically rigorous tool for analyzing inequalities in population health. American Journal of Public Health, 2014;104:1589–1591. http://seer.cancer.gov/hdcalc pmid:25033114
- 4. Rossen LM and Schoendorf KC. Measuring health disparities: trends in racial-ethnic and socioeconomic disparities in obesity among 2- to 18-year old youth in the United States, 2001-2010. Ann Epidemiol., 2012;22:698–704. pmid:22884768
- 5. Kent EE, Breen N, Lewis DR, de Moor JS, Smith AW, Seibel NL. US trends in survival disparities among adolescents and young adults with non-Hodgkin lymphoma. Cancer Causes Control, 2015;26:1153–1162. pmid:26084209
- 6. Breen N, Lewis D, Gibson JT, Yu M, Harper S. Measuring health disparities: trends in racial-ethnic and socioeconomic disparities in obesity among 2- to 18-year old youth in the United States, 2001-2010. Cancer Causes Control, 2017;28:117–125.
- 7. Yu M, Liu B, Li Y, Zou J, Breen N. Statistical inferences of extended concentration indices for directly standardized rates. Stat. in Med., 2019;38:1–12.
- 8. Ahn J, Harper S, Yu M, Feuer EJ, Liu B, Luta G. Variance estimation and confidence intervals for eleven commonly used health disparity measures. JCO: Clin. Can. Inform., 2018.
- 9. He H, Tang W, Wang W, Crits-Christoph P. Structural zeroes and zero-inflated models. Shanghai Arch Psychiatry, 2014;25:236–242.