Racial and ethnic disparities in HIV diagnoses among heterosexually active persons in the United States nationally and by state, 2018

Background Despite declining HIV infection rates, persistent racial and ethnic disparities remain. Appropriate calculations of diagnosis rates by HIV transmission category, race and ethnicity, and geography are needed to monitor progress towards reducing systematic disparities in health outcomes. We estimated the number of heterosexually active adults (HAAs) by sex and state to calculate appropriate HIV diagnosis rates and disparity measures within subnational regions. Methods The analysis included all HIV diagnoses attributed to heterosexual transmission in 2018 in the United States, in 50 states and the District of Columbia. Logistic regression models estimated the probability of past-year heterosexual activity among adults in three national health surveys, by sex, age group, race and ethnicity, education category, and marital status. Model-based probabilities were applied to estimated counts of HAAs by state, which were synthesized through meta-analysis. HIV diagnoses were overlaid to calculate racial- and ethnic-specific rates, rate differences (RDs), and rate ratios (RRs) among HAAs by sex and state. Results Nationally, HAA women have a two-fold higher HIV diagnosis rate than HAA men (rate per 100,000 HAAs, women: 6.57; men: 3.09). Compared to White non-Hispanic HAAs, Black HAAs have a 20-fold higher HIV diagnosis rate (RR, men: 21.28, women: 19.55; RD, men: 15.40, women: 31.78) and Hispanic HAAs have a 4-fold higher HIV diagnosis rate (RR, men: 4.68, RD, women: 4.15; RD, men: 2.79, RD, women: 5.39). Disparities were ubiquitous across regions, with >75% of states in each region having Black-to-White RR ≥10. Conclusion The racial and ethnic disparities across regions suggests a system-wide failure particularly with respect to preventing HIV among Black and Hispanic women. Pervasive disparities emphasize the role for coordinated federal responses such as the current Ending the HIV Epidemic (EHE) initiative.


Introduction
Although human immunodeficiency virus (HIV) diagnosis rates among heterosexually active adults (HAAs) in the United States (US) have declined, reducing racial and ethnic inequities in HIV prevention and treatment is a priority in national and jurisdictional HIV strategies [1][2][3]. A recent analysis among HAA men in the US from 2014-2018 revealed substantial racial and ethnic inequities using 12 separate disparity measures [4,5]. Although HIV diagnosis rates are highest among persons reporting male-to-male sexual contact, 9.4% of new diagnoses among males and 84.6% among females aged �13 years are heterosexually acquired [6]. To meet national, state, and local "ending the epidemic" goals, further reduction of new infections among HAAs is needed, particularly among women.
Numerous factors contribute to inequities in HIV acquisition and health outcomes. Social and structural factors include poverty, unstable housing, incarceration, socioeconomic status, educational attainment, access to quality HIV prevention and care, and racial discrimination [7]. A recent analysis of durable viral suppression among adolescents and young adults nationally found that while disparities existed for all racial and ethnic groups, Black persons had the lowest durable viral suppression (36.1%, compared to 50.8%, 46.7%, and 47.3% among persons identifying as White, Hispanic, or other groups), which in turn increases transmission risk [8]. There are several reasons why viral suppression is lower among Black populations. First, minority patients are less likely to have healthcare providers of the same racial and ethnic identity; congruence in identity is associated with better patient-provider relationships, and lower cultural competence among providers is associated with worse HIV care outcomes [9]. Second, lower health literacy among Black persons with HIV may also negatively impact adherence to antiretroviral therapy [10]. Furthermore, structural racism, discrimination, and mistrust in the health system create barriers to HIV services utilization among Black persons [11]. These mechanisms also apply to other racial and ethnic minority groups.
Appropriate calculations of diagnosis rates by HIV transmission category, race and ethnicity, and geography are needed as one important metric in the HIV care continuum to monitor progress towards reducing systematic disparities in health outcomes that are the result of unjust social, economic, and environmental conditions. Racial and ethnic disparities in HIV diagnoses among HAA men and women are documented nationally [4,5,12], but state-level estimates for men and women are lacking. Building on recent work estimating HIV diagnoses rates among men who have sex with men (MSM) [13,14], we use meta-analysis to estimate state-level populations of HAAs, HIV diagnosis rates by sex, and an absolute and relative disparity measure of Black-White and Hispanic-White disparities in diagnosis rates. We use the term HAA, rather than heterosexual, because sexual activity and sexual orientation differ. The term "heterosexual" denotes sexual orientation, which cannot be deduced from the data. Instead, "heterosexually active" is used to denote reported sexual activity with someone of the opposite sex.

Analytic overview
Consistent with prior work to estimate populations of MSM [13,14] and HAAs [15] nationally, we used a four-step process to develop state-level estimated HIV diagnosis rates, rate differences (RDs), and rate ratios (RRs) of racial and ethnic disparities among HAAs (see Fig 1). First, we used logistic regression models to estimate the probability of past-year heterosexual activity among persons aged �18 years in three national health surveys by sex, age group, race and ethnicity, education category, and marital status. Second, model-based probabilities were applied to the American Community Survey (ACS) to generate proportions of HAA men and women by state. Third, we used meta-analysis to synthesize estimated probabilities across surveys; these were applied to the ACS to generate estimated populations of HAAs. Fourth, HIV diagnoses were overlaid to calculate state-level, sex-specific rates by race and ethnicity, and RDs and RRs as measures of racial and ethnic disparities.

Data sources and methods for HAAs
Four publicly-available national surveys were used to estimate HAA populations: National Health and Nutrition Examination Survey (NHANES; pooled 2013-2014 and 2015-2016 waves) [16], National Survey of Family Growth (NSFG; 2015-2017 wave) [17], General Social Survey (GSS; pooled annual surveys from 2014, 2016, and 2018) [18], and ACS (5-year estimates from 2014-2018) [19]. Heterosexual activity among men was defined as self-reported sex with women exclusively in the past year. Heterosexual activity among women was defined as self-reported sex with men exclusively, or sex with both men and women in the past year. Women who had both male and female sex partners were included because while there are occasional case reports of female-to-female HIV transmission, this mode of transmission is rare [20] and thus we were inclusive of all women who reported male sexual partners. The "past year" definition was used to represent current sexual activity.
In the first stage, four logistic regression models estimated the probability of past-year heterosexual activity nationally. Three models developed these estimates separately for adults 18-50 years in NHANES, NSFG, and GSS; this age range was common to all surveys. Models included covariates and their two-way interactions for: age group (18-29, 30-39, and 40-50), sex (female or male), race and ethnicity (White non-Hispanic, Black non-Hispanic, Hispanic, and all other races combined), education category (high school and lower, some college, and college graduate and above), and marital status (never married; married; and widowed, separated, or divorced). The models were developed with an intent to improve their predictive power rather than limiting to the most significant coefficients from a stepwise procedure. The covariates and their categories were selected based on a review of demographic covariates used in prior literature [15], consistency across the three surveys, availability at the state level in the ACS (for the second stage described below), and ability to ensure at least 5 observations in each stratum for sufficient degrees of freedom. Regarding the categorization of race and ethnicity, all other races were combined due to the smaller numbers of survey participants in other race and ethnicity groups when stratifying by state, age group, race and ethnicity, marital status, etc. as described below, and will be referred to as "other races." A fourth model estimated the probability of past-year heterosexual activity among adults aged �51 years in the Hepatitis, STD, and TB Prevention (NCHHSTP), National Health and Nutrition Examination Survey (NHANES), National Survey of Family Growth (NSFG). a Heterosexual activity among men defined as men who have had sex with women exclusively in the past 12 months. Heterosexual activity among women defined as women who have had sex with men exclusively and women who have had sex with both men and women in the past 12 months. b The logistic regression model contains covariates for age group (18-29, 30-39, and 40-50), sex (female or male), race and ethnicity (White, Black, Hispanic, and other), education category (high school and lower, some college, and college graduate and above), and marital status (never married; married; and widowed, separated, or divorced). This yielded 216 unique demographic strata. The logistic regression model for GSS (ages 51+) did not contain a GSS, using all covariates except age group because there were too few observations in the higher age group to generate a reliable estimate if the model included additional age group stratifications. All models used available survey weights. NHANES weights were divided by two because we pooled 2013-2014 and 2015-2016 waves. GSS survey weights were not altered because they adjust for non-response, not US population representativeness.
In the second stage, the ACS was used to project the percentage of state populations in the 216 strata (all age group, sex, race and ethnicity, education, and marital status combinations) for the 50 states and District of Columbia. This was done separately (for 72 sex, race and ethnicity, education, and marital status combinations) for persons aged �51 years. Results from the four logistic regressions were applied to the population compositions to estimate the proportion of HAAs aged 18-50 years (NHANES, NSFG, and GSS) and �51 years (GSS only) by stratum.
In the third stage, our weighted averages across surveys used a random-effects model; weights were the inverse of within and between survey variances [21]. Special considerations for meta-analysis to synthesize survey-based estimates are different sampling frames, subpopulations, question wording, and data collection timeframes [22]. To address these issues, we used a consistent "past 12 months" definition of heterosexual activity and limited the metaanalysis to ages 18-50 years because that range was common across surveys. Regarding timeframes, no adjustments were made as it is unlikely that heterosexual activity varied at the state level during the data coverage of the surveys used to estimate past-year heterosexual activity (2013-2018).
In the fourth stage, we tabulated numbers of HAAs separately for persons aged 18-50 years (meta-analysis results from NHANES, NSFG, and GSS) and persons �51 years (GSS only).
Standard errors of statewide probabilities of heterosexual activity are cumulative from the different methods in each stage. In the first stage, national estimates with standard errors were bootstrapped to provide standard errors for state estimates. In the third stage, random-effects models synthesized standard errors from survey-based estimates among respondents aged 18-50 years. The standard error for all adults was based on the sum of variances of synthesized results for adults aged 18-50 and �51 years, assuming the two estimates were independent.
All logistic regression analyses were done in SAS 9.4 (proc survey logistic) [23]. The marandom SAS macro was used for meta-analysis [24].

Data sources and methods for HIV diagnoses
HIV diagnoses in 2018 were obtained from the Centers for Disease Control and Prevention's AtlasPlus [25]. We included all persons aged �13 years reporting heterosexually acquired HIV. This risk reporting is consistent with our selection of the term HAA, rather than heterosexuals, because sexual orientation is fluid. The classification of HIV risk factor at the time of diagnosis is specific to the mode of transmission presumed to be related to an individual's HIV acquisition and not sexual orientation. We included persons in the "Black/African American," "Hispanic/Latino," and "White" race and ethnicity categories; we excluded "multiple races" because there was no corresponding denominator available to produce HAA estimates. Our focus on the three racial and ethnic groups resulted in <5% of diagnoses being removed from the analysis (2.2% of male and female heterosexually-acquired diagnoses were in the combined covariate for age, yielding 72 strata. c To estimate state-wide probabilities of heterosexual activity, we first calculated the national estimate of recent heterosexual activity for 216 strata among persons aged 18 to 50 (all combinations of sex, race and ethnicity, education, age, and marital status categories) and subsequently applied these estimates to statewide population compositions of these strata. We repeated that exercise for the 72 strata in the GSS-based model estimate for persons aged 51+. d NSFG covers respondents up to 49 years. There were a few participants that were 50 years of age.
We included all 50 states and the District of Columbia, although AtlasPlus did not contain HIV diagnoses for New Hampshire and diagnoses were not reported by transmission category and race and ethnicity for seven additional states (Delaware, Idaho, Massachusetts, Missouri, Montana, North Dakota, and Vermont).

Rates and disparity measures
We divided new diagnoses by estimated population size to estimate HIV rates per 100,000 HAAs nationally and by state for non-Hispanic White, non-Hispanic Black, and Hispanic adults (hereafter, White, Black, and Hispanic). As disparity measures, we calculated RDs and RRs comparing Black-to-White HAAs and Hispanic-to-White HAAs. All rates and disparity measures were estimated by sex. Hex maps of disparity measures were created using R version 3.6.2 [26]. We opted to use hex maps rather than a conventional map format because interpretations of maps can be biased if the size of the geographical boundary is not proportional to the size of the concept of interest (e.g., some states in the Mountain and West North Central divisions have large geographic areas but fewer diagnoses, whereas New England states and the District of Columbia are populous but have smaller land areas). The background map used publicly available hexagon boundaries for US states [27], and figures were produced using the ggplot2 package [28].
Although we present the standard errors for the estimated number of HAA men and women in each racial and ethnic group by state, we do not report a confidence interval for the diagnosis rates, RDs, and RRs because the standard errors for the denominators is small and the number of diagnoses reflect the full population and not a sample. The hex maps indicate a "not calculated" shading for the states where there were no diagnoses among White HAA men or women and a RR could not be estimated with a denominator of zero.

Human subjects review
We relied on data that were all publicly available for direct download online in aggregate and anonymized form without use restrictions. The four surveys used (ACS, GSS, NFSG, and NHANES) went through their own human subjects review for the primary data collect, with details available on their documentation websites. HIV diagnosis data from AtlasPlus are reported in aggregate counts with suppressed values for small counts to reduce reidentification risk in accordance with Centers for Disease Control and Prevention guidelines. Because all data were fully anonymized, available for free public use, and at a granularity that made it impossible to reidentify respondents, we did not need to seek review from our human subjects committee. Table 1 displays estimated HIV diagnoses per 100,000 HAA White, Black, and Hispanic men; and accompanying RDs and RRs. For states without publicly available HIV diagnoses by race and ethnicity, we provide estimated HAAs only ("population size"). Nationally, the HIV diagnosis rate among states reporting data and the District of Columbia was 3.09 per 100,000 HAA men (for all race and ethnicity categories including "other The hex map visualizations (Figs 2 and 3) support key findings from Table 1: among HAA men, Black-to-White disparities are higher than Hispanic-to-White disparities, and while RDs and RRs vary across states, disparities are ubiquitous across states without a clear geographical pattern. Some states including the District of Columbia have reverse disparities (RDs < 0.00 and RRs between 0.00 to 1.00); most of these states have small populations and their estimated rates are unstable.    (Fig 3) illustrate these findings visually. The high Black-to-White and Hispanic-to-White rate ratios in the Northeast are likely driven by the Mid-Atlantic states, as three of the six New England states had insufficient data to calculate rate ratios.

Comparison of rates and disparity measures with different denominators
Qualitatively, conclusions were similar when comparing HIV diagnosis rates and disparity measures between estimates using the Census population (irrespective of reported past-year sexual activity) versus our adjusted estimates among HAAs. HIV diagnosis rates and RDs had higher magnitudes when using estimated HAA population size as denominators, as HAAs are a subset of the US population. However, states' relative rankings on HIV diagnosis rates, RDs, and RRs were consistent when using HAAs or total population as denominators.

Discussion
Our analysis yielded several key findings. Nationally, heterosexually active women have a twofold higher HIV diagnosis rate than heterosexually active men. Compared to White HAAs, HIV diagnosis rates were over 20 times higher among Black HAAs and over 4 times higher among Hispanic HAAs for both men and women. To help illustrate the magnitude of these disparities, for the Black-to-White RRs, 31 of 37 states including the District of Columbia had RR�10 among HAA men and 33 of 42 states including the District of Columbia had RR�10 among HAA women (omitting states with missing values due to data availability or insufficient data). These disparities were without notable regional patterns and suggests a system-wide failure particularly with respect to preventing HIV among Black and Hispanic women. There are several advantages to our approach to estimating numbers of HAAs for use as denominators for heterosexually-acquired HIV. First, it likely yields more appropriate rates compared to using the full Census population. Although RRs as a measure of disparity were similar if rates were calculated using estimated HAAs versus the Census population, differences in rates can influence the magnitude of RDs. Furthermore, using rates with adjusted denominators of estimated HAAs allows providers to convey risk more appropriately to patients and facilitates comparisons of HIV burden across populations. These enhancements are important for documenting progress of policy initiatives such as the federal "Ending the HIV Epidemic in the U.S." (EHE) strategy, which is in an early stage of implementation with jurisdictions recently receiving their second year of funding at the time of this analysis. Second, our HAA estimates can also be adapted to other measures related to HIV prevention and the HIV care continuum such use of pre-exposure prophylaxis.
Our finding that racial and ethnic disparities in HIV rates are consistent across states and the District of Columbia suggests that coordinated federal responses such as the EHE may be beneficial to ensure that the needs of all communities are met with a goal of reducing disparities. Although there was interstate variation in HIV diagnosis rates and disparity measures, in several instances this is driven by small numbers of diagnoses when stratifying by state and race and ethnicity. For example, there were reverse disparities (RD<0.0 and RR<1.0) in Alaska, Wyoming, and Maine and disproportionately large Black-to-White disparities in Iowa; each of these states has small counts of new diagnoses. While substantial research highlights the Southeast as the leading edge of the US epidemic [29,30], we find that disparities persist across and within all regions. The federal EHE plan has the potential to address inequities, while considering different risk profiles and needs at the community level, because it provides targeted funding to 57 priority EHE areas that collectively comprise almost two-thirds of new HIV diagnoses among Black and Hispanic persons and funding recipients are required to develop plans to reduce disparities and allocate funding for those purposes [31].
There are several primary explanations for why the HIV diagnosis rate is twice as high among HAA women compared to men. First, HAA women may be more likely to be screened for HIV than HAA men in the context of receiving reproductive healthcare services. Second, there are physiological differences by sex, and receptive penile-vaginal intercourse has twice the likelihood of transmitting an infection per contact compared to insertive penile-vaginal intercourse [32]. Third, many infections among HAA women may be associated with male sex partners who are connected to MSM transmission networks [33]. Other reasons include lack of awareness of their male sexual partners' HIV status and risk factors and higher engagement in risky behaviors among women who have been sexually abused [34], with women experiencing higher rates of intimate partner violence than men [35].
A robust literature describes how the legacy of historic racism and trauma to Black women from the era of slavery to modern times have contributed to worse sexual health outcomes among women; these include but are not limited to coerced medical experimentation, racebased events such as rape and lynching, inadequate healthcare, and social determinants of health [36][37][38]. This historical context highlights the importance of culturally responsive interventions to improve linkage to and retention in highly effective HIV prevention and highquality HIV care for Black women. Our finding of higher HIV diagnosis rates among Black HAA women is consistent with other research findings on disparities in sexual health outcomes including Black women having higher rates of maternal morbidity and mortality [39], congenital syphilis among their newborns [40,41], higher rates of unplanned pregnancy among women with HIV [42], and lower rates of pre-exposure prophylaxis use than either men or White women [43]. More generally, Black populations including men experience disparities across health conditions including COVID-19 [44], diabetes [45], cancer [46], and others.
Our analysis has several limitations. Our state-level aggregation may mask local variation in HIV diagnosis rates that are attributable to higher likelihood of HIV acquisition in geographic areas with a higher background HIV prevalence. Due to data availability, our diagnoses include adolescents aged 13-17 years because the lowest age category in AtlasPlus is 13-24 years. However, adolescents comprise a small share of HIV infections nationally [6] and the likely impact on our estimated rates is minimal. State-level data by race and ethnicity were missing in AtlasPlus for eight states. Although we were able to calculate RDs for all states with available data, there were additional states where we were unable to calculate RRs because the denominator (diagnosis rates among White HAA males or females) was zero. In most instances except HAA men in DC, this phenomenon is likely because of the small number of diagnosed cases. This suggests that in general, disparity measures in states with small populations may be unstable and should be interpreted with caution. While outside the scope of our analysis, future work might explore methods to minimize the impact of small case counts when examining data with multiple stratifications (i.e., by state, sex, mode of transmission, race and ethnicity, and age group). Due to data availability and small numbers within certain strata, we were unable to examine other races and ethnicities including multiple races which collectively comprised <5% of all diagnoses nationally. Each survey used to estimate the number of HAAs has common limitations such as non-response bias and potential reporting bias for self-reported sexual activity. Due to data availability, the estimated number of HAAs aged �51 years is only based on GSS. In standardizing based on national data, we assume that the probability of heterosexual activity within each demographic strata is equivalent across states. While this assumption was inadequate for past work estimating denominators of MSM due to differences in urban versus rural areas [47], this is a lesser concern for HAAs for whom lower stigma. While we believe that our rates using estimated HAAs as denominators are likely more appropriate than rates using the total Census population, we did not do a formal validity analysis. There are known racial and ethnic disparities in receiving late HIV diagnoses [48][49][50], and our analysis of new diagnoses does not account for underdiagnosis among racial and ethnic minorities. Lastly, the higher observed rates among Black women may be partially a surveillance artifact. Black women who move to the US from countries with high HIV prevalence are more likely to be diagnosed late and enter care late [51,52]; these individuals are identified as new diagnoses in surveillance systems.
Achieving the end of HIV as an epidemic in the US will require focused efforts to lessen persistent racial and ethnic disparities in HIV prevention and treatment, including among HAAs. The persistent HIV-related disparities in some Black and Hispanic communities confirm the need for culturally responsive interventions that address the social and structural factors associated with the disparities. Updating measures of HIV diagnosis and disparities using estimated HAAs as denominators can enable providers to convey risk to patients, provide comparisons of HIV burden across populations, and improve monitoring of national and state policy initiatives to end the epidemic.
Supporting information S1