Developing a dynamic HIV transmission model for 6 U.S. cities: An evidence synthesis

Background Dynamic HIV transmission models can provide evidence-based guidance on optimal combination implementation strategies to treat and prevent HIV/AIDS. However, these models can be extremely data intensive, and the availability of good-quality data characterizing regional microepidemics varies substantially within and across countries. We aim to provide a comprehensive and transparent description of an evidence synthesis process and reporting framework employed to populate and calibrate a dynamic, compartmental HIV transmission model for six US cities. Methods We executed a mixed-method evidence synthesis strategy to populate model parameters in six categories: (i) initial HIV-negative and HIV-infected populations; (ii) parameters used to calculate the probability of HIV transmission; (iii) screening, diagnosis, treatment and HIV disease progression; (iv) HIV prevention programs; (v) the costs of medical care; and (vi) health utility weights for each stage of HIV disease progression. We identified parameters that required city-specific data and stratification by gender, risk group and race/ethnicity a priori and sought out databases for primary analysis to augment our evidence synthesis. We ranked the quality of each parameter using context- and domain-specific criteria and verified sources and assumptions with our scientific advisory committee. Findings To inform the 1,667 parameters needed to populate our model, we synthesized evidence from 59 peer-reviewed publications and 24 public health and surveillance reports and executed primary analyses using 11 data sets. Of these 1,667 parameters, 1,517 (91%) were city-specific and 150 (9%) were common for all cities. Notably, 1,074 (64%), 201 (12%) and 312 (19%) parameters corresponded to categories (i), (ii) and (iii), respectively. Parameters ranked as best- to moderate-quality evidence comprised 39% of the common parameters and ranged from 56%-60% across cities for the city-specific parameters. We identified variation in parameter values across cities as well as within cities across risk and race/ethnic groups. Conclusions Better integration of modelling in decision making can be achieved by systematically reporting on the evidence synthesis process that is used to populate models, and by explicitly assessing the quality of data entered into the model. The effective communication of this process can help prioritize data collection of the most informative components of local HIV prevention and care services in order to reduce decision uncertainty and strengthen model conclusions.


S1 Supplement B: Search Strategy for Model Inputs
We divided our search strategy into two parts: (1) identifying a rank order of potential data sources for each domain, and (2) selection of the best data to use, given additional factors and constraints. We identified the best possible data sources for each domain, ranked by suitability depending on factors unique to each model parameter category [1,2,3]. For example, the most accurate and reliable source for initial total population numbers was city-level census data, while the best source for ART effectiveness estimates came from randomized controlled trials (RCT). For city-specific parameters, we selected source data based on geographic representativeness, stratification level relative to our model requirements, and time-period. For non-city-specific (common) parameters, we selected source data based on study quality, how well the evidence matched the ideal evidence for a given model parameter (Manuscript Table 4), and whether or not our required model parameters could be directly estimated from evidence sources.
For model parameters relating to the probability of HIV transmission (S1 Supplement C, Section 2) condom effectiveness, ART effectiveness, probability of transmission, and disease progression for PLHIV off-ART, we documented our literature search methods, results, and choice of evidence to inform model parameters (S1 Table B2). Specifically, the starting set for our snowball literature search was derived from the references for target parameters previously used in modeling studies [4,5,6]. Literature cited in any of the start set papers were searched using Google Scholar. From this initial search, we defined key words (first restricting to words in titles) and searched for "review" type publications. The search strategies were broadened when the limited search resulted in no relevant publications. As per our a priori selection criteria (S1 Table B1), we prioritized data published from randomized control trials (RCT), and where not available, relied on peerreviewed observational studies with or without meta-analysis. Where meta-analysis was not performed for the parameter of interest, or when the evidence included in the metaanalysis was heterogeneous across settings, we reviewed the individual studies included, and if necessary calculated parameter estimates (not otherwise reported) using the data reported in the original paper.  [19,20]; [18] 50% (25%-75%); 50% (10%-90%)
h Triangulated CD4-specific estimates based on the evidence reviewed and assumptions. Systematic review of observational studies

Reduced probability of unprotected sexual contacts due to HIV diagnosis/awareness
Overall [54], [55], [56], [57]  "allintitle: 'metaanalysis'". a Snowball search was conducted using Google Scholar. b The articles were considered relevant if they directly reported the estimates on parameters of interest, or if the parameters of interest can be derived based on the contents reported in the papers; the seed paper was included if they were relevant. c Previous studies did not distinguish the condom effectiveness for heterosexual and homosexual sex; as there was no seed paper, we used PubMed Mesh term search strategy. d Evidence on transmission probability by CD4 strata was limited, and we first obtained the evidence on transmission probability overall, and triangulated the CD4 strata-specific estimates based on different evidence sources: review and meta-analysis of needle-stick studies found the transmission probability through needle stick is 0.0041 (95% CI: 0.0017-0.0095) among AIDS patients, and 0.0024 (0.0014-0.004) among all patients; another study of majority sample with AIDS reported 0.003-0.004; observational study on probability of HIV transmission due to deep injuries estimated a transmission probability of 0.023 (0.02-0.07). e Among all the seed papers and citing papers, estimates from a total of 8 populations/study samples in the North American and European settings were extracted, where the primary subtype is HIV-1 B. f Estimates were directly extracted from the selected papers unless noted. Generally, the per-act transmission probability was reported; we calculated the rates of transmission per 100 person-year based on data reported in the papers. g We did not use the snowball search strategy for these parameters. We found a meta-analysis paper, and based on its content, we were aware that two study populations (the Rakai study and the European study) were commonly used for estimating the duration and infectivity of the acute infection (both are retrospective studies); in addition, we identified another two prospective studies based on expert recommendations of the literature. h Transmission probability through needle sharing is larger than transmission through needle stick and smaller than transmission through deep injuries; and transmission probability increases with lower CD4 counts. I Rates for male to female transmission were transformed to probability using p=1-exp(-rt), resulting in transmission probability ranged from 0.043 to 0.070 (those studies did not distinguish HIV stages or CD4); if assuming the relative risk of AIDS vs non-AIDS to be 2; and assuming two extreme scenarios that those studies estimating male to female transmissions included only AIDS patients and only non-AIDS patients, respectively; we can arrived at the estimates ranges for male to female transmission to be: >0.043-<0.140 for CD4<200; and >0.0215-<0.070 for CD4>=200; according to the range, we determined the point estimates as the mid-point of the ranges; female to male transmission probabilities were estimated based on male to female's and the relative risk estimates. j One US prospective study observed an 1.3 fold increase of log10 viral load during the acute stage; literature review [26,47]  Each subsection describes the dimension of model input parameters, including the level of stratification used in the model (i.e. gender, race/ethnicity, etc.) and whether a parameter was city-specific or common across cities (total number of parameters for each city). We also discuss the identification and selection process for evidence sources (outlined in S1 Supplement Table B1), as well as the equations used to derive model input data, when these values were not available directly from reports, literature sources, or estimated directly from primary data analysis. For transparency in our derivation and population of model parameter inputs, plain-language equations in each subsection describe how we derived each model parameter, with subscripts indicating on which subgroups a particular piece of evidence/parameter value is stratified (gender, race/ethnicity, risk group, CD4 cell count among PLHIV).
We also discuss any assumptions and other information required to generate model parameter values, as well as how we incorporated parameter and evidence uncertainty into our model via ranges. We discuss distributional choices and assumptions for probabilistic sensitivity analysis (PSA), based on parameter data types, in S1 Supplement E.

Risk-stratified initial population estimates
Initial population estimates capture 15-64 city-level population numbers that were stratified on the basis of gender (male or female), race/ethnicity (black/African American, Hispanic/Latino, and non-Hispanic white/others), and HIV risk behavior type (men who have sex with men (MSM), people who inject drugs (PWID), MSM who inject drugs (MWID), and heterosexual (HET)). MSM, MWID, and HET were further stratified into subgroups based on HIV sexual risk behavior intensity (high vs. low) (discussed in section 2.1.1), and PWID and MWID were categorized based on whether they were receiving opioid agonist treatment (OAT) (discussed in section 4.2.1), resulting in 42 population subgroups. We distributed these 42 subgroups among 19 health states, including HIV-negative, PLHIV who are unaware (3 CD4 cell count strata and acute HIV), HIV-diagnosed (3 CD4 strata and acute HIV), PLHIV on-ART and off-ART (3 CD4 strata each) (PLHIV population discussed in section 1.2). We also included PrEP states for HIV-negative, acute and chronic HIV among infected/unaware PLHIV (3 PrEP strata) (discussed in section 4.3.1). This resulted in 798 initial population values (42 subgroups x 19 health states).

Population aged 15-64
Model input parameters: To derive initial populations for all risk groups in our model, we required total population numbers by city, stratified by race/ethnicity (black/African American, Hispanic/Latino, white/other) and gender (male/female) (6 parameter values).
Identification and selection of evidence: We identified central statistics databases for initial population values, prioritizing population estimates based on geographic representativeness, time-period, and sampling method (S1 Supplement Table B1). For total population numbers, we selected city-level census data from the United States Census Bureau, stratified by ethnicity and gender into 6 subgroups and available for each city in 2011 (S2 Supplement Table 1.1.1) [62].

Derivation of model parameters
Estimates for population numbers in each gender/race/ethnicity subgroup were directly available from cross-tabulated census data and did not require any additional assumptions.

PWID population
Model input parameters: We required initial population numbers of PWID by city, stratified by race/ethnicity and gender (6 parameter values).  [63].

Derivation of model parameters
We derived estimates for PWID population by multiplying ethnicity-stratified total population numbers by gender-weighted, race/ethnicity-specific prevalence estimates for each city. We used prevalence estimates from the most recent available year, and assumed that prevalence rates remained constant to 2011. To derive gender-and race/ethnicity-stratified PWID prevalence estimates, we assumed that gender proportions of PWID were equivalent within race/ethnicity strata.

Derivation of model parameters
We derived initial MSM population estimates by multiplying total city-level population estimates for males from census data by county-or CBSA-specific MSM proportions. Given large differences in MSM populations between boroughs of New York City and availability of data, we used population-weighted, borough-level MSM prevalence estimates for total MSM in New York City [64]. Furthermore, we assumed that MSM proportions for Staten Island were equivalent to those in Brooklyn for New York City.

Derivation of model parameters
We derived MWID population estimates by triangulating from two different derivation methods using NHBS survey data (MSM among PWID population, and PWID among MSM population). Our point estimate averaged these two methods of derivation, and we used each method individually as upper and lower range values.

Heterosexual population
Model input parameters: We required initial population numbers of heterosexuals by city, stratified by race/ethnicity and gender, which we derived within the model.
Identification and selection of evidence: For total population numbers, we selected citylevel census data from the United States Census Bureau for each city in 2011. Refer to section 1.1.1 for derivation of overall population estimates by city.

Derivation of model parameters
HIV risk groups in the model were mutually exclusive and collectively exhaustive of the entire population, so we derived the initial heterosexual population as the remaining individuals not identified as PWID, MSM or MWID.

Number of PLHIV
The initial number of PLHIV were a subset of the total population numbers derived in Section 1.1. To derive initial population numbers for PLHIV in our model, we used a combination of city-level surveillance data for diagnosed PLHIV, PLHIV on treatment, as well as estimates of the proportion of PLHIV who are aware of their status. We stratified our estimates according to observed CD4 strata proportions, as well as the proportion of individuals with acute HIV.

HIV infected individuals
Model input parameters: We required initial population values for the number of PLHIV who were diagnosed and infected/unaware by city. For each group of diagnosed and infected/unaware, this included 42 stratified parameter values by gender, race/ethnicity and risk group, as well as high/low sexual risk among HET, MSM, MWID, and OAT status among PWID/MWID x 3 CD4 cell count categories (≥ 500 cells/µL, 200-499 cells/µL, and < 200 cells/µL) (252 parameter values). To capture the relative percentage of PLHIV who were infected but unaware at any given time in the model, we also required the proportion of PLHIV who were aware of their status, stratified by gender, race/ethnicity and risk group (18 parameter values).

Identification and selection of evidence:
We identified city-level HIV surveillance data to derive the total numbers of diagnosed PLHIV, stratified by risk group, ethnicity and gender. We identified national HIV cohort data for CD4 stratification among diagnosed, and literature estimates for CD4 stratification among infected but unaware PLHIV. We identified surveillance estimates for the proportion of PLHIV who are aware of their status among those infected. We selected city-specific surveillance data for initial population numbers of diagnosed PLHIV, stratified individually by race/ethnicity, gender, risk group and CD4 cell count for 2011 (Described in S1 Supplement E) [67,68,69,70,71,72]. We selected data for the proportion of PLHIV who are aware of their status based on gender and risk group, weighted by state-level estimates from CDC reports of the proportion of PLHIV aware of their status relative to the national average (S2 Supplement

Derivation of model parameters
We derived the total number of diagnosed PLHIV directly from surveillance data. We derived the total number of infected PLHIV indirectly, using total diagnosed PLHIV divided by the proportion of PLHIV who are aware of their status. We triangulated the proportion of PLHIV who were aware of their status, for each city, using a weighting for state-level proportion of PLHIV who are aware of their status relative to the national average and applied this weighting to proportions for each gender and risk group. We assumed that proportions were equivalent across race/ethnic groups. We stratified diagnosed PLHIV by CD4 count according to observed proportions in HIVRN data. We derived estimates for CD4 cell count proportions among MSM, stratified by race/ethnicity, and HET, stratified by race/ethnicity and gender. Due to small sample sizes, we combined estimates for PWID and MWID for each HIVRN region under the assumption that CD4 stratification among PWID and MWID were equivalent. Finally, we distributed PLHIV from unknown risk groups equally among all gender, race/ethnicity and CD4 strata for each city [77].

ART status
Model input parameters: For PLHIV on-ART and off-ART post-initiation, we required 42 stratified parameter values by gender, race/ethnicity and risk group, as well as high/low sexual risk among HET, MSM, MWID, and OAT status among PWID/MWID, all stratified by 3 CD4 cell count categories (≥ 500 cells/µL, 200-499 cells/µL, and < 200 cells/µL) and by city (252 total parameter values). We also required the proportion of PLHIV ever on ART, and currently on ART, by city, stratified by gender, race/ethnicity and risk group (36 total parameter values).

Identification and selection of evidence:
We identified city-specific surveillance data as the best available evidence. Given limitations in surveillance data for the proportion of individuals who had ever initiated ART, we used the proportion of individuals linked to care as a proxy for those who had ever initiated ART (S2 Supplement  [76].

Derivation of model parameters
Given the lack of available evidence, as well as IAS guidelines recommending immediate ART initiation for all infected PLHIV [78], we assumed that the proportion of diagnosed PLHIV ever initiating ART was approximated by the proportion of diagnosed PLHIV who were linked to care. For initial CD4 distributions of diagnosed individuals on and off ART, we derived cross-tabulated proportions directly from HIVRN data using combined estimates from 2011-2015, stratified by race/ethnicity for all risk groups, in addition to gender among HET. We assumed CD4 distribution to be equivalent for male and female PWID.

PLHIV ever on ART
Initial PLHIV off-ART post-initiation

Model input parameters:
We required estimates of PLHIV in acute stage HIV, stratified by gender, race/ethnicity, risk group, high/low sexual risk among HET, MSM, MWID and OAT status among PWID/MWID, for infected/unaware and diagnosed/ART naïve (84 parameter values). 17 Identification and selection of evidence: We identified literature sources as the best available evidence to derive the proportion of PLHIV in acute stage HIV among infected/unaware and diagnosed/ART naïve PLHIV (S2 Supplement Table C1.2.7) [75].

Derivation of model parameters
Given limited evidence, we assumed that the proportion of individuals in acute stage HIV was the same for all infected PLHIV, both infected/unaware and diagnosed/ART naïve. Furthermore, we assumed that proportions of individuals with acute HIV were drawn equally from all strata (race/ethnicity, gender, risk group, CD4 cell count).

Population dynamics
In order to capture population dynamics unique to each city, we calculated entry, maturation and mortality rates for each city. For population dynamics among HIVnegative populations, we calculated net population changes stratified by gender and race/ethnicity based on in-and out-maturation (i.e. individuals who are 14 turning 15, and 64 turning 65) combined with long-term projected growth. We calculated outmaturation rates among PLHIV based on diagnosed PLHIV who are 64 turning 65, based on surveillance data for each city. For mortality rates, we used a combination of published mortality estimates and primary data analysis for PLHIV, as well as population life tables for HIV-negative individuals.

Population entry, growth and maturation rates
Model parameter inputs: We required population growth parameters for HIV-negative individuals (18 parameter values), as well as out-maturation rates for PLHIV (18 parameter values).
Identification and selection of evidence: We identified census data from the United States Census Bureau as the best available evidence, stratified by race/ethnicity and gender [62], as well as population growth projections for each city.
We derived in-maturation rates based on the population proportion aged 15-19, relative to the total population, and out-maturation rates based on the population proportion aged 60-64, relative to the total population. To ensure that our model matched long-term population growth projections and changing demographics in each city, we used external reports and/or data to adjust total population growth parameters. Projections stratified by age, gender and race/ethnicity, were available for Atlanta [79], Baltimore [80], Los Angeles [81], and Miami [82]. For New York City, race/ethnicity stratified growth projections were not available at the city, county or state level, so we triangulated projections using age-gender stratified estimates combined with national trends for changing race/ethnic population compositions to 2040 [83,84]. For Seattle, stratified growth projections were not available at the city level, so we used population projections for Washington State [85]. We incorporated these projections to ensure that population growth parameters produced projections that matched the overall growth rates from long-term projections accounting for external factors and trends affecting city growth rates. We assumed that population entry and maturation rates were the same across risk groups within race/ethnicity and gender strata (S2 Supplement Tables C1.3 . PLHIV maturation rates were derived from HIV surveillance data, using the same methods as general population maturation. We distinguished PLHIV maturation rates from those of the general population to reflect the different age structure of PLHIV cohorts (S2 Supplement Table C1.3.2).

Mortality rates
Model parameter inputs: We required monthly mortality rates, stratified by race/ethnicity, gender, risk group, high/low risk among MSM, MWID and HET, as well as OAT status among PWID and MWID (42 subgroups [93]. For mortality rates among HIV-negative PWID we used peer-reviewed literature estimates [94], as well as for PWID/MWID who were on OAT (S2 Supplement Table C1.3.5) [95].

Derivation of model parameters
We assumed equivalent mortality rates for infected/unaware, ART-naïve, and off-ART PLHIV within each gender, race/ethnicity, risk group, CD4 strata. We assumed that mortality rates among high-risk HET, MSM and MWID were equivalent to those of lowrisk HET, MSM, and MWID. We assumed that mortality rates for HET and MSM were equivalent between HIV-negative, infected acute HIV and PLHIV with CD4 count ≥ 500 cells/µL, and were based on population life tables. We used standardized mortality ratio multipliers from literature estimates to derive mortality rates for HIV-negative PWID/MWID relative to non-PWID/MWID, as well as all PWID/MWID on OAT relative to off OAT. For PLHIV on-ART, we estimated mortality rates directly from HIVRN data for each stratified subpopulation.

HIV-negative population
Given that the health states in our model were mutually exclusive and collectively exhaustive, we derived the HIV-negative population from population-level census data as the remaining individuals in each population subgroup after all PLHIV health states had been populated. 21 Model input parameters: We required initial population numbers for HIV-negative individuals, stratified by risk group, race/ethnicity and gender, high/low sexual risk among HET, MSM, MWID, and OAT status among PWID/MWID. We derived these numbers within our model, based on the numbers in all other health states.

Total HIV-negative population
Identification and selection of evidence: For total HIV-negative population numbers, we selected city-level census data from the United States Census Bureau for each city in 2011 (S2 Supplement Table C1.1.1) [62]. Please refer to section 1.1.1 for derivation of overall population estimates by city.

Derivation of model parameters
We derived initial numbers of HIV-negative individuals in the model as the difference between total population numbers and all infected PLHIV (including both infected/unaware and diagnosed).

Screened among HIV-negative
Model input parameters: We required the proportion HIV-negative individuals who had been HIV-screened at baseline, stratified by risk group, race/ethnicity and gender, for each city (42 parameter values). We also required the average duration of time that HIVnegative individuals remained identified after HIV screening.
Identification and selection of evidence: We identified NHBS, Behavioral Risk Factor Surveillance System (BRFSS) and the New York City Community Health Survey (NYC-CHS) data as the best sources of evidence to derive the initial proportion of HIV-negative individuals screened for HIV (S2 Supplement Table C1.

Derivation of model parameters
We derived testing rates based on the percentage of individuals receiving an HIV test in the past 12 months. We stratified by high/low sexual risk behavior among HET and MSM and we assumed that testing rates for low risk HET, MSM and MWID were equivalent. We used NHBS data to derive testing rates for high risk and BRFSS for low risk, with the exception of the use of the NYC-CHS for testing rates in New York City [98,99]. To derive ranges for NHBS testing rates, we used stratified BRFSS standard error 22 estimates [94,95]. We assumed that HIV-negative individuals remained identified for 12months after screening.

Sexual risk behaviours
To model sexual risk behaviour, we included parameters capturing the distinction between individuals with high and low sexual risk behaviours by stratifying initial population estimates based on the number of sexual partners. We also included parameters for the reduction in sexual partners due to HIV diagnosis and the probability of condom use during same-and opposite-sex sexual encounters.

Stratification by high/low sexual risk behavior
Model input parameters: We required proportions of individuals with high and low risk sexual behavior among infected and HIV-negative MSM/MWID and HET, stratified by gender and race/ethnicity for each city (36 total parameter values).

Identification and selection of evidence:
We identified National Survey of Family Growth (NSFG) data as the best source of evidence to derive proportions of high/low risk among HET [100], and NHBS data to derive proportions of high/low risk among MSM/MWID (S2 Supplement Tables C2.1 . We supplemented NSFG and NHBS data with primary analysis of data from the AIDS Linked to IntraVenous Experience (ALIVE) PWID cohort study [101], and from Project AWARE to inform ranges used in sensitivity analyses and/or calibration [102].

Derivation of model parameters
We derived the proportion of high risk among HIV-negative HET using the proportion of individuals who had 5 or more sexual partners in the past 12 months. We derived the proportion of high risk among HIV-negative individuals by assumption, based on the percentage of MSM reporting condom-less sex in their most recent encounter with a casual partner [103], and among MSM/MWID PLHIV [104], and HET PLHIV [105], from literature sources. We used proportions of PLHIV individuals with STDs as a proxy for high-risk behavior. We derived proportions of high risk among infected by assuming a constant multiplier on low risk HET and MSM/MWID.

Proportion of high risk among HIV-negative
We used 95% confidence intervals derived from NSFG estimates as upper and lower range values.

Number of sexual partners
Model input parameters: We required monthly numbers of same sex partners, stratified by race/ethnicity and risk group for MSM and MWID (18 parameter values). We required monthly numbers of opposite sex partners, stratified by gender, race/ethnicity and risk group (42 parameter values).

Identification and selection of evidence:
We identified NSFG data as the best source of evidence for opposite sex partners among HET/PWID, and same-and opposite-sex sexual partners for low risk MSM [100]. We selected NHBS data to derive numbers of same-and opposite-sex partners among high risk MSM and MWID (S2 Supplement  Tables C2.1

Derivation of model parameters
We assumed that monthly numbers of same and opposite sex sexual partners were equivalent between MSM and MWID, as well as opposite sex partners between PWID and HET.
We used NSFG and NHBS means as point estimates and 95% confidence intervals as ranges around point estimates. As we defined high-risk individuals by higher numbers of sexual partners than low-risk, we constrained the lower bounds of high-risk to be greater or equal to the upper bounds of low-risk individuals for the same race/ethnicity, gender and risk group strata. This ensured that the number of sexual partners for high-risk individuals in calibration and PSA was always greater than the number for low-risk individuals.

Decrease in number of sexual partners due to diagnosis
Model parameter inputs: We required an estimate for the decrease in number of sexual partners due to HIV diagnosis, common across cities (1 parameter value).
Identification and selection of evidence: We identified systematic review/meta-analysis literature sources as the best available evidence for the decrease in sexual partners due to diagnosis (S2 Supplement

Derivation of model parameters
We derived the decrease in sexual partners due to diagnosis from literature estimates (S1 Supplement Table B2). We assumed that the percentage reduction in unprotected sex for PLHIV who were aware vs. unaware estimated in the literature was a suitable approximation of sexual risk behavior for the percentage reduction in sexual partners due to diagnosis. We further assumed that the proportional reduction in the number of sexual partners due to diagnosis was constant across gender, race/ethnicity and risk groups.

= .
We derived range estimates from 95% confidence intervals in literature sources.

Probability of condom use
Model input parameters: We required estimates of condom use probability for heterosexual sex and homosexual sex, stratified by gender and race/ethnicity, as well as by all risk groups for heterosexual sex and MSM/MWID for homosexual sex (60 parameter values).
Identification and selection of evidence: We identified NSFG data as the best available evidence source for condom use probabilities for heterosexual sex among HET and PWID [100], and NHBS data for condom use probabilities for heterosexual and homosexual sex among MSM and MWID (S2 Supplement Tables C2.1

Derivation of model parameters
We assumed that condom use probabilities were equivalent among low-risk HET, lowrisk MSM/MWID and PWID, by race/ethnicity and gender. For high-risk opposite-and same-sex MSM/MWID, we assumed that probabilities were equivalent among MSM and MWID. We used regional data from NSFG to derive estimates for each city, based on census regions.

Condom use probability for heterosexual sex
We used 95% confidence intervals derived from NSFG and NHBS estimates as upper and lower range values.

Injection risk behaviours
To model injection risk behaviour, we included parameters capturing the monthly number of injections for PWID/MWID, proportion of shared injections, as well as the effect of HIV diagnosis on injection sharing.

Number of injections
Model parameter inputs: We required an estimate for the monthly number of injections for PWID/MWID, for each city (1 parameter value).

Identification and selection of evidence:
We identified peer-reviewed modeling studies as the best available evidence for monthly injections (S2 Supplement Table C2.2.1) [94].

Derivation of model parameters
We assumed that number of injections was equivalent across gender, race/ethnicity and between PWID and MWID.

Proportion of shared injections
Model parameter inputs: We required an estimate for the proportion of shared injections for MWID, stratified by race/ethnicity, and PWID, stratified by gender and race/ethnicity for each city (9 parameter values).
Identification and selection of evidence: We identified NHBS data as the best available evidence to derive estimates for the proportion of injections that are shared in Atlanta, Los Angeles, Miami, New York City, and Seattle [106], and we used New York City estimates by assumption for Baltimore, as city-specific NHBS data were unavailable (S2 Supplement Table C2.2.2).

Derivation of model parameters
We derived the proportion of shared injections, based on the proportion of PWID and MWID who reported sharing injections in the past 12 months. We assumed that the 27 proportion of shared injections was equivalent across gender and between PWID and MWID.
We used 95% confidence intervals derived from NHBS proportions as upper and lower range estimates.

Decrease in number of shared injections due to diagnosis
Model parameter inputs: We required an estimate for the decrease in shared injections for PWID/MWID due to diagnosis, common across cities (1 parameter value).
Identification and selection of evidence: We identified NHBS data as the best available evidence to derive estimates for the decrease in shared injections due to HIV diagnosis (S2 Supplement Table C2.2.3) [106].

Derivation of model parameters
To estimate the percentage reduction in shared injections, we calculated the reduced probability for distributive injection sharing after diagnosis. We assumed that the percentage reduction in shared injections was equivalent across gender, race/ethnicity and between PWID and MWID.

ℎ =
We used 95% confidence intervals derived from NHBS proportions as upper and lower range estimates.

Sexual mixing patterns
We explicitly modeled sexual mixing patterns within and between race/ethnicity groups, to capture the dynamics of HIV infection among race/ethnicity groups in a particular city, given differences in race/ethnicity composition by city.

Model input parameters:
We required assortative sexual mixing parameters by city, for the proportion of sexual partners of the same race/ethnicity for both HET/PWID and MSM/MWID, stratified by race/ethnicity (6 parameter values).

Derivation of model parameters
We based our estimates of assortative sexual mixing among HET and PWID on the assortative mixing among high and low risk opposite-sex encounters among individuals in the NSFG. We used literature sources for same-sex assortative mixing among MSM and MWID, using estimates from a Houston-based cohort study for Atlanta, Baltimore, Miami and New York City [107], and estimates from a San Francisco cohort study for Los Angeles and Seattle [108]. We assumed the same mixing patterns for HET and PWID, as well as high/low risk among MSM and MWID.

Probability of transmission
To model the probability of HIV transmission, we included parameters capturing the baseline probabilities of transmission via sexual contact and injection (modeled as probabilities of transmission per sexual act or shared injection), and the protective effects of ART and condom use in reducing transmission.

Probability of transmission from sexual contact
Model input parameters: We required estimates for the probability of HIV transmission per heterosexual and homosexual sexual contact, common across cities, race/ethnicity and risk group. Heterosexual estimates were further stratified by whether the contact was male-to-female or female to male, and CD4 cell count/acute HIV (12 parameter values).

Derivation of model input parameters
We derived input parameters for the probability of transmission via sexual contact by triangulating data from different literature sources (Table B2). We used a multiplier to derive the probability of transmission for individuals with acute HIV, relative to chronic HIV with CD4 < 200 cells/µL.
We derived ranges based on the range of 95% confidence intervals in literature estimates.

Probability of transmission from shared injection
Model input parameters: We required estimates for the probability of HIV transmission per injection, common across cities, race/ethnicity, risk group and stratified by CD4 category/acute HIV (4 parameter values).

Identification and selection of evidence:
We identified systematic literature reviews of observational studies as the best available evidence to derive parameter estimates for the probability of transmission from shared injection (S2 Supplement Table C2.4.2) [21,23,24,25].

Derivation of model parameters
We derived input parameters for the probability of transmission from injection by triangulating data from different literature sources (S1 Supplement Table B2).

ℎ = ℎ
We derived ranges based on the range of 95% confidence intervals from literature estimates.

ART effectiveness on probability of transmission from sexual contact
Model input parameters: We required estimates of ART effectiveness for transmission via heterosexual and homosexual sex, common across cities, gender, race/ethnicity, and risk groups (all risk groups for heterosexual sex and MSM/MWID for homosexual sex) (2 parameter values).

Identification and selection of evidence:
We identified systematic reviews/meta-analyses of multiple literature sources as the best available evidence sources to estimate parameter value inputs for heterosexual sex [14,15]

Derivation of model parameters
We derived model parameter values for ART effectiveness for transmission via both heterosexual and homosexual contact based on a large meta-analysis study of ART effectiveness [15] (S1 Supplement Table B2).

ART effectiveness heterosexual
We derived ranges based on the range of 95% confidence intervals from literature estimates.

ART effectiveness on probability of transmission from injection
Model input parameters: We required estimates of ART effectiveness for transmission via injection, common across cities, gender, race/ethnicity, and PWID/MWID (1 parameter value).

Identification and selection of evidence:
We identified systematic reviews/meta-analyses of literature estimates as the best available evidence to derive parameter value estimates (S2 Supplement Table C2.4.4) [18,19,20].

Derivation of model parameters
We derived model parameter values by synthesizing literature estimates and ranges (S1 Supplement Table B2).

ART effectiveness injection
We derived ranges based on the range of 95% confidence intervals from literature estimates.  [10,11,12,13].

Derivation of model parameters
We derived model parameter values by synthesizing literature estimates and ranges (S1 Supplement Table B2).

HIV Testing
To model HIV testing, we included model parameters capture the monthly rates at which infected/unaware PLHIV are diagnosed from HIV-symptom-based case finding (for infected PLHIV with CD4 cell counts < 500 cells/µL), as well as HIV testing rates for HIVnegative and PLHIV who are unaware.

Identification and selection of evidence:
We identified cohort studies as the best available source of evidence to inform estimates of symptom-based case finding rates [109,110].

Derivation of model parameters
We derived symptom-based case-finding rates for infected/unaware PLHIV with CD4 cell counts of 200-499 cells/µL and ≥ 500 cells/µL from literature estimates, assuming an equal rate across cities, gender, race/ethnicity and risk group.

HIV testing rates
Model input parameters: We required city-specific HIV testing rates, stratified by gender, race/ethnicity, risk group, as well as high/low sexual risk among MSM, MWID and HET and OAT status among PWID/MWID (42 parameter values).
Identification and selection of evidence: We identified NHBS, Behavioral Risk Factor Surveillance System (BRFSS) and the New York City Community Health Survey (NYC-CHS) data as the best available source of evidence to derive testing rates [65, 106,111].

Derivation of model parameters
We derived testing rates based on the percentage of individuals receiving an HIV test in the past 12 months. We stratified by high/low sexual risk behavior among HET and MSM and we assumed that testing rates for low risk HET, MSM and MWID were equivalent. We used NHBS data to derive testing rates for high risk and BRFSS for low risk, with the exception of the use of the NYC-CHS for testing rates in New York City [98,99]. To derive ranges for NHBS testing rates, we used stratified BRFSS standard error 34 estimates [94,95]. We assumed that HIV-negative individuals remained identified for 12months after screening (S2 Supplement We derived testing rates based on the percentage of individuals receiving an HIV test in the past 12 months (yearly probability of HIV test), which we converted to a monthly rate using the formula below. We stratified testing rates by high/low risk sexual behavior among HET and MSM. We assumed that testing rates for low risk HET and MSM were equivalent, as well as testing rates for PWID and MWID (S2 Supplement

ART initiation
To model ART initiation, we included parameters for the proportion of diagnosed PLHIV initiating ART within 30 days of diagnosis, as well as the monthly rate of previously diagnosed PLHIV (more than 30 days past diagnosis) initiating ART. The combination of these two parameters allowed us to calculate the total number of PLHIV initiating ART every month.

Model input parameters:
We required gender, race/ethnicity and risk-group-stratified proportions, for PLHIV initiating ART immediately upon diagnosis (< 30 days postdiagnosis) ( . We supplemented ART initiation rates from HIVRN primary analysis with MMP data [112]. 35

Derivation of model parameters
We calculated the number of PLHIV initiating ART, using the proportion linked to care as the denominator. To derive monthly ART initiation rates for individuals more than 30days post-diagnosis, we divided the number of PLHIV initiating ART post-30-days, by the total number of follow-up months for PLHIV diagnosed but not initiated ART (PLHIV initiating ART more than 30-days post-diagnosis/month).

ART retention and re-initiation
To model ART retention and re-initiation, we included parameters for the monthly transition probability for PLHIV on ART to dropout stratified by CD4 cell count, as well as the monthly transition probabilities for PLHIV re-initiating ART, not stratified by CD4 cell count.

Model input parameters:
We required ART dropout rates by CD4 category, stratified by gender, ethnicity and risk group (54 parameter values).

Identification and selection of evidence:
We identified longitudinal HIV cohort data from HIVRN as the best available evidence to estimate monthly transition rates for PLHIV on-ART to off-ART (S2 Supplement Tables 3.3

Derivation of model parameters
We estimated the probabilities of ART-dropout by simultaneously estimating the transition probability between CD4 strata, ART dropout probability, and ART re-initiation probability using a continuous-time multi-state Markov model. Full details on estimation of disease progression are described elsewhere [113]. 36 ART dropout

ART re-initiation
Model input parameters: We required gender, ethnicity and risk-group-stratified rates for individuals re-initiating ART after interruptions in treatment (18 parameter values).

Identification and selection of evidence:
We identified longitudinal HIV cohort data from HIVRN as the best available evidence to estimate monthly transition rates for PLHIV on ART (S2 Supplement Table C3. 3

Derivation of model parameters
We assumed that ART re-initiation rates were equivalent across CD4 cell categories. We estimated ART re-initiation probabilities by simultaneously estimating the transition probability between CD4 strata, ART dropout probability, and ART re-initiation probability using a continuous-time multi-state Markov model. Full details on estimation of disease progression are described elsewhere [113].

HIV disease progression on ART
To model HIV disease progression for PLHIV on ART, we included monthly probabilities for transition between CD4 cell count categories. PLHIV on ART could transition between any of the three CD4 cell count categories.

Derivation of model parameters
We estimated disease progression on ART by simultaneously estimating the transition probability between CD4 strata, ART dropout probability, and ART re-initiation probability using a continuous-time multi-state Markov model. Full details on estimation of disease progression are described elsewhere [113]. We derived ranges from 95% confidence intervals generated in primary analysis of HIVRN data.

HIV disease progression off ART
To model HIV disease progression off ART, we included monthly transition probabilities for transition between CD4 cell count categories for PLHIV off ART.

Model input parameters:
We required transition probabilities from acute to chronic HIV for infected/unaware and diagnosed PLHIV, common across city, gender, race/ethnicity, and risk group (2 parameter values).

Identification and selection of evidence:
We identified longitudinal cohort studies as the best available evidence for estimates of disease progression off ART (S2 Supplement  Tables C3.5

Derivation of model parameters
We derived our estimates based on disease progression from ≥ 200 cells/µL to 200-499 cells/µL using the transition rate from HIV to AIDS observed from events per 100 personyears [110]. We assumed that monthly disease progression rates from ≥ 500 cells/μL to 200-499 cells/µL and from 200-499 cells/µL to < 200 cells/µL for infected/unaware PLHIV were equivalent. In the absence of treatment, we assumed that PLHIV progressed from higher to lower CD4 cell counts while off ART.

Transition from acute to chronic states for infected and diagnosed
Model input parameters: We required transition rates from acute to chronic HIV for infected/unaware and diagnosed PLHIV, common across city, gender, race/ethnicity, and risk group (2 parameter values).

Identification and selection of evidence:
We identified longitudinal cohort studies as the best available evidence to estimate transitions from acute to chronic HIV states for infected/unaware and diagnosed PLHIV (S2 Supplement Tables C3.5.2.1 & C3.5.2.2) [50].

Derivation of model parameters
We derived the transition from acute to chronic HIV by converting literature estimates for the duration that PLHIV spend in acute HIV into a monthly rate. We derived transitions from acute to chronic HIV for diagnosed PLHIV based on assumptions relative to infected/unaware PLHIV.

HIV prevention programs
To model the efficacy of HIV prevention programs, we included model parameters capturing total syringe distribution volumes from syringe services programs (SSP) and OAT coverage among PWID/MWID by city, as well as time-varying estimates of preexposure prophylaxis (PrEP) coverage among HIV-negative MSM/MWID.

SSP coverage
We derived SSP coverage parameters for cities in our model based on the total volume of syringe distribution, divided by the PWID/MWID population.

Model input parameters:
To calculate SSP coverage, we required total SSP volume estimates by city (1 parameter value).

Identification and selection of evidence:
We identified the best available evidence for Atlanta based on estimates from the Atlanta Harm Reduction Coalition in 2016 [114]. Estimates for Baltimore were based on the City of Baltimore Syringe Exchange Program in 2016 [115]. Estimates for Los Angeles were based on direct correspondence with the City of Los Angeles AIDS Coordinator's Office for Los Angeles [116]. Estimates for Miami were based on national CDC estimates, as local surveillance estimates were not available [117]. Estimates for New York City were based on New York state department of health reports in 2012 [118]. Estimates for Seattle were based on direct correspondence with Public Health -Seattle & King County for Seattle (Supplement Table C4.1) [119].

Derivation of model parameters
We assumed that syringes were distributed equally according to the gender and race/ethnicity proportions in the underlying population of PWID/MWID.

=
We derived lower ranges for Atlanta and Miami based on assumptions, and upper ranges based on national urban SSP averages [117]. We derived SSP low ranges for NYC from 2006 estimates, and high ranges by extrapolating SSP volume trends to 2016.

SSP effectiveness for reducing shared injections
Model input parameters: We required a multiplier for reduction in shared injections due to SSP for PWID and MWID, common across cities (1 parameter value).

Identification and selection of evidence:
We identified meta-analyses of observational studies as the best source for estimating the effectiveness of SSP in reducing shared injections (S2 Supplement Table C4.1.2) [120].

Derivation of model parameters
Given the lack of direct evidence for the reduction in shared injections due to SSP, we derived our parameter estimate based on the pooled effect size of SSP in reducing HIV transmission for PWID/MWID. Two pooled effect sizes were reported in the source study (one analysed from all included studies and one only across the subset of higher quality studies) and we derived the parameter based on the latter estimate.
We used 95% confidence intervals from literature estimates to derive high and low ranges for the reduction in shared injections due to SSP.

Opioid agonist treatment
To model opioid agonist treatment (OAT), we included parameters capturing the initial stratification of PWID/MWID on-and off-OAT. We also included monthly rates for OAT retention, and the efficacy of OAT in improving adherence to ART and reducing shared injections.

Model input parameters:
We required numbers of PWID and MWID receiving OAT, stratified by race/ethnicity for PWID and MWID, as well as gender for PWID (9 parameter values).

Identification and selection of evidence:
We identified administrative databases and surveillance data as the best available evidence for the number of PWID/MWID on OAT. We used TEDS data to estimate the number of individuals receiving methadone in 2010 (S2 Supplement Table C4.2.1) [121]. We estimated the number of individuals receiving Buprenorphine using the DATA waivered physician capacity for each city, and the proportion of statewide DATA waivered physicians in each city (S2 Supplement We derived ranges for PWID on OAT based on the variation in proportion of PWID among DATA-waivered physician patients.

OAT entry and dropout
Model input parameters: We required OAT dropout rates for PWID and MWID, common across cities, gender and race/ethnicity (1 parameter value). 43 Identification and selection of evidence: We identified systematic reviews of cohort studies as the best available evidence to estimate OAT retention (S2 Supplement Table  C4.2.6) [125].

Derivation of model parameters
We derived OAT dropout rates for PWID/MWID as the midpoint between the lowest and highest observed dropout rates among cohort studies in the systematic review.
OAT dropout rate = ℎ ℎ We derived ranges using the highest and lowest observed OAT dropout rates among studies used to derive the midpoint estimate.

Model input parameters:
We required a multiplier for increased retention in ART due to OAT for PWID and MWID, common across cities (1 parameter value).

Identification and selection of evidence:
We identified cohort-based observational studies as the best source for estimating the effectiveness of OAT in improving ART retention (S2 Supplement

Derivation of model parameters
We derived the multiplier for OAT effectiveness on ART dropout rates using odds-ratios for the effect of OAT on reducing ART dropout rates.

=
We used 95% confidence intervals from literature estimates to derive upper and lower ranges for the reduction in ART dropout due to OAT.

OAT effectiveness for reducing shared injections
Model input parameters: We required a multiplier for proportional reduction in shared injections due to OAT for PWID and MWID, common across cities (1 parameter value).

Identification and selection of evidence:
We identified meta-analyses of cohort studies as the best source for estimating the effectiveness of OAT in reducing shared injections (S2 Supplement Table C4.2.8) [127]. 44

Derivation of model parameters
Given the lack of direct evidence for the reduction in shared injections due to OAT, we derived our parameter estimate based on the rate ratio for HIV infection risk for PWID/MWID on OAT relative to those off OAT.
We used 95% confidence intervals from literature estimates to derive high and low ranges for the reduction in shared injections due to OAT.

Pre-exposure prophylaxis
To model the scale-up of pre-exposure prophylaxis (PrEP), following the approval of medication for use as PrEP in 2012, we included parameters for the number of HIVnegative MSM/MWID on PrEP, screening rates, duration that HIV-negative individuals remain on PrEP, as well as the efficacy of PrEP in reducing HIV transmission.

Derivation of model parameters
We set initial numbers of HIV-negative MSM/MWID on PrEP to zero in 2011 (baseline) to begin model calibration, as Truvada was not approved by the FDA until 2012 [129]. For all cities, we derived total numbers on PrEP for 2012 through 2017 (to account for its recent rapid growth in uptake among MSM) using AIDSVu surveillance data on the number of unique men who had at least one day of prescribed PrEP by ZIP code [130]. We used state-level growth from 2016 to 2017 to derive city-level estimates for the number of individuals on PrEP in 2017, as city-level data were not available. We derived ranges for the number of individuals on PrEP using a lower bound derived from Seattle & 45 King County surveillance reports [119] and an upper bound from literature estimates [131]. We derived race/ethnicity proportions receiving PrEP for New York City based on Medicaid prescription data [132], and for other cities except Miami using weighted national race/ethnic proportions of MSM accessing PrEP [128], weighted by statewide race/ethnic distribution of the MSM population [133]. For Miami, given a large disparity in race/ethnic composition between Florida State and Miami-Dade County, we weighted the PrEP proportion estimates with city-level MSM demographics that we estimated from model instantiation (S1 Supplement C, Section 3.1.1). To derive screening rates for individuals on PrEP, we assumed that individuals on PrEP were tested every three months as this was a requirement for PrEP [134] (S2 Supplement Table C4. 3.4). To derive the average duration the HIV-negative individuals remain on PrEP, we assumed that individuals remained identified for 3-months to mirror CDC testing guidelines [134] (S2 Supplement Table C4.3.5).

PrEP effectiveness
Model input parameters: We required multipliers for the reduction in transmission due to PrEP for MSM/MWID, common across cities, race/ethnicity, and risk group (1 parameter value).

Identification and selection of evidence:
We identified literature estimates for PrEP effectiveness in reducing HIV transmission as the best available evidence (S2 Supplement Table C4.3.3) [135,136].

Derivation of model parameters
We derived PrEP effectiveness from PrEP efficacy estimates for individuals maintaining optimal protective levels of adherence (≥4 doses/week) [135], multiplied by the proportion of individuals maintaining protective levels of adherence [136].

PrEP effectiveness
We derived ranges for PrEP efficacy from 95% confidence intervals in the source study. 47

Costs of Medical Care
We derived monthly costs among PLHIV from primary analysis of longitudinal HIV cohort health care utilization data, combined with unit costs for individual expenditure items. We estimated costs among HIV-negative individuals using nationally representative health care expenditure data from the general population. Identification and selection of evidence: We identified longitudinal cohort data from HIVRN data to estimate health resource use for PLHIV [92]. For unit costs of individual health care components, we selected Medicaid physician fee schedules [137], prescription drug costs from national FSS price schedules [138], costs of diagnostic testing [139], and cost estimates for emergency department visits and inpatient hospitalizations [140] (S2 Supplement Tables C5.1

Derivation of model parameters
We applied unit costs to HIVRN utilization records to derive total medical care costs among PLHIV, stratified by risk group and CD4 cell count. We estimated medical care costs of PLHIV by HIVRN region and applied estimates to cities within each region. Full details of cost estimation are described elsewhere [141].
We used 95% confidence intervals derived from HIVRN primary analysis estimates to derive upper and lower range values.

Medical costs among HIV-negative
Model input parameters: We required monthly health care costs for HIV-negative, stratified by risk group (PWID, non-PWID), for each city (2 parameter values).

Identification and selection of evidence:
We identified population-level estimates of health care costs from MEPS data in 2016, stratified by census region, to capture health care costs of HIV-negative individuals in our model [142] (S2 Supplement Table C5.2.1). We used a multiplier for costs among HIV-negative PWID based on observed cost 48 differences among HIV-positive PWID and non-PWID in HIVRN primary analysis (S2 Supplement Table C5.2.2) [143,144].

Derivation of model parameters
We used 95% confidence intervals from MEPS estimates to derive high and low ranges. We derived ranges for PWID costs among HIV-negative from literature estimates. 49

Health utility weights
Quality-adjusted life years (QALYs) for individuals were a primary outcome of our costeffectiveness model. To calculate QALYs, we included model input parameters that assigned health utility weights to each health state, based on literature estimates and additional assumptions.

Identification and selection of evidence:
We identified longitudinal cohort studies as the best available evidence to derive health utility weights. We selected health utility weights from literature sources for infected PLHIV [145,146,147,148,149,150] (S2 Supplement  [20], and OAT among PWID (S2 Supplement  Table C6.2.5) [151].

Derivation of model parameters
We used 95% confidence intervals from literature estimates to derive ranges for health utility weights.

HIV-uninfected
Model input parameters: We required health utility weights for HIV-negative individuals, stratified by risk group (PWID vs. non-PWID), OAT status among PWID (OAT vs. non-OAT) (3 parameter values). 50 Identification and selection of evidence: We identified longitudinal cohort studies as the best available evidence to derive health utility weights. For PWID health utility weights, we used literature estimates (S2 Supplement  [151].

Derivation of model parameters
We assumed quality of life weights for non-PWID, HIV-negative individuals to be 1, as a reference health state for relative quality of life weights in other health states.
We used 95% confidence intervals from literature estimates to derive high and low ranges for health utility weights.

S1 Supplement D: Primary Analyses
This supplement provides a brief description of the data sources from which we conducted primary data analysis to populate our model parameter values.

Data source description
The York City, and Seattle). Data were available for all five cities except for the HET cycle 4, which was not conducted in New York City, and Seattle. All participants with complete interview data who were sexually active, or who actively injected drugs in the L12M were included in our analysis. To fit the specifications of our simulation model, we excluded participants who were older than 64 years. Additionally, we excluded those who selfreported HIV positive, as we aimed to estimate risk behaviors and PrEP uptake among HIV negative or HIV status-unaware populations.

Statistical analysis
Descriptive analysis was conducted to obtain summary statistics on parameters of interest stratified by city, race/ethnicity, gender, risk group (MSM, MSM/PWID, PWID, HET) and calendar year. As MSM/PWID participating in the MSM cycle might be different from those identified in the PWID cycle due to different sampling methods, we estimated two sets of parameters for MSM/PWID using data from each cycle separately. For parameters such as annual number of opposite sex partners and same sex partners, we obtained the mean, standard deviation, median, interquartile range, and the 10 th and 90 th percentiles of the estimates. Due to the great uncertainty in quantifying individuals' sexual risk behaviors, these multiple statistics describing the risk behavior distribution will assist in our calibration of these parameters. We calculated the proportion and the exact 95% confidence intervals (CIs) of individuals who use condom every time with opposite/same partners, PWID who 52 use needles which have been used by others, MSM who report PrEP indication/uptake, and individuals who receive HIV testing, during L12M. For the calculation of proportion of individuals who receive HIV testing in L12M, the denominator contained HIV negative or HIV status-unaware participants and HIV positive participants who were diagnosed in L12M.
Demographics of sex partners were collected in PWID cycle 4 and HET cycle 4 only, and were used to estimate the assortative mixing by race/ethnicity for PWID and HET, respectively stratified by city. First, we cross tabulated the participants' race/ethnicity by their most recent sex partners' race/ethnicity. Second, we calculated the Newman's assortative mixing coefficients and 95% CIs stratified by race/ethnicity [152]. This coefficient captures the probability of mixing with the same race/ethnicity, accounting for mixing with the same race/ethnicity by chance alone. As sensitivity analysis, we also calculated the assortative mixing coefficients using data from participants' up to three most recent sex partners.

Data source description
The states and one territory were sampled, then facilities in those areas providing outpatient HIV care, and finally, eligible PLHIV. MMP methods, including non-response bias analysis and weighting techniques, have been described in detail elsewhere [153,154].

Study sample
The national population of inference for MMP is all HIV-diagnosed persons aged ≥ 18 years living in the United States. For each project area, the population of inference is all HIV diagnosed persons aged ≥ 18 years whose most recently reported address was within the project area. We to calculate city-specific estimates, we received data from 5 project areas, 1) Georgia (Atlanta), 2) Florida (Miami), 3) Los Angeles county, 4) New York City, 5) Washington state (Seattle).

Statistical analysis
We Parameters included CD4 distribution among on-ART and off-ART PLHIV, distribution of CD4 cell counts at diagnosis, as well as the proportion of PLHIV initiating ART within 30 days of diagnosis, stratified by CD4 cell count. We received proportions of heterosexual and homosexual assortative mixing for the last sexual partner and last 5 sexual partners, as well as number of same and opposite sexual partners in the past 12 months and proportion of individuals using a condom with all sexual partners.

Data source description
The HIVRN care providers are a consortium of adult and pediatric clinics located in the Northeast (Rochester, Boston, New York City, Baltimore, and Philadelphia), South (Dallas, Memphis, Tampa) and West (Portland, Oakland, and San Diego) regions of the United States. HIVRN sites abstract specified data elements from patients' medical records, including demographic data, service utilization, medications, and laboratory tests; abstracted data are assembled into a single database after quality assurance review.

Study sample
We included individuals aged 15-64 who were enrolled in participating clinics between 01 January 2007 and 30 September 2015.

Statistical analysis
Disease progression on ART characterized by the transition probability between CD4 strata, ART dropout probability, and ART re-initiation probability were simultaneously estimated using a continuous-time multi-state Markov model. This was operationalized by a matrix with 14 possible instantaneous transitions (CD4 ≥500 to 200-499; ≥500 to off-ART; ≥500 to death; 200-499 to ≥500; 200-499 to <200; 200-499 to off-ART; 200-499 to death; <200 to 200-499; <200 to off-ART; <200 to death; off-ART to ≥500; off-ART to 200-499; off-ART to <200; and off-ART to death). Analysis was stratified by region, and adjusted for risk group, race/ethnicity and gender.

National Survey of Family Growth (NSFG)
Data source description 54 The NSFG is a nationally representative survey of the U.S. household population using stratified multi-stage area probability sampling. Face-to-face interviews and audio computer-assisted self-interviews were conducted with women and men aged 15-44 years, capturing information such as sexual behavior, family life, and health status. The public use data files were available on the NSFG website. Additionally, a user agreement can be signed to gain access to a REGION file, which includes a 4-category REGION variable (Northeast, West, Midwest, and South), capturing respondent's residence at the time of the interview. More detailed residence information can only be accessed at the Census Research Data Centers.

Study sample
We obtained data from the 2011-2013 NSFG survey, including the public use data file and the REGION variable. A total of 5,601 women and 4,815 men were included. We used the NSFG data to estimate the sexual risk behaviors of HETs and low-risk MSM. MSM was defined as male who report having a male sex partner in the L12M. HET was defined as all participants except MSM and those who report injecting drugs in the L12M. We did not obtain the estimates for PWID due to the small sample size (N=47).

Statistical analysis
We performed survey analysis accounting for the stratified probability sampling design and sampling weight to obtain summary statistics on parameters of interest. The SAS proc surveymeans procedure was used to obtain the weighted means and 95% CIs for parameters including annual number of same/opposite sex partners, and proc surveyfreq was used to obtain the proportion and 95% CIs of individuals who use condom every time in the L12M. The domain statement was used to obtain estimates on subgroups stratified by region, race/ethnicity, gender, and risk group (MSM, HET).
Only the demographics of opposite sex partners were available, and were used to estimate the assortative mixing by race/ethnicity for HETs stratified by region. We calculated the Newman's assortative mixing coefficients and 95% CIs for each race/ethnicity group [152].

Data source description
The AWARE study recruited 5,012 HIV negative or HIV status-unaware patients (≥ 18 years) seeking services from sexually transmitted disease clinics in nine U.S. cities (Pittsburgh, Jacksonville, Los Angeles, Miami, Portland, Seattle, Columbia, San Francisco, and Washington, DC.) between April and December 2010. It was a randomized controlled trial designed to assess the effect of brief patient-centered risk-reduction counseling at the time of a rapid HIV test on the subsequent acquisition of sexually transmitted infections. At 55 the baseline and 6-month follow-up, participants were assessed for their sexual risk behaviors and injection drug use in the last 6 months (L6M).

Study sample
We included all participants aged between 18-64 years in our analysis.

Statistical analysis
Descriptive analysis was performed to obtain estimates on participants' baseline HIV risk behaviors in the L6M among the full study sample, as well as among samples from each of the three cities of interest (Los Angeles, Miami, and Seattle) separately. All analysis was stratified by risk group (MSM, MSM/PWID, PWID, and HET), race/ethnicity and gender. We calculated the median and interquartile range estimates for the number of same and opposite sex partners, respectively. For MSM and MSM/PWID, their number of same sex partners was approximated by the number of anal sex partners, and the number of opposite sex partners was approximated by the number of vaginal sex partners. Additionally, we calculated the proportions and the exact 95% CIs of individuals who use condom every time, and PWID who report using needles which have been used by others.

Data source description
The ALIVE study is a prospective cohort of adult (18 years +) who reported injection drug use within the past 11 years, recruited through community outreach in Baltimore, MD. The initial enrollment began in 1988 with 2,938 participants recruited (88% of participants were black/African American), and an additional 1,733 PWID were enrolled through later recruitment efforts. Follow-up visits for ALIVE occur semi-annually. At each visit, participants complete surveys and laboratory testing. Information on behavior, life events, and health-related outcomes is captured using audio-computer assisted self-interview.

Study sample
We included all black/African American participants who had an assessment (regardless of whether it is a baseline or follow-up) completed in 2010, and reported injection drug use in the L6M. We excluded those who were HIV positive, or those who were older than 64 years at the time of assessment.

Statistical analysis
Descriptive analysis was employed to obtain estimates on risk behaviors using data from the first assessment in 2010 of each participant. All analysis was stratified by risk group (MSM/PWID, and PWID), and gender. We calculated the median and interquartile range estimates for the number of same and opposite sex partners, respectively, and calculated the proportions and the exact 95% CIs of individuals who use condom every time, and PWID who report sharing needles.

Data source description
The CDC's BRFSS is the largest health-related cross-sectional surveillance survey of U.S. residents and collects health-related risk behaviors and chronic health conditions from adults, aged 18 years and older. The BRFSS uses two samples: one for landline telephone respondents and one for cellular telephone respondents, and includes statelevel stratification. The public use data files were available on the BRFSS website (www.cdc.gov/brfss) with respondents' geographic information available at the state level, and included indication of residence in a Metropolitan Statistical Area.

Study sample
We obtained data from the public use file of the 2010 BRFSS survey, including the MSCODE variable indicating non-nominal MSA residence. We included all adults aged <65 residing in the center city of an MSA or inside the county containing the center city of an MSA (MSCODE==1 | MSCODE==2) for each respective state, and assumed that population-level HIV testing behavior reported in BRFSS was representative of HIV testing behavior for low risk HET and for low risk MSM.

Statistical analysis
We estimated gender and race/ethnicity stratified HIV testing in the L12M accounting for both design weighting and iterative proportional fitting, and we used weights assigned to each respondent for the landline telephone and cellular telephone combined data to obtain the weighted means and 95% CIs.

Data source description
The New York City CHS is an annual cross-sectional telephone survey of randomly selected adults aged 18 and older from all five boroughs of New York City (Manhattan, Brooklyn, Queens, Bronx, and Staten Island). The survey is conducted using a computerassisted telephone interviewing system and collects self-reported data from selected respondents with landline telephones and cell phones. The public use data files were available on the CHS website (https://www1.nyc.gov/site/doh/data/data-sets/communityhealth-survey-public-use-data.page)

Study sample
We obtained data from the public use file of the 2010 CHS survey, and we included all women aged <65 and all men aged <65 not reporting having sex with a man in the past 12 months (MSM==2). We assumed that HIV testing behavior reported in CHS was representative of HIV testing behavior for low risk HET and low risk MSM. 57

Statistical analysis
We estimated gender and race/ethnicity stratified HIV testing in the L12M accounting for probability of selection and post-stratification weights to obtain the weighted means and 95% CIs.

Data source description
The Substance Abuse and Mental Health Services Administration's (SAMHSA) TEDS admission data set is the only national client-level database on substance abuse treatment, and includes routinely collected information on all individuals admitted to facilities that receive public funds. In TEDS, an admission is defined as the formal acceptance of an individual into substance abuse treatment. The public use data files were available on the TEDS website (https://wwwdasis.samhsa.gov/dasis2/teds.htm) with geographic information available for admissions at the state level, including Core-Based Statistical Areas (CBSA) indicators.

Study sample
We obtained data from the 2010 to 2014 TEDS public use files, and we included all individuals aged ≥15 residing in the CBSA corresponding to each respective city, and defined PWID with OUD as individuals reporting: (i) the primary, secondary or tertiary substance use of heroin, non-prescription methadone, and other opiates and synthetics (including buprenorphine, codeine, Hydrocodone, hydromorphone, meperidine, morphine, opium, oxycodone, pentazocine, propoxyphene, tramadol, and any other drug with morphine-like effects); and (ii) injection as route of administration. As the data did not capture repeated treatment admissions, we assumed that each admission reported in TEDS was for a unique individual.

Statistical analysis
We estimated the number of individuals in a given year receiving opioid agonist treatment (i.e. medication-assisted treatment) by gender and race/ethnicity for every city except Atlanta (all OAT data was missing for Georgia (variable METHUSE)). As TEDS has been shown to be a conservative source of treatment coverage [155], we defined a conservative lower range by excluding PWID with OUD aged>55 and defined a liberal upper range by including ALL individuals reporting use of heroin.

S1 Supplement E: Model Calibration/Validation, Parameter Ranges, and PSA Probability Distributions
Since data used to populate the model were not always available, representative for specific subpopulations of interest, up-to-date, and/or exhibiting large variation, we undertook an extensive model calibration and validation exercise which will be described in further detail in an upcoming manuscript [156]. We provide an overview below.
We adopted a direct-search, Nelder-Mead algorithm, to iteratively calibrate key parameters with high uncertainty against three sets of observed calibration endpoints, including the total number of diagnosed PLHIV in each year, annual new HIV diagnoses, and annual all-cause mortality among PLHIV. We adjusted parameter values within prespecified ranges using 2011 as the baseline year, and compared model outputs with reported values between 2012-2015 until the weighted mean percentage deviation (goodness-of-fit (GOF) metric) was minimized. We selected the set of key parameters for calibration using a one-at-a-time factor screening approach (the Morris method), on the basis of underlying uncertainty associated with evidence input estimates and their effects on the calibration targets. We also conducted model validation to assess face-validity, as well as the internal and external validity of the model. In particular, we externally validated model projections against the empirically estimated annual number of HIV incident cases.
In subsequent sections, we discuss the process for deriving model calibration targets for each city. Due to limitations in publically available surveillance data, and varying data quality across cities, we also documented the process by which we triangulated cityspecific calibration targets from alternative sources. Data and evidence sources for model calibration and validation targets are attached in supplementary Excel files (S2 Supplement Tables E2 -E7).

Model Calibration Targets
We derived model calibration targets directly from city-level HIV surveillance reports when available, and describe the methods and assumptions required to create the calibration targets for our model. As with other evidence sources, any race/ethnicity other than black/African American or Hispanic/Latino, was categorized as white/other, and any risk group other than MSM, PWID and MWID was classified as HET to match overall totals of diagnoses, total PLHIV and all-cause deaths. Other assumptions that were required due to data limitations for specific cities are detailed below.

New HIV Diagnoses
Yearly numbers of new HIV diagnoses were derived directly from city-level HIV surveillance where available. In New York City [157], Los Angeles [158], Baltimore [159], and Seattle [160], numbers of new HIV diagnoses were available for all years with 59 minimal triangulation required. In most cases, two-way stratified totals were reported (by race/ethnicity and gender, race/ethnicity and risk group, and risk group and gender), and we triangulated three-way stratified estimates to match our target model calibration inputs. For Miami, the number of new HIV diagnoses were available for 2013-2015 [161], stratified by race/ethnicity and gender/risk group. We assumed that race/ethnicity proportions were equivalent for each risk group. We derived new diagnoses for 2012, using the state level diagnoses, adjusted for the relative proportion of new diagnoses from Miami for each PLHIV strata. For the Atlanta EMA (city boundary encompassed the counties described in S1 Supplement A), we used a 2014 EMA report of new HIV diagnoses [162], combined with Georgia state level surveillance estimates to derive relative percentages between the city and state for each strata of PLHIV [162,163]. We used these percentages to derive city-level estimates for 2012, 2013 and 2015, based on the assumption that city-level numbers remained the same relative proportion of statelevel numbers for each strata as in 2014.

Total Diagnosed PLHIV
Total population numbers for diagnosed PLHIV were derived directly from city-level HIV surveillance reports and data queries where available. In New York City [157], Los Angeles [158], Seattle [160], and Baltimore [159], total numbers of PLHIV were available for all years. As with new diagnoses, the majority were two-way or fully stratified and required minimal triangulation. For Miami, we used data on total diagnosed PLHIV for Miami-Dade county from AIDSVu for 2011-2014 [164], and Miami-Dade county HIV surveillance data for 2015 [68]. For Atlanta, we used 2014 EMA reports to triangulate PLHIV numbers for 2011-2013 and 2015, combined with state-level estimates to derive relative percentages between city and state for each strata of PLHIV [162]. As with new diagnoses, we used these percentages for derive city-level estimates of the diagnosed PLHIV population for 2012, 2013, and 2015, using state-level numbers, assuming that city-level numbers remained the same relative proportion of state-level numbers for each strata in those years as 2014.

All-cause Mortality among PLHIV
We derived yearly all-cause mortality among PLHIV from city-level HIV surveillance reports where possible. For New York City [157], and Los Angeles [158], two-way stratified totals were reported for all-cause mortality among PLHIV, and we triangulated using the same methods as for new HIV diagnoses. For Baltimore, only total HIV-specific deaths were reported at the city-level [159], so we used state-level differences between the number of HIV-specific deaths and all-cause deaths among PLHIV to derive relative ratios between HIV-specific mortality and all-cause mortality. We derived the numbers of all-cause deaths for PLHIV in Baltimore using these proportions to distribute all-cause deaths to each stratified subgroup. In Miami, we used state-level numbers of HIV-specific deaths, combined with state-level all-cause mortality among PLHIV to derive a multiplier for HIV-specific deaths [165]. We applied this multiplier to reported numbers of city-level HIV-specific deaths to derive the number of all-cause deaths among PLHIV for Miami. In Seattle, fully stratified city-level numbers for all-cause mortality among PLHIV were not reported, so we used state-level estimates to derive relative proportions of deaths for every strata [166]. We applied state-level proportions to city-level totals for allcause mortality among PLHIV to derive fully stratified city-level mortality totals [160]. For Atlanta, we used state-level estimates of all-cause mortality from 2012-2015, stratified individually by risk group, gender and race/ethnicity [167], weighted by the relative proportion of diagnosed PLHIV within each strata in the 20 counties, relative to the entire state.

Model Validation Targets
We used yearly incidence estimates for each city as a validation step for our model. Among the six cities, New York City reported separate incidence estimates for each year from 2012-2015. For other cities, including Los Angeles (for which independent city-level incidence estimates were only available for 2012 and 2013), we derived incidence estimates using the difference between observed yearly state-level diagnoses and yearly state-level incidence estimated by the CDC [167]. We used these proportions to derive city-level incidence numbers based on the number of city-level diagnoses.

Population Growth Projections
To ensure that our model matched long-term population growth projections and changing demographics in each city, we used external reports and/or data to adjust total population growth parameters. Projections stratified by age, gender and race/ethnicity, were available for Atlanta [79], Baltimore [80], Los Angeles [81], and Miami [82]. For New York City, race/ethnicity stratified growth projections were not available at the city, county or state level, so we triangulated projections using age-gender stratified estimates combined with national trends for changing race/ethnic population compositions to 2040 [83,84]. For Seattle, stratified growth projections were not available at the city level, so we used population projections for Washington State [85]. We incorporated these projections to ensure that population growth parameters produced projections that matched the overall growth rates from long-term projections accounting for external factors and trends affecting city growth rates.

Probabilistic Sensitivity Analysis
Prior parameter ranges and probability distributions were used in both model calibration and probabilistic sensitivity analysis (PSA) processes from which parameter values were randomly drawn. Using distributional assumption guidelines [168,169], we fit each 61 parameter with parametric distributions according to the data type and quality of available evidence (S1 Supplement Table E1). Given these differences, the fitted distributions do not necessarily map to the parameter categories presented earlier.
62 S1 Supplement Figure E1. Duration of acute stage and transmission probability during acute stage † We did not explicitly model distributional assumptions and ranges for census and surveillance report data given the relatively low levels of uncertainty for these parameters.

S1 Supplement F: Scientific Advisory Committee Survey
Our review of literature and publicly available reports was able to provide the majority of data values needed for the dynamic compartmental model. However, where data was unavailable or was not rated highly in our quality assessment, we sought the expert advice from members of our Scientific Advisory Committee (SAC) on how to proceed in finding additional sources, and utilizing the data available to us.
To track experts' advice, we developed a comprehensive, web-based survey tailored for each of the cities using SurveyMonkey® and sent an invitation to our SAC leads. The survey asked 60 questions and was designed to take less than an hour to complete, with mandatory questions built throughout to elicit complete responsiveness. Participants were asked to review the best data points we found, identify additional sources we were not aware of, and describe their preference for assumptions we offered to improve the representativeness of data we worked with using triangulation. Participants were also encouraged to share the survey with public health experts in their respective city for additional review and input. The full survey is attached in a supplementary PDF document.
The survey was first made available on July 20, 2017 and remained open until October 31, 2017. 1 to 7 experts completed the survey per city, including 9 SAC members and 5 additional local public health experts. Two participants completed the survey for more than one city. The survey also revealed additional data sources identified by respondents, including internal data/reports not publicly available. We were able to access these reports directly or through assistance from our contacts within each city.
Using respondents' advice, we applied the following assumptions to adjust or incorporate the best available data into our model's starting values (S1 Supplement Table F1).