Estimating Incidence from Prevalence in Generalised HIV Epidemics: Methods and Validation

Background HIV surveillance of generalised epidemics in Africa primarily relies on prevalence at antenatal clinics, but estimates of incidence in the general population would be more useful. Repeated cross-sectional measures of HIV prevalence are now becoming available for general populations in many countries, and we aim to develop and validate methods that use these data to estimate HIV incidence. Methods and Findings Two methods were developed that decompose observed changes in prevalence between two serosurveys into the contributions of new infections and mortality. Method 1 uses cohort mortality rates, and method 2 uses information on survival after infection. The performance of these two methods was assessed using simulated data from a mathematical model and actual data from three community-based cohort studies in Africa. Comparison with simulated data indicated that these methods can accurately estimates incidence rates and changes in incidence in a variety of epidemic conditions. Method 1 is simple to implement but relies on locally appropriate mortality data, whilst method 2 can make use of the same survival distribution in a wide range of scenarios. The estimates from both methods are within the 95% confidence intervals of almost all actual measurements of HIV incidence in adults and young people, and the patterns of incidence over age are correctly captured. Conclusions It is possible to estimate incidence from cross-sectional prevalence data with sufficient accuracy to monitor the HIV epidemic. Although these methods will theoretically work in any context, we have able to test them only in southern and eastern Africa, where HIV epidemics are mature and generalised. The choice of method will depend on the local availability of HIV mortality data.

; If we ignore the possibility of sero-conversion and death occurring in the same interval, the relationship between the size of the cohort at successive sero-surveys can be found in the following way: where i μ is the mortality rate for those not infected with HIV. Dividing by 0 , i N : For the calculations presented here, we use 01 . 0 = i µ for all age-ranges, which approximates the observed mortality rate for uninfected individuals aged 15-44 in sub-Saharan Africa [1,2].

Derivation of cross-sectional measures
Case 1: Inter-survey period not longer than width of age-group ( ) r T ≤ In this case, the estimate of incidence in the cross-sectional age-group includes the experience of two cohorts (Figure 1(c)). We calculate an average rate from these two cohort incidence rates, weighted according to the time spent by the cohorts in the fixed age-group.
The time spent by each cohort is proportional to the shaded areas in Figure 1(c).
Total person-years spent in fixed age-group: r T.
Fraction of person-years spent by cohort Fraction of person-years spent by cohort i : r T 2 1 − Hence, equation (7): Case 2: Inter-survey period longer than width of age-groups ( ) The same logic applies when the inter-survey period is longer, but the weighting equation must be adjusted because more cohorts pass through the cross-sectional age-groups in the period. If r T r 2 ≤ < , three cohorts pass through the cross-sectional age-group: Fraction of person-years spent by cohort Hence, the new expression to replace equation (7) is: In theory, further expressions could be found for longer inter-survey periods (i.e. r T 2 > ), but since the linear approximations which underlie these methods work better over shorter intervals, the derived estimates would be less reliable and this is not recommended.

Alternative Formula for Calculating Cohort Survival Rates
Equation (5) holds when r T ≈ . Experimentation in situations when this is not the case suggests that small departures from this (i.e. r T < ) will not introduce large errors. However, we have derived an alternative formula that should give a better approximation when

Details of simulation model
The simulation model generates prevalence and mortality measurement as a function of time for a given pattern of incidence. Here a is age and t in the time since the epidemic stared (both in years); ) , ( t a U is the number of uninfected individuals with that many years of sexual activity and ) , , ( t w a I is the number individuals infected w years ago with that many years of sexual activity. The rate of change of ) , ( t a U and ) , , ( t w a I is described by the following partial differential equations.
) , ( t a λ is the instantaneous incidence rate; ) (a µ is the rate of mortality from causes other than AIDS; is the hazard of AIDS-death for those infected at age x who have survived z years. In these simulations, survival after infection is parameterised as in Table 1 and the background mortality rates are based on World Bank estimates for Africa in the pre-AIDS era [3].
The boundary conditions are: (a φ is the fraction of the population at age a, in the absence of AIDS-mortality. Three scenarios for the age pattern of incidence are simulated -constant incidence over age, incidence highest at older ages (reflecting patterns expected early in epidemics) and incidence highest at younger age (reflecting patterns observed in mature epidemics) (see Figure 4(b)). To generate changes over time, these age-specific rates are multiplied by a scaling variable, which is set to unity and then increases or decreases (or remains the same) at a specified time (see Figure 3 for illustration of temporal changes).
The impact of ART is simulated by manipulating the survival distribution with HIV so that it instead reflects survival until "ART is started". Models indicate that in the current mode of ART delivery (Hallett et al., submitted), individuals that are started on ART will typically have CD4 cell counts below 200 and be within two years of death without therapy. Therefore, here it is assumed that individuals that can access ART are initiated two years before they would otherwise die of AIDS. Survival for those individuals that do not receive ART is distributed exponentially with mean 2 years.
A fraction of individuals that newly become in need of ART, g(t), receive ART. This fraction can change over time to represent no provision in the early part of the epidemic, a linear scale-up of provision over five-years to 30% and then a maintained level thereafter.