Acquisition, prevalence and clearance of type-specific human papillomavirus infections in young sexually active Indian women: A community-based multicentric cohort study

In context of the ongoing multi-centric HPV vaccine study in India, unvaccinated married women (N = 1484) aged 18–23 years were recruited in 2012–2015 as age-matched controls to the vaccinated women and followed up yearly. We assess type-specific prevalence, natural history and potential determinants of human papillomavirus (HPV) infection in these unvaccinated women. Cervical samples were collected yearly for at least four consecutive years. A Multiplex Type-Specific E7-Based polymerase chain reaction assay was used to detect 21 HPV types. HPV prevalence was 36.4% during 6 years. Most common HPV types were 16 (6.5%) and 31 (6.1%). Highest persistence were observed for HPV 35 (62.5%) and 52 (25%). New HPV acquisition rate was 5.6/1000 person-months of observation (PMO), highest for HPV 16 (1.1/1000 PMO). Type-specific clearance rates ranged between 2.9–5.5/100 PMO. HPV 16 and/or 18 infections were 41% (95% CI 4–63%) lower among women with 2-<3 years between marriage and first cervical sample collection compared to those with <2 years. HPV prevalence and acquisition rates in young Indian women were lower than their Western counterparts. HPV 16 infections being most common shows the importance and potential impact of HPV vaccination in India. Women with 2–3 years exposure had reduced risk possibly due to higher infections clearance.

Introduction A working group of the International Agency for Research on Cancer (IARC) categorized 12 types of Human Papillomavirus (HPV), phylogenetically all belonging to the alpha genus, as 'definitely carcinogenic' to humans [1]. These include HPV 16, 31, 33, 35, 52 and 58 (species alpha-9), HPV 18 and 45 (species alpha-7), 51 (species alpha-5), type 56 (species alpha-6), and HPV 39 and 59 (species alpha-7). Additional HPV types belonging to genus alpha (types 26, 53, 66, 68, 70, 73 and 82) have been detected in cervical cancer samples at lower frequencies and are considered 'possible or probable high-risk' [2]. Women with a high-risk HPV infection of the cervix have 11 times higher risk of developing high grade cervical precancers compared to non-infected women [3]. The study of prevalence of HPV (overall and type-specific) is important not only to understand the burden of infection but also to predict the risk and the incidence of HPV induced cancers in the population [4,5]. In addition, studying the prevalence of the HPV types targeted by the vaccines in young women has immense public health significance in assessing the early impact of the HPV vaccines. HPV vaccine has been introduced in the immunization programme only in two provinces (out of total 28 provinces and 9 Union-territories) in India [6]. The vaccine is expected to be introduced in rest of the country in the near future. A population-based estimate of the type-specific prevalence of HPV infection in young women from different regions of the country will create a valuable baseline for future surveillance studies assessing the vaccine impact.
IARC initiated a multi-centric study in India in 2009 to compare the efficacy of a single dose of quadrivalent HPV vaccine (Gardasil TM , Merck, NJ, USA) to that of two and three doses. A total of 17,729 girls received three doses, two doses or a single dose of the quadrivalent vaccine by April 2010. Follow up of the recipients of different doses of the vaccine and a cohort of age-matched unvaccinated women is ongoing, which is expected to generate valuable evidence on the long term efficacy of a single dose of the vaccine to inform the cervical cancer eliminations strategies being formulated by the World Health Organization (WHO).
The current manuscript based on the IARC HPV vaccine study reports the type-specific prevalence and natural history of 19 oncogenic (definite and probable high-risk) HPV types and two low-risk types (HPV 6 and 11, responsible for approximately 90% of genital warts) in young sexually active healthy women from seven different provinces of India. We present the data on the type-specific acquisition of new HPV infection, their persistence and clearance. Longitudinal investigation into the natural history of HPV infection allowed us to additionally identify the key determinants of incident infection. married women recruited at ages 18-23 years. The ongoing trial aims to compare HPV infection rates and incidence of cervical lesions related to HPV infection among recipients of different doses of the vaccine. The vaccinated participants received first dose of the vaccine between September 2009 and April 2010 when they were 10-18 years old and were unmarried. The unvaccinated women were recruited and provided their first cervical cell samples between April 2013 and June 2015, matched for age and study site with the then married vaccinated participants. At recruitment, the trained health workers and nurses of the study interviewed the eligible women for sociodemographic and reproductive information and took their first cervical cell samples. The health workers and nurses then visited every participant in her household every year enquiring about their general health and wellbeing. Like in the vaccinated groups, medically significant events, pregnancy, antenatal and postnatal events, delivery, and migration details were obtained and recorded through the household visits, relatives, the network of social workers, healthcare providers and hospitals records. The yearly follow up is still ongoing. The detailed study protocol and the preliminary findings on vaccine efficacy have been described elsewhere [7,8].
Cervical cell samples for HPV genotyping were collected in PreservCyt TM medium (Hologic, MA, USA) from the women at their recruitment visit and yearly thereafter, until four annual samples were collected. If the fourth cervical sample was positive for any incident HPV type, an additional sample was collected to be tested for persistence. The samples were tested at the Rajiv Gandhi Centre for Biotechnology (RGCB), India, by the HPV type-specific E7 Polymerase Chain Reaction bead-based multiplex genotyping (Luminex Corporation TM , Texas, USA) to detect 12 high-risk (HPV 16,18,31,33,35,39,45,51,52,56,58,59), seven possible/ probable high-risk (HPV 26,53,66,68,70,73,82) and two low-risk (HPV 6,11) types of HPV.
As a quality control exercise, a Global HPV DNA proficiency panel [9] provided by the International HPV Reference Center, Karolinska Institute in Stockholm, Sweden in the year 2017 was evaluated. A test was regarded as proficient in genotyping, if it could detect 50 International Units (IU)/5μl of HPV 16 and 18 DNA, and 500 genome equivalents (GE)/5μl of the other HPV types included in the panel, both in samples with single and multiple plasmids. In addition, the specificity of the reported types should be >97%.
The ethics review committees of IARC and the participating centers approved the protocol.

Statistical analysis
The analysis of HPV genotyping, based on the cervical cell samples collected from 29 April 2013 to 10 June 2019 in the unvaccinated cohort, is reported in this manuscript. The following outcomes are evaluated: 1. Type-specific HPV period prevalence estimated from HPV types present in the participant's first cervical cell sample (prevalent infections) and new ones detected in samples other than the first (newly acquired infections). The period prevalence reported in this analysis is based on the cumulative HPV infections accrued during 6 years of follow-up.
2. Type-specific HPV persistence defined as having the same HPV type in two consecutive samples taken at least 10 months apart. This analysis included women with at least two sample collections.
3. Type-specific newly acquired HPV infections, which involved women who tested negative for a particular HPV type in their first sample.
4. Type-specific HPV clearance defined as having a negative HPV test result for a particular type after being identified as positive for that same type in an antecendent sample from the same individual.

5.
Combinations of high-risk HPV types depending on whether they may be protected, crossprotected or not protected by the vaccines (HPV 16 and/or 18; HPV 31, 33 and/or 45; and the other 14 oncogenic types) were assessed for prevalence, new acquisition and clearance, in addition to the individual types.
The effect of participants' baseline socio-demographic characteristics and cervical sample collection patterns on any HPV infection and HPV 16/18 grouped outcomes was assessed. The sample collection patterns included three variables. The first variable was delayed sample collection. A participant was defined as having a delayed collection if she had a gap of 18 months or more between any consecutive sample collection dates. A participant who had less than four consecutive sample collections as per protocol, and whose due sample collection was delayed by more than 18 months by the date of our data analysis (10 June 2019), was also defined as having a delayed collection. The second and third variables were the number of cervical sample collections per participant, and the gap between the dates of marriage and first sample collection.
Counting of the HPV type-specific person-months of observation (PMO) for each woman in the prevalence and incidence analysis began at date of first sample collection (baseline/recruitment visit) and ended either at the date of detection of the specific HPV type of interest, or at last negative HPV test/sample collection. Women who had only one sample collection were considered to have been followed up for one day. The rates of newly acquired infections were calculated by dividing the number of new infections by the total follow-up time at risk for an individual HPV type and expressed per 1000 person-months together with their exact Poisson 95% confidence intervals (CIs). The rates were assessed based on the individual HPV type, and rates of grouped HPV infections types were also expressed per 1000 person-months, as the ratio of number of infections to the total combined follow-up time a woman was at risk of acquiring each HPV type in the respective group. The cumulative probability of acquiring a new HPV infection was assessed using Kaplan-Meier estimates of the cumulative hazard function among cohort members who were negative for a particular HPV type tested in their first sample.
PMO for clearance of each HPV type for a woman was estimated starting from the date the HPV infection was first detected to the date the woman turned negative, or to the last date the woman was detected with a persistent HPV infection for that type. Again, type-specific HPV clearance rates and associated exact Poisson 95% CIs were calculated and expressed per 100 person-months, and Kaplan-Meier estimates were obtained for the cumulative probability of HPV clearance over the study period.
The analysis to evaluate the effect of the socio-demographic factors and cervical sample collection patterns on the grouped HPV infections outcomes was carried out at the infection level rather than the woman level to increase power (i.e., an individual contributed to the analysis multiple unique HPV types acquired at different time points during follow-up). This analysis was done using generalized estimating equation (GEE) regression models to account for lack of independence between infections occurring within the same individual. All statistical analyses were carried out in Stata

Results
Out of the total 1484 unvaccinated women recruited at eight study sites from seven different provinces, 229 women had one, 202 had two, 312 had three, 735 had four and six had five cervical samples assessed for HPV genotyping till June 2019. The median time of follow-up was 36.9 months (IQR: 0.03-53.0 months; range: 0.03-71.2 months). These participants had a mean age at recruitment of 20.3 years (SD: 1.1), 878 (59.2%) had high school or higher formal education and majority of them belonged to low income group. The mean age at marriage was 19 years and three-quarters had their first cervical cell sample two years or more after marriage (Table 1). Table 2 shows the type-specific HPV period prevalence, persistence and new HPV acquisition rate. The prevalence of any HPV infection during the 6-year study period was 36.4% (540/1484), with 19.6% (291/1484) showing infection in the first sample. The most common HPV types, either present in these women at study entry or infecting them during the study period were HPV 16, 31, 58, 56 and 68, and the corresponding proportions were 6.5%, 6.1%, 4.9%, 4.6% and 4.3%, respectively. The period prevalence was 9.3% for HPV 16 and/or 18, 9.8% for HPV 31, 33 and/or 45 and 26.0% for the other 14 types. HPV 16 and/or 18 were detected in the first sample in 3.4% (51/1484) participants. Highest persistence was observed for HPV 35, 52, 70, 16 and 18 in that order, the proportions being 62.5%, 25.0%, 25.0%, 23.6% and 22.2%, respectively. The persistence proportions were very low for both HPV 6 (6.7%) and HPV 11 (0%).
In women negative for any of the 21 HPV types tested on their baseline cervical cell sample collection, the acquisition rate of new HPV infections was 5.6 per 1000 PMO. The HPV typespecific acquisition rates of new infections are presented in Table 2. These rates were highest for HPV 16 (1.1/1000 PMO), followed by HPV 58 and 68 (0.8/1000 PMO each) and HPV 56 (0.7/1000 PMO each). The acquisition rates were generally higher for the individual high-risk HPV types than for the individual probable high-risk or low-risk types.
The cumulative probability by follow-up time of new HPV infections was assessed for HPV 16 and/or 18, HPV 31, 33 and/or 45, high-risk types other than HPV 16/18/31/33/45 and any of the 21 HPV types (Fig 1). Women were more likely to acquire new infections of high-risk types other than HPV 16 and/or 18 or HPV 31, 33 and/or 45. After 5 years of follow-up, women had a 9.2%, 6.6% and 24.3% risk of being detected with new infections of HPV 16 and/ or 18, HPV 31, 33 and/or 45 and other high-risk types, respectively. Table 3 shows the type-specific clearance rates and survival time to clearance while Fig 2 presents the Kaplan-Meier plots for the cumulative proportion of participants positive for HPV 16 and/or 18, HPV 31, 33 and/or 45, or high-risk types other than HPV 16/18/31/33/45 having cleared the infection during the study period. The HPV clearance rates and proportions for the three different categories of HPV infections were similar over the course of time. The median times to HPV type-specific clearance were 1.3 years, 1.5 years and 1.5 years, and the 4-year cumulative probabilities of clearance were 97.6%, 99.0% and 97.6% for HPV 16 and/or 18, HPV 31, 33 and/or 45, and other oncogenic types, respectively. There was no significant difference in the clearance rates of the individual types, with rates ranging from 2.9 to 5.5 per 100 PMO.
The effect (obtained from the GEE regression models) of the women's baseline socio-demographic characteristics and cervical cell sample collection patterns on the grouped HPV infection outcomes is shown in protocol, and whose due sample collection was delayed by more than 18 months by the date of our data analysis (10 June 2019), was also defined as having a delayed collection. https://doi.org/10.1371/journal.pone.0244242.t001

PLOS ONE
Type-specific HPV acquisition, prevalence and clearance collections done on schedule; participants providing at least 3 (ARR 0.55; 95% CI 0.40-0.76) compared to those providing 1-2 cervical cell sample collections; and among those whose gap

PLOS ONE
Type-specific HPV acquisition, prevalence and clearance between marriage and provision of the first cervical cell sample collection was 2-<3 years (ARR 0.59; 95% CI 0.37-0.96) compared to participants whose gap was <2 years. The risk of any HPV infection was significantly (2.08-fold) higher in Mizoram compared to Pune, Maharashtra (Table 4). There was no significant difference among the other sites.

Main findings
Our study has systematically evaluated the prevalence and natural history of HPV infection over a median period of nearly 4 years in a young sexually active cohort of women. More than one third of the women were infected over time with any of the 21 HPV types; HPV 16 being most frequently detected followed by HPV 31, 58 and 56 in that order. Over 90% of the infected women cleared the infection by 36 months irrespective of the HPV type. The clearance rate was significantly higher in the initial months of infection-a phenomenon well-documented by previous studies [10]. This might also partly explain why women with a 2-3 year gap between marriage and first cervical cell collection had a reduced risk of HPV infection as they may have had time to clear the infection.

Strengths and limitations
The longitudinal nature of our study helped assess HPV persistence which is the necessary cause of cervical neoplasia. This is the largest cohort so far in India in which HPV genotyping was systematically studies both for prevalence and natural history. Undoubtedly, the most important risk factors of HPV infection are related to sexual practices of the participating women and their partners. Indian society is still conservative and discussion regarding sexual practices, especially premarital or extramarital sex, is a taboo in most places. We did not collect

PLOS ONE
Type-specific HPV acquisition, prevalence and clearance any data on sexual practices to avoid embarrassing young women and their possible non-participation. We used gap between married and first cervical sample collection as representation of exposure time assuming marriage is a proxy for sexual debut.
Furthermore our study participants were not selected using any systematic sampling methodology and may not be truly representative of the country or the provinces they belong to. We decided not to screen this very young cohort of women for cervical cancer to avoid harms. It is possible that some of them may have had high grade cervical premalignant lesions at study entry and the data may not be exactly comparable to the outcomes of the studies that excluded such women. However, the proportion of high-grade lesions is likely to be low in these young Indian women.

PLOS ONE
Type-specific HPV acquisition, prevalence and clearance Some of the persistent infections reported in our study could be due to reinfection of the cervix, which theoretically could over-estimate the rate of persistence. However, we have followed the definition of persistent infection used in majority of the natural history studies to ensure comparability of data [11].

Interpretation
Type-specific persistence of high-risk HPV is the best known predictor of a woman developing high-grade cervical precancers in future [3]. The definition of persistence is not uniform across the studies and the reported rates vary widely with age, frequency of sample collection and number of samples collected per participant. However, there is general agreement that the high-risk types tend to persist longer than low-risk types and persistence increases significantly with age, both for low and high-risk types [12,13]. The follow-up of the unvaccinated control cohort (age 15-25 years) in a bivalent HPV vaccine evaluation study showed that more than half of the high-risk HPV infections (54.1%) persisted for longer than 1 year [14]. In our study the persistence proportions of HPV 16 (23.6%) and HPV 18 (22.2%) were higher than most of the probable high-or low-risk types but lower than that reported in the bivalent vaccine evaluation study. Another Indian study also observed higher persistence proportions for both HPV 16 (45.6%) and HPV 18 (38.4%) in 16-24 year old women [15].
The reported HPV prevalence among apparently normal women varies widely with geographical location. A meta-analysis of 194 studies conducted between the years 1995 and 2009 estimated the HPV point prevalence among more than a million women with normal cervical cytology [16]. The age-adjusted prevalence of any HPV was 11.7% worldwide-ranging from 9.4% in Asia to 21.1% in Africa. Southern Asia, almost entirely represented by studies from India, had an age adjusted prevalence of 7.1%. Another study reported significantly lower HPV prevalence among South Asian young women (<25 years of age) compared to those in Africa, Europe or America (12.9% versus 20%) [17].
HPV prevalence reported by Indian studies varied from 2.3% to 36.9%, reflecting the heterogeneity of women included in the studies [17]. High prevalence was observed in studies that recruited predominantly symptomatic women attending health facilities [18][19][20]. We

PLOS ONE
Type-specific HPV acquisition, prevalence and clearance identified only five studies from India that selected apparently normal women from the community and used a PCR based assay to detect a large number of HPV types (ranging from 26 to 44) ( Table 5) [21][22][23][24][25]. These studies also documented a wide range of HPV prevalence, from 6.1% in south India to 19.2% among tribal women from central India. The point prevalence observed in our study (19.6%) was highest among the Indian studies possibly because of the sampling frame (which included some of the sites with high prevalence like Mizoram and Sikkim), sample collection methodology (serial samples collected from the same women over

PLOS ONE
Type-specific HPV acquisition, prevalence and clearance consecutive years), younger age and assay selection. The Multiplex Type-Specific E7-Based PCR assay used in our study is more sensitive than the L1 consensus primer-based PCR assays used in earlier Indian studies, and as a result significantly increased the rate of detection of HPV and the number of infections with multiple HPV types. The high analytical sensitivity of our assay has been established in a range of studies [26][27][28]. India is generally considered to be a conservative society as far as sexual practices are concerned. The National Family Health Survey (NFHS, 2015-16) collecting data to estimate the critical indicators of health reported that only 2.5% of unmarried women between 15-19 years of age had sex ever and only 0.4% of sexually active women aged between 20 and 24 years had multiple partners [29]. In spite of such apparently conservative sexual behaviour of the women, we observed that nearly one third of the study participants got infected with HPV. Aizawl (Mizoram) situated in the north-eastern part of India had the highest rate of HPV infection followed by Ambilikai (Tamil Nadu) in our study. The Family Health Survey revealed that married women in Mizoram reported the highest average number of partners (3.6%) in the country and highest HIV prevalence (1.49% compared to the national average of 0.24%). The high HPV prevalence explains the fact that Mizoram reported the highest cervical cancer incidence in the country (28/100,000 women in 2012-2014) [30].
Like our study, HPV 16 was the most prevalent type among normal women in previous studies irrespective of the study location, with prevalence varying between 2.5% in Asia and

Conclusions
Indian women have a high burden of cervical cancer. The primary explanation is that women in India do not have access to population-based organized screening for cervical cancer and only a miniscule proportion of adolescent girls has been covered by HPV vaccination. Our study clearly shows that the Indian women are at high risk of being infected with HPV, especially HPV types 16 and 18. Interestingly, the center reporting highest prevalence (Mizoram)

PLOS ONE
and lowest prevalence (Ahmedabad) in our study also had the highest and lowest incidence of cervical cancer among all the sites (Fig 3). Policy-makers in India should not delay introduction of the HPV vaccine any further and ensure that the women have access to quality assured cancer screening to remain on track to achieve the goal of eliminating cervical cancer.