Emergence as an outbreak of the HIV-1 CRF19_cpx variant in treatment-naïve patients in southern Spain

Background CRF19_cpx is a complex circulating recombination form (CRF) of HIV-1. We describe the characteristics of an outbreak of the CRF19_cpx variant among treatment-naïve patients in southern Spain. Methods The study was undertaken at the Virgen de la Victoria Hospital, a reference centre for the analysis of HIV-1 genotype in Malaga (Spain). Subtyping was performed through REGA v3.0 and the relationship of our CRF19_cpx sequences, among themselves and regarding other reference sequences from the same variant, was defined by phylogenetic analysis. We used PhyML program to perform a reconstruction of the phylogeny by Maximum Likelihood method as well as further confirmation of the transmission clusters by Bayesian inference. Additionally, we collected demographic, clinical and immunovirological data. Results Between 2011 and 2016, we detected 57 treatment-naïve patients with the CRF19_cpx variant. Of these, 55 conformed a very well-defined transmission cluster, phylogenetically close to CRF19_cpx sequences from the United Kingdom. The origin of this subtype in Malaga was dated between 2007 and 2010. Over 50% of the patients presented the non-nucleoside reverse transcriptase inhibitor G190A resistance mutation. This variant was mostly represented by young adult Spanish men who had sex with men. Almost half of them were recent seroconverters, though a similar percentage was diagnosed at a late state of HIV infection. Five cases of AIDS and one non-AIDS defined death occurred during follow-up. The majority of patients treated with first-line combination antiretroviral therapy (ART) responded. Conclusions We report the largest HIV-1 CRF19_cpx cohort of treatment-naïve patients outside Cuba, almost all emerging as an outbreak in the South of Spain. Half the cases had the G190A resistance mutation. Unlike previous studies, the variant from Malaga seems less pathogenic, with few AIDS events and an excellent response to ART.


Introduction
CRF19_cpx is a complex circulating recombination form (CRF) of HIV-1, exhibiting a mosaic structure with multiple segments of subtypes A, D, and G.Although molecular epidemiology studies suggest a central African origin for this variant [1][2], CRF19_cpx was first described in HIV-1 patients from Cuba [1], where it was imported around the mid-1980s and rapidly disseminated within local transmission networks [2].This recombinant form now represents almost 4% of all subtypes and more than 17% of all the CRFs in newly diagnosed Cuban patients [3][4][5].Other genetic variability studies show even higher prevalence values, reaching up to ~28% in Cuba [6].Additionally, examination of the Los Alamos HIV Database (LANL, http://www.hiv.lanl.gov/content/sequence/HIV/mainpage.html)shows that most of the entries for CRF19_cpx also come from Cuba (217 out of 285 sequences, last accessed in January 2017).
Clinically, patients with the subtype CRF19_cpx are mostly linked to a rapid progression to AIDS [7].CRF19_cpx has also been found as one of the most prevalent viral variants, with multiple drug resistance mutations in HIV-1 therapy-naïve patients in Cuba [5,8].Consequently, the subtype CRF19_cpx seems to be associated with greater pathogenicity and common resistance mutations [7,9].
As mentioned, and despite the very low spread of the CRF19_cpx variant outside Cuba, the existence of a few cases of this recombinant form was recently reported in Tunisia and Spain [10][11].The United Kingdom, Greece and France are also represented, with certain sequences in the LANL (48, 2 and 1 sequences, respectively; last accessed in January 2017).
Here, we present a major expansion of this HIV-1 CRF outside Cuba.The variant has been transmitted in southern Spain as an outbreak among MSM, diagnosed between 2011 and 2016.

Study population and subtype assignment
The study was undertaken at the Virgen de la Victoria Hospital, a reference centre of the study of HIV-1 genotype drug resistance for six hospitals from the region of Malaga (southern Spain).A genotype resistance test has been routinely undertaken in our centre since 2004 for all the patients with confirmed HIV-1 infection at the time of diagnosis and before starting combination antiretroviral therapy (ART).A partial region of HIV-1 pol gene, encoding the complete protease (PR) and partial reverse transcriptase (RT), was sequenced using RT-PCR and Sanger sequencing (Trugene HIV Genotyping Kit1, Siemens Healthcare Diagnostics Inc., Tarrytown, NY, USA) or 454 pyrosequencing (GS Junior Titanium Sequencing Kit1 Roche Diagnostics Gmbh, Mannheim, Germany), depending on the date of sample collection (before or after 2014, respectively).The subtype for each FASTA sequence provided was assigned through REGA v3.0 and sequences determined as CRF19_cpx subtype, afterwards confirmed by phylogenetic analysis.

Phylogenetic analysis
The relationship of our CRF19 cpx sequences among themselves and with regards to the epidemic of this subtype worldwide was characterized by means of a phylogenetic analysis with another 254 reference sequences of the same variant retrieved from the LANL (198 sequences from Cuba, 47 from the United Kingdom, 4 from Spain, 3 from the USA, 1 from Greece and another one from Tunisia).The PR and RT sequences (average overall length of 913 nt) were aligned by ClustalX.The phylogenetic reconstruction was inferred by maximum likelihood method with PhyML v3.0 program [12][13].The cluster reliability was supported by two nonparametric branch-supports implemented in the mentioned software: bootstrapping with 100 replications and SH-like aLRT test, another branch-support measure in line with the SH treeselection method [14].The best substitution model was determined with the Findmodel option included in MEGA v6.0 [15], assuming the lowest AIC (Akaike Information Criterion) score as the selection criterion [16].

Dating the most recent common ancestor (tMRCA) of the variant in our area
An outbreak cluster was identified in our cohort and extracted from the tree and time-resolved in BEAST.For this purpose, we applied a Coalescent-based Bayesian Markov Chain Monte Carlo (MCMC) approach with the program BEAST v2 [18].Under a relaxed uncorrelated lognormal clock, we set the GTR G+I substitution model and a prior lognormal distribution of 5x10 -3 substitutions per site per year (standard deviation = 0.5) in the evolution rate, as the general heterogeneity rate reported in the literature for all subtypes in the pol gene [19][20].We chose the most appropriate coalescent model (Bayesian and Extended Bayesian Skylines, constant or exponential) to infer the population dynamics of this outbreak, based on the lowest value of Akaike's Information Criterion (AICM).The MCMC was run as default with chain lengths of 100 million states, sampling estimates every 1000 th generation.All the parameters were estimated using the software Tracer 1.6 (http://tree.bio.ed.ac.uk/software/tracer/), accepting only traces with an effective sample size (ESS) of >200 to assess the tMRCA.

Description and statistical analyses of the study population characteristics
Additionally, we collected demographic, clinical and immunovirological data as well as therapy-related information for the patients with the CRF19 variant.All the patients signed an Informed Consent at their first visit to each hospital, containing explicit agreement to use the routine data under confidentiality and anonymized, as performed here.We carried out a statistical analysis of these variables with the software SPSS 16.0.Prior to the descriptive analysis, we studied the distribution of the corresponding variable in the whole cohort using the mean or the median, according to whether it adjusted or not to normality, respectively.Comparison of proportions was performed by the bilateral Fisher test.For quantitative variables following a non-normal distribution, the Wilcoxon non-parametric test was used.To evaluate the degree of association or independence of quantitative variables (normally distributed) with a dichotomous category, the means were compared in the two categories with the Student t test.In all cases statistical significance was set at p< 0.05.

Results
We subtyped all sequences provided by any kind of genotype test performed in our hospital from treatment-naïve patients diagnosed since 2004, finding the first case of the CRF19_cpx variant in 2011 (Fig 1).From January 2011 to December 2016, a total of 2566 resistance studies were carried out in naïve patients; 57 (2.2%) had sequences consigned in REGA as subtype CRF19_cpx or similar [Recombinant of 19_cpx, B and Subtype D (19_cpx)] (results in S1 Table ).The highest prevalence for this subtype was found in 2016, with 19 patients out of 438 (4.3%).
The Bayesian approach confirmed the presence of a local transmission cluster, with an associated posterior probability (pp) of 0.  1).All these subclusters had also correspondence in the ML tree, with one or both of the branch-support measures and thresholds considered (S3B and S4B Figs).Finally, the most recent common ancestor (MRCA) of this outbreak was dated at 2009 (2007.5-2010.0,95% HPD) by the Bayesian skyline growth with a lognormal relaxed molecular clock, selected as reporting the lowest AICM value (S2 Table ).On the other hand, there were two sequences phylogenetically separated from each other and from the rest.One of them was sampled in 2013 and showed a close phylogenetic relation (pp = 1.0; bootstrap = 84%; SH-like aLRT = 98%) to reference sequences from Cuba exclusively.The other one, however, sampled in 2016, did not belong to any subgroup (either reference or local clusters) with a high enough confidence value (pp = 0.538, Fig 2 ).Thus, at least three separate routes of introduction the CRF19_cpx HIV-1 occurred in our area.This pattern was robustly confirmed with the two phylogenetic reconstructions obtained, ML tree (also supported by both non-parametric measures) and Bayesian inference.
Regarding antiretroviral drug resistance, we found the G190A mutation associated with different levels of non-nucleoside reverse transcriptase inhibitor (NNRTI) resistance in 29 out of 57 patients with the subtype CRF19_cpx (50.9%), including all the patients from clusters A to D (Fig 3).Outside these clusters but within the outbreak, there are 2 more patients with this mutation.In addition, we also detected in 8 patients (14.0%) the polymorphic mutation V179I/A, though this possesses little direct effect on NNRTI susceptibility.Unlike G190A mutation clusters, the patients with the V179 polymorphism did not show any clear grouping with each other, as depicted in Furthermore, as shown in Table 2, 54 out of 57 patients were self-reported men who had sex with men (MSM) (94.7%).All were Spanish, except two patients from Argentina and one from France.The average age of the cohort was 35.7 years (27.2-42.0).The initial CD4 count was 387 cells/μL (259-468).Eight patients (14.0%) had <200 cells/μL at diagnosis and 26 (45.6%) presented a late diagnosis (initial CD4 count <350 cells/μL).In addition, the average first viral load was 4.9 Log 10 copies/mL (4.3-5.5), with a zenith value of 5.0 Log 10 copies/mL (4.5-5.5), this latter being lower in patients with the G190A mutation (4.7 vs. 5.2, p = 0.03).On the other hand, five cases of AIDS (8.8%) were recorded at diagnosis.Only one death was recorded during the study period, due to acute myocardial infarction in a patient two years after his diagnosis and without having any AIDS event during the follow-up period.Finally, 54 patients were being treated with first-line combination ART at the end of the study period, 92.6% of them with viral suppression.Other demographic and clinical data about our cohort as well as the detailed comparison between the groups with and without the G190A mutation are depicted in Table 2.

Discussion
As far as we know, this study reports the largest cohort of HIV-1 CRF19_cpx outside Cuba.
Although the spread of this recombinant outside Cuba has been previously described [10][11], it is the first time arising as an outbreak of such size.CRF19_cpx has been sampled in our area since 2011, with an estimated introduction in 2009.Up to the end of the study period, we detected this subtype in 57 treatment-naïve patients, 55 of them phylogenetically grouped in a well-defined cluster.Phylogenetic analysis showed the proximity to reference sequences of the CRF19_cpx subtype sampled in the United Kingdom between 2008 and 2010.On the other hand, there were also two more sequences of this variant, separate from each other and from the other patients in our area.One of them is not linked to any transmission cluster, not even with other CRF19_cpx references from the LANL.Thus, the pattern shows at least three separate introductions of the CRF19_cpx HIV-1 in the Malaga area, only one of them emerging as an outbreak, with a sharp increase of cases during the study period.All the phylogenetic approaches performed, in addition to the epidemiological data discussed below, support its status of a real outbreak.The centre of this outbreak is Malaga, with no relation to any other CRF19_cpx sequence from treatment-naïve patients outside this area [22].The prevalence for this variant constitutes the second highest among non-B subtypes and recombinants in the Costa del Sol area [23].Moreover, the prevalence in 2016, the last year included in the study, was over 4%, a figure very similar to that observed in Cuba itself [3].Kouri et al. highlighted the high fitness score of the CRF19_cpx subtype in the PR region [7].Although we did not determine the replicative capacity of the specific strains of this variant circulating in our area, its persistence for a long time and its increasing transmission support a high viral fitness.
CRF19_cpx has also been associated with multiple drug resistance mutations in therapynaïve Cuban patients [5,8].In this respect, more than half of our cohort possessed the G190A mutation associated with different levels of NNRTI resistance (intermediate resistance for efavirenz, potential low level resistance for etravirine, high level resistance for nevirapine, and low level resistance for rilpivirine).However, the high prevalence of this specific drug resistance differs strikingly from other studies, where it is seldom found at diagnosis [5,24].Indeed, we detected the polymorphism at position 179 of the RT gene with a lower frequency than seen in another Spanish cohort of 9 CRF19_cpx patients [11].Unlike the G190A mutation though, the above mentioned polymorphism only confers low-level resistance to NNRTI.
Finally, we analysed the demographic and clinical variables in the set of CRF19_cpx patients in our study.These characteristics were similar to those found for non-B infected patients in the Spanish AIDS Research Network Cohort (CoRIS) [25], except for the origin and risk category, since we mainly detected CRF19_cpx in Spanish MSM, while other non-B subtypes affect mainly heterosexual and immigrant patients.Regarding clinical and virological data, we can find no information supporting a greater pathogenicity of the CRF19_cpx recombinant, as seen in previous studies [7,9].Thus, few cases of AIDS were reported.The viral load and CD4 count at HIV-1 diagnosis were also similar to the non-B patients from CoRIS and even to the overall cohort, including all the subtype B patients [25][26].All the patients were treated with first-line ART, with no case of treatment or virological failure reported.Consequently, the subtype CRF19_cpx does not seem to be associated with a superior pathogenicity in our cohort.Nevertheless, the status of outbreak in our area, mostly affecting MSM, means we cannot state that the characteristics in this case are specific for the CRF19_cpx subtype nor can they be generalized to other at-risk populations.
In summary, the CRF19_cpx recombinant has been mainly transmitted as an outbreak over a period of 6 years in southern Spain.Local young MSM are the major risk group affected.This subtype is associated with a high prevalence of cross-class primary resistance to NNRTI in our area, with more than half the cases presenting the G190A resistance mutation.Unlike previous studies, the CRF19_cpx recombinant from Malaga seems less pathogenic, with few cases of AIDS and excellent response to first-line ART.
Fig 1. Cases of CRF 19_cpx subtype over time.Number of CRF19_cpx variants detected in the Hospital Virgen de la Victoria, from the first case found until the end of the study period (2011-2016).https://doi.org/10.1371/journal.pone.0190544.g001

Fig 2 .
Fig 2. Bayesian maximum clade credibility phylogenetic tree inferred by MrBayes v3.2 program showing our CRF19_cpx sequences and another 254 reference sequences from the same variant retrieved from LANL.Each patient from our cohort is represented in red by their sample ID, while reference sequences appear in different colours according to the country of sampling (green: Cuba; pink: United Kingdom; turquoise: Spain (other than our cohort); brown: USA; black: Greece; blue: Tunisia).This figure is depicted in detail as a supplementary material (S2 Fig).

Fig 3 .
Fig 3. Subtree with the 55 CRF19 cpx sequences grouped together (pp = 0.895) conforming the identified outbreak as depicted in Bayesian inference of the phylogeny (see Fig 2).Sequences presenting the G190A mutation are highlighted within the light grey shaded square.Asterisks indicate the detection of V179I/A as applicable.https://doi.org/10.1371/journal.pone.0190544.g003

Fig 3
as well as S3 and S4 Figs in Supplementary material.

Table 1 . Summary of the main demographic, clinical and virological data for the seven subclusters found (pp
!0.9),